Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: some questions about the OOV (Nadir Durrani)
----------------------------------------------------------------------
Message: 1
Date: Sun, 1 Jan 2017 23:14:12 +0300
From: Nadir Durrani <nadir.durrani@nu.edu.pk>
Subject: Re: [Moses-support] some questions about the OOV
To: moses-support@mit.edu
Message-ID:
<CAFDj2Q1Gn1xEtyuAxUpCuQS0A_FF5j7R3bdA0G1huT5RnFWYiQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi WULongski,
When you enable
[TRAINING]
transliteration-module = "yes"
in the EMS, it simply trains transliteration model from your word-aligned
parallel corpus. This includes i) mining transliteration corpus, ii) then
training the entire phrase-based pipeline over character corpus that was
just mined. At the end you have a transliteration model that can be used to
transliterate OOVs.
You can either use the model to transliterate in a post-decoding step i.e.
after the actual decoder has run and now you just need to transliterate the
OOVs. This is done through
post-decoding-transliteration = "yes"
An alternative is to do it at the same time the actual decoding takes place
in-decoding-transliteration = "yes"
This allows the decoder to reorder OOVs along with the regular words. But I
did not get any better BLEU scores on average.
The current implementation is independent of tuning i.e. you don't have to
retune the system when you enable transliteration. Tuning transliteration
parameters (LM-OOV, transliteration phrase-table, etc) did not improve
results so I just fixed weights. Currently LM-OOV feature gets precedence.
>> --alignment means what ? it means I should have other corpus??? I only
have the en-fr corpus. what does the aligned text mean?
You need word-alignments to mine transliteration pairs. The miner works on
1-1 word-list.
>> --srilm-dir <sri lm binary path> it means that I should install
the srilm????
it will use lmplz if you don't specify srilm-dir
>> I want to know that if I want to translate french to english . I should
first do baseline steps ,then I will do these steps in
Transliteration of OOV helps when source and target are written in
different writing scripts. For French and English, simply copying over the
unknown word would be more fruitful. Transliteration may helpful
interesting cognates and interesting transformation of borrowed words. But
I don't think it will improve translation quality.
>> --input-extension <input extension>--output-extension
<output-extension> means what????
>> if I just want to translate french to english ,should I use
--input-extension fr --output-extension en ?
Yes !!! But try using EMS than running the command manually. It is much
easier.
Nadir
On Fri, Dec 30, 2016 at 7:25 PM, <moses-support-request@mit.edu> wrote:
> Send Moses-support mailing list submissions to
> moses-support@mit.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
> moses-support-request@mit.edu
>
> You can reach the person managing the list at
> moses-support-owner@mit.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
> 1. FYI: How the results of mkcls actually used during Moses
> training (Lane Schwartz)
> 2. some questions about the OOV ( WULongski )
> 3. something about the Unsupervised Transliteration Model
> ( WULongski )
> 4. Re: Moses-support Digest, Vol 122, Issue 38 (Mike Ladwig)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 29 Dec 2016 11:18:49 -0600
> From: Lane Schwartz <dowobeha@gmail.com>
> Subject: [Moses-support] FYI: How the results of mkcls actually used
> during Moses training
> To: "moses-support@mit.edu" <moses-support@mit.edu>
> Message-ID:
> <CABv3vZkT+DFhWyZ_-2U+GeF+0erQN-iNjYsL=p6kTDOUsUzWYw@
> mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> This email is simply to record a (to my knowledge) previously undocumented
> aspect of how the Moses training scripts interact with giza++.
>
>
> I've been looking through moses/scripts/training/train-model.perl and the
> execution scripts created by EMS, and I ran across Perl function called
> make_classes, which (not surprisingly) calls mkcls. This didn't surprise
> me, as I assumed that giza++ used the resulting classes. But in examining
> the subsequent calls to giza++ (or mgiza), I couldn't see anywhere else in
> the Moses training pipeline that actually uses the *.vcb.classes files
> resulting from the calls to mkcls.
>
> Now, there are certainly use cases where a research might want to
> explicitly make use of these classes (a class LM, for example). But mkcls
> is called by default whenever training Moses using train-model.perl, and in
> the general case, I couldn't find any place where these classes are
> subsequently used. So I wondered: Am I missing something obvious? Are the
> results of mkcls actually used anywhere by default in the Moses training
> pipeline?
>
> After running mgiza --help, it appears that mgiza can accept these class
> files, but it appears that train-model.perl is not actually explicitly
> providing these class files to mgiza. So, I tried running mgiza as it was
> called by train-model.perl in a clean directory, providing it only the
> files that mgiza actually was provided via command flags (the src-tgt.cooc,
> tgt.vcb, and src.vcb files). Run this way, mgiza complains:
>
> ERROR: can not read src.vcb.classes
> ERROR: can not read tgt.vcb.classes
>
> So, the answer is that mgiza does actually need these files, but
> train-model.perl does not explicitly provide them to mgiza, instead relying
> on the fact that mgiza defaults to assuming that the class files exist in
> the same location as the vcb files with the same prefix, but the additional
> suffix .classes
>
> Thanks,
> Lane
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://mailman.mit.edu/mailman/private/moses-support/
> attachments/20161229/28368f84/attachment-0001.html
>
> ------------------------------
>
> Message: 2
> Date: Fri, 30 Dec 2016 10:36:00 +0800
> From: " WULongski " <1251817914@qq.com>
> Subject: [Moses-support] some questions about the OOV
> To: " moses-support " <moses-support@mit.edu>
> Message-ID: <tencent_589E2CC664F5D88E6E9E657C@qq.com>
> Content-Type: text/plain; charset="gb18030"
>
> Hi,
> 1?
> I read the handling OOVs methods in http://www.statmt.org/moses/?
> n=Advanced.OOVs#ntoc1. Now I want to use Unsupervised Transliteration
> Model, it means that first i should train a transliteration module. Then I
> should train moses with transliteration option. But in the basline webpage,
> after it I should do the tuning steps to optimize the parameters .
> Now in the OOVs website , it doesn't do tuning. So I am confused.
> I think I should do the tuning.But if I do it,how to do that? the same
> way as the baseline ? Or should I add some parameters to the command of
> nohup nice ~/mosesdecoder/scripts/training/mert-moses.pl \ ~/corpus/
> news-test2008.true.fr ~/corpus/news-test2008.true.en \
> ~/mosesdecoder/bin/moses train/model/moses.ini --mertdir
> ~/mosesdecoder/bin/ \ &> mert.out &
>
>
>
> 2 ?
> Some questions about the parameters:
> in http://www.statmt.org/moses/?n=Advanced.OOVs#ntoc1
>
> Execute command to train transliteration:
> ../mosesdecoder/scripts/Transliteration/train-transliteration-module.pl
> \ --corpus-f <foreign text> --corpus-e <target text> \
> --alignment <path to aligned text> \ --moses-src-dir <moses decoder
> path> --external-bin-dir <external tools> \ --input-extension <input
> extension>--output-extension <output-extension> \ --srilm-dir <sri lm
> binary path> --out-dir <path to generate output files>
> --alignment means what ? it means I should have other corpus??? I only
> have the en-fr corpus. what does the aligned text mean?
>
> --input-extension <input extension>--output-extension <output-extension>
> means what????
> if I just want to translate french to english ,should I use
> --input-extension fr --output-extension en ?
>
>
> Thank you very much !
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://mailman.mit.edu/mailman/private/moses-support/
> attachments/20161229/6d61dced/attachment-0001.html
>
> ------------------------------
>
> Message: 3
> Date: Fri, 30 Dec 2016 11:07:51 +0800
> From: " WULongski " <1251817914@qq.com>
> Subject: [Moses-support] something about the Unsupervised
> Transliteration Model
> To: " moses-support " <moses-support@mit.edu>
> Message-ID: <tencent_77DB420D0B67741416FBCE65@qq.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi,
> in the web page http://www.statmt.org/moses/?n=Advanced.OOVs
>
> Steps for use outside experiment.perl
>
>
>
> Execute command to train transliteration:
> ../mosesdecoder/scripts/Transliteration/train-transliteration-module.pl
> \ --corpus-f <foreign text> --corpus-e <target text> \
> --alignment <path to aligned text> \ --moses-src-dir <moses decoder
> path> --external-bin-dir <external tools> \ --input-extension <input
> extension>--output-extension <output-extension> \ --srilm-dir <sri lm
> binary path> --out-dir <path to generate output files>
>
> --srilm-dir <sri lm binary path> it means that I should install the
> srilm????
> I want to know that if I want to translate french to english . I should
> first do baseline steps ,then I will do these steps in
> http://www.statmt.org/moses/?n=Advanced.OOVs.
>
> thank you very much!
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://mailman.mit.edu/mailman/private/moses-support/
> attachments/20161229/7956e64e/attachment-0001.html
>
> ------------------------------
>
> Message: 4
> Date: Fri, 30 Dec 2016 11:25:52 -0500
> From: Mike Ladwig <mdladwig@gmail.com>
> Subject: Re: [Moses-support] Moses-support Digest, Vol 122, Issue 38
> To: Hieu Hoang <hieuhoang@gmail.com>
> Cc: Moses Support <moses-support@mit.edu>
> Message-ID:
> <CAB3VaD16w4aeAcAf163CfAzS7HDtVyBsAZKj0j7B-xKfkxSMMQ@mail.
> gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Wed, Dec 28, 2016 at 4:37 AM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>
> > I am getting significantly (~20%) lower bleu scores than with 2.x but I
> >> have a lot of testing before I will know why.
> >>
> > Moses and Moses2 should give very similar results. Please let me know
> what
> > you find
> >
>
> In looking at training logs, I am getting many messages like this:
>
> WARNING: sentence 540930 has alignment point (4, 3) out of bounds (4, 4)
> T: europe is changing .
> S: europa verandert sich .
> WARNING: sentence 540931 has alignment point (9, 5) out of bounds (9, 10)
> T: that was the slogan of the last european elections .
> S: das war das motto der letzten europa wahlen .
> WARNING: sentence 540932 has alignment point (6, 0) out of bounds (6, 6)
> T: personally , i am convinced .
> S: personlich stimme ich dem zu .
>
> Thoughts?
> mike.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://mailman.mit.edu/mailman/private/moses-support/
> attachments/20161230/580e245c/attachment.html
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 122, Issue 42
> **********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20170101/8b2a7bf3/attachment.html
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 123, Issue 1
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 123, Issue 1"
Post a Comment