Moses-support Digest, Vol 122, Issue 42

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. FYI: How the results of mkcls actually used during Moses
training (Lane Schwartz)
2. some questions about the OOV ( WULongski )
3. something about the Unsupervised Transliteration Model
( WULongski )
4. Re: Moses-support Digest, Vol 122, Issue 38 (Mike Ladwig)


----------------------------------------------------------------------

Message: 1
Date: Thu, 29 Dec 2016 11:18:49 -0600
From: Lane Schwartz <dowobeha@gmail.com>
Subject: [Moses-support] FYI: How the results of mkcls actually used
during Moses training
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CABv3vZkT+DFhWyZ_-2U+GeF+0erQN-iNjYsL=p6kTDOUsUzWYw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

This email is simply to record a (to my knowledge) previously undocumented
aspect of how the Moses training scripts interact with giza++.


I've been looking through moses/scripts/training/train-model.perl and the
execution scripts created by EMS, and I ran across Perl function called
make_classes, which (not surprisingly) calls mkcls. This didn't surprise
me, as I assumed that giza++ used the resulting classes. But in examining
the subsequent calls to giza++ (or mgiza), I couldn't see anywhere else in
the Moses training pipeline that actually uses the *.vcb.classes files
resulting from the calls to mkcls.

Now, there are certainly use cases where a research might want to
explicitly make use of these classes (a class LM, for example). But mkcls
is called by default whenever training Moses using train-model.perl, and in
the general case, I couldn't find any place where these classes are
subsequently used. So I wondered: Am I missing something obvious? Are the
results of mkcls actually used anywhere by default in the Moses training
pipeline?

After running mgiza --help, it appears that mgiza can accept these class
files, but it appears that train-model.perl is not actually explicitly
providing these class files to mgiza. So, I tried running mgiza as it was
called by train-model.perl in a clean directory, providing it only the
files that mgiza actually was provided via command flags (the src-tgt.cooc,
tgt.vcb, and src.vcb files). Run this way, mgiza complains:

ERROR: can not read src.vcb.classes
ERROR: can not read tgt.vcb.classes

So, the answer is that mgiza does actually need these files, but
train-model.perl does not explicitly provide them to mgiza, instead relying
on the fact that mgiza defaults to assuming that the class files exist in
the same location as the vcb files with the same prefix, but the additional
suffix .classes

Thanks,
Lane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161229/28368f84/attachment-0001.html

------------------------------

Message: 2
Date: Fri, 30 Dec 2016 10:36:00 +0800
From: " WULongski " <1251817914@qq.com>
Subject: [Moses-support] some questions about the OOV
To: " moses-support " <moses-support@mit.edu>
Message-ID: <tencent_589E2CC664F5D88E6E9E657C@qq.com>
Content-Type: text/plain; charset="gb18030"

Hi,
1?
I read the handling OOVs methods in http://www.statmt.org/moses/?n=Advanced.OOVs#ntoc1. Now I want to use Unsupervised Transliteration Model, it means that first i should train a transliteration module. Then I should train moses with transliteration option. But in the basline webpage, after it I should do the tuning steps to optimize the parameters .
Now in the OOVs website , it doesn't do tuning. So I am confused.
I think I should do the tuning.But if I do it,how to do that? the same way as the baseline ? Or should I add some parameters to the command of nohup nice ~/mosesdecoder/scripts/training/mert-moses.pl \ ~/corpus/news-test2008.true.fr ~/corpus/news-test2008.true.en \ ~/mosesdecoder/bin/moses train/model/moses.ini --mertdir ~/mosesdecoder/bin/ \ &> mert.out &



2 ?
Some questions about the parameters:
in http://www.statmt.org/moses/?n=Advanced.OOVs#ntoc1

Execute command to train transliteration:
../mosesdecoder/scripts/Transliteration/train-transliteration-module.pl \ --corpus-f <foreign text> --corpus-e <target text> \ --alignment <path to aligned text> \ --moses-src-dir <moses decoder path> --external-bin-dir <external tools> \ --input-extension <input extension>--output-extension <output-extension> \ --srilm-dir <sri lm binary path> --out-dir <path to generate output files>
--alignment means what ? it means I should have other corpus??? I only have the en-fr corpus. what does the aligned text mean?

--input-extension <input extension>--output-extension <output-extension> means what????
if I just want to translate french to english ,should I use --input-extension fr --output-extension en ?


Thank you very much !
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161229/6d61dced/attachment-0001.html

------------------------------

Message: 3
Date: Fri, 30 Dec 2016 11:07:51 +0800
From: " WULongski " <1251817914@qq.com>
Subject: [Moses-support] something about the Unsupervised
Transliteration Model
To: " moses-support " <moses-support@mit.edu>
Message-ID: <tencent_77DB420D0B67741416FBCE65@qq.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,
in the web page http://www.statmt.org/moses/?n=Advanced.OOVs

Steps for use outside experiment.perl



Execute command to train transliteration:
../mosesdecoder/scripts/Transliteration/train-transliteration-module.pl \ --corpus-f <foreign text> --corpus-e <target text> \ --alignment <path to aligned text> \ --moses-src-dir <moses decoder path> --external-bin-dir <external tools> \ --input-extension <input extension>--output-extension <output-extension> \ --srilm-dir <sri lm binary path> --out-dir <path to generate output files>

--srilm-dir <sri lm binary path> it means that I should install the srilm????
I want to know that if I want to translate french to english . I should first do baseline steps ,then I will do these steps in http://www.statmt.org/moses/?n=Advanced.OOVs.

thank you very much!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161229/7956e64e/attachment-0001.html

------------------------------

Message: 4
Date: Fri, 30 Dec 2016 11:25:52 -0500
From: Mike Ladwig <mdladwig@gmail.com>
Subject: Re: [Moses-support] Moses-support Digest, Vol 122, Issue 38
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: Moses Support <moses-support@mit.edu>
Message-ID:
<CAB3VaD16w4aeAcAf163CfAzS7HDtVyBsAZKj0j7B-xKfkxSMMQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

On Wed, Dec 28, 2016 at 4:37 AM, Hieu Hoang <hieuhoang@gmail.com> wrote:

> I am getting significantly (~20%) lower bleu scores than with 2.x but I
>> have a lot of testing before I will know why.
>>
> Moses and Moses2 should give very similar results. Please let me know what
> you find
>

In looking at training logs, I am getting many messages like this:

WARNING: sentence 540930 has alignment point (4, 3) out of bounds (4, 4)
T: europe is changing .
S: europa verandert sich .
WARNING: sentence 540931 has alignment point (9, 5) out of bounds (9, 10)
T: that was the slogan of the last european elections .
S: das war das motto der letzten europa wahlen .
WARNING: sentence 540932 has alignment point (6, 0) out of bounds (6, 6)
T: personally , i am convinced .
S: personlich stimme ich dem zu .

Thoughts?
mike.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161230/580e245c/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 122, Issue 42
**********************************************

0 Response to "Moses-support Digest, Vol 122, Issue 42"

Post a Comment