Moses-support Digest, Vol 106, Issue 51

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: MMSAPT in EMS questions (Vincent Nguyen)
2. How -xml-input with "constraint" flag works? (Rajen Chatterjee)

----------------------------------------------------------------------

Message: 1
Date: Thu, 27 Aug 2015 22:27:30 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: Re: [Moses-support] MMSAPT in EMS questions
To: moses-support <moses-support@mit.edu>
Message-ID: <55DF72B2.2060306@neuf.fr>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi Vincent :)

here are some answers. I know, talking to myself.
this page
http://www.statmt.org/moses/?n=Advanced.Incremental
needs to be redone for sure.

it mixes up :
On one hand
Incremental training as a lot of people used it in the past with:
- training a baseline model, but if you do it with EMS, unless I am
mistaken it won't keep the VCB files ....
- using inc Giza, beacause mgiza isn't really incremental, just reusing
parameters (am I wrong ?)
- so if baseline model has been trained with fastalign (IBM Model2) it
won't work since IncGiza supports only IMBModel1 and hmm.
- bottom line on huge corpora like Giga, you need patience to train your
baseline with mgiza / option hmm.
My understanding is that EMS does not include the baseline preparation
(vcb missing) but is good for incremental training itself.

On the other hand
Capability of incremental training using MMSapt:
- EMS includes the mmsapt option to train and binarize the arrays
- EMS does NOT include the part of incrementally adding the new data in
an automated way. Has to be done manually.

Am I understanding things properly ?

Le 23/08/2015 09:06, Vincent Nguyen a ?crit :
> Hello,
>
> I have a few questions on running MMSAPT within EMS. I am refering to
> the doc here : http://www.statmt.org/moses/?n=Advanced.Incremental
> and to the sections of the config.basic file of EMS.
>
> 1) the doc says
> initial training run EMS as usual but use modified version of Giza++ and
> add training-options = "-final-alignment-model hmm"
> Does this mean that we cannot use FastAlign for initial training ? or
> does FastAlign support hmm ?
> also can the -final-alignment-model hmm parameter be in the same as line
> as other training-options ? (-sort-compress -cores ....)
>
> 2) do we need to comment the
> alignment-symmetrization-method=grow-diag-final-and line ?
>
> 3) what is the difference between : ("for use with interactive
> post-editing")
>
> mmsapt = "pfwd=g pbwd=g smooth=0.01 rare=0 prov=0 sample=1000 workers=1"
> binarize-all = $moses-script-dir/training/binarize-model.perl
> and
> mmsapt = "pfwd=g pbwd=g smooth=0 rare=1 prov=1 sample=1000 workers=1"
> binarize-all = $moses-script-dir/training/binarize-model.perl
>
> 4) then there is a section in the config file for which I find no
> documentation but seems related
> use of baseline alignment model (incremental training)
> baseline=68
> then 8 lines of parameters
>
> 5) in the "Updates" section of the doc (adding new data) I see nothing
> related with EMS.
> Does this mean that at this time there is no actual incremental training
> within EMS and this part has to be done manually ?
>
> Thanks for your help,
>
> Vincent
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

------------------------------

Message: 2
Date: Fri, 28 Aug 2015 13:20:39 +0200
From: Rajen Chatterjee <rajen.k.chatterjee@gmail.com>
Subject: [Moses-support] How -xml-input with "constraint" flag works?
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAC4-+NxKP4hrvW5MZgT0--Ucug=NVTs=bfZoYpS2QHkv6Pp0Rg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Everyone,

I am using -xml-input option with "constraint" flag to decode the test set.
My understanding (for constraint flag) is that the translation option
present in the xml tag must be used by the decoder and should be present in
the output. But I didn't notice it to be true in the following example (a
simplified version):
*Input:*
<n translation="is">is</n> going <n translation="the">the</n>
*Output:*
is to
*Phrase Table Entry:*
is going the ||| is to ||| .................. (in the trace I can notice
that this entry is used by the decoder)

Here the xml translation option "the" is not present in the output. I guess
I am not clear with how "constraint" flag works. Please let me know if
anyone has any idea.

I read here (http://www.statmt.org/moses/?n=Advanced.Hybrid#ntoc1) that
"The XML-specified translation competes with phrase table choices that
contain the specified translation"
But in the above case the XML-specified translation "the" is also competing
with phrase table choices that do not contain the specified translation
("the" is not present in the target side ").

--
-Regards,
Rajen Chatterjee.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150828/b37b0d40/attachment-0001.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 106, Issue 51
**********************************************

Moses-support Digest, Vol 106, Issue 51

0 Response to "Moses-support Digest, Vol 106, Issue 51"

Post a Comment