Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: clarification CBPT vs MMSAPT (Ulrich Germann)
----------------------------------------------------------------------
Message: 1
Date: Sat, 22 Aug 2015 23:36:26 +0100
From: Ulrich Germann <ulrich.germann@gmail.com>
Subject: Re: [Moses-support] clarification CBPT vs MMSAPT
To: Vincent Nguyen <vnguyen@neuf.fr>, "moses-support@mit.edu"
<moses-support@mit.edu>
Message-ID:
<CAHQSRUovg6610wRRth5m1+fyyePXb=de_Z1jyLEs4DnZHu6E8Q@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi Vincent,
1. I don't use EMS, so I'm the wrong person to ask.
2. Please always post questions to the moses-support mailing list, so that
others can benefit from questions and answers as well.
3. Can you briefly explain what you are trying to accomplish? I don't think
I understand what you are actually trying to do.
Best regards - Uli
On Sat, Aug 22, 2015 at 10:45 PM, Vincent Nguyen <vnguyen@neuf.fr> wrote:
>
> I kept reading again and again this
> http://www.statmt.org/moses/?n=Advanced.Incremental
> but this is not clear enough for a newbie like me for use with EMS.
> I also see a section in the EMS config file :
> use of baseline aligment model (incremental training)
> and I don't really see how it comes with the rest of parameters.
>
>
>
> Le 22/08/2015 16:31, vnguyen@neuf.fr a ?crit :
>
> Oops
> Using EMS i built the phrase table with the mmsapt=
> Option and it went through
> But i had not added the training-options
> -final-alignment-model hmm
>
> Do i need to start again?
>
> The thing is i use dyers aligner because of the giga corpus and i am not
> sure that training option is compatible since the tuto mentions giza++
> modified...
>
>
>
> ____________________
>
> De : "Ulrich Germann"
> Date : 21 ao?t 2015 15:54:08
> A : Vincent Nguyen
> Cc : prashant@fbk.eu, moses-support@mit.edu
> Sujet : Re: [Moses-support] clarification CBPT vs MMSAPT
>
>
>
> On Thu, Aug 20, 2015 at 5:40 PM, Vincent Nguyen < <vnguyen@neuf.fr>
> vnguyen@neuf.fr> wrote:
>
>> Thanks to both of you. I will it a try to both solutions.
>>
>> For MMSAPT :
>> Will I be able to make it work with the Giga corpus fr-en ? If everything
>> is loaded in memory I may be short of ram rather quickly.
>>
>
> For the WMT-15 fr-en data, mmsapt's files are about 20GB in total, but not
> all of it will normally be kept in memory. Mmsapt degrades gracefully, it
> just gets slow if the VM manager has to drop memory pages and re-load them.
> The LM is about 40GB, so for optimal performance you should calculate 60+GB
> of RAM. Provided you have enough RAM, cat all model files to /dev/null
> prior to starting moses. Sequential disk access is much faster than random
> disk access, and the cat to /dev/null will push them into the OS's file
> cache.
>
>
>
>> Plus I was using dyers fast align ... so do I need to realign the whole
>> corpus with the modified version of giza++ ?
>>
>> You need word alignments in the output format produced by symal (ie.
> row-column pairs 1-1 2-2 3-4 etc.). How these alignments are produced
> doesn't matter for Mmsapts ability to handle them. It may, of course,
> affect the alignment quality, but that's independent of which phrase table
> implementation you use.
>
> - Uli
>
>
>
>> For CBPT :
>> I would like to give the the MT adative server a try but I don't really
>> understand how to adapt the given "adaptive model" and "updater model"
>> in a context where my language pair is different. these preliminary steps
>> are not part of the tutorial. (especially the updater_models/alignment
>> folders ...)
>>
>> The only glitch I see in the CBPT is that adaptive changes cannot be made
>> permanent.
>>
>>
>>
>>
>> Le 20/08/2015 16:17, Ulrich Germann a ?crit :
>>
>> Memory-mapped phrase tables are an alternative to conventional phrase
>> tables. They are much, much faster to build, only slightly slower than
>> CompactPT at runtime, and at the very least competitive in terms of BLEU
>> performance. I usually observe slightly higher BLEU scores, but for each
>> individual evaluation, the difference is usually not significant. They
>> support only phrase-based MT, but not syntax-based MT.
>>
>> Both Mmsapt and CBPT also cater to post-editing scenarios (CBPT were
>> specifically developed for this purpose). They allow adding new material to
>> the phrase tables at run time. I can't say much about CBPT (apparently you
>> add phrase table entries, and there is a decay function that rewards more
>> recent choices approved by the translator), but in the case of Mmsapt
>> (since it samples at lookup time anyway), you can add new word-aligned
>> parallel text at run time to the training data (or additional material at
>> start-up; additions are currently not stored on disk by the server (do NOT
>> use mosesserver, use moses --server --port ...) and are lost when the
>> server exits, but can be loaded at startup time from text files, if they
>> are available (in other words: it's currently up to the user/client who
>> submits the additions to also store them on disk if they are meant to be
>> permanent). Mmsapt offers numerous configuration options (separate scores
>> or joint scores for background and foreground corpus, a provenance feature,
>> etc.) that affect the number of features, and there is no established best
>> practice for use in interactive MT (unless Michael Denkowski has advice to
>> offer in this respect).
>>
>> For phrase-based MT I recommend Mmsapt (see also my paper in the coming
>> issue of PBML), as it saves you a lot of phrase table building agony. For
>> interactive use, the infrastructure is there but additional research is
>> required to figure out the optimal configuration of feature functions and
>> associated parameters.
>>
>> Best regards - Uli Germann
>>
>> On Thu, Aug 20, 2015 at 12:56 AM, Prashant Mathur < <prashant@fbk.eu>
>> prashant@fbk.eu> wrote:
>>
>>> Hi Vincent,
>>>
>>> The goal is incremental adaptation but these two are different
>>> techniques in principle.
>>> CBPT adds additional dynamic phrase table (with 1 additional feature)
>>> which allows deletion, insertion of phrase pairs at any given time. For
>>> incremental adaptation CBPT can be used in conjunction with constraint
>>> based decoding as in [1] or cascading onlineMgiza++ and normal phrase
>>> extractor as in [2].
>>> I don't have much idea about memory mapped suffix array implementation
>>> but afaik with MMSAPT (which uses 7 features) you can do incremental
>>> updates to your model by adding stream of parallel data along with the
>>> alignments.
>>>
>>> --Prashant
>>>
>>> [1]
>>> http://www.cl.uni-heidelberg.de/~riezler/publications/papers/MTJOURNAL2014.pdf
>>> [2] http://mt4cat.org/software/adaptive-mt-server
>>>
>>>
>>> On Wed, Aug 19, 2015 at 6:53 PM, Vincent Nguyen < <vnguyen@neuf.fr>
>>> vnguyen@neuf.fr> wrote:
>>>
>>>> Hello support,
>>>>
>>>> Going into advanced features of Moses, I am a bit confused by the
>>>> differences and therefore which path to follow, regarding the 2 features
>>>> CBPT and MMSAPT.
>>>>
>>>> I have the feeling the ultimate goal of both is the same but maybe I am
>>>> wrong.
>>>>
>>>> Can someone explain the actual difference ?
>>>>
>>>> by the way the "update" feature of this page <http://demo.statmt.org/>
>>>> http://demo.statmt.org/ is
>>>> based on which one ?
>>>>
>>>> Thanks
>>>>
>>>> Vincent.
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> <Moses-support@mit.edu>Moses-support@mit.edu
>>>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>>
>> --
>> Ulrich Germann
>> Senior Researcher
>> School of Informatics
>> University of Edinburgh
>>
>>
>>
>
>
> --
> Ulrich Germann
> Senior Researcher
> School of Informatics
> University of Edinburgh
>
>
>
--
Ulrich Germann
Senior Researcher
School of Informatics
University of Edinburgh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150822/d4981888/attachment.html
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 106, Issue 44
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 106, Issue 44"
Post a Comment