Moses-support Digest, Vol 106, Issue 41

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: clarification CBPT vs MMSAPT (Prashant Mathur)
2. Re: clarification CBPT vs MMSAPT (Ulrich Germann)


----------------------------------------------------------------------

Message: 1
Date: Thu, 20 Aug 2015 01:56:51 +0200
From: Prashant Mathur <prashant@fbk.eu>
Subject: Re: [Moses-support] clarification CBPT vs MMSAPT
To: Vincent Nguyen <vnguyen@neuf.fr>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAK3pNhLMpF_4d5qe1M7hWB174XiK30UsiK9JyGEC=X2dB3dJsg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Vincent,

The goal is incremental adaptation but these two are different techniques
in principle.
CBPT adds additional dynamic phrase table (with 1 additional feature) which
allows deletion, insertion of phrase pairs at any given time. For
incremental adaptation CBPT can be used in conjunction with constraint
based decoding as in [1] or cascading onlineMgiza++ and normal phrase
extractor as in [2].
I don't have much idea about memory mapped suffix array implementation but
afaik with MMSAPT (which uses 7 features) you can do incremental updates to
your model by adding stream of parallel data along with the alignments.

--Prashant

[1]
http://www.cl.uni-heidelberg.de/~riezler/publications/papers/MTJOURNAL2014.pdf
[2] http://mt4cat.org/software/adaptive-mt-server


On Wed, Aug 19, 2015 at 6:53 PM, Vincent Nguyen <vnguyen@neuf.fr> wrote:

> Hello support,
>
> Going into advanced features of Moses, I am a bit confused by the
> differences and therefore which path to follow, regarding the 2 features
> CBPT and MMSAPT.
>
> I have the feeling the ultimate goal of both is the same but maybe I am
> wrong.
>
> Can someone explain the actual difference ?
>
> by the way the "update" feature of this page http://demo.statmt.org/ is
> based on which one ?
>
> Thanks
>
> Vincent.
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150819/6eacb5fb/attachment-0001.html

------------------------------

Message: 2
Date: Thu, 20 Aug 2015 15:17:56 +0100
From: Ulrich Germann <ulrich.germann@gmail.com>
Subject: Re: [Moses-support] clarification CBPT vs MMSAPT
To: Prashant Mathur <prashant@fbk.eu>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAHQSRUri1StDaKrJL6qyyXNUFXPX_1tvpHt1jO=M=2yYu82tcg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Memory-mapped phrase tables are an alternative to conventional phrase
tables. They are much, much faster to build, only slightly slower than
CompactPT at runtime, and at the very least competitive in terms of BLEU
performance. I usually observe slightly higher BLEU scores, but for each
individual evaluation, the difference is usually not significant. They
support only phrase-based MT, but not syntax-based MT.

Both Mmsapt and CBPT also cater to post-editing scenarios (CBPT were
specifically developed for this purpose). They allow adding new material to
the phrase tables at run time. I can't say much about CBPT (apparently you
add phrase table entries, and there is a decay function that rewards more
recent choices approved by the translator), but in the case of Mmsapt
(since it samples at lookup time anyway), you can add new word-aligned
parallel text at run time to the training data (or additional material at
start-up; additions are currently not stored on disk by the server (do NOT
use mosesserver, use moses --server --port ...) and are lost when the
server exits, but can be loaded at startup time from text files, if they
are available (in other words: it's currently up to the user/client who
submits the additions to also store them on disk if they are meant to be
permanent). Mmsapt offers numerous configuration options (separate scores
or joint scores for background and foreground corpus, a provenance feature,
etc.) that affect the number of features, and there is no established best
practice for use in interactive MT (unless Michael Denkowski has advice to
offer in this respect).

For phrase-based MT I recommend Mmsapt (see also my paper in the coming
issue of PBML), as it saves you a lot of phrase table building agony. For
interactive use, the infrastructure is there but additional research is
required to figure out the optimal configuration of feature functions and
associated parameters.

Best regards - Uli Germann

On Thu, Aug 20, 2015 at 12:56 AM, Prashant Mathur <prashant@fbk.eu> wrote:

> Hi Vincent,
>
> The goal is incremental adaptation but these two are different techniques
> in principle.
> CBPT adds additional dynamic phrase table (with 1 additional feature)
> which allows deletion, insertion of phrase pairs at any given time. For
> incremental adaptation CBPT can be used in conjunction with constraint
> based decoding as in [1] or cascading onlineMgiza++ and normal phrase
> extractor as in [2].
> I don't have much idea about memory mapped suffix array implementation but
> afaik with MMSAPT (which uses 7 features) you can do incremental updates to
> your model by adding stream of parallel data along with the alignments.
>
> --Prashant
>
> [1]
> http://www.cl.uni-heidelberg.de/~riezler/publications/papers/MTJOURNAL2014.pdf
> [2] http://mt4cat.org/software/adaptive-mt-server
>
>
> On Wed, Aug 19, 2015 at 6:53 PM, Vincent Nguyen <vnguyen@neuf.fr> wrote:
>
>> Hello support,
>>
>> Going into advanced features of Moses, I am a bit confused by the
>> differences and therefore which path to follow, regarding the 2 features
>> CBPT and MMSAPT.
>>
>> I have the feeling the ultimate goal of both is the same but maybe I am
>> wrong.
>>
>> Can someone explain the actual difference ?
>>
>> by the way the "update" feature of this page http://demo.statmt.org/ is
>> based on which one ?
>>
>> Thanks
>>
>> Vincent.
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


--
Ulrich Germann
Senior Researcher
School of Informatics
University of Edinburgh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150820/a4f291c9/attachment-0001.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 106, Issue 41
**********************************************

0 Response to "Moses-support Digest, Vol 106, Issue 41"

Post a Comment