Moses-support Digest, Vol 88, Issue 46

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Adding exceptions using xml-input (Philipp Koehn)
2. kbmira, custom metric, background corpus (Marcin Junczys-Dowmunt)


----------------------------------------------------------------------

Message: 1
Date: Fri, 21 Feb 2014 15:21:17 -0500
From: Philipp Koehn <pkoehn@inf.ed.ac.uk>
Subject: Re: [Moses-support] Adding exceptions using xml-input
To: Massinissa ?hmim <massinissa.ahmim@linguacustodia.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDCmMWJwnQ0MXo5OuXiHPX08LtFAvZk7Fyagi_GZjsyn8Q@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

you could place the "exceptions" into an additional phrase table and give
it preference
over the regular phrase table - the easiest would be to give it a single
indicator feature
with a very positive weight.

-phi


On Fri, Feb 21, 2014 at 9:20 AM, Massinissa ?hmim <
massinissa.ahmim@linguacustodia.com> wrote:

> Dear all,
>
> I am currently considering the using xml-input to force Moses to translate
> certain parts of text (titles...etc) in a specific way.
>
> So far I have been using the command line for instance :
>
> echo '<np translation="This document is cool."> Ce document est cool</np>
> ' | /mosesdecoder/bin/moses -xml-input exclusive -f moses.ini -t
>
> This might be good for a relative amount of exceptions, but as I want to
> implement a large number of 'exception' it might grow quickly. I was
> wounder whether there is a already a way in Moses to store a set of
> exceptions in an file that the decoder will check before starting the
> translation process?
>
> Many thanks
>
> Regards
>
> Massinissa
> --
>
> [image: Description : Description : lingua_custodia_final full logo]
>
> *The Translation Trustee*
>
> *1, Place Charles de Gaulle*
>
> *78180 Montigny-le-Bretonneux*
>
> *Tel : +33 1 30 44 04 23 Mobile : +33 7 61 44 40 84*
>
> *Email :** massinissa.ahmim**@linguacustodia.com
> <olivier.debeugny@linguacustodia.com>*
>
> *Website :* *www.linguacustodia.com <http://www.linguacustodia.com/> -
> www.thetranslationtrustee.com <http://www.thetranslationtrustee.com>*
>
> ? Pensez ? l'environnement, n'imprimez ce courriel que si n?cessaire.
>
> Please do not print this email unless it is absolutely necessary. Spread
> environmental awareness.
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140221/d604d2da/attachment-0001.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20140221/d604d2da/attachment-0001.jpg

------------------------------

Message: 2
Date: Fri, 21 Feb 2014 22:58:44 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: [Moses-support] kbmira, custom metric, background corpus
To: moses-support <moses-support@mit.edu>
Message-ID: <5307CC14.1030104@amu.edu.pl>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hi,
I managed to implement my custom metric with kbmira, but keep running
into weird behavior. If I do not set --model-bg the tuning results are
actually constantly decreasing between iterations. With --model-bg it
seems to work reasonably well (kbmira recovers from occasional decreases).

The metric is plain F-score computed from three statistics: correct
edits, proposed edits, gold standard edits. For PRO and kbmira, sentence
level F-score is being computed from those.

I do not think this is an issue with the code of my metric itself, since
it works very well with MERT and reasonably well with PRO. I am using
essentially the same code for kbmira, just adding the background corpus
statistics to the sufficient statistics of my metric.

So, my question would be, is the significantly different behavior for
the two types of background corpora in kbmira justified or should I
assume that I messed something up somewhere? Also, do the initial values
for the background corpus actually matter? Currently I just set them all
to 1.

Thanks,
Best,
Marcin


------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 88, Issue 46
*********************************************

0 Response to "Moses-support Digest, Vol 88, Issue 46"

Post a Comment