Moses-support Digest, Vol 120, Issue 13

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Are the numbers in the phrase-table.gz probabilities or
log prob? (Philipp Koehn)
2. Re: Sample SGML and XML files used by mteval-13a.pl
(Philipp Koehn)
3. Re: Are the numbers in the phrase-table.gz probabilities or
log prob? (Nat Gillin)
4. Re: Sample SGML and XML files used by mteval-13a.pl (Nat Gillin)


----------------------------------------------------------------------

Message: 1
Date: Wed, 12 Oct 2016 17:17:45 -0400
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] Are the numbers in the phrase-table.gz
probabilities or log prob?
To: Nat Gillin <nat.gillin@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDA7h2RX_sXuSEwK-HsLGTgpZibsbv1EtrXJDCqdjdV3tw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

the numbers in the (text) phrase table are raw probability, but Moses
internally uses log-probabilites, so the weights are used as:

weight * log(phrase-table-prob)

which is equivalent to

log( phrase-table-prob ^ weight)

-phi


On Wed, Oct 12, 2016 at 5:59 AM, Nat Gillin <nat.gillin@gmail.com> wrote:

> Dear Moses community,
>
> One more question on the numbers in the phrase-table and moses.ini.
>
> Assuming that the numbers in the phrase-table are probabilities (not yet
> log) If i were to multiply the weights from moses.ini to the phrase-table,
> would it be:
>
> log(weight * phrase-tabe-prob)
>
>
> or
>
> weight * log(phrase-table-prob)
>
>
>
> Considering that the decoder later do an argmax, i guess the value doesn't
> really matter. But it'll be good to know if its log after multiplying the
> weights or weights multiplied by the log.
>
> Regards,
> Nat
>
> On Wed, Oct 12, 2016 at 4:48 PM, Nat Gillin <nat.gillin@gmail.com> wrote:
>
>> Dear Moses community,
>>
>> After training the model with train-model.pl, I have got several files
>> from the model.
>>
>> Are the numbers in the phrase-table.gz probabilities or log prob? They
>> look like probabiltiites (ranging from 0-1), but I would like to confirm
>> it. Is that also the same for the reordering-table.wbe-msd-bidirectional-fe.gz
>> ?
>>
>> And as for the weights in moses.ini after MERT, are they also raw
>> probabilities or log probabilities?
>>
>> Thanks in advance for the information!
>>
>> Regards,
>> Nat
>>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161012/3a3a5f5b/attachment-0001.html

------------------------------

Message: 2
Date: Wed, 12 Oct 2016 17:22:48 -0400
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] Sample SGML and XML files used by
mteval-13a.pl
To: Nat Gillin <nat.gillin@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDB6vV3Dp--D7+BBEvOSw5wgAWZA_M=R8SbaDZEbKz59wQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

check out the dev/test files from WMT:
http://statmt.org/wmt16/translation-task.html

The main difference between mteval-13a.pl and multi-bleu.pl is that the
latter accepts your tokenization of translation and reference, while
mteval-13a.pl expects detokenized text and performs its own tokenization
internally.

-phi

On Tue, Oct 11, 2016 at 8:43 PM, Nat Gillin <nat.gillin@gmail.com> wrote:

> Dear Moses community,
>
> Is it right that the mteval-13a.pl is the canonical BLEU/NIST evaluation
> used in WMT and by most researches reporting on BLEU?
>
> Are there links to sample SGML and XML files that is used by mteval-13a.pl
> ?
>
> I understand that there's also Moses BLEU in the C++ code and also the
> multi-bleu.pl. Are there other bleu versions out there? Are they the
> same?
>
> Thanks in advance for the tips!
>
> Regards,
> Nat
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161012/9027770d/attachment-0001.html

------------------------------

Message: 3
Date: Thu, 13 Oct 2016 10:19:33 +0800
From: Nat Gillin <nat.gillin@gmail.com>
Subject: Re: [Moses-support] Are the numbers in the phrase-table.gz
probabilities or log prob?
To: Philipp Koehn <phi@jhu.edu>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAD2EOZiEugxV8VJkDVq0KPFciaW=9TW33EvzgR2zphCvHM_DtA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Dear Phi,

Thanks for the note on the weights and the prob vs log prob!

Regards,
Nat

On Thu, Oct 13, 2016 at 5:17 AM, Philipp Koehn <phi@jhu.edu> wrote:

> Hi,
>
> the numbers in the (text) phrase table are raw probability, but Moses
> internally uses log-probabilites, so the weights are used as:
>
> weight * log(phrase-table-prob)
>
> which is equivalent to
>
> log( phrase-table-prob ^ weight)
>
> -phi
>
>
> On Wed, Oct 12, 2016 at 5:59 AM, Nat Gillin <nat.gillin@gmail.com> wrote:
>
>> Dear Moses community,
>>
>> One more question on the numbers in the phrase-table and moses.ini.
>>
>> Assuming that the numbers in the phrase-table are probabilities (not yet
>> log) If i were to multiply the weights from moses.ini to the phrase-table,
>> would it be:
>>
>> log(weight * phrase-tabe-prob)
>>
>>
>> or
>>
>> weight * log(phrase-table-prob)
>>
>>
>>
>> Considering that the decoder later do an argmax, i guess the value
>> doesn't really matter. But it'll be good to know if its log after
>> multiplying the weights or weights multiplied by the log.
>>
>> Regards,
>> Nat
>>
>> On Wed, Oct 12, 2016 at 4:48 PM, Nat Gillin <nat.gillin@gmail.com> wrote:
>>
>>> Dear Moses community,
>>>
>>> After training the model with train-model.pl, I have got several files
>>> from the model.
>>>
>>> Are the numbers in the phrase-table.gz probabilities or log prob? They
>>> look like probabiltiites (ranging from 0-1), but I would like to confirm
>>> it. Is that also the same for the reordering-table.wbe-msd-bidirectional-fe.gz
>>> ?
>>>
>>> And as for the weights in moses.ini after MERT, are they also raw
>>> probabilities or log probabilities?
>>>
>>> Thanks in advance for the information!
>>>
>>> Regards,
>>> Nat
>>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161012/25a39b05/attachment-0001.html

------------------------------

Message: 4
Date: Thu, 13 Oct 2016 10:21:30 +0800
From: Nat Gillin <nat.gillin@gmail.com>
Subject: Re: [Moses-support] Sample SGML and XML files used by
mteval-13a.pl
To: Philipp Koehn <phi@jhu.edu>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAD2EOZjJLSe-m+S-x0Wt-8y6ptmArRwFqysN0Xmh9Q7s+6dgXg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Dear Phi,

Thanks again for the explanation!

Regards,
Nat

On Thu, Oct 13, 2016 at 5:22 AM, Philipp Koehn <phi@jhu.edu> wrote:

> Hi,
>
> check out the dev/test files from WMT:
> http://statmt.org/wmt16/translation-task.html
>
> The main difference between mteval-13a.pl and multi-bleu.pl is that the
> latter accepts your tokenization of translation and reference, while
> mteval-13a.pl expects detokenized text and performs its own tokenization
> internally.
>
> -phi
>
> On Tue, Oct 11, 2016 at 8:43 PM, Nat Gillin <nat.gillin@gmail.com> wrote:
>
>> Dear Moses community,
>>
>> Is it right that the mteval-13a.pl is the canonical BLEU/NIST evaluation
>> used in WMT and by most researches reporting on BLEU?
>>
>> Are there links to sample SGML and XML files that is used by
>> mteval-13a.pl ?
>>
>> I understand that there's also Moses BLEU in the C++ code and also the
>> multi-bleu.pl. Are there other bleu versions out there? Are they the
>> same?
>>
>> Thanks in advance for the tips!
>>
>> Regards,
>> Nat
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161012/102996d3/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 120, Issue 13
**********************************************

0 Response to "Moses-support Digest, Vol 120, Issue 13"

Post a Comment