Moses-support Digest, Vol 82, Issue 33

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: How do Moses use the LM for unknown words (Nicola Bertoldi)


----------------------------------------------------------------------

Message: 1
Date: Sun, 25 Aug 2013 07:53:19 +0000
From: Nicola Bertoldi <bertoldi@fbk.eu>
Subject: Re: [Moses-support] How do Moses use the LM for unknown words
To: Per Tunedal <per.tunedal@operamail.com>
Cc: "<moses-support@mit.edu>" <moses-support@mit.edu>
Message-ID: <63F9C2B4-C67B-4F98-9634-EAD0CE53B5AA@fbk.eu>
Content-Type: text/plain; charset="iso-8859-1"

Hi Per,

for all n-gram (but those of the highest order), the third field is the logarithmic back-off weight (logBO)
if not reported, the weight is assumed equalt to 0 (in log scale)

suppose you want to compute
logP(maison | cadre seulement)

and the 3-gram
"cadre seulement maison"
is absent , i.e. logP(maison | cadre seulement) = 0.0

and the 2-gram is present as follows:
-0.7 cadre maison -0.1

hence, the LM is computed as:

logP(maison | cadre seulement) + logBO(cadre seulement) * logP(maison | seulement) =
= 0.0 + -4.57217 * -0.7

(sorry for the example, but I do not speak French)

best
Nicola

On Aug 23, 2013, at 9:20 PM, Per Tunedal wrote:

>
> Hi,
> how do Moses calculate the probability of a sentence with an unknown
> word? How is the LM used?
>
> I've estimated a 3-gram LM with IRSTLM for a base line system, according
> to the instructions in the Wiki. The arpa-file contains entries like:
>
> -7.2625 redescendue -0.1681
> -7.2625 serviabilit? -0.1681
> -2.51072 <unk>
>
> -3.26915 cadre tr?s -0.096544
> -4.52727 cadre lors
> -4.57217 cadre seulement
>
> I suppose the first number is the probability and the second number is
> the "back-off weight". Is it used somehow? In that case, what happens
> when it's absent (4.52727 cadre lors) ?
>
> Yours,
> Per Tunedal
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support




------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 82, Issue 33
*********************************************

0 Response to "Moses-support Digest, Vol 82, Issue 33"

Post a Comment