Moses-support Digest, Vol 86, Issue 37

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Some of the confusing concepts (Andrew)
2. Regarding Bleu Score (Pranjal Das)
3. Re: Regarding Bleu Score (Prasanth K)
4. Re: Regarding Bleu Score (Pranjal Das)


----------------------------------------------------------------------

Message: 1
Date: Thu, 12 Dec 2013 05:52:39 +0900
From: Andrew <ravenyj@hotmail.com>
Subject: [Moses-support] Some of the confusing concepts
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <BLU171-W825BF04929C32BC0101E92B2DD0@phx.gbl>
Content-Type: text/plain; charset="iso-2022-jp"

Hello,
I'm trying to get a complete picture of how Moses works, and here are some of the parts in which I failed to grab a definitive understanding. I apologize that this may be a bit verbose, but I would greatly appreciate it if you could help me better understand the nature of Moses and SMT.

1) In GIZA++, what are the order and number of iterations for each model?It seems like the order is Model 1->Model 2->HMM -> Model 3-> Model 4 by default, but I'm not sure how many iterations of each runs by default.
2) In GIZA++, is it right that source word cannot be aligned to more than one word in target language? What about the opposite? And can we have a case where multiple source words are aligned to the same target word, and vice versa? What would happen in an extreme case where source sentence is only one word, and target sentence is, say, 10 words?
3) From what I've read, it seems like all possible alignments are counted at first, and alignment probability for each word is calculated based on those counts. If so, in case where |source| < |target|, which source word is likely to get aligned to empty word? My understanding is that it would be the word with lowest alignment probability in regard to target words, and a word with high fertility probability for n=0.
4) If we opt not to use reordering table in moses.ini, will the distortion limit be meaningless? Also in that case, will the grammaticality be dependent only on the language model?
5) If GIZA++ aligns words in both directions, why does it matter which one is source and which one is target? Is there difference in weights? Or is it because of the restriction that source word can only be aligned to one target word?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131212/0ade7ccd/attachment-0001.htm

------------------------------

Message: 2
Date: Thu, 12 Dec 2013 20:51:41 +0530
From: Pranjal Das <pranjal4456@gmail.com>
Subject: [Moses-support] Regarding Bleu Score
To: moses-support <moses-support@mit.edu>
Message-ID:
<CAAGh44yO=6MK8dioOEF-W8U7xJnNkg=j6i45m5uXVtQGpM14kA@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi all..
While doing Bengali to English translation i got a bleu score of 7.02
and doing English to Bengali i got 4.7

why is the difference so high as i am using the same parallel corpus ??


*Pranjal Das*
Department of Information Technology,
Institute of Science and Technology,
Gauhati University,Guwahati,Assam
Phone- +91-8399879454
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131212/29af6c2a/attachment-0001.htm

------------------------------

Message: 3
Date: Thu, 12 Dec 2013 16:28:22 +0100
From: Prasanth K <prasanthk.ms09@gmail.com>
Subject: Re: [Moses-support] Regarding Bleu Score
To: Pranjal Das <pranjal4456@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CA+n+9-jJojqGON9Nshz_jWaSSU4jUbYoUPTH0AAxCQmRp6O+Bw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Pranjal,

Its not uncommon to observe such differences when changing the direction of
translation. Translation from English to Bengali is relatively harder as
Bengali is morphologically rich, making it difficult for the correct
surface forms to be generated. Given that BLEU is a pattern matching
algorithm comparing surface forms, the drop in the score could be partly
attributed to not being able to generate the correct surface forms.

You can look at the EuroMatrix, where similar patterns can be observed.
Translation from English->Finnish gives better results than the other way
around.
http://www.statmt.org/matrix/

Prasanth

On Thu, Dec 12, 2013 at 4:21 PM, Pranjal Das <pranjal4456@gmail.com> wrote:

> Hi all..
> While doing Bengali to English translation i got a bleu score of 7.02
> and doing English to Bengali i got 4.7
>
> why is the difference so high as i am using the same parallel corpus ??
>
>
> *Pranjal Das*
> Department of Information Technology,
> Institute of Science and Technology,
> Gauhati University,Guwahati,Assam
> Phone- +91-8399879454
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


--
"Theories have four stages of acceptance. i) this is worthless nonsense;
ii) this is an interesting, but perverse, point of view, iii) this is true,
but quite unimportant; iv) I always said so."

--- J.B.S. Haldane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131212/dcdc5fff/attachment-0001.htm

------------------------------

Message: 4
Date: Thu, 12 Dec 2013 21:01:47 +0530
From: Pranjal Das <pranjal4456@gmail.com>
Subject: Re: [Moses-support] Regarding Bleu Score
To: Prasanth K <prasanthk.ms09@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAAGh44yQ-kx=34Jius8cEZb9_zKMB9gowY-bhnNhrCePHn48zA@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Thank you Prasanth....but why i am getting such a low bleu score...actually
i have a very small corpus..about 2500 sentences..is it because of that ???

*Pranjal Das*
Department of Information Technology,
Institute of Science and Technology,
Gauhati University,Guwahati,Assam
Phone- +91-8399879454


On Thu, Dec 12, 2013 at 8:58 PM, Prasanth K <prasanthk.ms09@gmail.com>wrote:

> Hi Pranjal,
>
> Its not uncommon to observe such differences when changing the direction
> of translation. Translation from English to Bengali is relatively harder as
> Bengali is morphologically rich, making it difficult for the correct
> surface forms to be generated. Given that BLEU is a pattern matching
> algorithm comparing surface forms, the drop in the score could be partly
> attributed to not being able to generate the correct surface forms.
>
> You can look at the EuroMatrix, where similar patterns can be observed.
> Translation from English->Finnish gives better results than the other way
> around.
> http://www.statmt.org/matrix/
>
> Prasanth
>
> On Thu, Dec 12, 2013 at 4:21 PM, Pranjal Das <pranjal4456@gmail.com>wrote:
>
>> Hi all..
>> While doing Bengali to English translation i got a bleu score of 7.02
>> and doing English to Bengali i got 4.7
>>
>> why is the difference so high as i am using the same parallel corpus ??
>>
>>
>> *Pranjal Das*
>> Department of Information Technology,
>> Institute of Science and Technology,
>> Gauhati University,Guwahati,Assam
>> Phone- +91-8399879454
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
> "Theories have four stages of acceptance. i) this is worthless nonsense;
> ii) this is an interesting, but perverse, point of view, iii) this is true,
> but quite unimportant; iv) I always said so."
>
> --- J.B.S. Haldane
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131212/47ba0ff3/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 86, Issue 37
*********************************************

0 Response to "Moses-support Digest, Vol 86, Issue 37"

Post a Comment