Moses-support Digest, Vol 104, Issue 49

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: problem in translation (Rico Sennrich)
2. Re: problem in translation (Leusch, Gregor)
3. Re: Major bug found in Moses (Read, James C)
4. Re: Major bug found in Moses (Read, James C)

----------------------------------------------------------------------

Message: 1
Date: Fri, 19 Jun 2015 09:27:04 +0000 (UTC)
From: Rico Sennrich <rico.sennrich@gmx.ch>
Subject: Re: [Moses-support] problem in translation
To: moses-support@mit.edu
Message-ID: <loom.20150619T112155-748@post.gmane.org>
Content-Type: text/plain; charset=utf-8

fatma elzahraa Eltaher <fatmaeltaher@...> writes:

>
> Dears,
> I have a problem in translation. After building Moses model , I try to
test it by a ?word but the output was the same word.
> I did not know where is the problem? could you help me?
> kindly find attached pic.
>
>
>
> thank you,

hello Fatma,

I'd check if your input words are in your phrase table, and if they're
correctly aligned to English words. I don't know how you trained your model,
but the words could be unknown because you have too little training data, or
because you mixed up the languages in the training corpora. Another
possibility is that you have sentences in your training data that are Arabic
on both sides of your parallel corpus. A look at the phrasg???????????)???????????()????????)I?

------------------------------

Message: 2
Date: Fri, 19 Jun 2015 10:26:06 +0000
From: "Leusch, Gregor" <gleusch@ebay.com>
Subject: Re: [Moses-support] problem in translation
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <D1A9B9BC.7B9F%gleusch@ebay.com>
Content-Type: text/plain; charset="utf-8"

Hi Fatma,

another frequent problem with ?real-world? Arabic script (or, basically,
with anything that is not just ?plain ASCII?) is that text may often
contain invisible or unexpected unicode characters, like right-to-left
markers, non-ascii space, ligatures, etc pp.
Token matching within Moses happens on the ?byte string? level, not on a
?visual? level, so any of those characters left either during training or
during translation may prevent phrase table entries from matching. The
simplest way to check whether this happens is trying to find the
corresponding string in your (preprocessed) training data, the phrase
table, and your input, and compare on the level of unicode code points.

Best,

Gregor

-----Original Message-----
From: Rico Sennrich <rico.sennrich@gmx.ch>
Date: Friday 19 June 2015 11:27
To: "moses-support@mit.edu" <moses-support@mit.edu>
Subject: Re: [Moses-support] problem in translation

>fatma elzahraa Eltaher <fatmaeltaher@...> writes:
>
>>
>> Dears,
>> I have a problem in translation. After building Moses model , I try to
>test it by a word but the output was the same word.
>> I did not know where is the problem? could you help me?
>> kindly find attached pic.
>>
>>
>>
>> thank you,
>
>hello Fatma,
>
>I'd check if your input words are in your phrase table, and if they're
>correctly aligned to English words. I don't know how you trained your
>model,
>but the words could be unknown because you have too little training data,
>or
>because you mixed up the languages in the training corpora. Another
>possibility is that you have sentences in your training data that are
>Arabic
>on both sides of your parallel corpus. A look at the
>phrasg??????????)??????????()????????)I??
>

------------------------------

Message: 3
Date: Fri, 19 Jun 2015 13:23:11 +0000
From: "Read, James C" <jcread@essex.ac.uk>
Subject: Re: [Moses-support] Major bug found in Moses
To: amittai axelrod <amittai@umiacs.umd.edu>, Hieu Hoang
<hieuhoang@gmail.com>, Kenneth Heafield <moses@kheafield.com>,
"moses-support@mit.edu" <moses-support@mit.edu>
Cc: "Arnold, Doug" <doug@essex.ac.uk>
Message-ID:
<DB3PR06MB0713A3A8849ED7408AB75B4685A40@DB3PR06MB0713.eurprd06.prod.outlook.com>

Content-Type: text/plain; charset="iso-8859-1"

I'm sorry but I disagree entirely. When the purpose of research is to test and improve the performance of the TM in isolation then a baseline with an LM is entirely innappropriate. You might like to also insist that I include a baseline rule based system for the sake of completeness?

James

________________________________________
From: amittai axelrod <amittai@umiacs.umd.edu>
Sent: Wednesday, June 17, 2015 9:03 PM
To: Read, James C; Hieu Hoang; Kenneth Heafield; moses-support@mit.edu
Cc: Arnold, Doug
Subject: Re: [Moses-support] Major bug found in Moses

this is a little hard to follow. "naturally" dropping the LM from the
equation makes the system worse, but "surprisingly" filtering out
suboptimal phrase pairs from the search space makes the system better?
it is not clear what your intuition derives from, though your faith in
it is astonishing.

regarding expectations -- i would expect publishable results to include
a comparison against a standard baseline. the comparison might not be
fair to your proposed system -- but that's not the baseline's fault!

as you are proposing a brand new translation paradigm on the grounds
that the current state of the art is broken, it is incumbent upon
you,the proponent, to show that your method works better than the
current standard. that's how science works.

you can have another baseline in there with no LM if you like, and say
you're isolating some small part of the sytem, and you can decide not to
tune your hypothesis system, but all baselines have to be tuned.

~amittai

On 6/17/15 13:46, Read, James C wrote:
> Please note that in order for the baseline to be meaningful it has to also use no LM. So, naturally the scores are lower than those of baselines you are referring to.
>
> Regarding expectations. Are you seriously suggesting that we would expect the translation model to be incapable of finding higher scoring translations when not filtering out less likely phrase pairs? How high exactly would that rank on your desirable qualities of a TM list?
>
> James
>
> ________________________________________
> From: amittai axelrod <amittai@umiacs.umd.edu>
> Sent: Wednesday, June 17, 2015 8:20 PM
> To: Read, James C; Hieu Hoang; Kenneth Heafield; moses-support@mit.edu
> Cc: Arnold, Doug
> Subject: Re: [Moses-support] Major bug found in Moses
>
> hi --
>
> you might not be aware, but your emails sound almost belligerently
> confrontational. i can see how you would be frustrated, but starting a
> conversation with "i have found a major bug" and then repeatedly saying
> that "clearly" everything is broken -- that may not be the best way to
> convince the few hundred people on the mailing list of the soundness of
> your approach.
>
> also, your argument could be easily mis-interpreted as "this behavior is
> unexpected to me, ergo this is unexpected behavior", and that will
> unfortunately bias the listener against you, as that is the preferred
> argument structure of conspiracy theorists.
>
> at any rate, "the system" is designed to take a large number of phrase
> pairs and model scores cobble them together into a translation. it does
> do that. it appears that you have identified a different way of doing
> that cobbling-together, one that uses much fewer models -- so far so good!
>
> however, from reading your paper, it seems that your baseline is
> completely unoptimized, so performance gains against it may not show up
> in the real world. as specific examples, Table 1 in your paper shows
> that your baseline French-English system score is 11.36, Spanish-English
> is 7.16, and German-English is 6.70 BLEU. if you compare those baselines
> against published results in those languages from the previous few
> years, you will see that those scores are well off the mark. your
> position will be helped by showing results against a stronger, yet still
> basic, baseline.
>
> what happens if you compare your approach against a vanilla use of the
> Moses pipeline [this includes tuning]?
>
> cheers,
> ~amittai
>
>
>
> On 6/17/15 12:45, Read, James C wrote:
>> Doesn't look like the LM is contributing all that much then does it?
>>
>> James
>>
>> ________________________________________
>> From: moses-support-bounces@mit.edu <moses-support-bounces@mit.edu> on behalf of Hieu Hoang <hieuhoang@gmail.com>
>> Sent: Wednesday, June 17, 2015 7:35 PM
>> To: Kenneth Heafield; moses-support@mit.edu
>> Subject: Re: [Moses-support] Major bug found in Moses
>>
>> On 17/06/2015 20:13, Kenneth Heafield wrote:
>>> I'll bite.
>>>
>>> The moses.ini files ship with bogus feature weights. One is required to
>>> tune the system to discover good weights for their system. You did not
>>> tune. The results of an untuned system are meaningless.
>>>
>>> So for example if the feature weights are all zeros, then the scores are
>>> all zero. The system will arbitrarily pick some awful translation from
>>> a large space of translations.
>>>
>>> The filter looks at one feature p(target | source). So now you've
>>> constrained the awful untuned model to a slightly better region of the
>>> search space.
>>>
>>> In other words, all you've done is a poor approximation to manually
>>> setting the weight to 1.0 on p(target | source) and the rest to 0.
>>>
>>> The problem isn't that you are running without a language model (though
>>> we generally do not care what happens without one). The problem is that
>>> you did not tune the feature weights.
>>>
>>> Moreover, as Marcin is pointing out, I wouldn't necessarily expect
>>> tuning to work without an LM.
>> Tuning does work without a LM. The results aren't half bad. fr-en
>> europarl (pb):
>> with LM: 22.84
>> retuned without LM: 18.33
>>>
>>> On 06/17/15 11:56, Read, James C wrote:
>>>> Actually the approximation I expect to be:
>>>>
>>>> p(e|f)=p(f|e)
>>>>
>>>> Why would you expect this to give poor results if the TM is well trained? Surely the results of my filtering experiments provve otherwise.
>>>>
>>>> James
>>>>
>>>> ________________________________________
>>>> From: moses-support-bounces@mit.edu <moses-support-bounces@mit.edu> on behalf of Rico Sennrich <rico.sennrich@gmx.ch>
>>>> Sent: Wednesday, June 17, 2015 5:32 PM
>>>> To: moses-support@mit.edu
>>>> Subject: Re: [Moses-support] Major bug found in Moses
>>>>
>>>> Read, James C <jcread@...> writes:
>>>>
>>>>> I have been unable to find a logical explanation for this behaviour other
>>>> than to conclude that there must be some kind of bug in Moses which causes a
>>>> TM only run of Moses to perform poorly in finding the most likely
>>>> translations according to the TM when
>>>>> there are less likely phrase pairs included in the race.
>>>> I may have overlooked something, but you seem to have removed the language
>>>> model from your config, and used default weights. your default model will
>>>> thus (roughly) implement the following model:
>>>>
>>>> p(e|f) = p(e|f)*p(f|e)
>>>>
>>>> which is obviously wrong, and will give you poor results. This is not a bug
>>>> in the code, but a poor choice of models and weights. Standard steps in SMT
>>>> (like tuning the model weights on a development set, and including a
>>>> language model) will give you the desired results.
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>> --
>> Hieu Hoang
>> Researcher
>> New York University, Abu Dhabi
>> http://www.hoang.co.uk/hieu
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>

------------------------------

Message: 4
Date: Fri, 19 Jun 2015 13:24:22 +0000
From: "Read, James C" <jcread@essex.ac.uk>
Subject: Re: [Moses-support] Major bug found in Moses
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>, "Arnold, Doug"
<doug@essex.ac.uk>
Message-ID:
<DB3PR06MB0713F41623D946B8CA163A0585A40@DB3PR06MB0713.eurprd06.prod.outlook.com>

Content-Type: text/plain; charset="iso-8859-1"

I quote:

"the decoder's job is NOT to find the high quality translation"

Did you REALLY just say that?

James

________________________________
From: Hieu Hoang <hieuhoang@gmail.com>
Sent: Wednesday, June 17, 2015 9:00 PM
To: Read, James C
Cc: Kenneth Heafield; moses-support@mit.edu; Arnold, Doug
Subject: Re: [Moses-support] Major bug found in Moses

the decoder's job is NOT to find the high quality translation (as measured by bleu). It's job is to find translations with high model score.

you need the tuning to make sure high quality translation correlates with high model score. If you don't tune, it's pot luck what quality you get.

You should tune with the features you use

Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu

On 17 June 2015 at 21:52, Read, James C <jcread@essex.ac.uk<mailto:jcread@essex.ac.uk>> wrote:
The analogy doesn't seem to be helping me understand just how exactly it is a desirable quality of a TM to

a) completely break down if no LM is used (thank you for showing that such is not always the case)
b) be dependent on a tuning step to help it find the higher scoring translations

What you seem to be essentially saying is that the TM cannot find the higher scoring translations because I didn't pretune the system to do so. And I am supposed to accept that such is a desirable quality of a system whose very job is to find the higher scoring translations.

Further, I am still unclear which features you prequire a system to be tuned on. At the very least it seems that I have discovered the selection process that tuning seems to be making up for in some unspecified and altogether opaque way.

James

________________________________________
From: Hieu Hoang <hieuhoang@gmail.com<mailto:hieuhoang@gmail.com>>
Sent: Wednesday, June 17, 2015 8:34 PM
To: Read, James C; Kenneth Heafield; moses-support@mit.edu<mailto:moses-support@mit.edu>
Cc: Arnold, Doug
Subject: Re: [Moses-support] Major bug found in Moses

4 BLEU is nothing to sniff at :) I was answering Ken's tangent aspersion
that LM are needed for tuning.

I have some sympathy for you. You're looking at ways to improve
translation by reducing the search space. I've bashed my head against
this wall for a while as well without much success.

However, as everyone is telling you, you haven't understood the role of
tuning. Without tuning, you're pointing your lab rat to some random part
of the search space, instead of away from the furry animal with whiskers
and towards the yellow cheesy thing

On 17/06/2015 20:45, Read, James C wrote:
> Doesn't look like the LM is contributing all that much then does it?
>
> James
>
> ________________________________________
> From: moses-support-bounces@mit.edu<mailto:moses-support-bounces@mit.edu> <moses-support-bounces@mit.edu<mailto:moses-support-bounces@mit.edu>> on behalf of Hieu Hoang <hieuhoang@gmail.com<mailto:hieuhoang@gmail.com>>
> Sent: Wednesday, June 17, 2015 7:35 PM
> To: Kenneth Heafield; moses-support@mit.edu<mailto:moses-support@mit.edu>
> Subject: Re: [Moses-support] Major bug found in Moses
>
> On 17/06/2015 20:13, Kenneth Heafield wrote:
>> I'll bite.
>>
>> The moses.ini files ship with bogus feature weights. One is required to
>> tune the system to discover good weights for their system. You did not
>> tune. The results of an untuned system are meaningless.
>>
>> So for example if the feature weights are all zeros, then the scores are
>> all zero. The system will arbitrarily pick some awful translation from
>> a large space of translations.
>>
>> The filter looks at one feature p(target | source). So now you've
>> constrained the awful untuned model to a slightly better region of the
>> search space.
>>
>> In other words, all you've done is a poor approximation to manually
>> setting the weight to 1.0 on p(target | source) and the rest to 0.
>>
>> The problem isn't that you are running without a language model (though
>> we generally do not care what happens without one). The problem is that
>> you did not tune the feature weights.
>>
>> Moreover, as Marcin is pointing out, I wouldn't necessarily expect
>> tuning to work without an LM.
> Tuning does work without a LM. The results aren't half bad. fr-en
> europarl (pb):
> with LM: 22.84
> retuned without LM: 18.33
>> On 06/17/15 11:56, Read, James C wrote:
>>> Actually the approximation I expect to be:
>>>
>>> p(e|f)=p(f|e)
>>>
>>> Why would you expect this to give poor results if the TM is well trained? Surely the results of my filtering experiments provve otherwise.
>>>
>>> James
>>>
>>> ________________________________________
>>> From: moses-support-bounces@mit.edu<mailto:moses-support-bounces@mit.edu> <moses-support-bounces@mit.edu<mailto:moses-support-bounces@mit.edu>> on behalf of Rico Sennrich <rico.sennrich@gmx.ch<mailto:rico.sennrich@gmx.ch>>
>>> Sent: Wednesday, June 17, 2015 5:32 PM
>>> To: moses-support@mit.edu<mailto:moses-support@mit.edu>
>>> Subject: Re: [Moses-support] Major bug found in Moses
>>>
>>> Read, James C <jcread@...> writes:
>>>
>>>> I have been unable to find a logical explanation for this behaviour other
>>> than to conclude that there must be some kind of bug in Moses which causes a
>>> TM only run of Moses to perform poorly in finding the most likely
>>> translations according to the TM when
>>>> there are less likely phrase pairs included in the race.
>>> I may have overlooked something, but you seem to have removed the language
>>> model from your config, and used default weights. your default model will
>>> thus (roughly) implement the following model:
>>>
>>> p(e|f) = p(e|f)*p(f|e)
>>>
>>> which is obviously wrong, and will give you poor results. This is not a bug
>>> in the code, but a poor choice of models and weights. Standard steps in SMT
>>> (like tuning the model weights on a development set, and including a
>>> language model) will give you the desired results.
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu<mailto:Moses-support@mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu<mailto:Moses-support@mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu<mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
> --
> Hieu Hoang
> Researcher
> New York University, Abu Dhabi
> http://www.hoang.co.uk/hieu
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu<mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
> .
>

--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150619/f3f102c4/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 104, Issue 49
**********************************************

Moses-support Digest, Vol 104, Issue 49

0 Response to "Moses-support Digest, Vol 104, Issue 49"

Post a Comment