Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Major bug found in Moses (Ondrej Bojar)
2. Re: Major bug found in Moses (Marcin Junczys-Dowmunt)
3. Re: Major bug found in Moses (Rico Sennrich)
----------------------------------------------------------------------
Message: 1
Date: Wed, 17 Jun 2015 16:23:31 +0200
From: Ondrej Bojar <bojar@ufal.mff.cuni.cz>
Subject: Re: [Moses-support] Major bug found in Moses
To: "Read, James C" <jcread@essex.ac.uk>, Marcin Junczys-Dowmunt
<junczys@amu.edu.pl>
Cc: "Moses-support@mit.edu" <moses-support@mit.edu>, "Arnold, Doug"
<doug@essex.ac.uk>
Message-ID: <9bd40891-797f-4b09-b1ef-4a1c3eeec86f@email.android.com>
Content-Type: text/plain; charset=UTF-8
Hi,
BLEU scores don't mean much, unless you know what the translations look like. Marcin's explanation sounds very plausible.
How did you set weights in your experiment? And were they fixed for the two contrastive runs?
Cheers, O.
On June 17, 2015 4:01:26 PM CEST, "Read, James C" <jcread@essex.ac.uk> wrote:
>Read here for a table of results for 40 language pairs:
>
>
>http://privatewww.essex.ac.uk/~jcread/paper.pdf
>
>
>Would you honestly expect such huge differences in BLEU score?
>Honestly!?
>
>
>James
>
>
>________________________________
>From: Read, James C
>Sent: Wednesday, June 17, 2015 4:56 PM
>To: Marcin Junczys-Dowmunt
>Cc: Moses-support@mit.edu; Arnold, Doug
>Subject: Re: [Moses-support] Major bug found in Moses
>
>
>You would expect an improvement of 37 BLEU points?
>
>
>James
>
>
>________________________________
>From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
>Sent: Wednesday, June 17, 2015 4:32 PM
>To: Read, James C
>Cc: Moses-support@mit.edu; Arnold, Doug
>Subject: Re: [Moses-support] Major bug found in Moses
>
>
>Hi James,
>
>there are many more factors involved than just probability, for
>instance word penalties, phrase penalities etc. To be able to validate
>your own claim you would need to set weights for all those
>non-probabilities to zero. Otherwise there is no hope that moses will
>produce anything similar to the most probable translation. And based on
>that there is no surprise that there may be different translations. A
>pruned phrase table will produce naturally less noise, so I would say
>the behaviour you describe is quite exactly what I would expect to
>happen.
>
>Best,
>
>Marcin
>
>W dniu 2015-06-17 15:26, Read, James C napisal(a):
>
>Hi all,
>
>
>
>I tried unsuccessfully to publish experiments showing this bug in Moses
>behaviour. As a result I have lost interest in attempting to have my
>work published. Nonetheless I think you all should be aware of an
>anomaly in Moses' behaviour which I have thoroughly exposed and should
>be easy enough for you to reproduce.
>
>
>
>As I understand it the TM logic of Moses should select the most likely
>translations according to the TM. I would therefore expect a run of
>Moses with no LM to find sentences which are the most likely or at
>least close to the most likely according to the TM.
>
>
>
>To test this behaviour I performed two runs of Moses. One with an
>unfiltered phrase table the other with a filtered phrase table which
>left only the most likely phrase pair for each source language phrase.
>The results were truly startling. I observed huge differences in BLEU
>score. The filtered phrase tables produced much higher BLEU scores. The
>beam size used was the default width of 100. I would not have been
>surprised in the differences in BLEU scores where minimal but they were
>quite high.
>
>
>
>I have been unable to find a logical explanation for this behaviour
>other than to conclude that there must be some kind of bug in Moses
>which causes a TM only run of Moses to perform poorly in finding the
>most likely translations according to the TM when there are less likely
>phrase pairs included in the race.
>
>
>
>I hope this information will be useful to the Moses community and that
>the cause of the behaviour can be found and rectified.
>
>
>
>James
>
>
>_______________________________________________
>Moses-support mailing list
>Moses-support@mit.edu<mailto:Moses-support@mit.edu>
>http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support
--
Ondrej Bojar (mailto:obo@cuni.cz / bojar@ufal.mff.cuni.cz)
http://www.cuni.cz/~obo
------------------------------
Message: 2
Date: Wed, 17 Jun 2015 16:29:23 +0200
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Major bug found in Moses
To: "Read, James C" <jcread@essex.ac.uk>
Cc: moses-support@mit.edu, "Arnold, Doug" <doug@essex.ac.uk>
Message-ID: <2cb6b12f00ffe1c69ade04648ab965d5@amu.edu.pl>
Content-Type: text/plain; charset="utf-8"
To paint you a picture:
Imagine you have a rat in a labyrinth (the labyrinth is the TM and the
search space). That rat is quite good at finding the center of that
labyrinth. Now you somehow disable that rat's sense of smell, sense of
direction, and long-term short-term memory (that's the LM). Can you
expect the rat to find the center? Or will it just tumble around,
bumping into walls and not find anything? That's what you did to the
decoder when disabling the LM.
Now you prune the TM. In the labyrinth that's like closing all the doors
that would lead the rat away from the center. There are still a few
corridors left, but they all point into the general direction of the
point where the rat is supposed to go. Although it may never quite reach
it. Now you put that same handicapped rat into the labyrinth where all
ways lead more or less to the center. Are you really surprised that the
clueless rat find the center nearly every time now?
That's what happend. It's not a bug. The LM is probably the strongest
feature in a MT system. If you take that away you see what happens.
W dniu 2015-06-17 16:22, Read, James C napisa?(a):
> All I did was break the link to the language model and then perform filtering. How is that a methodoligical mistake? How else would one test the efficacy of the TM in isolation?
>
> I remain convinced that this is undersirable behaviour and therefore a bug.
>
> James
>
> -------------------------
>
> FROM: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
> SENT: Wednesday, June 17, 2015 5:12 PM
> TO: Read, James C
> CC: Arnold, Doug; moses-support@mit.edu
> SUBJECT: Re: [Moses-support] Major bug found in Moses
>
> Hi James
>
> No, not at all. I would say that is expected behaviour. It's how search spaces and optimization works. If anything these are methodological mistakes on your side, sorry. You are doing weird thinds to the decoder and then you are surprised to get weird results from it.
>
> W dniu 2015-06-17 16:07, Read, James C napisa?(a):
>
> So, do we agree that this is undersirable behaviour and therefore a bug?
>
> James
>
> -------------------------
>
> FROM: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
> SENT: Wednesday, June 17, 2015 5:01 PM
> TO: Read, James C
> SUBJECT: Re: [Moses-support] Major bug found in Moses
>
> As I said. With an unpruned phrase table and an decoder that just optmizes some unreasonble set of weights all bets are off, so if you get very low BLEU point there, it's not surprising. It's probably jumping around in a very weird search space. With a pruned phrase table you restrict the search space VERY strongly. Nearly everything that will be produced is a half-decent translation. So yes, I can imagine that would happen.
>
> Marcin
>
> W dniu 2015-06-17 15:56, Read, James C napisa?(a):
>
> You would expect an improvement of 37 BLEU points?
>
> James
>
> -------------------------
>
> FROM: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
> SENT: Wednesday, June 17, 2015 4:32 PM
> TO: Read, James C
> CC: Moses-support@mit.edu; Arnold, Doug
> SUBJECT: Re: [Moses-support] Major bug found in Moses
>
> Hi James,
>
> there are many more factors involved than just probability, for instance word penalties, phrase penalities etc. To be able to validate your own claim you would need to set weights for all those non-probabilities to zero. Otherwise there is no hope that moses will produce anything similar to the most probable translation. And based on that there is no surprise that there may be different translations. A pruned phrase table will produce naturally less noise, so I would say the behaviour you describe is quite exactly what I would expect to happen.
>
> Best,
>
> Marcin
>
> W dniu 2015-06-17 15:26, Read, James C napisa?(a):
>
> Hi all,
>
> I tried unsuccessfully to publish experiments showing this bug in Moses behaviour. As a result I have lost interest in attempting to have my work published. Nonetheless I think you all should be aware of an anomaly in Moses' behaviour which I have thoroughly exposed and should be easy enough for you to reproduce.
>
> As I understand it the TM logic of Moses should select the most likely translations according to the TM. I would therefore expect a run of Moses with no LM to find sentences which are the most likely or at least close to the most likely according to the TM.
>
> To test this behaviour I performed two runs of Moses. One with an unfiltered phrase table the other with a filtered phrase table which left only the most likely phrase pair for each source language phrase. The results were truly startling. I observed huge differences in BLEU score. The filtered phrase tables produced much higher BLEU scores. The beam size used was the default width of 100. I would not have been surprised in the differences in BLEU scores where minimal but they were quite high.
>
> I have been unable to find a logical explanation for this behaviour other than to conclude that there must be some kind of bug in Moses which causes a TM only run of Moses to perform poorly in finding the most likely translations according to the TM when there are less likely phrase pairs included in the race.
>
> I hope this information will be useful to the Moses community and that the cause of the behaviour can be found and rectified.
>
> James
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support [1]
Links:
------
[1] http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150617/b5e1ea57/attachment-0001.htm
------------------------------
Message: 3
Date: Wed, 17 Jun 2015 14:32:13 +0000 (UTC)
From: Rico Sennrich <rico.sennrich@gmx.ch>
Subject: Re: [Moses-support] Major bug found in Moses
To: moses-support@mit.edu
Message-ID: <loom.20150617T162622-973@post.gmane.org>
Content-Type: text/plain; charset=us-ascii
Read, James C <jcread@...> writes:
> I have been unable to find a logical explanation for this behaviour other
than to conclude that there must be some kind of bug in Moses which causes a
TM only run of Moses to perform poorly in finding the most likely
translations according to the TM when
> there are less likely phrase pairs included in the race.
I may have overlooked something, but you seem to have removed the language
model from your config, and used default weights. your default model will
thus (roughly) implement the following model:
p(e|f) = p(e|f)*p(f|e)
which is obviously wrong, and will give you poor results. This is not a bug
in the code, but a poor choice of models and weights. Standard steps in SMT
(like tuning the model weights on a development set, and including a
language model) will give you the desired results.
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 104, Issue 30
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 104, Issue 30"
Post a Comment