Moses-support Digest, Vol 109, Issue 41

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: baseline-system has very low BLEU-Score (Raphael Hoeps)
2. Re: baseline-system has very low BLEU-Score (Rico Sennrich)

----------------------------------------------------------------------

Message: 1
Date: Wed, 18 Nov 2015 16:12:36 +0100
From: Raphael Hoeps <raphael.hoeps@gmx.net>
Subject: Re: [Moses-support] baseline-system has very low BLEU-Score
To: moses-support@mit.edu
Message-ID: <564C9564.9060607@gmx.net>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi,
I really tried to find my mistakes and checked every step, but with no
effort. Here is what I have:

- As I want to translate from English into German, I trained a German
language model. I think it works quite well, giving me good (low)
perplexities for some German sentences I wrote as input.

- I trained the translation model (English to German) and tuned it over
development corpora (500 sentences).

- Then I translated the test set and as far as I can say when I look at
it, the translation is not that bad! A lot of small sentences in the
test set are translated absolutely correctly (like "What will they do?"
or "I really don't know."). Other sentences are rough but sometimes well
understandable.

- When calculating the BLEU-score of this test set and the translation
of the test set with multi-bleu.perl I get the poor result of 3.76.

- When translating with the untuned system, the score is 3.56. The
translated sentences seem pretty similar to the translation of the tuned
system.

What really confuses me, is that I get such a low score on a translated
document, that doesn't seem at all like nonsense to me, when I read it.
Can this be normal? If it helps, I can send you the English test set and
the German translation.

Do you have any tips on how to find my mistake?

Thanks a lot,
Raphael

Am 18.11.2015 um 14:56 schrieb Rico Sennrich:
> Hello Raphael,
>
> I suggest that you check if you mixed up the languages somewhere, and
> check if your translation output is actually English.
>
> 3.76 BLEU is possible to achieve without translation (because names and
> some function words are the same between English and German), and it's
> possible that you used the wrong reference file when measuring BLEU, or
> that your SMT system is broken in some way and just copies the source
> text to the output.
>
> best wishes,
> Rico
>
>
> On 18.11.2015 13:36, Raphael Hoeps wrote:
>> Hi,
>> I'm a Computer Science student from Germany working on a SMT-project. I
>> tried to get into the moses-system a little bit and did the
>> baseline-tutorial found here:
>> http://www.statmt.org/moses/?n=Moses.Baseline. I sticked to this
>> tutorial but used the German/English corpora.
>>
>> Unfortunately in the end I got a poor BLEU-score of only 3.76:
>> BLEU = 3.76, 24.1/6.2/2.2/0.9 (BP=0.906, ratio=0.910, hyp_len=68049,
>> ref_len=74753)
>> In the tutorial, a score of 23.5 was received.
>>
>> I think did everything as shown in the tutorial, except for one thing:
>> in the tuning-part I cut down the two development corpora to 500 lines
>> (from 2000), because my Laptop is quite old and I wanted to speed up the
>> process a little bit. (It still took my Laptop 6 hours).
>> Do you think that this is the reason for my poor score?
>> Is it maybe because I used the German/English corpora, so the score
>> can't be compared to the English/French system in the tutorial?
>> Or did I just make a mistake when typing all the commands? Any ideas how
>> to find this mistake?
>>
>> Thank you very much for your help,
>> Raphael
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

------------------------------

Message: 2
Date: Wed, 18 Nov 2015 16:17:42 +0000
From: Rico Sennrich <rico.sennrich@gmx.ch>
Subject: Re: [Moses-support] baseline-system has very low BLEU-Score
To: moses-support@mit.edu
Message-ID: <564CA4A6.50807@gmx.ch>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi Raphael,

just to re-iterate: are you sure you used the correct (German) reference
file to score your (German) translation output, and didn't mistakenly
use the (English) input as reference?

best wishes,
Rico

On 18.11.2015 15:12, Raphael Hoeps wrote:
> - When calculating the BLEU-score of this test set and the translation
> of the test set with multi-bleu.perl I get the poor result of 3.76.
>
> Do you have any tips on how to find my mistake?
>
> Thanks a lot,
> Raphael
>
>
>
>
>
>
> Am 18.11.2015 um 14:56 schrieb Rico Sennrich:
>> Hello Raphael,
>>
>> I suggest that you check if you mixed up the languages somewhere, and
>> check if your translation output is actually English.
>>
>> 3.76 BLEU is possible to achieve without translation (because names and
>> some function words are the same between English and German), and it's
>> possible that you used the wrong reference file when measuring BLEU, or
>> that your SMT system is broken in some way and just copies the source
>> text to the output.
>>
>> best wishes,
>> Rico
>>
>>
>> On 18.11.2015 13:36, Raphael Hoeps wrote:
>>> Hi,
>>> I'm a Computer Science student from Germany working on a SMT-project. I
>>> tried to get into the moses-system a little bit and did the
>>> baseline-tutorial found here:
>>> http://www.statmt.org/moses/?n=Moses.Baseline. I sticked to this
>>> tutorial but used the German/English corpora.
>>>
>>> Unfortunately in the end I got a poor BLEU-score of only 3.76:
>>> BLEU = 3.76, 24.1/6.2/2.2/0.9 (BP=0.906, ratio=0.910, hyp_len=68049,
>>> ref_len=74753)
>>> In the tutorial, a score of 23.5 was received.
>>>
>>> I think did everything as shown in the tutorial, except for one thing:
>>> in the tuning-part I cut down the two development corpora to 500 lines
>>> (from 2000), because my Laptop is quite old and I wanted to speed up the
>>> process a little bit. (It still took my Laptop 6 hours).
>>> Do you think that this is the reason for my poor score?
>>> Is it maybe because I used the German/English corpora, so the score
>>> can't be compared to the English/French system in the tutorial?
>>> Or did I just make a mistake when typing all the commands? Any ideas how
>>> to find this mistake?
>>>
>>> Thank you very much for your help,
>>> Raphael
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 109, Issue 41
**********************************************

Moses-support Digest, Vol 109, Issue 41

0 Response to "Moses-support Digest, Vol 109, Issue 41"

Post a Comment