Moses-support Digest, Vol 100, Issue 83

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. BadDiscountException (fatma elzahraa Eltaher)
2. Re: BadDiscountException (Kenneth Heafield)
3. Number of Unique Hypotheses in the N-best List (Erin? Dikici)
4. Re: Number of Unique Hypotheses in the N-best List (Rico Sennrich)
5. Re: Number of Unique Hypotheses in the N-best List (Matthias Huck)
6. Re: Number of Unique Hypotheses in the N-best List
(Marcin Junczys-Dowmunt)

----------------------------------------------------------------------

Message: 1
Date: Tue, 24 Feb 2015 05:04:35 -0800
From: fatma elzahraa Eltaher <fatmaeltaher@gmail.com>
Subject: [Moses-support] BadDiscountException
To: moses-support@mit.edu
Message-ID:
<CAOW1BbSvZ7+P-D-XZtZMPFYZA4h_jLwNQvFmB+yHM=HAguEmhw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Dears,
I get the following error in LM_toy_train.65.STDERR:
Unigram tokens 25188 types 39
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:468 2:322921696 3:605478272 4:968765120 5:1412782592
/home/fatma/Desktop/Folder/mosesdecoder/lm/builder/adjust_counts.cc:50 in
void lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
lm::builder::DiscountConfig&) threw BadDiscountException because `s.n[j] ==
0'.
Could not calculate Kneser-Ney discounts for 1-grams with adjusted count 4
because we didn't observe any 1-grams with adjusted count 3; Is this small
or artificial data?
How do I fix it?

thank you,

Fatma El-Zahraa El -Taher

Teaching Assistant at Computer & System department

Faculty of Engineering, Azhar University

Email : fatmaeltaher@gmail.com
mobile: +201141600434
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150224/cff0e9f2/attachment-0001.htm

------------------------------

Message: 2
Date: Tue, 24 Feb 2015 08:22:41 -0500
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] BadDiscountException
To: moses-support@mit.edu
Message-ID: <54EC7B21.20704@kheafield.com>
Content-Type: text/plain; charset=windows-1252

The closed-form estimates for Kneser-Ney are not well-defined on toy or
class-based data. I recommend using more training data. If this is a
class-based model, pass --discount_fallback.

Kenneth

On 02/24/2015 08:04 AM, fatma elzahraa Eltaher wrote:
> Dears,
> I get the following error in LM_toy_train.65.STDERR:
> Unigram tokens 25188 types 39
> === 2/5 Calculating and sorting adjusted counts ===
> Chain sizes: 1:468 2:322921696 3:605478272 4:968765120 5:1412782592
> /home/fatma/Desktop/Folder/mosesdecoder/lm/builder/adjust_counts.cc:50
> in void
> lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
> lm::builder::DiscountConfig&) threw BadDiscountException because `s.n[j]
> == 0'.
> Could not calculate Kneser-Ney discounts for 1-grams with adjusted count
> 4 because we didn't observe any 1-grams with adjusted count 3; Is this
> small or artificial data?
> How do I fix it?
>
>
> thank you,
>
>
>
> Fatma El-Zahraa El -Taher
>
> Teaching Assistant at Computer & System department
>
> Faculty of Engineering, Azhar University
>
> Email : fatmaeltaher@gmail.com <mailto:fatmaeltaher@gmail.com>
> mobile: +201141600434
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

------------------------------

Message: 3
Date: Tue, 24 Feb 2015 16:13:37 +0200
From: Erin? Dikici <erinc.dikici@boun.edu.tr>
Subject: [Moses-support] Number of Unique Hypotheses in the N-best
List
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAJ=2YW1R6ae5pAXcG=rsF1gS-GVKc2G2ountDUWz+_WzXFzhBw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Dear All,

Moving from Moses version 2.1 to 3.0, I realized a significant change in
the decoding behavior.

I use the decoder with the parameters "/opt/moses/bin/moses
-search-algorithm 1 -cube-pruning-pop-limit 5000 -s 5000 -dl 0", with an
n-best-list size of 50.

For an example test sentence, the number of unique hypotheses in the
generated test.output.1.best50 file was 32 in version 2.1. In version 3.0,
using exactly the same configuration file (thus the same parameters), the
number of unique hypotheses is only 2.

Can you please advise on what to do in order to increase the diversity in
the n-best lists?

Thanks in advance for your help,

ED
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150224/dbea662c/attachment-0001.htm

------------------------------

Message: 4
Date: Tue, 24 Feb 2015 16:01:13 +0000 (UTC)
From: Rico Sennrich <rico.sennrich@gmx.ch>
Subject: Re: [Moses-support] Number of Unique Hypotheses in the N-best
List
To: moses-support@mit.edu
Message-ID: <loom.20150224T165436-555@post.gmane.org>
Content-Type: text/plain; charset=utf-8

Erin? Dikici <erinc.dikici@...> writes:

> Dear All,
>
>
> Moving from Moses version 2.1 to 3.0, I realized a significant change in
the decoding behavior.I use the decoder with the parameters
"/opt/moses/bin/moses -search-algorithm 1 -cube-pruning-pop-limit 5000 -s
5000 -dl 0", with an n-best-list size of 50.For an example test sentence,
the number of unique hypotheses in the generated test.output.1.best50 file
was 32 in version 2.1. In version 3.0, using exactly the same configuration
file (thus the same parameters), the number of unique hypotheses is only 2.
>
> Can you please advise on what to do in order to increase the diversity in
the n-best lists?
>

Hi Erin?,

somewhere between 2.1 and 3.0, the keyword 'distinct' was c??????????(?????????????????????????????????????????????????????????)?????????????????????????????????????????????????
???)??????????????????????()????????)I??

------------------------------

Message: 5
Date: Tue, 24 Feb 2015 16:24:52 +0000
From: Matthias Huck <mhuck@inf.ed.ac.uk>
Subject: Re: [Moses-support] Number of Unique Hypotheses in the N-best
List
To: Moses-support <moses-support@mit.edu>
Message-ID: <1424795092.2192.501.camel@portedgar>
Content-Type: text/plain; charset="UTF-8"

> somewhere between 2.1 and 3.0, the keyword 'distinct' was

Oops, that was me. And it wasn't intended. I'm using this for my own
setups and apparently copied it to master when I added some other stuff.
Hope I didn't mess up other people's experiments. It's been in master
since 7 August 2014 already and nobody noticed.

Sorry for that, you can remove it again if you want.
Lines 1280 and 1282 of scripts/training/mert-moses.pl .

I'd assume that your 32 entries of the n-best list weren't actually
unique, though, but a number of duplicates of the (two) very same
outputs, as "distinct" should simply avoid duplicate entries.

Here's a link to a related previous discussion on this mailing list:
http://comments.gmane.org/gmane.comp.nlp.moses.user/11097
You can try the parameter "n-best-factor".

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

------------------------------

Message: 6
Date: Tue, 24 Feb 2015 17:38:22 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Number of Unique Hypotheses in the N-best
List
To: moses-support@mit.edu
Message-ID: <54ECA8FE.1040308@amu.edu.pl>
Content-Type: text/plain; charset=windows-1252; format=flowed

If you decide to remove it, then please add an option to activate that.
I did actually notice it, but I was happy it was there so I did not
complain :)

W dniu 24.02.2015 o 17:24, Matthias Huck pisze:
>> somewhere between 2.1 and 3.0, the keyword 'distinct' was
> Oops, that was me. And it wasn't intended. I'm using this for my own
> setups and apparently copied it to master when I added some other stuff.
> Hope I didn't mess up other people's experiments. It's been in master
> since 7 August 2014 already and nobody noticed.
>
> Sorry for that, you can remove it again if you want.
> Lines 1280 and 1282 of scripts/training/mert-moses.pl .
>
> I'd assume that your 32 entries of the n-best list weren't actually
> unique, though, but a number of duplicates of the (two) very same
> outputs, as "distinct" should simply avoid duplicate entries.
>
> Here's a link to a related previous discussion on this mailing list:
> http://comments.gmane.org/gmane.comp.nlp.moses.user/11097
> You can try the parameter "n-best-factor".
>
>
>

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 100, Issue 83
**********************************************

Moses-support Digest, Vol 100, Issue 83

0 Response to "Moses-support Digest, Vol 100, Issue 83"

Post a Comment