Moses-support Digest, Vol 100, Issue 93

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Fwd: Re: BadDiscountException (Kenneth Heafield)
2. kbmira segfault (Matt Post)
3. Re: kbmira segfault (Barry Haddow)
4. Evaluation and tuning problems (fatma elzahraa Eltaher)


----------------------------------------------------------------------

Message: 1
Date: Thu, 26 Feb 2015 17:18:03 -0500
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] Fwd: Re: BadDiscountException
To: Philipp Koehn <phi@jhu.edu>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <54EF9B9B.2040900@kheafield.com>
Content-Type: text/plain; charset=utf-8

Hi,

Actually it's more than that. Boost program options doesn't like
single hyphens before multi-character options :-\.

Can you add --discount_fallback to the perl script?

Kenneth

On 02/26/15 16:44, Philipp Koehn wrote:
> Hui,
>
> the wrapper script really just exists, because SRILM (and the wrapper)
> sets the name of the produced LM file with "-lm" and lmplz sets it
> with "-arpa". If you allow as an alternative name for the switch
> "-lm", I'll remove it.
>
> -phi
>
> On Tue, Feb 24, 2015 at 8:39 PM, Kenneth Heafield <moses@kheafield.com> wrote:
>> Try removing this bit of text and just calling the lmplz binary
>> directly. It's not clear to me why that wrapper script still exists.
>>
>> $moses-script-dir/ems/support/lmplz-wrapper.perl -bin
>>
>>
>> -------- Forwarded Message --------
>> Subject: Re: [Moses-support] BadDiscountException
>> Date: Tue, 24 Feb 2015 06:16:43 -0800
>> From: fatma elzahraa Eltaher <fatmaeltaher@gmail.com>
>> To: Kenneth Heafield <moses@kheafield.com>
>>
>>
>>
>> I use kenlm model and when try to add --discount_fallback=1 for setting
>> I get this error Unknown option: discount_fallback.
>> I attached config.toy where must I change to solve this problem ?
>>
>>
>> thank you,
>>
>>
>>
>> Fatma El-Zahraa El -Taher
>>
>> Teaching Assistant at Computer & System department
>>
>> Faculty of Engineering, Azhar University
>>
>> Email : fatmaeltaher@gmail.com <mailto:fatmaeltaher@gmail.com>
>> mobile: +201141600434
>>
>>
>> On Tue, Feb 24, 2015 at 5:22 AM, Kenneth Heafield <moses@kheafield.com
>> <mailto:moses@kheafield.com>> wrote:
>>
>> The closed-form estimates for Kneser-Ney are not well-defined on toy or
>> class-based data. I recommend using more training data. If this is a
>> class-based model, pass --discount_fallback.
>>
>> Kenneth
>>
>> On 02/24/2015 08:04 AM, fatma elzahraa Eltaher wrote:
>> > Dears,
>> > I get the following error in LM_toy_train.65.STDERR:
>> > Unigram tokens 25188 types 39
>> > === 2/5 Calculating and sorting adjusted counts ===
>> > Chain sizes: 1:468 2:322921696 3:605478272 4:968765120 5:1412782592
>> > /home/fatma/Desktop/Folder/mosesdecoder/lm/builder/adjust_counts.cc:50
>> > in void
>> > lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
>> > lm::builder::DiscountConfig&) threw BadDiscountException because
>> `s.n[j]
>> > == 0'.
>> > Could not calculate Kneser-Ney discounts for 1-grams with adjusted
>> count
>> > 4 because we didn't observe any 1-grams with adjusted count 3; Is this
>> > small or artificial data?
>> > How do I fix it?
>> >
>> >
>> > thank you,
>> >
>> >
>> >
>> > Fatma El-Zahraa El -Taher
>> >
>> > Teaching Assistant at Computer & System department
>> >
>> > Faculty of Engineering, Azhar University
>> >
>> > Email : fatmaeltaher@gmail.com <mailto:fatmaeltaher@gmail.com>
>> <mailto:fatmaeltaher@gmail.com <mailto:fatmaeltaher@gmail.com>>
>> > mobile: +201141600434
>> >
>> >
>> >
>> > _______________________________________________
>> > Moses-support mailing list
>> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>


------------------------------

Message: 2
Date: Thu, 26 Feb 2015 17:18:56 -0500
From: Matt Post <post@cs.jhu.edu>
Subject: [Moses-support] kbmira segfault
To: moses-support@mit.edu
Message-ID: <FDD24842-D5E8-4B2C-A0CD-95802B0C476E@cs.jhu.edu>
Content-Type: text/plain; charset="us-ascii"

kbmira segfaults on the following command:

kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o mert.out

Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be downloaded here:

https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0
https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0

I tracked it down to this line of mert/FeatureStats.cpp.

std::string SparseVector::decode(std::size_t id)
{
return m_id_to_name[id];
}

Any obvious ideas before I go down this rabbit hole? I verified there are no blank lines or anything else funny with the formatting, at least as far as I can tell (all dense features, plus one sparse feature, OOVPenalty=-100, showing up occasionally).

matt



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150226/7ed86c31/attachment-0001.htm

------------------------------

Message: 3
Date: Thu, 26 Feb 2015 22:35:43 +0000
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] kbmira segfault
To: Matt Post <post@cs.jhu.edu>, moses-support@mit.edu
Message-ID: <54EF9FBF.7060408@staffmail.ed.ac.uk>
Content-Type: text/plain; charset="iso-8859-1"

Hi Matt

When mert-moses.pl runs kbmira, it always supplies a list of the dense
features (and their initial values) using the --dense-init parameter. I
think this is your problem. I've attached a typical file used for this
feature list.

Of course, kbmira should have a sensible message rather than a segfault.
This is probably my doing,

cheers - Barry

On 26/02/15 22:18, Matt Post wrote:
> kbmira segfaults on the following command:
>
> kbmira run --ffile run1.features.dat --scfile run1.scores.dat -o mert.out
>
> Where run1.features.dat (30 MB) and run1.scores.dat (14 MB) can be
> downloaded here:
>
> https://www.dropbox.com/s/yim7ub1bmq5jv2g/run1.features.dat?dl=0
> https://www.dropbox.com/s/kkek36o7aflgzuu/run1.scores.dat?dl=0
>
> I tracked it down to this line of mert/FeatureStats.cpp.
>
> std::string SparseVector::decode(std::size_t id)
> {
> return m_id_to_name[id];
> }
>
> Any obvious ideas before I go down this rabbit hole? I verified there
> are no blank lines or anything else funny with the formatting, at
> least as far as I can tell (all dense features, plus one sparse
> feature, OOVPenalty=-100, showing up occasionally).
>
> matt
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150226/cfcb203c/attachment-0001.htm
-------------- next part --------------
LexicalReordering0= 0.300000
LexicalReordering0= 0.300000
LexicalReordering0= 0.300000
LexicalReordering0= 0.300000
LexicalReordering0= 0.300000
LexicalReordering0= 0.300000
LexicalReordering0= 0.300000
LexicalReordering0= 0.300000
OpSequenceModel0= 0.080000
OpSequenceModel0= -0.020000
OpSequenceModel0= 0.020000
OpSequenceModel0= -0.001000
OpSequenceModel0= 0.030000
Distortion0= 0.300000
LM0= 0.500000
WordPenalty0= -1.000000
PhrasePenalty0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000
TranslationModel0= 0.200000

------------------------------

Message: 4
Date: Thu, 26 Feb 2015 14:36:17 -0800
From: fatma elzahraa Eltaher <fatmaeltaher@gmail.com>
Subject: [Moses-support] Evaluation and tuning problems
To: moses-support@mit.edu
Message-ID:
<CAOW1BbRRXKn8xiWeoPtgfo8Emx63XfTVG=qZ79Os_qdpYXZY9Q@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Dears,

I get this massages
step EVALUATION:test:input-from-sgm crashed
step TUNING:tune crashed
when I run this command
./experiment.perl -config config.toy -exec
I do not know how to fix these error.
I attached the .stderr files and my config.toy file.


thank you,



Fatma El-Zahraa El -Taher

Teaching Assistant at Computer & System department

Faculty of Engineering, Azhar University

Email : fatmaeltaher@gmail.com
mobile: +201141600434
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150226/dba1e694/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: EVALUATION_test_input-from-sgm.111.STDERR
Type: application/octet-stream
Size: 3842 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150226/dba1e694/attachment.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TUNING_tune.111.STDERR
Type: application/octet-stream
Size: 3735 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150226/dba1e694/attachment-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config.toy
Type: application/octet-stream
Size: 18943 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150226/dba1e694/attachment-0002.obj

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 100, Issue 93
**********************************************

0 Response to "Moses-support Digest, Vol 100, Issue 93"

Post a Comment