Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. lmplz failed to run on certain input (Dingyuan Wang)
2. Re: lmplz failed to run on certain input (Kenneth Heafield)
3. Re: lmplz failed to run on certain input (Dingyuan Wang)
4. Re: Fail to build and compile moses (Hieu Hoang)
----------------------------------------------------------------------
Message: 1
Date: Mon, 01 Jun 2015 10:13:12 +0800
From: Dingyuan Wang <abcdoyle888@gmail.com>
Subject: [Moses-support] lmplz failed to run on certain input
To: "Moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <556BBFB8.2020808@gmail.com>
Content-Type: text/plain; charset="utf-8"
Dear all,
When using lmplz to generate a 6-gram model for POS-tag-like data (small
vocabulary, no real word), lmplz sometimes failed to run depending the
dataset.
The version of lmplz is built from the latest code from either
mosesdecoder or kenlm GitHub repo.
Command line is like
somedir/lmplz -o 6 -S 50% --text foo.txt --arpa foo.lm
Here is the stderr. The failing dataset is 2894562 lines, 92M.
=== 1/5 Counting and sorting n-grams ===
Reading /home/gumble/[somedir]/zh-cn-nw-pos1.txt
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Unigram tokens 41634632 types 114
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:1368 2:241348592 3:452528608 4:724045824 5:1055900160
6:1448091648
/home/gumble/github/kenlm/lm/builder/adjust_counts.cc:59 in void
lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
lm::builder::DiscountConfig&) threw BadDiscountException because
`discounts_[i].amount[j] < 0.0 || discounts_[i].amount[j] > j'.
ERROR: 1-gram discount out of range for adjusted count 3: -0.2
Aborted (core dumped)
I used `awk 'BEGIN {srand()} !/^$/ { if (rand() <= .0001) print }'
zh-cn-nw-pos1.txt` to sample a few lines, which sometimes produces a
similar error, or successfully runs.
The attached failed sample produces the error "ERROR: 1-gram discount
out of range for adjusted count 2: -1.23077".
Thanks.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sample-fail.txt.gz
Type: application/gzip
Size: 3113 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150601/06443038/attachment-0002.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sample-success.txt.gz
Type: application/gzip
Size: 2773 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150601/06443038/attachment-0003.bin
------------------------------
Message: 2
Date: Sun, 31 May 2015 21:52:57 -0600
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] lmplz failed to run on certain input
To: moses-support@mit.edu
Message-ID: <556BD719.9050607@kheafield.com>
Content-Type: text/plain; charset=windows-1252
--discount_fallback
On 05/31/15 20:13, Dingyuan Wang wrote:
> Dear all,
>
> When using lmplz to generate a 6-gram model for POS-tag-like data (small
> vocabulary, no real word), lmplz sometimes failed to run depending the
> dataset.
>
> The version of lmplz is built from the latest code from either
> mosesdecoder or kenlm GitHub repo.
>
> Command line is like
>
> somedir/lmplz -o 6 -S 50% --text foo.txt --arpa foo.lm
>
> Here is the stderr. The failing dataset is 2894562 lines, 92M.
>
> === 1/5 Counting and sorting n-grams ===
> Reading /home/gumble/[somedir]/zh-cn-nw-pos1.txt
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> ****************************************************************************************************
> Unigram tokens 41634632 types 114
> === 2/5 Calculating and sorting adjusted counts ===
> Chain sizes: 1:1368 2:241348592 3:452528608 4:724045824 5:1055900160
> 6:1448091648
> /home/gumble/github/kenlm/lm/builder/adjust_counts.cc:59 in void
> lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
> lm::builder::DiscountConfig&) threw BadDiscountException because
> `discounts_[i].amount[j] < 0.0 || discounts_[i].amount[j] > j'.
> ERROR: 1-gram discount out of range for adjusted count 3: -0.2
> Aborted (core dumped)
>
> I used `awk 'BEGIN {srand()} !/^$/ { if (rand() <= .0001) print }'
> zh-cn-nw-pos1.txt` to sample a few lines, which sometimes produces a
> similar error, or successfully runs.
>
> The attached failed sample produces the error "ERROR: 1-gram discount
> out of range for adjusted count 2: -1.23077".
>
> Thanks.
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
------------------------------
Message: 3
Date: Mon, 01 Jun 2015 12:50:42 +0800
From: Dingyuan Wang <abcdoyle888@gmail.com>
Subject: Re: [Moses-support] lmplz failed to run on certain input
To: Kenneth Heafield <moses@kheafield.com>, moses-support@mit.edu
Message-ID: <556BE4A2.8010805@gmail.com>
Content-Type: text/plain; charset=windows-1252
Works. Thanks.
On 2015/06/01 11:52, Kenneth Heafield wrote:
> --discount_fallback
>
> On 05/31/15 20:13, Dingyuan Wang wrote:
>> Dear all,
>>
>> When using lmplz to generate a 6-gram model for POS-tag-like data (small
>> vocabulary, no real word), lmplz sometimes failed to run depending the
>> dataset.
>>
>> The version of lmplz is built from the latest code from either
>> mosesdecoder or kenlm GitHub repo.
>>
>> Command line is like
>>
>> somedir/lmplz -o 6 -S 50% --text foo.txt --arpa foo.lm
>>
>> Here is the stderr. The failing dataset is 2894562 lines, 92M.
>>
>> === 1/5 Counting and sorting n-grams ===
>> Reading /home/gumble/[somedir]/zh-cn-nw-pos1.txt
>> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>> ****************************************************************************************************
>> Unigram tokens 41634632 types 114
>> === 2/5 Calculating and sorting adjusted counts ===
>> Chain sizes: 1:1368 2:241348592 3:452528608 4:724045824 5:1055900160
>> 6:1448091648
>> /home/gumble/github/kenlm/lm/builder/adjust_counts.cc:59 in void
>> lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
>> lm::builder::DiscountConfig&) threw BadDiscountException because
>> `discounts_[i].amount[j] < 0.0 || discounts_[i].amount[j] > j'.
>> ERROR: 1-gram discount out of range for adjusted count 3: -0.2
>> Aborted (core dumped)
>>
>> I used `awk 'BEGIN {srand()} !/^$/ { if (rand() <= .0001) print }'
>> zh-cn-nw-pos1.txt` to sample a few lines, which sometimes produces a
>> similar error, or successfully runs.
>>
>> The attached failed sample produces the error "ERROR: 1-gram discount
>> out of range for adjusted count 2: -1.23077".
>>
>> Thanks.
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
------------------------------
Message: 4
Date: Mon, 01 Jun 2015 10:25:50 +0400
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Fail to build and compile moses
To: moses-support@mit.edu, davood_mf@hotmail.com
Message-ID: <556BFAEE.7090205@gmail.com>
Content-Type: text/plain; charset="windows-1252"
that's a slightly crazy compile error i've never seen before. What
operating system and version are you using? Is it a standard
installation, or do you know if the sysadmin did anything weird to the
system?
On 30/05/2015 17:17, Davood Mohammadifar wrote:
> Hello
> After downloading moses (according to command git clone
> https://github.com/moses-smt/mosesdecoder.git
> in manualI file), I executed the following command for compiling and
> building moses.
> ./bjam -j8
>
> and got error.
>
> It's likely the problems are related to GCC.
> The log file is in attachment.
> Ubuntu Version: 12.04
>
> Thank you for your helping
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150601/87514dba/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 104, Issue 1
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 104, Issue 1"
Post a Comment