Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Error on lmplz (Lane Schwartz)
2. Re: Tuning with no language model (Read, James C)
----------------------------------------------------------------------
Message: 1
Date: Wed, 13 Jan 2016 08:59:21 -0600
From: Lane Schwartz <dowobeha@gmail.com>
Subject: Re: [Moses-support] Error on lmplz
To: Kenneth Heafield <moses@kheafield.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CABv3vZkm5A1Jb4WonKLwCL037-jo_V_OT_6aF+Z_XLct-L3qAA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Thanks, Kenneth. Here's what I get now.
$ ~/mosesdecoder.multisource.git/bin/lmplz -o 2 <<< "that is what happens ?
> cssd has nothing more or voldemort or pastries in prague ."
> === 1/5 Counting and sorting n-grams ===
> Reading /tmp/sh-thd-1452698150 (deleted)
>
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> tcmalloc: large alloc 29442056192 bytes == 0x1c74000 @
> tcmalloc: large alloc 78512136192 bytes == 0x6de346000 @
>
> ****************************************************************************************************
> Unigram tokens 16 types 18
> === 2/5 Calculating and sorting adjusted counts ===
> Chain sizes: 1:216 2:107979354931
> tcmalloc: large alloc 107979358208 bytes == 0x192a648000 @
> terminate called after throwing an instance of
> 'lm::builder::BadDiscountException'
> what():
> /home/lanes/mosesdecoder.multisource.git/lm/builder/adjust_counts.cc:53 in
> void lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
> lm::builder::DiscountConfig&) threw BadDiscountException because `s.n[j] ==
> 0'.
> Could not calculate Kneser-Ney discounts for 1-grams with adjusted count 4
> because we didn't observe any 1-grams with adjusted count 3; Is this small
> or artificial data?
> Try deduplicating the input. To override this error for e.g. a
> class-based model, rerun with --discount_fallback
> Aborted (core dumped)
On Tue, Jan 12, 2016 at 5:40 PM, Kenneth Heafield <moses@kheafield.com>
wrote:
> Pushed the fix from kenlm master in October to Moses master.
>
> On 01/12/2016 10:34 PM, Lane Schwartz wrote:
> > Steps to reproduce this error:
> >
> > $ ~/mosesdecoder.git/bin/lmplz -o 2 <<< "that is what happens ? cssd
> > has nothing more or voldemort or pastries in prague ."
> > === 1/5 Counting and sorting n-grams ===
> > Reading /tmp/sh-thd-107574999377 (deleted)
> >
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > tcmalloc: large alloc 29442056192 bytes == 0x2ae2000 @
> > tcmalloc: large alloc 78512136192 bytes == 0x6df1b4000 @
> >
> ****************************************************************************************************
> > Unigram tokens 16 types 18
> > === 2/5 Calculating and sorting adjusted counts ===
> > Chain sizes: 1:216 2:107979354931
> > tcmalloc: large alloc 107979358208 bytes == 0x192b4b6000 @
> > lmplz: ./util/fixed_array.hh:104: T&
> > util::FixedArray<T>::operator[](std::size_t) [with T =
> > lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = long
> > unsigned int]: Assertion `i < size()' failed.
> >
> >
> >
> >
> > On Wed, Sep 30, 2015 at 11:41 AM, Kenneth Heafield <moses@kheafield.com
> > <mailto:moses@kheafield.com>> wrote:
> >
> > That's bad. Would you mind sending me privately a minimal example of
> > the data that reproduces the problem?
> >
> > Kenneth
> >
> > On 09/30/2015 04:29 PM, Alex Martinez wrote:
> > > Hello,
> > > today I've pulled moses code and recompiled and some experiments
> (EMS)
> > > that were already working are failing on the LM training step with
> the
> > > following error:
> > >
> > > Executing: /opt/moses/bin/lmplz --text
> > > /home/alexmc/devel/toydata/process/lm/nc=pos.factored.1 --order 5
> > --arpa
> > > /home/alexmc/devel/toydata/process/lm/nc=pos.lm.1
> --discount_fallback
> > > === 1/5 Counting and sorting n-grams ===
> > > Reading /mnt/a62/devel/toydata/process/lm/nc=pos.factored.1
> > >
> >
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > > tcmalloc: large alloc 4753956864 bytes == 0x1f7c000 @
> > > tcmalloc: large alloc 22185107456 bytes == 0x11d536000 @
> > >
> >
> ****************************************************************************************************
> > > Unigram tokens 2433135 types 47
> > > === 2/5 Calculating and sorting adjusted counts ===
> > > Chain sizes: 1:564 2:2630656000 3:4932480000 4:7891967488
> > 5:11509120000
> > > tcmalloc: large alloc 11509121024 bytes == 0x1f7c000 @
> > > tcmalloc: large alloc 2630656000 bytes == 0x2aff70000 @
> > > tcmalloc: large alloc 4932485120 bytes == 0x34cc3a000 @
> > > tcmalloc: large alloc 7891968000 bytes == 0x64933c000 @
> > > lmplz: ./util/fixed_array.hh:104: T&
> > > util::FixedArray<T>::operator[](std::size_t) [with T =
> > > lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = long
> > > unsigned int]: Assertion `i < size()' failed.
> > >
> > > I'm runing a Linux server with Ubuntu 15.04
> > >
> > > Any help will be appreciated
> > >
> > > Alex Mart?nez
> > >
> > >
> > > _______________________________________________
> > > Moses-support mailing list
> > > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> > > http://mailman.mit.edu/mailman/listinfo/moses-support
> > >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> >
> >
> > --
> > When a place gets crowded enough to require ID's, social collapse is not
> > far away. It is time to go elsewhere. The best thing about space travel
> > is that it made it possible to go elsewhere.
> > -- R.A. Heinlein, "Time Enough For Love"
>
--
When a place gets crowded enough to require ID's, social collapse is not
far away. It is time to go elsewhere. The best thing about space travel
is that it made it possible to go elsewhere.
-- R.A. Heinlein, "Time Enough For Love"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160113/ff385684/attachment-0001.html
------------------------------
Message: 2
Date: Wed, 13 Jan 2016 15:02:08 +0000
From: "Read, James C" <jcread@essex.ac.uk>
Subject: Re: [Moses-support] Tuning with no language model
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: Moses Support <moses-support@mit.edu>
Message-ID:
<HE1PR06MB1481B819A9C130A8BC548B0A85CB0@HE1PR06MB1481.eurprd06.prod.outlook.com>
Content-Type: text/plain; charset="iso-8859-1"
OK, looks like it's running. Probably won't be able to see if it generates useful weights until tomorrow. Thanks.
As a side line of thought. I was wondering how many lines of configuration I could get away with deleting? How would Moses behave if I were to delete the lexical reordering lines? The distortion? The word penalty? The phrase penalty?
My goal is to get Moses to choose the best translation with reference to the translation model only.
If I delete these other configuration lines will Moses use defaults for these other options or completely disable their operation leaving just the TM?
________________________________
From: Hieu Hoang <hieuhoang@gmail.com>
Sent: Wednesday, January 13, 2016 2:51 PM
To: Read, James C
Cc: Moses Support
Subject: Re: [Moses-support] Tuning with no language model
ok. The mert script create a temporary directory every time you run it. By default it's named
mert-work
Since you ran mert with an incorrect moses.ini previously, it may have polluted the temporary directory and cause problem now.
You should find this temporary directory and delete it before running mert again.
ps ALL directories must be absolute, eg ../tuning_data/true.bg<http://true.bg/>
Hieu Hoang
http://www.hoang.co.uk/hieu
On 13 January 2016 at 14:45, Read, James C <jcread@essex.ac.uk<mailto:jcread@essex.ac.uk>> wrote:
/media/bigdata/jcread/llv/data/europarlv7/raw/aligned/bg-en/training_data$ /media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/training/mert-moses.pl<http://mert-moses.pl> -no-filter-phrase-table ../tuning_data/true.bg<http://true.bg> ../tuning_data/true.en /media/bigdata/jcread/3rd_party_software/mosesdecoder/bin/moses /media/bigdata/jcread/llv/data/europarlv7/raw/aligned/bg-en/training_data/binarised/moses-tm.ini --mertdir /media/bigdata/jcread/3rd_party_software/mosesdecoder/bin/
with absolute path to moses-tm.ini does not resolve the problem. Still get the same error.
________________________________
From: Read, James C
Sent: Wednesday, January 13, 2016 2:40 PM
To: Hieu Hoang
Cc: Moses Support
Subject: Re: [Moses-support] Tuning with no language model
This is what I get when I run the same command as you:
LexicalReordering0= 0.300000 0.300000 0.300000 0.300000 0.300000 0.300000
Distortion0= 0.300000
UnknownWordPenalty0 UNTUNEABLE
WordPenalty0= -1.000000
PhrasePenalty0= 0.200000
TranslationModel0= 0.200000 0.200000 0.200000 0.200000
Looks just like your output. However, this is what I get when I run the mert script with the following command:
/media/bigdata/jcread/llv/data/europarlv7/raw/aligned/bg-en/training_data$ /media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/training/mert-moses.pl<http://mert-moses.pl> -no-filter-phrase-table ../tuning_data/true.bg<http://true.bg> ../tuning_data/true.en /media/bigdata/jcread/3rd_party_software/mosesdecoder/bin/moses binarised/moses-tm.ini --mertdir /media/bigdata/jcread/3rd_party_software/mosesdecoder/bin/
Loading UnknownWordPenalty0
Loading WordPenalty0
Loading PhrasePenalty0
Loading LexicalReordering0
Loading Distortion0
Loading TranslationModel0
LM0
The following weights have no feature function. Maybe incorrectly spelt weights: LM0,Exit code: 1
The decoder died. CONFIG WAS -weight-overwrite 'PhrasePenalty0= 0.043478 WordPenalty0= -0.217391 TranslationModel0= 0.043478 0.043478 0.043478 0.043478 Distortion0= 0.065217 LM0= 0.108696 LexicalReordering0= 0.065217 0.065217 0.065217 0.065217 0.065217 0.065217'
________________________________
From: Hieu Hoang <hieuhoang@gmail.com<mailto:hieuhoang@gmail.com>>
Sent: Wednesday, January 13, 2016 2:27 PM
To: Read, James C
Cc: Moses Support
Subject: Re: [Moses-support] Tuning with no language model
looks fine to me. This is what I get when I used it to run
# $MOSES_DIR/bin/moses -f moses.ini -show-weights
....
LexicalReordering0= 0.300000 0.300000 0.300000 0.300000 0.300000 0.300000
Distortion0= 0.300000
UnknownWordPenalty0 UNTUNEABLE
WordPenalty0= -1.000000
PhrasePenalty0= 0.200000
TranslationModel0= 0.200000 0.200000 0.200000 0.200000
Double check that the moses.ini file you sent me is truly what you're using.
To be sure, you MUST use absolute paths for all file names, ie. not
binarised/moses-tm.ini
but
/media/bigdata/whatever/binarised/moses-tm.ini
On 13/01/16 14:16, Read, James C wrote:
#########################
### MOSES CONFIG FILE ###
#########################
# input factors
[input-factors]
0
# mapping steps
[mapping]
0 T 0
[distortion-limit]
6
# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryCompact name=TranslationModel0 num-features=4 path=/media/bigdata/jcread/llv/data/europarlv7/raw/aligned/bg-en/training_data/binarised/phrase-table input-factor=0 output-factor=0
LexicalReordering name=LexicalReordering0 num-features=6 type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 path=/media/bigdata/jcread/llv/data/europarlv7/raw/aligned/bg-en/training_data/binarised/reordering-table
Distortion
# dense weights for feature functions
[weight]
# The default weights are NOT optimized for translation quality. You MUST tune the weights.
# Documentation for tuning is here: http://www.statmt.org/moses/?n=FactoredTraining.Tuning
[http://www.statmt.org/moses/img/coin-tiny.png]<http://www.statmt.org/moses/?n=FactoredTraining.Tuning>
Moses - FactoredTraining/Tuning<http://www.statmt.org/moses/?n=FactoredTraining.Tuning>
www.statmt.org<http://www.statmt.org>
Tuning. Overview. During decoding, Moses scores translation hypotheses using a linear model. In the traditional approach, the features of the model are the ...
UnknownWordPenalty0= 1
WordPenalty0= -1
PhrasePenalty0= 0.2
TranslationModel0= 0.2 0.2 0.2 0.2
LexicalReordering0= 0.3 0.3 0.3 0.3 0.3 0.3
Distortion0= 0.3
/media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/training/mert-moses.pl<http://mert-moses.pl> -no-filter-phrase-table ../tuning_data/true.bg<http://true.bg> ../tuning_data/true.en /media/bigdata/jcread/3rd_party_software/mosesdecoder/bin/moses binarised/moses-tm.ini --mertdir /media/bigdata/jcread/3rd_party_software/mosesdecoder/bin/
________________________________
From: Hieu Hoang <hieuhoang@gmail.com><mailto:hieuhoang@gmail.com>
Sent: Wednesday, January 13, 2016 2:05 PM
To: Read, James C
Cc: Moses Support
Subject: Re: [Moses-support] Tuning with no language model
please show me the EXACT moses.ini and command that you use.
Hieu Hoang
http://www.hoang.co.uk/hieu
On 13 January 2016 at 14:04, Read, James C <jcread@essex.ac.uk<mailto:jcread@essex.ac.uk>> wrote:
So now I get the following error:
Loading UnknownWordPenalty0
Loading WordPenalty0
Loading PhrasePenalty0
Loading LexicalReordering0
Loading Distortion0
Loading TranslationModel0
LM0
The following weights have no feature function. Maybe incorrectly spelt weights: LM0,Exit code: 1
I didn't get this error without deleting LM0 line.
________________________________
From: Hieu Hoang <hieuhoang@gmail.com<mailto:hieuhoang@gmail.com>>
Sent: Wednesday, January 13, 2016 1:58 PM
To: Read, James C; Moses Support
Subject: Re: [Moses-support] Tuning with no language model
there lies your problem. There's a weight for a non-existant feature function
On 13/01/16 13:57, Read, James C wrote:
no
________________________________
From: Hieu Hoang <hieuhoang@gmail.com><mailto:hieuhoang@gmail.com>
Sent: Wednesday, January 13, 2016 1:56 PM
To: Read, James C; Moses Support
Subject: Re: [Moses-support] Tuning with no language model
did you delete the line
LM0= 0.5
too?
On 13/01/16 13:53, Read, James C wrote:
The following command works fine for when a language model is specified. Deleting the line KENLM lazyken=0 name=LM0 factor=0 path=$PWD/blm.en order=3
causes the script to fail with error:
ERROR: Failed to run '/pathtomoses/mosesdecoder/bin/moses -config /pathtodata/binarised/moses.ini -show-weights'. at /pathtomoses/mosesdecoder/scripts/training/mert-moses.pl<http://mert-moses.pl> line 1744.
#########################
### MOSES CONFIG FILE ###
#########################
# input factors
[input-factors]
0
# mapping steps
[mapping]
0 T 0
[distortion-limit]
6
# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryCompact name=TranslationModel0 num-features=4 path=/pathtodata/binarised/phrase-table input-factor=0 output-factor=0
LexicalReordering name=LexicalReordering0 num-features=6 type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 path=/pathtodata/binarised/reordering-table
Distortion
KENLM lazyken=0 name=LM0 factor=0 path=/pathtodata/blm.en order=3
# dense weights for feature functions
[weight]
# The default weights are NOT optimized for translation quality. You MUST tune the weights.
# Documentation for tuning is here: <http://www.statmt.org/moses/?n=FactoredTraining.Tuning> <http://www.statmt.org/moses/?n=FactoredTraining.Tuning> http://www.statmt.org/moses/?n=FactoredTraining.Tuning
[http://www.statmt.org/moses/img/coin-tiny.png]<http://www.statmt.org/moses/?n=FactoredTraining.Tuning>
Moses - FactoredTraining/Tuning<http://www.statmt.org/moses/?n=FactoredTraining.Tuning>
<http://www.statmt.org>www.statmt.org<http://www.statmt.org>
Tuning. Overview. During decoding, Moses scores translation hypotheses using a linear model. In the traditional approach, the features of the model are the ...
UnknownWordPenalty0= 1
WordPenalty0= -1
PhrasePenalty0= 0.2
TranslationModel0= 0.2 0.2 0.2 0.2
LexicalReordering0= 0.3 0.3 0.3 0.3 0.3 0.3
Distortion0= 0.3
LM0= 0.5
/pathtomoses/mosesdecoder/scripts/training/mert-moses.pl<http://mert-moses.pl> -no-filter-phrase-table true.fr<http://true.fr> true.en /pathtomoses/mosesdecoder/bin/moses binarised/moses.ini --mertdir /pathtomoses/mosesdecoder/bin/
--
Hieu Hoang
http://www.hoang.co.uk/hieu
--
Hieu Hoang
http://www.hoang.co.uk/hieu
--
Hieu Hoang
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160113/9f73510d/attachment.html
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 111, Issue 36
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 111, Issue 36"
Post a Comment