Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Error on lmplz (Lane Schwartz)
2. Re: Error on lmplz (Kenneth Heafield)
3. Re: Tiny sample data uses outdated file format (Philipp Koehn)
4. Tuning with no language model (Read, James C)
----------------------------------------------------------------------
Message: 1
Date: Tue, 12 Jan 2016 16:34:39 -0600
From: Lane Schwartz <dowobeha@gmail.com>
Subject: Re: [Moses-support] Error on lmplz
To: Kenneth Heafield <moses@kheafield.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CABv3vZk5g2NBjLd_qt-gfLfO1kt_HC3q8+mR1+JEew3s=E=7xA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Steps to reproduce this error:
$ ~/mosesdecoder.git/bin/lmplz -o 2 <<< "that is what happens ? cssd has
> nothing more or voldemort or pastries in prague ."
> === 1/5 Counting and sorting n-grams ===
> Reading /tmp/sh-thd-107574999377 (deleted)
>
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> tcmalloc: large alloc 29442056192 bytes == 0x2ae2000 @
> tcmalloc: large alloc 78512136192 bytes == 0x6df1b4000 @
>
> ****************************************************************************************************
> Unigram tokens 16 types 18
> === 2/5 Calculating and sorting adjusted counts ===
> Chain sizes: 1:216 2:107979354931
> tcmalloc: large alloc 107979358208 bytes == 0x192b4b6000 @
> lmplz: ./util/fixed_array.hh:104: T&
> util::FixedArray<T>::operator[](std::size_t) [with T =
> lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = long unsigned
> int]: Assertion `i < size()' failed.
On Wed, Sep 30, 2015 at 11:41 AM, Kenneth Heafield <moses@kheafield.com>
wrote:
> That's bad. Would you mind sending me privately a minimal example of
> the data that reproduces the problem?
>
> Kenneth
>
> On 09/30/2015 04:29 PM, Alex Martinez wrote:
> > Hello,
> > today I've pulled moses code and recompiled and some experiments (EMS)
> > that were already working are failing on the LM training step with the
> > following error:
> >
> > Executing: /opt/moses/bin/lmplz --text
> > /home/alexmc/devel/toydata/process/lm/nc=pos.factored.1 --order 5 --arpa
> > /home/alexmc/devel/toydata/process/lm/nc=pos.lm.1 --discount_fallback
> > === 1/5 Counting and sorting n-grams ===
> > Reading /mnt/a62/devel/toydata/process/lm/nc=pos.factored.1
> >
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > tcmalloc: large alloc 4753956864 bytes == 0x1f7c000 @
> > tcmalloc: large alloc 22185107456 bytes == 0x11d536000 @
> >
> ****************************************************************************************************
> > Unigram tokens 2433135 types 47
> > === 2/5 Calculating and sorting adjusted counts ===
> > Chain sizes: 1:564 2:2630656000 3:4932480000 4:7891967488 5:11509120000
> > tcmalloc: large alloc 11509121024 bytes == 0x1f7c000 @
> > tcmalloc: large alloc 2630656000 bytes == 0x2aff70000 @
> > tcmalloc: large alloc 4932485120 bytes == 0x34cc3a000 @
> > tcmalloc: large alloc 7891968000 bytes == 0x64933c000 @
> > lmplz: ./util/fixed_array.hh:104: T&
> > util::FixedArray<T>::operator[](std::size_t) [with T =
> > lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = long
> > unsigned int]: Assertion `i < size()' failed.
> >
> > I'm runing a Linux server with Ubuntu 15.04
> >
> > Any help will be appreciated
> >
> > Alex Mart?nez
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
--
When a place gets crowded enough to require ID's, social collapse is not
far away. It is time to go elsewhere. The best thing about space travel
is that it made it possible to go elsewhere.
-- R.A. Heinlein, "Time Enough For Love"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160112/3def6a23/attachment-0001.html
------------------------------
Message: 2
Date: Tue, 12 Jan 2016 23:40:19 +0000
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] Error on lmplz
To: Lane Schwartz <dowobeha@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <56958EE3.2050808@kheafield.com>
Content-Type: text/plain; charset=utf-8
Pushed the fix from kenlm master in October to Moses master.
On 01/12/2016 10:34 PM, Lane Schwartz wrote:
> Steps to reproduce this error:
>
> $ ~/mosesdecoder.git/bin/lmplz -o 2 <<< "that is what happens ? cssd
> has nothing more or voldemort or pastries in prague ."
> === 1/5 Counting and sorting n-grams ===
> Reading /tmp/sh-thd-107574999377 (deleted)
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> tcmalloc: large alloc 29442056192 bytes == 0x2ae2000 @
> tcmalloc: large alloc 78512136192 bytes == 0x6df1b4000 @
> ****************************************************************************************************
> Unigram tokens 16 types 18
> === 2/5 Calculating and sorting adjusted counts ===
> Chain sizes: 1:216 2:107979354931
> tcmalloc: large alloc 107979358208 bytes == 0x192b4b6000 @
> lmplz: ./util/fixed_array.hh:104: T&
> util::FixedArray<T>::operator[](std::size_t) [with T =
> lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = long
> unsigned int]: Assertion `i < size()' failed.
>
>
>
>
> On Wed, Sep 30, 2015 at 11:41 AM, Kenneth Heafield <moses@kheafield.com
> <mailto:moses@kheafield.com>> wrote:
>
> That's bad. Would you mind sending me privately a minimal example of
> the data that reproduces the problem?
>
> Kenneth
>
> On 09/30/2015 04:29 PM, Alex Martinez wrote:
> > Hello,
> > today I've pulled moses code and recompiled and some experiments (EMS)
> > that were already working are failing on the LM training step with the
> > following error:
> >
> > Executing: /opt/moses/bin/lmplz --text
> > /home/alexmc/devel/toydata/process/lm/nc=pos.factored.1 --order 5
> --arpa
> > /home/alexmc/devel/toydata/process/lm/nc=pos.lm.1 --discount_fallback
> > === 1/5 Counting and sorting n-grams ===
> > Reading /mnt/a62/devel/toydata/process/lm/nc=pos.factored.1
> >
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > tcmalloc: large alloc 4753956864 bytes == 0x1f7c000 @
> > tcmalloc: large alloc 22185107456 bytes == 0x11d536000 @
> >
> ****************************************************************************************************
> > Unigram tokens 2433135 types 47
> > === 2/5 Calculating and sorting adjusted counts ===
> > Chain sizes: 1:564 2:2630656000 3:4932480000 4:7891967488
> 5:11509120000
> > tcmalloc: large alloc 11509121024 bytes == 0x1f7c000 @
> > tcmalloc: large alloc 2630656000 bytes == 0x2aff70000 @
> > tcmalloc: large alloc 4932485120 bytes == 0x34cc3a000 @
> > tcmalloc: large alloc 7891968000 bytes == 0x64933c000 @
> > lmplz: ./util/fixed_array.hh:104: T&
> > util::FixedArray<T>::operator[](std::size_t) [with T =
> > lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = long
> > unsigned int]: Assertion `i < size()' failed.
> >
> > I'm runing a Linux server with Ubuntu 15.04
> >
> > Any help will be appreciated
> >
> > Alex Mart?nez
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> When a place gets crowded enough to require ID's, social collapse is not
> far away. It is time to go elsewhere. The best thing about space travel
> is that it made it possible to go elsewhere.
> -- R.A. Heinlein, "Time Enough For Love"
------------------------------
Message: 3
Date: Tue, 12 Jan 2016 20:06:52 -0500
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] Tiny sample data uses outdated file
format
To: Lane Schwartz <dowobeha@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDBjsab21mdd+hJRNeN4y2r2HvN96Dh5v-qpmk3-ZJzu3Q@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi,
I fixed it.
-phi
On Mon, Jan 11, 2016 at 4:33 PM, Lane Schwartz <dowobeha@gmail.com> wrote:
> http://www.statmt.org/moses/sample-data/complete/tiny.zip
>
> The config file included in the above sample doesn't work anymore,
> presumably due to a change in the ini format.
>
> $ ~/mosesdecoder.git/bin/moses -f moses.ini < in.txt
>>
>> Defined parameters (per moses.ini or switch):
>> config: moses.ini
>>
>> description: French English Tiny test data used for debugging
>>
>> distortion-limit: 4
>>
>> generation-file: 0 1 2 generation0-1.txt
>>
>> input-factors: 0 1
>>
>> lmodel-file: 0 0 3 europarl.en.srilm 0 1 3 europarl.pos.en.srilm
>>
>> load-dir: .
>>
>> mapping: T 0 G 0 T 1
>>
>> ttable-file: 0 0 5 trans0-0.txt 1 1 5 trans1-1.txt
>>
>> ttable-limit: 20 20
>>
>> weight-d: 0.2
>>
>> weight-generation: -110.232 -110.3
>>
>> weight-l: 0.5 0.5
>>
>> weight-t: 0.2 0.2 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.1
>>
>> weight-w: -1
>>
>> Phrase table specification in old 4-field format. No longer
>> supportedUnknown parameter load-dirUnknown parameter ttable-limit
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160112/f586dfa9/attachment-0001.html
------------------------------
Message: 4
Date: Wed, 13 Jan 2016 13:22:35 +0000
From: "Read, James C" <jcread@essex.ac.uk>
Subject: [Moses-support] Tuning with no language model
To: Moses Support <moses-support@mit.edu>
Message-ID:
<HE1PR06MB148162830D4E52572620E19485CB0@HE1PR06MB1481.eurprd06.prod.outlook.com>
Content-Type: text/plain; charset="iso-8859-1"
Returning to an old discussion we once had on this list.
I've tried to tune some systems with no reference in the configuration file to a language model so that I can do Moses justice when comparing it to a number of lines of work I have done that use only translation models and various schemes of phrase table filtering.
It seems that the tuning script will not run when supplied with a config file minus a reference to a language model. I could run a baseline with the weights learned from tuning with a config file but again this would be almost as meaningless as comparing my work against a Moses system with default weights as the weights would not be optimised to a system with no language model. I'm trying to come up with some kind of work around so that I can make a TM only Moses based baseline with sensible weights in the config file that are optimised to a TM only situation (no LM). I'm running short of ideas about how best to approach this.
I was wondering if the following approach might work. If I was to make some kind of minimal dummy LM file with only one entry for some work which doesn't exist would that be enough to fool the MERT script to run so I could get some sensible weights optimised to a no LM scenario?
I'm open to any other suggestions of how I can obtain such weights if anybody knows a clever way of doing this.
Just to be clear this support request is not an invitation to discuss whether or not one would want to tune a system with no language model or not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160113/15ce54ec/attachment.html
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 111, Issue 27
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 111, Issue 27"
Post a Comment