Moses-support Digest, Vol 108, Issue 12

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Faster decoding with multiple moses instances
(Kenneth Heafield)
2. Re: Faster decoding with multiple moses instances
(Michael Denkowski)
3. Re: Faster decoding with multiple moses instances (Philipp Koehn)


----------------------------------------------------------------------

Message: 1
Date: Mon, 5 Oct 2015 16:10:52 +0100
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] Faster decoding with multiple moses
instances
To: moses-support@mit.edu
Message-ID: <561292FC.1060502@kheafield.com>
Content-Type: text/plain; charset=windows-1252

https://github.com/kpu/usage

This injects code into shared executables that makes them print usage
statistics on termination to stderr. grep stderr, collate.

Kenneth

On 10/05/2015 04:05 PM, Michael Denkowski wrote:
> Hi Philipp,
>
> Unfortunately I don't have a precise measurement. If anyone knows of a
> good way to benchmark a process tree with lots of memory mapping the
> same files, I would be glad to run it.
>
> --Michael
>
> On Mon, Oct 5, 2015 at 10:26 AM, Philipp Koehn <phi@jhu.edu
> <mailto:phi@jhu.edu>> wrote:
>
> Hi,
>
> great - that will be very useful.
>
> Since you just ran the comparison - do you have any numbers on
> "still allowed everything to fit into memory", i.e., how much more
> memory is used by running parallel instances?
>
> -phi
>
> On Mon, Oct 5, 2015 at 10:15 AM, Michael Denkowski
> <michael.j.denkowski@gmail.com
> <mailto:michael.j.denkowski@gmail.com>> wrote:
>
> Hi all,
>
> Like some other Moses users, I noticed diminishing returns from
> running Moses with several threads. To work around this, I
> added a script to run multiple single-threaded instances of
> moses instead of one multi-threaded instance. In practice, this
> sped things up by about 2.5x for 16 cpus and using memory mapped
> models still allowed everything to fit into memory.
>
> If anyone else is interested in using this, you can prefix a
> moses command with scripts/generic/multi_moses.py. To use
> multiple instances in mert-moses.pl <http://mert-moses.pl>,
> specify --multi-moses and control the number of parallel
> instances with --decoder-flags='-threads N'.
>
> Below is a benchmark on WMT fr-en data (2M training sentences,
> 400M words mono, suffix array PT, compact reordering, 5-gram
> KenLM) testing default stack decoding vs cube pruning without
> and with the parallelization script (+multi):
>
> ---
> 1cpu sent/sec
> stack 1.04
> cube 2.10
> ---
> 16cpu sent/sec
> stack 7.63
> +multi 12.20
> cube 7.63
> +multi 18.18
> ---
>
> --Michael
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>


------------------------------

Message: 2
Date: Mon, 5 Oct 2015 11:17:46 -0400
From: Michael Denkowski <michael.j.denkowski@gmail.com>
Subject: Re: [Moses-support] Faster decoding with multiple moses
instances
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: Moses Support <moses-support@mit.edu>
Message-ID:
<CA+-Geg+PcgeZF8TLrqNGhqt6Pmt_dOG0J5toO1C_CNqsRTwAgw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Hieu,

I'm using the memory mapped suffix array phrase table
(PhraseDictionaryBitextSampling). I can run a test with compact PT as well.

--Michael

On Mon, Oct 5, 2015 at 10:48 AM, Hieu Hoang <hieuhoang@gmail.com> wrote:

> what pt implementation did you use, and had it been pre-pruned so that
> there's a limit on how many target phrase for a particular source phrase?
> ie. don't have 10,000 entries for 'the' .
>
> I've been digging around multithreading in the last few weeks. I've
> noticed that the compact pt is VERY bad at handling unpruned pt.
> Cores 1 5 10 15 20 25 Unpruned compact pt 143 42 32 38
> 52 62 probing pt 245 58 33 25 24 21 Pruned compact pt 119 24 15 10 10 10
> probing pt 117 25 25 10 10 10
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 5 October 2015 at 15:15, Michael Denkowski <
> michael.j.denkowski@gmail.com> wrote:
>
>> Hi all,
>>
>> Like some other Moses users, I noticed diminishing returns from running
>> Moses with several threads. To work around this, I added a script to run
>> multiple single-threaded instances of moses instead of one multi-threaded
>> instance. In practice, this sped things up by about 2.5x for 16 cpus and
>> using memory mapped models still allowed everything to fit into memory.
>>
>> If anyone else is interested in using this, you can prefix a moses
>> command with scripts/generic/multi_moses.py. To use multiple instances in
>> mert-moses.pl, specify --multi-moses and control the number of parallel
>> instances with --decoder-flags='-threads N'.
>>
>> Below is a benchmark on WMT fr-en data (2M training sentences, 400M words
>> mono, suffix array PT, compact reordering, 5-gram KenLM) testing default
>> stack decoding vs cube pruning without and with the parallelization script
>> (+multi):
>>
>> ---
>> 1cpu sent/sec
>> stack 1.04
>> cube 2.10
>> ---
>> 16cpu sent/sec
>> stack 7.63
>> +multi 12.20
>> cube 7.63
>> +multi 18.18
>> ---
>>
>> --Michael
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151005/4ed60891/attachment-0001.html

------------------------------

Message: 3
Date: Mon, 5 Oct 2015 11:20:55 -0400
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] Faster decoding with multiple moses
instances
To: Barry Haddow <bhaddow@inf.ed.ac.uk>
Cc: Moses Support <moses-support@mit.edu>, Michael Denkowski
<michael.j.denkowski@gmail.com>
Message-ID:
<CAAFADDD0WoaDyMSR101gm=A2m_RL8TNgchPEeJg=Lv9vAR0Adg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

with regard to pruning ---

the example EMS config files have

[TRAINING]
score-settings = "--GoodTuring --MinScore 2:0.0001"

which carries out threshold pruning during phrase table construction, going
a good way towards avoiding too many translation options per phrase.

-phi

On Mon, Oct 5, 2015 at 11:08 AM, Barry Haddow <bhaddow@inf.ed.ac.uk> wrote:

> Hi Hieu
>
> That's exactly why I took to pre-pruning the phrase table, as I mentioned
> on Friday. I had something like 750,000 translations of the most common
> word, and it took half-an-hour to get the first sentence translated.
>
> cheers - Barry
>
>
> On 05/10/15 15:48, Hieu Hoang wrote:
>
> what pt implementation did you use, and had it been pre-pruned so that
> there's a limit on how many target phrase for a particular source phrase?
> ie. don't have 10,000 entries for 'the' .
>
> I've been digging around multithreading in the last few weeks. I've
> noticed that the compact pt is VERY bad at handling unpruned pt.
> Cores 1 5 10 15 20 25 Unpruned compact pt 143 42 32 38
> 52 62 probing pt 245 58 33 25 24 21 Pruned compact pt 119 24 15 10 10 10
> probing pt 117 25 25 10 10 10
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 5 October 2015 at 15:15, Michael Denkowski <
> michael.j.denkowski@gmail.com> wrote:
>
>> Hi all,
>>
>> Like some other Moses users, I noticed diminishing returns from running
>> Moses with several threads. To work around this, I added a script to run
>> multiple single-threaded instances of moses instead of one multi-threaded
>> instance. In practice, this sped things up by about 2.5x for 16 cpus and
>> using memory mapped models still allowed everything to fit into memory.
>>
>> If anyone else is interested in using this, you can prefix a moses
>> command with scripts/generic/multi_moses.py. To use multiple instances in
>> mert-moses.pl, specify --multi-moses and control the number of parallel
>> instances with --decoder-flags='-threads N'.
>>
>> Below is a benchmark on WMT fr-en data (2M training sentences, 400M words
>> mono, suffix array PT, compact reordering, 5-gram KenLM) testing default
>> stack decoding vs cube pruning without and with the parallelization script
>> (+multi):
>>
>> ---
>> 1cpu sent/sec
>> stack 1.04
>> cube 2.10
>> ---
>> 16cpu sent/sec
>> stack 7.63
>> +multi 12.20
>> cube 7.63
>> +multi 18.18
>> ---
>>
>> --Michael
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> _______________________________________________
> Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151005/4d35e686/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 108, Issue 12
**********************************************

0 Response to "Moses-support Digest, Vol 108, Issue 12"

Post a Comment