Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Faster decoding with multiple moses instances (Vito Mandorino)
----------------------------------------------------------------------
Message: 1
Date: Thu, 8 Oct 2015 10:25:48 +0200
From: Vito Mandorino <vito.mandorino@linguacustodia.com>
Subject: Re: [Moses-support] Faster decoding with multiple moses
instances
To: Moses Support <moses-support@mit.edu>
Message-ID:
<CA+8mSmGwCkSfFPaBvLrGqO812nP8e4ED8H4H4zyEuV5Mbx3+Hg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi all,
what about mosesserver? Do you think the same speed gains would occur?
Best,
Vito
2015-10-06 22:39 GMT+02:00 Michael Denkowski <michael.j.denkowski@gmail.com>
:
> Hi Hieu and all,
>
> I just checked in a bug fix for the multi_moses.py script. I forgot to
> override the number of threads for each moses command, so if [threads] were
> specified in the moses.ini, the multi-moses runs were cheating by running a
> bunch of multi-threaded instances. If threads were only being specified on
> the command line, the script was correctly stripping the flag so everything
> should be good. I finished a benchmark on my system with an unpruned
> compact PT (with the fixed script) and got the following:
>
> 16 threads 5.38 sent/sec
> 16 procs 13.51 sent/sec
>
> This definitely used a lot more memory though. Based on some very rough
> estimates looking at free system memory, the memory mapped suffix array PT
> went from 2G to 6G with 16 processes while the compact PT went from 3G to
> 37G. For cases where everything fits into memory, I've seen significant
> speedup from multi-process decoding.
>
> For cases where things don't fit into memory, the multi-moses script could
> be extended to start as many multi-threaded instances as will fit into ram
> and farm out sentences in a way that keeps all of the CPUs busy. I know
> Marcin has mentioned using GNU parallel.
>
> Best,
> Michael
>
> On Tue, Oct 6, 2015 at 4:16 PM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>
>> I've just run some comparison between multithreaded decoder and the
>> multi_moses.py script. It's good stuff.
>>
>> It make me seriously wonder whether we should use abandon multi-threading
>> and go all out for the multi-process approach.
>>
>> There's some advantage to multi-thread - eg. where model files are loaded
>> into memory rather than memory map. But there's disadvantages too - it more
>> difficult to maintain and there's about a 10% overhead.
>>
>> What do people think?
>>
>> Phrase-based:
>>
>> 1 5 10 15 20 25 30 32 real 4m37.000s real 1m15.391s real
>> 0m51.217s real 0m48.287s real 0m50.719s real 0m52.027s real
>> 0m53.045s Baseline (Compact pt) user 4m21.544s user 5m28.597s user
>> 6m38.227s user 8m0.975s user 8m21.122s user 8m3.195s user
>> 8m4.663s
>> sys 0m15.451s sys 0m34.669s sys 0m53.867s sys 1m10.515s
>> sys 1m20.746s sys 1m24.368s sys 1m23.677s
>>
>>
>>
>>
>>
>>
>>
>> 34 4m49.474s real 1m17.867s real 0m43.096s real 0m31.999s
>> 0m26.497s 0m26.296s killed (32) + multi_moses 4m33.580s user 4m40.486s
>> user 4m56.749s user 5m6.692s 5m43.845s 7m34.617s
>>
>> 0m15.957s sys 0m32.347s sys 0m51.016s sys 1m11.106s 1m44.115s
>> 2m21.263s
>>
>>
>>
>>
>>
>>
>>
>>
>> 38 real 4m46.254s real 1m16.637s real 0m49.711s real
>> 0m48.389s real 0m49.144s real 0m51.676s real 0m52.472s Baseline
>> (Probing pt) user 4m30.596s user 5m32.500s user 6m23.706s user
>> 7m40.791s user 7m51.946s user 7m52.892s user 7m53.569s
>> sys 0m15.624s sys 0m36.169s sys 0m49.433s sys 1m6.812s
>> sys 1m9.614s sys 1m13.108s sys 1m12.644s
>>
>>
>>
>>
>>
>>
>>
>> 39 real 4m43.882s real 1m17.849s real 0m34.245s real
>> 0m31.318s real 0m28.054s real 0m24.120s real 0m22.520s (38) +
>> multi moses user 4m29.212s user 4m47.693s user 5m5.750s user
>> 5m33.573s user 6m18.847s user 7m19.642s user 8m38.013s
>> sys 0m15.835s sys 0m25.398s sys 0m36.716s sys 0m41.349s
>> sys 0m48.494s sys 1m0.843s sys 1m13.215s
>> Hiero:
>> 3 real 5m33.011s real 1m28.935s real 0m59.470s real 1m0.315s
>> real 0m55.619s real 0m57.347s real 0m59.191s 1m2.786s 6/10
>> baseline user 4m53.187s user 6m23.521s user 8m17.170s user
>> 12m48.303s user 14m45.954s user 17m58.109s user 20m22.891s
>> 21m13.605s
>> sys 0m39.696s sys 0m51.519s sys 1m3.788s sys 1m22.125s
>> sys 1m58.718s sys 2m51.249s sys 4m4.807s 4m37.691s
>>
>>
>>
>>
>>
>>
>>
>>
>> 4
>> real 1m27.215s real 0m40.495s real 0m36.206s real 0m28.623s
>> real 0m26.631s real 0m25.817s 0m25.401s (3) + multi_moses
>> user 5m4.819s user 5m42.070s user 5m35.132s user 6m46.001s
>> user 7m38.151s user 9m6.500s 10m32.739s
>>
>> sys 0m38.039s sys 0m45.753s sys 0m44.117s sys 0m52.285s
>> sys 0m56.655s sys 1m6.749s 1m16.935s
>>
>> On 05/10/2015 16:05, Michael Denkowski wrote:
>>
>> Hi Philipp,
>>
>> Unfortunately I don't have a precise measurement. If anyone knows of a
>> good way to benchmark a process tree with lots of memory mapping the same
>> files, I would be glad to run it.
>>
>> --Michael
>>
>> On Mon, Oct 5, 2015 at 10:26 AM, Philipp Koehn <phi@jhu.edu> wrote:
>>
>>> Hi,
>>>
>>> great - that will be very useful.
>>>
>>> Since you just ran the comparison - do you have any numbers on "still
>>> allowed everything to fit into memory", i.e., how much more memory is used
>>> by running parallel instances?
>>>
>>> -phi
>>>
>>> On Mon, Oct 5, 2015 at 10:15 AM, Michael Denkowski <
>>> <michael.j.denkowski@gmail.com>michael.j.denkowski@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Like some other Moses users, I noticed diminishing returns from running
>>>> Moses with several threads. To work around this, I added a script to run
>>>> multiple single-threaded instances of moses instead of one multi-threaded
>>>> instance. In practice, this sped things up by about 2.5x for 16 cpus and
>>>> using memory mapped models still allowed everything to fit into memory.
>>>>
>>>> If anyone else is interested in using this, you can prefix a moses
>>>> command with scripts/generic/multi_moses.py. To use multiple instances in
>>>> mert-moses.pl, specify --multi-moses and control the number of
>>>> parallel instances with --decoder-flags='-threads N'.
>>>>
>>>> Below is a benchmark on WMT fr-en data (2M training sentences, 400M
>>>> words mono, suffix array PT, compact reordering, 5-gram KenLM) testing
>>>> default stack decoding vs cube pruning without and with the parallelization
>>>> script (+multi):
>>>>
>>>> ---
>>>> 1cpu sent/sec
>>>> stack 1.04
>>>> cube 2.10
>>>> ---
>>>> 16cpu sent/sec
>>>> stack 7.63
>>>> +multi 12.20
>>>> cube 7.63
>>>> +multi 18.18
>>>> ---
>>>>
>>>> --Michael
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>
>>
>>
>> _______________________________________________
>> Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>> --
>> Hieu Hoanghttp://www.hoang.co.uk/hieu
>>
>>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
--
*M**. Vito MANDORINO -- Chief Scientist*
[image: Description : Description : lingua_custodia_final full logo]
*The Translation Trustee*
*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89*
*Email :* *vito.mandorino@linguacustodia.com
<massinissa.ahmim@linguacustodia.com>*
*Website :* *www.linguacustodia.com <http://www.linguacustodia.com/> -
www.thetranslationtrustee.com <http://www.thetranslationtrustee.com/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151008/543d3394/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20151008/543d3394/attachment.jpg
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 108, Issue 20
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 108, Issue 20"
Post a Comment