Moses-support Digest, Vol 106, Issue 3

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: Parallelizer multi core (Vincent Nguyen)
2. Re: Parallelizer multi core (Hieu Hoang)
3. nplm ngram total order in ems (John Joseph Morgan)

----------------------------------------------------------------------

Message: 1
Date: Sat, 1 Aug 2015 10:51:21 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: Re: [Moses-support] Parallelizer multi core
To: Hieu Hoang <hieuhoang@gmail.com>, moses-support
<moses-support@mit.edu>
Message-ID: <55BC8889.1080704@neuf.fr>
Content-Type: text/plain; charset=windows-1252; format=flowed

I am confused
LM section type = 8 is just for KenLM in model creation / training, right ?
If I want to try IRSTLM for model creation training , then the LM
section should not contain any KenLM / type stuff

BUT
in the tuning, section it will not use the KenLM since the moses.ini
will have been generated with IRSTLM

Am I wrong ?

Le 01/08/2015 09:53, Hieu Hoang a ?crit :
> in the [LM] section, make sure
> type = 8
>
>
> On 01/08/2015 11:48, Vincent Nguyen wrote:
>> fair enough.
>> One thing though ....
>> When you use irstlm for LM and training, then the EMS crashes in multi
>> thread at tuning (decoder)
>>
>> What is the easiest way so that the tuning part uses KenLM each time
>> multi threads is activated ? (I mean in EMS).
>>
>> Vincent
>>
>>
>>
>>
>> Le 01/08/2015 09:38, Hieu Hoang a ?crit :
>>> oh alright. I've made it 4 cores. The example config files are aimed at
>>> beginners with laptops.
>>>
>>> On 01/08/2015 10:35, Marcin Junczys-Dowmunt wrote:
>>>> Hi, I agree with Nick. I am using a 64-core machine. "-threads all" will
>>>> grind to a still-stand. I am however fine with a few more threads, say 16.
>>>> Best,
>>>> Marcin
>>>>
>>>> On 01.08.2015 00:35, Nikolay Bogoychev wrote:
>>>>> Hey,
>>>>>
>>>>> I have opposed this change in the past for two reasons:
>>>>>
>>>>> Using more than 4 threads doesn't help unless the user is using
>>>>> PhraseDictionaryCompact. See this issue
>>>>> https://github.com/moses-smt/mosesdecoder/issues/39 in fact on most
>>>>> machines you rarely want to run moses on all available threads.
>>>>>
>>>>> Also - threads all picks up virtual (hyper) threads which are in fact
>>>>> harmful to performance.
>>>>>
>>>>> If you want to change the default I think it would be better to have a
>>>>> sane default like 4.. It would boost performance for most people and
>>>>> if you run it on machines with less available cores it would be not
>>>>> too bad.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Nick
>>>>>
>>>>> On 31 Jul 2015 7:31 pm, "Hieu Hoang" <hieuhoang@gmail.com
>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>
>>>>> good suggestion. Changed:
>>>>> https://github.com/moses-smt/mosesdecoder/commit/f894dec0fd8d5b15eb16c35d3d2599338894ee9d
>>>>> if you have any more suggestions, it's best if you can just me a
>>>>> patch and I'll check it in
>>>>>
>>>>> On 31/07/2015 15:59, Vincent Nguyen wrote:
>>>>>> for inexperienced people like me :)
>>>>>> Add |--decoder-flags="-threads 4"| is key
>>>>>>
>>>>>> if EMS config.basic had "-threads all" by default we would gain A
>>>>>> LOT of time.
>>>>>>
>>>>>> cheers,
>>>>>>
>>>>>> Vincent
>>>>>>
>>>>>>
>>>>>> Le 29/07/2015 22:05, Vincent Nguyen a ?crit :
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am wondering what tasks of the EMS are really parallelized.
>>>>>>> I activated the script line + 8 cores.
>>>>>>>
>>>>>>> Training / binarizing / Tuning all make only one core to actually work.
>>>>>>>
>>>>>>> Am I correct ?
>>>>>>> _______________________________________________
>>>>>>> Moses-support mailing list
>>>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>> --
>>>>> Hieu Hoang
>>>>> Researcher
>>>>> New York University, Abu Dhabi
>>>>> http://www.hoang.co.uk/hieu
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> Moses-support@mit.edu
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>

------------------------------

Message: 2
Date: Sat, 1 Aug 2015 16:22:49 +0400
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Parallelizer multi core
To: Vincent Nguyen <vnguyen@neuf.fr>, moses-support
<moses-support@mit.edu>
Message-ID: <55BCBA19.70401@gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed

On 01/08/2015 12:51, Vincent Nguyen wrote:
>
> I am confused
> LM section type = 8 is just for KenLM in model creation / training,
> right ?
no. type = 8 specify what LM is used during decoding - it puts the
'KENLM' into the moses.ini

KENLM and IRSTLM also have binarizing programs to make the decoding use
less memory. eg. if you want to binarize with kenlm, then add
[LM]
lm-binarizer = $moses-src-dir/bin/build_binary
> If I want to try IRSTLM for model creation training , then the LM
> section should not contain any KenLM / type stuff
to creating a LM

with SRILM:
lm-training = $srilm-bin-dir/ngram-count
with IRSLM:
lm-training = "$moses-script-dir/generic/trainlm-irst2.perl -cores
$cores -irst-dir $irstlm-dir -temp-dir $working-dir/tmp"
settings = "-s msb -p 0"
with KenLM:
lm-training = "$moses-script-dir/ems/support/lmplz-wrapper.perl -bin
$moses-bin-dir/lmplz"
settings = "--prune '0 0 1' -T $working-dir/lm -S 20%"

> BUT
> in the tuning, section it will not use the KenLM since the moses.ini
> will have been generated with IRSTLM
it uses whatever type you specified in type = ??
>
> Am I wrong ?
>
>
>
> Le 01/08/2015 09:53, Hieu Hoang a ?crit :
>> in the [LM] section, make sure
>> type = 8
>>
>>
>> On 01/08/2015 11:48, Vincent Nguyen wrote:
>>> fair enough.
>>> One thing though ....
>>> When you use irstlm for LM and training, then the EMS crashes in multi
>>> thread at tuning (decoder)
>>>
>>> What is the easiest way so that the tuning part uses KenLM each time
>>> multi threads is activated ? (I mean in EMS).
>>>
>>> Vincent
>>>
>>>
>>>
>>>
>>> Le 01/08/2015 09:38, Hieu Hoang a ?crit :
>>>> oh alright. I've made it 4 cores. The example config files are
>>>> aimed at
>>>> beginners with laptops.
>>>>
>>>> On 01/08/2015 10:35, Marcin Junczys-Dowmunt wrote:
>>>>> Hi, I agree with Nick. I am using a 64-core machine. "-threads
>>>>> all" will
>>>>> grind to a still-stand. I am however fine with a few more threads,
>>>>> say 16.
>>>>> Best,
>>>>> Marcin
>>>>>
>>>>> On 01.08.2015 00:35, Nikolay Bogoychev wrote:
>>>>>> Hey,
>>>>>>
>>>>>> I have opposed this change in the past for two reasons:
>>>>>>
>>>>>> Using more than 4 threads doesn't help unless the user is using
>>>>>> PhraseDictionaryCompact. See this issue
>>>>>> https://github.com/moses-smt/mosesdecoder/issues/39 in fact on most
>>>>>> machines you rarely want to run moses on all available threads.
>>>>>>
>>>>>> Also - threads all picks up virtual (hyper) threads which are in
>>>>>> fact
>>>>>> harmful to performance.
>>>>>>
>>>>>> If you want to change the default I think it would be better to
>>>>>> have a
>>>>>> sane default like 4.. It would boost performance for most people and
>>>>>> if you run it on machines with less available cores it would be not
>>>>>> too bad.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Nick
>>>>>>
>>>>>> On 31 Jul 2015 7:31 pm, "Hieu Hoang" <hieuhoang@gmail.com
>>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>>
>>>>>> good suggestion. Changed:
>>>>>> https://github.com/moses-smt/mosesdecoder/commit/f894dec0fd8d5b15eb16c35d3d2599338894ee9d
>>>>>> if you have any more suggestions, it's best if you can
>>>>>> just me a
>>>>>> patch and I'll check it in
>>>>>>
>>>>>> On 31/07/2015 15:59, Vincent Nguyen wrote:
>>>>>>> for inexperienced people like me :)
>>>>>>> Add |--decoder-flags="-threads 4"| is key
>>>>>>>
>>>>>>> if EMS config.basic had "-threads all" by default we
>>>>>>> would gain A
>>>>>>> LOT of time.
>>>>>>>
>>>>>>> cheers,
>>>>>>>
>>>>>>> Vincent
>>>>>>>
>>>>>>>
>>>>>>> Le 29/07/2015 22:05, Vincent Nguyen a ?crit :
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am wondering what tasks of the EMS are really
>>>>>>>> parallelized.
>>>>>>>> I activated the script line + 8 cores.
>>>>>>>>
>>>>>>>> Training / binarizing / Tuning all make only one core
>>>>>>>> to actually work.
>>>>>>>>
>>>>>>>> Am I correct ?
>>>>>>>> _______________________________________________
>>>>>>>> Moses-support mailing list
>>>>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>> _______________________________________________
>>>>>>> Moses-support mailing list
>>>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>> --
>>>>>> Hieu Hoang
>>>>>> Researcher
>>>>>> New York University, Abu Dhabi
>>>>>> http://www.hoang.co.uk/hieu
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> Moses-support@mit.edu
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> Moses-support@mit.edu
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>

--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu

------------------------------

Message: 3
Date: Sat, 1 Aug 2015 11:30:09 -0400
From: John Joseph Morgan <johnjosephmorgan@gmail.com>
Subject: [Moses-support] nplm ngram total order in ems
To: moses-support@mit.edu
Message-ID: <C1238A4A-69F6-4EFA-8062-F58C7415D631@gmail.com>
Content-Type: text/plain; charset=utf-8

I?m trying to run the toy bilingualnplm example with ems.
The ngram order gets computed in experiment.perl on line 1868.
The formula is:
$order + 2 * $source_window + 1
If $order is 5 and $source_window is 4 this formula gives 14.
Is this correct?
It doesn't seem right.

John

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 106, Issue 3
*********************************************

Moses-support Digest, Vol 106, Issue 3

0 Response to "Moses-support Digest, Vol 106, Issue 3"

Post a Comment