Moses-support Digest, Vol 101, Issue 38

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: In-memory loading of compact phrases (Marcin Junczys-Dowmunt)
2. Re: In-memory loading of compact phrases (Jes?s Gonz?lez Rubio)


----------------------------------------------------------------------

Message: 1
Date: Wed, 11 Mar 2015 20:21:15 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] In-memory loading of compact phrases
To: Jes?s Gonz?lez Rubio <jegonzalez@prhlt.upv.es>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <550095AB.3050500@amu.edu.pl>
Content-Type: text/plain; charset="utf-8"

Maybe someone will correct me, but if I am not wrong, the gziped version
already calculates the future score while loading (i.e. the phrase is
being scored by the language model). The compact phrase table cannot do
this during loading and doing this on-line. This will be the reason for
the slow speed. I suppose your phrase table has not been pruned? So, for
instance function words like "the" can have hundreds of thousands of
counterparts that need to be scored by the LM during collection.

You can limit your phrase table using Barry's prunePhraseTable tool.
With this you can limit it to, say, the 20 best phrases (corresponds to
the ttable limit) and only score this 20 phrases during collection. That
should be orders of magnitude faster.

Best,
Marcin

W dniu 11.03.2015 o 20:12, Jes?s Gonz?lez Rubio pisze:
> Thanks for the quick response, I will try as you suggest.
>
> Nevertheless, my main concern is the time spent collecting options. Is
> it normal the difference observed respect to the gzip'ed tables? being
> the tables cached, shouldn't they be closer?
>
> 2015-03-11 18:52 GMT+00:00 Marcin Junczys-Dowmunt <junczys@amu.edu.pl
> <mailto:junczys@amu.edu.pl>>:
>
> Hi,
> Try measuring the differences again after a full system reboot
> (fresh reboot before each mesurement) or after purging OS
> read/write caches. Your phrase tables are most likely cached,
> which means they are in fact in memory.
> Best,
> Marcin
>
> W dniu 11.03.2015 o 19:31, Jes?s Gonz?lez Rubio pisze:
>> Hi,
>>
>> I'm obtaining some unintuitive timing results when using compact
>> phrase tables. The average translation time per sentence is much
>> higher for them in comparison to using gzip'ed phrase tables.
>> Particularly important is the difference in time required to
>> collect the options. This table summarizes the timings (in seconds):
>>
>> Compact Gzip'ed
>> on-disk in-memory
>> Init: 5.9 6.3 1882.8
>> Per-sentence:
>> - Collect: 5.9 5.8 0.2
>> - Search: 1.6 1.6 3.3
>>
>> Results in the table were computed using Moses v2.1 with one
>> single thread (-th 1) but I've seen similar results using the
>> pre-compiled binary for moses v3.0. The model comprises two
>> phrase-tables (~2G and ~3M), two lexicalized reordering tables
>> (~700M and ~1M) and two language models (~31G and ~38M). You can
>> see the exact configuration in the attached moses.ini file.
>>
>> Interestingly, there is virtually no difference for the compact
>> table between the the on-disk and in-memory options.
>> Additionally, timings were higher for the initial sentences in
>> both cases which I think should not be the case for the in-memory
>> option.
>>
>> May be the case that the in-memory option of compact tables
>> (-minpht-memory -minlexr-memory) is not working properly?
>>
>> Cheers.
>> --
>> Jes?s
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150311/7a9940a0/attachment-0001.htm

------------------------------

Message: 2
Date: Wed, 11 Mar 2015 19:31:48 +0000
From: Jes?s Gonz?lez Rubio <jesus.g.rubio@gmail.com>
Subject: Re: [Moses-support] In-memory loading of compact phrases
To: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAF+=9hjtM+tiBStJGZEjowa9gJ-r_NLm0xgD-Wyi27pLMQFU2A@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

2015-03-11 19:21 GMT+00:00 Marcin Junczys-Dowmunt <junczys@amu.edu.pl>:

> Maybe someone will correct me, but if I am not wrong, the gziped version
> already calculates the future score while loading (i.e. the phrase is being
> scored by the language model). The compact phrase table cannot do this
> during loading and doing this on-line. This will be the reason for the slow
> speed. I suppose your phrase table has not been pruned? So, for instance
> function words like "the" can have hundreds of thousands of counterparts
> that need to be scored by the LM during collection.
>

That makes sense.

You can limit your phrase table using Barry's prunePhraseTable tool. With
> this you can limit it to, say, the 20 best phrases (corresponds to the
> ttable limit) and only score this 20 phrases during collection. That should
> be orders of magnitude faster.
>

OK.


> Best,
> Marcin
>
> W dniu 11.03.2015 o 20:12, Jes?s Gonz?lez Rubio pisze:
>
> Thanks for the quick response, I will try as you suggest.
>
> Nevertheless, my main concern is the time spent collecting options. Is
> it normal the difference observed respect to the gzip'ed tables? being the
> tables cached, shouldn't they be closer?
>
> 2015-03-11 18:52 GMT+00:00 Marcin Junczys-Dowmunt <junczys@amu.edu.pl>:
>
>> Hi,
>> Try measuring the differences again after a full system reboot (fresh
>> reboot before each mesurement) or after purging OS read/write caches. Your
>> phrase tables are most likely cached, which means they are in fact in
>> memory.
>> Best,
>> Marcin
>>
>> W dniu 11.03.2015 o 19:31, Jes?s Gonz?lez Rubio pisze:
>>
>> Hi,
>>
>> I'm obtaining some unintuitive timing results when using compact phrase
>> tables. The average translation time per sentence is much higher for them
>> in comparison to using gzip'ed phrase tables. Particularly important is the
>> difference in time required to collect the options. This table summarizes
>> the timings (in seconds):
>>
>> Compact Gzip'ed
>> on-disk in-memory
>> Init: 5.9 6.3 1882.8
>> Per-sentence:
>> - Collect: 5.9 5.8 0.2
>> - Search: 1.6 1.6 3.3
>>
>> Results in the table were computed using Moses v2.1 with one single
>> thread (-th 1) but I've seen similar results using the pre-compiled binary
>> for moses v3.0. The model comprises two phrase-tables (~2G and ~3M), two
>> lexicalized reordering tables (~700M and ~1M) and two language models (~31G
>> and ~38M). You can see the exact configuration in the attached moses.ini
>> file.
>>
>> Interestingly, there is virtually no difference for the compact table
>> between the the on-disk and in-memory options. Additionally, timings were
>> higher for the initial sentences in both cases which I think should not be
>> the case for the in-memory option.
>>
>> May be the case that the in-memory option of compact tables
>> (-minpht-memory -minlexr-memory) is not working properly?
>>
>> Cheers.
>> --
>> Jes?s
>>
>>
>> _______________________________________________
>> Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>


--
Jes?s
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150311/f6c6d9fc/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 101, Issue 38
**********************************************

0 Response to "Moses-support Digest, Vol 101, Issue 38"

Post a Comment