Moses-support Digest, Vol 106, Issue 26

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Is there multithread option for KenLM's build_binary?
(Hieu Hoang)


----------------------------------------------------------------------

Message: 1
Date: Tue, 11 Aug 2015 19:57:46 +0400
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Is there multithread option for KenLM's
build_binary?
To: liling tan <alvations@gmail.com>, moses-support
<moses-support@mit.edu>
Message-ID: <55CA1B7A.6070000@gmail.com>
Content-Type: text/plain; charset="windows-1252"



On 10/08/2015 17:22, liling tan wrote:
> Dear Moses devs/users,
>
> @Marcin @Ken , Thanks for the tips on the -S for build_binary, RAM
> estimation and the probing vs trie explanations.
>
> Just to do a check, currently, is there an option for lmplz to output
> binarized directly without going through ARPA? If there is, is there
> also a binary to arpa dumping mechanism?
as far as i know, neither of these options are available in the current
version of kenlm
>
> Regards,
> LIling
>
>
>
>
> On Fri, Aug 7, 2015 at 9:31 PM, liling tan <alvations@gmail.com
> <mailto:alvations@gmail.com>> wrote:
>
> Dear Moses dev/users,
>
> On a related note, without multi-threads, can anyone give a gauge
> of how much RAM is required to binarized a 80GB (compressed .gz)
> 6gram arpa file? The no. of ngrams are:
>
> \data\
> ngram 1=7503209
> ngram 2=131003943
> ngram 3=671005861
> ngram 4=1510529519
> ngram 5=2165163610
> ngram 6=2477533666
>
>
> Also, how long would it take (single-threadedly) on a 2.4Ghz core
> with 128GB RAM? Is there a way to mathematically estimate the time
> taken and RAM required to binarize a language model?
>
> Also, is binarized and quantized LM from KenLM lossy? If so how
> lossy? The KenLM paper states "To conserve memory at the expense
> of accuracy, values may be quantized using q bits per probability
> and r bits per backoff". Can someone help point us to papers that
> quanitfy how lossy it gets in terms of MT experiments or word
> perplexity task?
>
> Thanks in advance for the pointers!
>
> Regards,
> Liling
>
> On Fri, Aug 7, 2015 at 8:56 PM, liling tan <alvations@gmail.com
> <mailto:alvations@gmail.com>> wrote:
>
> Dear Moses dev/users,
>
> Is there multithread option for KenLM's build_binary?
>
> Regards,
> Liling
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150811/ec99b63c/attachment-0001.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 106, Issue 26
**********************************************

0 Response to "Moses-support Digest, Vol 106, Issue 26"

Post a Comment