Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: EMS results - makes sense ? (Vincent Nguyen)
2. Re: Is there multithread option for KenLM's build_binary?
(liling tan)
----------------------------------------------------------------------
Message: 1
Date: Mon, 10 Aug 2015 10:24:10 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: Re: [Moses-support] EMS results - makes sense ?
To: moses-support@mit.edu
Message-ID: <55C85FAA.7070505@neuf.fr>
Content-Type: text/plain; charset=windows-1252; format=flowed
Thanks for your feedback Rico.
What I find it hard to understand is how just by changing a test set you
can have a 2 BLEU points difference without any other change in the
tuning / training parameters.
it's even 4 points for UEDIN 2014 ....
Le 10/08/2015 09:56, Rico Sennrich a ?crit :
> Hi Vincent,
>
> the KIT paper reports scores on newstest2010 (and newstest2009) in their
> system description paper, while the matrix shows scores on newstest2011.
> The UEDIN WMT14 paper reports scores on newstest2012, newstest2013, and
> newstest2014 (it may admittedly be hard to see which is which:
> newstest2013 is the default in that paper). The reason why people report
> results on an older test set is that system development for a shared
> task happens without access to the test set to avoid overfitting to the
> task. Time and space permitting, some experiments are repeated on that
> year's test set for the system description (like in table 6 in the UEDIN
> paper).
>
> best wishes,
> Rico
>
> On 10/08/15 08:32, Vincent Nguyen wrote:
>> similarly reading the WMT14 paper from UEDIN, If not mistaken I read :
>> 35.9 in the matrix : http://matrix.statmt.org/systems/show/2106
>> 31.76 for B1 best system on page 101 of :
>> http://www.statmt.org/wmt14/pdf/W14-3309.pdf
>>
>> Maybe I do not take the good information.
>>
>> Le 09/08/2015 23:07, Vincent Nguyen a ?crit :
>>> Still looking at WMT 11 in fact something looks weird to me :
>>> This table suggests that the Karlsruhe IT obtained 30.5 Bleu score for
>>> FR to EN : http://matrix.statmt.org/matrix/systems_list/1669
>>> But reading the paper http://www.statmt.org/wmt11/pdf/WMT45.pdf shows
>>> 28.34 as the final score
>>>
>>> I am trying not to focus too much on Bleu scores but this is my only
>>> reference to compare my experiments.
>>>
>>>
>>> Le 04/08/2015 17:28, Barry Haddow a ?crit :
>>>> Hi Vincent
>>>>
>>>> If you are comparing to the results of WMT11, then you can look at the
>>>> system descriptions to see what the authors did. In fact it's worth
>>>> looking at the WMT14 descriptions (WMT15 will be available next month)
>>>> to see how state-of-the-art systems are built.
>>>>
>>>> For fr-en or en-fr, the first thing to look at is the data. There are
>>>> some large data sets released for WMT and you can get a good gain from
>>>> just crunching more data (monolingual and parallel). Unfortunately
>>>> this takes more resources (disk, cpu etc) so you may run into trouble
>>>> here.
>>>>
>>>> The hierarchical models are much bigger so yes you will need more
>>>> disk. For fr-en/en-fr it's probably not worth the extra effort,
>>>>
>>>> cheers - Barry
>>>>
>>>> On 04/08/15 15:58, Vincent Nguyen wrote:
>>>>> thanks for your insights.
>>>>>
>>>>> I am just stuck by the Bleu difference between my 26 and the 30 of
>>>>> WMT11, and some results of WMT14 close to 36 or even 39
>>>>>
>>>>> I am currently having trouble with hierarchical rule set instead of
>>>>> lexical reordering
>>>>> wondering if I will get better results but I have an error message
>>>>> filesystem root low disk space before it crashes.
>>>>> is this model taking more disk space in some ways ?
>>>>>
>>>>> I will next try to use more corpora of which in domain with my
>>>>> internal TMX
>>>>>
>>>>> thanks for your answers.
>>>>>
>>>>> Le 04/08/2015 16:02, Hieu Hoang a ?crit :
>>>>>> On 03/08/2015 13:00, Vincent Nguyen wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Just a heads up on some EMS results, to get your experienced opinions.
>>>>>>>
>>>>>>> Corpus: Europarlv7 + NC2010
>>>>>>> fr => en
>>>>>>> Evaluation NC2011.
>>>>>>>
>>>>>>> 1) IRSTLM vs KenLM is much slower for training / tuning.
>>>>>> that sounds right. KenLM is also multithreaded, IRSTLM can only be
>>>>>> used in single-threaded decoding.
>>>>>>> 2) BLEU results are almost the same (25.7 with Irstlm, 26.14 with
>>>>>>> KenLM)
>>>>>> true
>>>>>>> 3) Compact Mode is faster than onDisk with a short test (77
>>>>>>> segments 96
>>>>>>> seconds, vs 126 seconds)
>>>>>> true
>>>>>>> 4) One last thing I do not understand though :
>>>>>>> For sake of checking, I replaced NC2011 by NC2010 in the evaluation (I
>>>>>>> know since NC2010 is part of training, should not be relevant)
>>>>>>> I got roughly the same BLEU score. I would have expected a higher
>>>>>>> score
>>>>>>> with a test set inculded in the training corpus.
>>>>>>>
>>>>>>> makes sense ?
>>>>>>>
>>>>>>>
>>>>>>> Next steps :
>>>>>>> What path should I use to get better scores ? I read the 'optimize'
>>>>>>> section of the website which deals more with speed
>>>>>>> and of course I will appply all of this but I was interested in
>>>>>>> tips to
>>>>>>> get more quality if possible.
>>>>>> look into domain adaptation if you have multiple training corpora,
>>>>>> some of which is in-domain and some out-of-domain.
>>>>>>
>>>>>> Other than that, getting good bleu score is a research open question.
>>>>>>
>>>>>> Well done on getting this far
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Moses-support mailing list
>>>>>>> Moses-support@mit.edu
>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> Moses-support@mit.edu
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
------------------------------
Message: 2
Date: Mon, 10 Aug 2015 15:22:25 +0200
From: liling tan <alvations@gmail.com>
Subject: Re: [Moses-support] Is there multithread option for KenLM's
build_binary?
To: moses-support <moses-support@mit.edu>
Message-ID:
<CAKzPaJL=rPKWPBE6biUmO2q4BZkanv9SVW1Sf0t5bvEHa_ki4Q@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Dear Moses devs/users,
@Marcin @Ken , Thanks for the tips on the -S for build_binary, RAM
estimation and the probing vs trie explanations.
Just to do a check, currently, is there an option for lmplz to output
binarized directly without going through ARPA? If there is, is there also a
binary to arpa dumping mechanism?
Regards,
LIling
On Fri, Aug 7, 2015 at 9:31 PM, liling tan <alvations@gmail.com> wrote:
> Dear Moses dev/users,
>
> On a related note, without multi-threads, can anyone give a gauge of how
> much RAM is required to binarized a 80GB (compressed .gz) 6gram arpa file?
> The no. of ngrams are:
>
> \data\
> ngram 1=7503209
> ngram 2=131003943
> ngram 3=671005861
> ngram 4=1510529519
> ngram 5=2165163610
> ngram 6=2477533666
>
>
> Also, how long would it take (single-threadedly) on a 2.4Ghz core with
> 128GB RAM? Is there a way to mathematically estimate the time taken and RAM
> required to binarize a language model?
>
> Also, is binarized and quantized LM from KenLM lossy? If so how lossy? The
> KenLM paper states "To conserve memory at the expense of accuracy, values
> may be quantized using q bits per probability and r bits per backoff". Can
> someone help point us to papers that quanitfy how lossy it gets in terms of
> MT experiments or word perplexity task?
>
> Thanks in advance for the pointers!
>
> Regards,
> Liling
>
> On Fri, Aug 7, 2015 at 8:56 PM, liling tan <alvations@gmail.com> wrote:
>
>> Dear Moses dev/users,
>>
>> Is there multithread option for KenLM's build_binary?
>>
>> Regards,
>> Liling
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150810/148ffc11/attachment-0001.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 106, Issue 24
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 106, Issue 24"
Post a Comment