Moses-support Digest, Vol 112, Issue 11

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Problem with processPhraseTableMin (Marcin Junczys-Dowmunt)
2. Shared Task on Biomedical Machine Translation at ACL WMT'16
-- - 1st Call for Participation (Mariana Lara Neves)


----------------------------------------------------------------------

Message: 1
Date: Wed, 3 Feb 2016 13:01:36 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Problem with processPhraseTableMin
To: moses-support@mit.edu
Message-ID: <56B1EC20.1050101@amu.edu.pl>
Content-Type: text/plain; charset=utf-8; format=flowed

Weird.

Jeremy, I binarized your phrase-table a couple of times with different
commits (also the most recent one), and I cannot reproduce the error.
Try maybe -threads 10 or 12.
I can make the binarized versions available for download.

W dniu 02.02.2016 o 18:21, Marcin Junczys-Dowmunt pisze:
> Looks fine, I had no problems running it with 18 and more domain
> indicators. Your machine is certainly more than suitable. Just one
> remark, using more than 8-12 threads usually slows things down, but
> should not cause crashes. Any chance to have a look at that table?
>
> W dniu 02.02.2016 o 18:16, Jeremy Gwinnup pisze:
>> Marcin,
>>
>> I was able to use -T with processLexicalTableMin successfully. I also tried processPhraseTableMin using a local tmp dir with 200G free and it still crashed at step 3 with the huge malloc message. Phrase table is nothing fancy - just standard 4 scores and 3 domain indicator features. Here?s a complete output with more info about the phrase table:
>>
>> Phrase table in question:
>>
>> -rw-rw-r-- 1 jgwinnup scream 2.2G Feb 1 23:58 phrase-table.1.gz
>>
>> Machine in question has 1TB RAM/32 cores - should be more than enough for the jobe
>>
>> Moses git-rev ends with: 80572b4 (Jan. 27)
>>
>> 1tqoct1:model> $MOSES/bin/processPhraseTableMin -in phrase-table.1.gz -out phrase-table.1 -threads all -nscores 7 -T /tmp_with_200G_free
>> WARNING: You are using a nonstandard number of scores (7) with PREnc. Set the index of P(t|s) with -rankscore int if it is not 2.
>> Used options:
>> Text phrase table will be read from: phrase-table.1.gz
>> Output phrase table will be written to: phrase-table.1.minphr
>> Step size for source landmark phrases: 2^10=1024
>> Source phrase fingerprint size: 16 bits / P(fp)=1.52588e-05
>> Selected target phrase encoding: Huffman + PREnc
>> Maxiumum allowed rank for PREnc: 100
>> Number of score components in phrase table: 7
>> Single Huffman code set for score components: no
>> Using score quantization: no
>> Explicitly included alignment information: yes
>> Running with 32 threads
>>
>> Pass 1/3: Creating hash function for rank assignment
>> ..................................................[5000000]
>> ..................................................[10000000]
>> ..................................................[15000000]
>> ..................................................[20000000]
>> ..................................................[25000000]
>> ..................................................[30000000]
>> ..................................................[35000000]
>> ..................................................[40000000]
>> ..................................................[45000000]
>> ....
>>
>> Pass 2/3: Creating source phrase index + Encoding target phrases
>> ..................................................[5000000]
>> ..................................................[10000000]
>> ..................................................[15000000]
>> ..................................................[20000000]
>> ..................................................[25000000]
>> ..................................................[30000000]
>> ..................................................[35000000]
>> ..................................................[40000000]
>> ..................................................[45000000]
>> ....
>>
>> Intermezzo: Calculating Huffman code sets
>> Creating Huffman codes for 471366 target phrase symbols
>> tcmalloc: large alloc 13808820224 bytes == 0xb0592000 @
>> tcmalloc: large alloc 27617640448 bytes == 0x3e86b0000 @
>> tcmalloc: large alloc 5187358422106112 bytes == (nil) @
>> terminate called after throwing an instance of 'std::bad_alloc'
>> what(): std::bad_alloc
>>
>>
>>
>>
>>> On Feb 2, 2016, at 10:21 AM, Jeremy Gwinnup <jeremy@gwinnup.org> wrote:
>>>
>>> Hi,
>>>
>>> I?m having a problem using processPhraseTableMin to compress a phrase table with 7 scores - the program consistently coredumps at step 3 - command and relevant output below. Is there anything I?m doing glaringly wrong?
>>>
>>> Thanks!
>>> -Jeremy
>>>
>>> Command:
>>>
>>> 1tqoct1:model> $MOSES/bin/processPhraseTableMin -in phrase-table.1.gz -out phrase-table.1 -threads all -nscores 7
>>>
>>> Once we get to step 3:
>>>
>>> Intermezzo: Calculating Huffman code sets
>>> Creating Huffman codes for 471366 target phrase symbols
>>> tcmalloc: large alloc 13983629312 bytes == 0xb14ce000 @
>>> tcmalloc: large alloc 27967250432 bytes == 0x3f3ca4000 @
>>> tcmalloc: large alloc 15681406635450368 bytes == (nil) @
>>> terminate called after throwing an instance of 'std::bad_alloc'
>>> what(): std::bad_alloc
>>>
>>> Top looked like this when the program ran into trouble:
>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>> 27416 jgwinnup 20 0 45.9g 30g 4.0g R 10.6 3.0 1589:17 processPhraseTa
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support




------------------------------

Message: 2
Date: Wed, 3 Feb 2016 13:45:39 +0100
From: Mariana Lara Neves <marianalaraneves@gmail.com>
Subject: [Moses-support] Shared Task on Biomedical Machine Translation
at ACL WMT'16 -- - 1st Call for Participation
To: moses-support@mit.edu
Message-ID:
<CAF+PL=fw4rLCCTt3LzzVd6oTJ+p4wn5P3p6k70YfK3MCjShJdw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Dear all,

We are pleased to announce a new shared task on biomedical machine
translation in the ACL First Conference on Machine Translation (WMT'16).
This new task aims to evaluate systems on the translation of scientific
publications for the the biological and health domains. The documents were
retrieved from the Scielo database of scientific publications and the
training data is already available. The biomedical translation task will
address the following language pairs:

+ English-French and French-English - (health)

+ English-Spanish and Spanish-English - (biological/health)

+ English-Portuguese and Portuguese-English - (biological/health)


Please find more information (including access to the training set) on the
dedicated shared task page:

http://www.statmt.org/wmt16/biomedical-translation-task.html


Important dates:


* Release of training data: end of January 2016 (already available)

* Release of test data April, 2016

* Results submission deadline April, 2016

* Paper submission deadline May 8, 2016

* Notification of acceptance June 5, 2016

* Camera-ready deadline June 22, 2016


Best regards,


Mariana Neves (Hasso-Plattner Institute, Germany)
Antonio Jimeno Yepes (IBM Research Australia)
Aur?lie N?v?ol (LIMSI, CNRS, France)
Karin Verspoor (University of Melbourne, Australia)

--

Mariana Neves, PhD
Post-doc researcher, Hasso-Plattner Institut
Web site: http://sites.google.com/site/marianalaraneves
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160203/dda5b847/attachment-0001.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 112, Issue 11
**********************************************

0 Response to "Moses-support Digest, Vol 112, Issue 11"

Post a Comment