Moses-support Digest, Vol 112, Issue 15

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Problem with processPhraseTableMin (Jeremy Gwinnup)
2. Phrase BitextSampling question (Marwa Refaie)
3. Re: Problem with processPhraseTableMin (Jeremy Gwinnup)


----------------------------------------------------------------------

Message: 1
Date: Thu, 4 Feb 2016 12:58:54 -0500
From: Jeremy Gwinnup <jeremy@gwinnup.org>
Subject: Re: [Moses-support] Problem with processPhraseTableMin
To: moses-support@mit.edu
Message-ID: <256070C1-159D-45A0-98A6-0D2130A18679@gwinnup.org>
Content-Type: text/plain; charset=utf-8

Kenneth,

Here?s a backtrace from gdb-ia from the intel+tcmalloc release variant. We?re building a debug version now, once it?s ready, I?ll send along that backtrace if it?s any different.

-Jeremy

Intermezzo: Calculating Huffman code sets
Creating Huffman codes for 471366 target phrase symbols
tcmalloc: large alloc 14381105152 bytes == 0xae638000 @
tcmalloc: large alloc 28762210304 bytes == 0x40891c000 @
tcmalloc: large alloc 1869272846557184 bytes == (nil) @
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Program received signal SIGABRT, Aborted.
0x000000000062267b in raise ()
(gdb) bt
#0 0x000000000062267b in raise ()
#1 0x000000000062d935 in abort ()
#2 0x00000000005f52b5 in __gnu_cxx::__verbose_terminate_handler ()
at ../../../../libstdc++-v3/libsupc++/vterminate.cc:95
#3 0x00000000005a4f96 in __cxxabiv1::__terminate (handler=<optimized out>)
at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:47
#4 0x00000000005a4fe1 in std::terminate ()
at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:57
#5 0x00000000005a37b8 in __cxxabiv1::__cxa_throw (obj=0x3e8daf0,
tinfo=0x963cc0 <typeinfo for std::bad_alloc>,
dest=0x5a3300 <std::bad_alloc::~bad_alloc()>)
at ../../../../libstdc++-v3/libsupc++/eh_throw.cc:87
#6 0x00000000004ea16e in (anonymous namespace)::cpp_alloc (
size=1869272846553800, nothrow=false) at src/tcmalloc.cc:1447
#7 0x00000000006b5b48 in tc_new (size=1869272846553800)
at src/tcmalloc.cc:1601
#8 0x0000000000480b1c in std::vector<unsigned long, std::allocator<unsigned long> >::_M_fill_insert(__gnu_cxx::__normal_iterator<unsigned long*, std::vector<unsigned long, std::allocator<unsigned long> > >, unsigned long, unsigned long const&) ()
#9 0x0000000000484957 in Moses::CanonicalHuffman<unsigned int>::CalcCodes(std::vector<unsigned long, std::allocator<unsigned long> >&) ()
#10 0x000000000045fca3 in Moses::PhraseTableCreator::CalcHuffmanCodes() ()
---Type <return> to continue, or q <return> to quit---
#11 0x000000000045c0cd in Moses::PhraseTableCreator::PhraseTableCreator(std::string, std::string, std::string, unsigned long, unsigned long, Moses::PhraseTableCreator::Coding, unsigned long, unsigned long, bool, bool, unsigned long, unsigned long, bool, unsigned long) ()
#12 0x0000000000400f79 in main ()
Warning: the current language does not match this frame.
(gdb)


> On Feb 4, 2016, at 12:01 PM, moses-support-request@mit.edu wrote:
>
> Send Moses-support mailing list submissions to
> moses-support@mit.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
> moses-support-request@mit.edu
>
> You can reach the person managing the list at
> moses-support-owner@mit.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
> 1. Re: Problem with processPhraseTableMin (Kenneth Heafield)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 4 Feb 2016 15:51:12 +0000
> From: Kenneth Heafield <moses@kheafield.com>
> Subject: Re: [Moses-support] Problem with processPhraseTableMin
> To: moses-support@mit.edu
> Message-ID: <56B37370.9010903@kheafield.com>
> Content-Type: text/plain; charset=UTF-8
>
> I can haz backtrace? It's clearly calculating a huge amount to allocate
> somewhere which is leading to tcmalloc returning NULL but that's not
> tcmalloc's fault.
>
> On 02/04/2016 03:47 PM, Marcin Junczys-Dowmunt wrote:
>> I've been using it with and without tcmalloc with no problems, the part
>> where it crashes for is not multi-threading anyway. I guess it's the
>> intel compiler, no idea why though.
>>
>> W dniu 2016-02-04 16:37, Jeremy Gwinnup napisa?(a):
>>
>>> Uli,
>>>
>>> I sent the phrase-table to Marcin yesterday to test - He was able to binarize the table successfully. Here, we've been compiling moses with the Intel compiler. We built the same checkout with gcc and using processPhraseTableMin from that build we were able to successfully binarize the phrase table.
>>>
>>> One thing I saw during testing these different configs was the intel-compiled version would output tcmalloc debug messages, but the gcc-compiled one would not. We're using tcmalloc-minimal for these builds. Should we be using the full version?
>>>
>>> Running moses ?version on both builds shows Boost 1.54, Xmlrpc-c 1.33.17 and CMPH (version unknown) linked in. We compile static binaries on a RHEL 6-based distro (Scientific Linux 6.7)
>>>
>>> -Jeremy
>>>> Message: 2 Date: Thu, 4 Feb 2016 15:03:02 +0000 From: Ulrich Germann
>>>> <ulrich.germann@gmail.com <mailto:ulrich.germann@gmail.com>> Subject:
>>>> Re: [Moses-support] Problem with processPhraseTableMin To: Marcin
>>>> Junczys-Dowmunt <junczys@amu.edu.pl <mailto:junczys@amu.edu.pl>> Cc:
>>>> "moses-support@mit.edu <mailto:moses-support@mit.edu>"
>>>> <moses-support@mit.edu <mailto:moses-support@mit.edu>> Message-ID:
>>>> <CAHQSRUq_gtrCUBkzwMZpVMKYPORmyGsE4sW-4rYBS_jzML1tWA@mail.gmail.com
>>>> <mailto:CAHQSRUq_gtrCUBkzwMZpVMKYPORmyGsE4sW-4rYBS_jzML1tWA@mail.gmail.com>>
>>>> Content-Type: text/plain; charset="utf-8" I've had
>>>> processPhraseTableMin crash when the phrase table contains duplicate
>>>> entries (can't remember if there was an unreasonable memory
>>>> allocation involved). Is Marcin using the exact same phrase table?
>>>> Can you check if the phrase table has duplicate entries? To crash or
>>>> not to crash could also depend on OS and libraries used. You can get
>>>> the versions of libraries compiled into moses with moses --version
>>>> I've had duplicate entries in the phrase table after running
>>>> ptable-sigtest-filter, which is Marcin's implementation of Johnson et
>>>> al.'s significance filtering that I pulled in from his WIPO branch;
>>>> compile with --with-mm --with-mm-extras to get it compiled. - Uli
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 112, Issue 14
> **********************************************




------------------------------

Message: 2
Date: Thu, 4 Feb 2016 20:00:39 +0000
From: Marwa Refaie <basmallah@hotmail.com>
Subject: [Moses-support] Phrase BitextSampling question
To: Moses <moses-support@mit.edu>
Message-ID: <DUB112-W5119CFA6536041D6CF1284BAD10@phx.gbl>
Content-Type: text/plain; charset="windows-1256"

I prepared all files for incremental training experiment , now my question is how to set weights in the moses.ini ??


PhraseDictionaryBitextSampling name=PT0 output-factor=0 path=/home/marwa/moses3/post1/post2/corp. L1=en L2=ar smooth=0 prov=1
PT0= g+ s g1jr2 0 j+0.2 1 -1 0.2 I wrote the above two lines in the moses.ini, but when translate I have no Arabic output, just the same input English text without translation, & no errors .
what I miss please ?

Marwa N. Refaie


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160204/7005edb5/attachment-0001.html

------------------------------

Message: 3
Date: Thu, 4 Feb 2016 15:31:49 -0500
From: Jeremy Gwinnup <jeremy@gwinnup.org>
Subject: Re: [Moses-support] Problem with processPhraseTableMin
To: moses-support@mit.edu
Message-ID: <6E40A219-FB1B-4124-92B0-D5044B4FAEF0@gwinnup.org>
Content-Type: text/plain; charset=us-ascii

Ok, this is weird. We just ran the intel+tcmalloc debug variant and it completed as normal. Also noticed no tcmalloc debug messages though.


> On Feb 4, 2016, at 12:01 PM, moses-support-request@mit.edu wrote:
>
> Send Moses-support mailing list submissions to
> moses-support@mit.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
> moses-support-request@mit.edu
>
> You can reach the person managing the list at
> moses-support-owner@mit.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
> 1. Re: Problem with processPhraseTableMin (Kenneth Heafield)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 4 Feb 2016 15:51:12 +0000
> From: Kenneth Heafield <moses@kheafield.com>
> Subject: Re: [Moses-support] Problem with processPhraseTableMin
> To: moses-support@mit.edu
> Message-ID: <56B37370.9010903@kheafield.com>
> Content-Type: text/plain; charset=UTF-8
>
> I can haz backtrace? It's clearly calculating a huge amount to allocate
> somewhere which is leading to tcmalloc returning NULL but that's not
> tcmalloc's fault.
>
> On 02/04/2016 03:47 PM, Marcin Junczys-Dowmunt wrote:
>> I've been using it with and without tcmalloc with no problems, the part
>> where it crashes for is not multi-threading anyway. I guess it's the
>> intel compiler, no idea why though.
>>
>> W dniu 2016-02-04 16:37, Jeremy Gwinnup napisa?(a):
>>
>>> Uli,
>>>
>>> I sent the phrase-table to Marcin yesterday to test - He was able to binarize the table successfully. Here, we've been compiling moses with the Intel compiler. We built the same checkout with gcc and using processPhraseTableMin from that build we were able to successfully binarize the phrase table.
>>>
>>> One thing I saw during testing these different configs was the intel-compiled version would output tcmalloc debug messages, but the gcc-compiled one would not. We're using tcmalloc-minimal for these builds. Should we be using the full version?
>>>
>>> Running moses ?version on both builds shows Boost 1.54, Xmlrpc-c 1.33.17 and CMPH (version unknown) linked in. We compile static binaries on a RHEL 6-based distro (Scientific Linux 6.7)
>>>
>>> -Jeremy
>>>> Message: 2 Date: Thu, 4 Feb 2016 15:03:02 +0000 From: Ulrich Germann
>>>> <ulrich.germann@gmail.com <mailto:ulrich.germann@gmail.com>> Subject:
>>>> Re: [Moses-support] Problem with processPhraseTableMin To: Marcin
>>>> Junczys-Dowmunt <junczys@amu.edu.pl <mailto:junczys@amu.edu.pl>> Cc:
>>>> "moses-support@mit.edu <mailto:moses-support@mit.edu>"
>>>> <moses-support@mit.edu <mailto:moses-support@mit.edu>> Message-ID:
>>>> <CAHQSRUq_gtrCUBkzwMZpVMKYPORmyGsE4sW-4rYBS_jzML1tWA@mail.gmail.com
>>>> <mailto:CAHQSRUq_gtrCUBkzwMZpVMKYPORmyGsE4sW-4rYBS_jzML1tWA@mail.gmail.com>>
>>>> Content-Type: text/plain; charset="utf-8" I've had
>>>> processPhraseTableMin crash when the phrase table contains duplicate
>>>> entries (can't remember if there was an unreasonable memory
>>>> allocation involved). Is Marcin using the exact same phrase table?
>>>> Can you check if the phrase table has duplicate entries? To crash or
>>>> not to crash could also depend on OS and libraries used. You can get
>>>> the versions of libraries compiled into moses with moses --version
>>>> I've had duplicate entries in the phrase table after running
>>>> ptable-sigtest-filter, which is Marcin's implementation of Johnson et
>>>> al.'s significance filtering that I pulled in from his WIPO branch;
>>>> compile with --with-mm --with-mm-extras to get it compiled. - Uli
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 112, Issue 14
> **********************************************




------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 112, Issue 15
**********************************************

0 Response to "Moses-support Digest, Vol 112, Issue 15"

Post a Comment