Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: differences between moses and moses2 output (Vito Mandorino)
----------------------------------------------------------------------
Message: 1
Date: Thu, 13 Oct 2016 16:08:31 +0200
From: Vito Mandorino <vito.mandorino@linguacustodia.com>
Subject: Re: [Moses-support] differences between moses and moses2
output
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CA+8mSmGT799==M7pjNJ4c9=kGnq-OgeCMuVcwMabia2i-G3TYg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
We haven't checked the probingpt + minlexr speedup yet, however we have
found some further differences in the output with respect to the standard
Moses decoder.
It happens sometimes that the order of replacement of placeholders with
actual numbers is not the good one. For instance :
moses2 output: as of december 2012 , 31
moses output: as of december 31 , 2012
moses2 output: ? jour au 2013 f?vrier 15
moses output: ? jour au 15 f?vrier 2013
Is this the expected behavior?
Another minor difference is the handling of the carriage return character
("\r") . It seems to be deleted by standard Moses and converted into
newline by Moses2.
Best,
Vito
2016-10-07 17:24 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
> yep, it should give you a big speedup compared to probingpt + minlexr model
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 7 October 2016 at 16:21, Vito Mandorino <vito.mandorino@
> linguacustodia.com> wrote:
>
>> Yes I modified the line in the moses.ini . My comparison was with respect
>> to probingPT + minlexr reordering model (rather than .gz reordering model)
>>
>> 2016-10-07 16:25 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
>>
>>> weird. it should be a massive speedup (~500%). You have to change the
>>> moses.ini file slightly
>>>
>>> [feature]
>>> LexicalReordering ? path=reordering-table.msd-bidi
>>> rectional-fe.0.5.0-0.gz
>>> to
>>> [feature]
>>> LexicalReordering ? property-index=0
>>>
>>>
>>> Hieu Hoang
>>> http://www.hoang.co.uk/hieu
>>>
>>> On 7 October 2016 at 15:02, Vito Mandorino <
>>> vito.mandorino@linguacustodia.com> wrote:
>>>
>>>> Yes, that worked for me as well, thank you. There is a little
>>>> improvement in speed but not that much actually (about 5% faster using 30
>>>> threads).
>>>>
>>>> 2016-10-04 11:44 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
>>>>
>>>>> yes - the script expects the files to be gzipped.
>>>>> It runs ok for me. I executed this:
>>>>>
>>>>> MOSES_DIR=~/workspace/github/mosesdecoder.perf
>>>>>
>>>>> $MOSES_DIR/scripts/generic/binarize4moses2.perl
>>>>> --phrase-table=phrase-table.gz --lex-ro=reordering-table.wbe-msd-bidirectional-fe.gz
>>>>> --output-dir=integrated_phrase-reordering/ --num-lex-scores=6
>>>>>
>>>>> Got this:
>>>>>
>>>>> Executing: gzip -dc phrase-table.gz |
>>>>> /home/hieu/workspace/github/mosesdecoder.perf/scripts/generi
>>>>> c/../../contrib/sigtest-filter/filter-pt -n 0 | gzip -c >
>>>>> ./tmp.14373/pt.gz
>>>>> ...
>>>>> Reading phrase table finished, writing remaining files to disk.
>>>>>
>>>>> $ ll integrated_phrase-reordering/
>>>>> total 24688
>>>>> drwxrwxr-x 2 hieu hieu 4096 Oct 4 10:38 ./
>>>>> drwxrwxr-x 5 hieu hieu 4096 Oct 4 10:42 ../
>>>>> -rw-rw-r-- 1 hieu hieu 917861 Oct 4 10:42 Alignments.dat
>>>>> -rw-rw-r-- 1 hieu hieu 2267885 Oct 4 10:42 cache
>>>>> -rw-rw-r-- 1 hieu hieu 76 Oct 4 10:42 config
>>>>> -rw-rw-r-- 1 hieu hieu 3146720 Oct 4 10:42 probing_hash.dat
>>>>> -rw-rw-r-- 1 hieu hieu 333856 Oct 4 10:42 source_vocabids
>>>>> -rw-rw-r-- 1 hieu hieu 18429920 Oct 4 10:42 TargetColl.dat
>>>>> -rw-rw-r-- 1 hieu hieu 121401 Oct 4 10:42 TargetVocab.dat
>>>>>
>>>>>
>>>>> On 04/10/2016 09:06, Vito Mandorino wrote:
>>>>>
>>>>> The command was
>>>>>
>>>>> perl /home/Moses/mosesdecoder/scripts/generic/binarize4moses2.perl
>>>>> --phrase-table=/home/vito/phrase-table.sorted
>>>>> --lex-ro=/home/vito/reordering-table.sorted
>>>>> --output-dir=/home/vito/integrated_phrase-reordering/
>>>>> --num-lex-scores=6
>>>>>
>>>>> The tables in the command are sorted with LC_ALL . I attach them in
>>>>> .gz format. Should one use the .gz format also in the command above?
>>>>>
>>>>> Vito
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>>
>>>>
>>>> [image: Description : Description : lingua_custodia_final full logo]
>>>>
>>>> *The Translation Trustee*
>>>>
>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>>
>>>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
>>>> <%2B33%206%2084%2065%2068%2089>*
>>>>
>>>> *Email :* *vito.mandorino@linguacustodia.com
>>>> <massinissa.ahmim@linguacustodia.com>*
>>>>
>>>> *Website :*
>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>>
>>>
>>>
>>
>>
>> --
>> *M**. Vito MANDORINO -- Chief Scientist*
>>
>>
>> [image: Description : Description : lingua_custodia_final full logo]
>>
>> *The Translation Trustee*
>>
>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>
>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
>> <%2B33%206%2084%2065%2068%2089>*
>>
>> *Email :* *vito.mandorino@linguacustodia.com
>> <massinissa.ahmim@linguacustodia.com>*
>>
>> *Website :*
>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>
>
>
--
*M**. Vito MANDORINO -- Chief Scientist*
[image: Description : Description : lingua_custodia_final full logo]
*The Translation Trustee*
*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89*
*Email :* *vito.mandorino@linguacustodia.com
<massinissa.ahmim@linguacustodia.com>*
*Website :*
*www.linguacustodia.finance <http://www.linguacustodia.com/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161013/bbbf71ed/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20161013/bbbf71ed/attachment.jpg
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 120, Issue 14
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 120, Issue 14"
Post a Comment