Moses-support Digest, Vol 120, Issue 9

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: differences between moses and moses2 output (Vito Mandorino)
2. Re: differences between moses and moses2 output (Hieu Hoang)


----------------------------------------------------------------------

Message: 1
Date: Fri, 7 Oct 2016 17:21:52 +0200
From: Vito Mandorino <vito.mandorino@linguacustodia.com>
Subject: Re: [Moses-support] differences between moses and moses2
output
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CA+8mSmF9h2yg0YYPf9Ku9EDxX6cPaVMk60D=ssWL_K2O8X6iyg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Yes I modified the line in the moses.ini . My comparison was with respect
to probingPT + minlexr reordering model (rather than .gz reordering model)

2016-10-07 16:25 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:

> weird. it should be a massive speedup (~500%). You have to change the
> moses.ini file slightly
>
> [feature]
> LexicalReordering ? path=reordering-table.msd-
> bidirectional-fe.0.5.0-0.gz
> to
> [feature]
> LexicalReordering ? property-index=0
>
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 7 October 2016 at 15:02, Vito Mandorino <vito.mandorino@
> linguacustodia.com> wrote:
>
>> Yes, that worked for me as well, thank you. There is a little improvement
>> in speed but not that much actually (about 5% faster using 30 threads).
>>
>> 2016-10-04 11:44 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
>>
>>> yes - the script expects the files to be gzipped.
>>> It runs ok for me. I executed this:
>>>
>>> MOSES_DIR=~/workspace/github/mosesdecoder.perf
>>>
>>> $MOSES_DIR/scripts/generic/binarize4moses2.perl
>>> --phrase-table=phrase-table.gz --lex-ro=reordering-table.wbe-msd-bidirectional-fe.gz
>>> --output-dir=integrated_phrase-reordering/ --num-lex-scores=6
>>>
>>> Got this:
>>>
>>> Executing: gzip -dc phrase-table.gz | /home/hieu/workspace/github/mo
>>> sesdecoder.perf/scripts/generic/../../contrib/sigtest-filter/filter-pt
>>> -n 0 | gzip -c > ./tmp.14373/pt.gz
>>> ...
>>> Reading phrase table finished, writing remaining files to disk.
>>>
>>> $ ll integrated_phrase-reordering/
>>> total 24688
>>> drwxrwxr-x 2 hieu hieu 4096 Oct 4 10:38 ./
>>> drwxrwxr-x 5 hieu hieu 4096 Oct 4 10:42 ../
>>> -rw-rw-r-- 1 hieu hieu 917861 Oct 4 10:42 Alignments.dat
>>> -rw-rw-r-- 1 hieu hieu 2267885 Oct 4 10:42 cache
>>> -rw-rw-r-- 1 hieu hieu 76 Oct 4 10:42 config
>>> -rw-rw-r-- 1 hieu hieu 3146720 Oct 4 10:42 probing_hash.dat
>>> -rw-rw-r-- 1 hieu hieu 333856 Oct 4 10:42 source_vocabids
>>> -rw-rw-r-- 1 hieu hieu 18429920 Oct 4 10:42 TargetColl.dat
>>> -rw-rw-r-- 1 hieu hieu 121401 Oct 4 10:42 TargetVocab.dat
>>>
>>>
>>> On 04/10/2016 09:06, Vito Mandorino wrote:
>>>
>>> The command was
>>>
>>> perl /home/Moses/mosesdecoder/scripts/generic/binarize4moses2.perl
>>> --phrase-table=/home/vito/phrase-table.sorted
>>> --lex-ro=/home/vito/reordering-table.sorted
>>> --output-dir=/home/vito/integrated_phrase-reordering/ --num-lex-scores=6
>>>
>>> The tables in the command are sorted with LC_ALL . I attach them in .gz
>>> format. Should one use the .gz format also in the command above?
>>>
>>> Vito
>>>
>>>
>>>
>>
>>
>> --
>> *M**. Vito MANDORINO -- Chief Scientist*
>>
>>
>> [image: Description : Description : lingua_custodia_final full logo]
>>
>> *The Translation Trustee*
>>
>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>
>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
>> <%2B33%206%2084%2065%2068%2089>*
>>
>> *Email :* *vito.mandorino@linguacustodia.com
>> <massinissa.ahmim@linguacustodia.com>*
>>
>> *Website :*
>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>
>
>


--
*M**. Vito MANDORINO -- Chief Scientist*


[image: Description : Description : lingua_custodia_final full logo]

*The Translation Trustee*

*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*

*Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89*

*Email :* *vito.mandorino@linguacustodia.com
<massinissa.ahmim@linguacustodia.com>*

*Website :*
*www.linguacustodia.finance <http://www.linguacustodia.com/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161007/1432885f/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20161007/1432885f/attachment-0001.jpg

------------------------------

Message: 2
Date: Fri, 7 Oct 2016 16:24:33 +0100
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] differences between moses and moses2
output
To: Vito Mandorino <vito.mandorino@linguacustodia.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbiCUSoW5cr_Ma7sjuLxFtJAzYFoy4dAhTnrd_nbNCbhaA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

yep, it should give you a big speedup compared to probingpt + minlexr model

Hieu Hoang
http://www.hoang.co.uk/hieu

On 7 October 2016 at 16:21, Vito Mandorino <
vito.mandorino@linguacustodia.com> wrote:

> Yes I modified the line in the moses.ini . My comparison was with respect
> to probingPT + minlexr reordering model (rather than .gz reordering model)
>
> 2016-10-07 16:25 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
>
>> weird. it should be a massive speedup (~500%). You have to change the
>> moses.ini file slightly
>>
>> [feature]
>> LexicalReordering ? path=reordering-table.msd-bidi
>> rectional-fe.0.5.0-0.gz
>> to
>> [feature]
>> LexicalReordering ? property-index=0
>>
>>
>> Hieu Hoang
>> http://www.hoang.co.uk/hieu
>>
>> On 7 October 2016 at 15:02, Vito Mandorino <vito.mandorino@linguacustodia
>> .com> wrote:
>>
>>> Yes, that worked for me as well, thank you. There is a little
>>> improvement in speed but not that much actually (about 5% faster using 30
>>> threads).
>>>
>>> 2016-10-04 11:44 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
>>>
>>>> yes - the script expects the files to be gzipped.
>>>> It runs ok for me. I executed this:
>>>>
>>>> MOSES_DIR=~/workspace/github/mosesdecoder.perf
>>>>
>>>> $MOSES_DIR/scripts/generic/binarize4moses2.perl
>>>> --phrase-table=phrase-table.gz --lex-ro=reordering-table.wbe-msd-bidirectional-fe.gz
>>>> --output-dir=integrated_phrase-reordering/ --num-lex-scores=6
>>>>
>>>> Got this:
>>>>
>>>> Executing: gzip -dc phrase-table.gz |
>>>> /home/hieu/workspace/github/mosesdecoder.perf/scripts/generi
>>>> c/../../contrib/sigtest-filter/filter-pt -n 0 | gzip -c >
>>>> ./tmp.14373/pt.gz
>>>> ...
>>>> Reading phrase table finished, writing remaining files to disk.
>>>>
>>>> $ ll integrated_phrase-reordering/
>>>> total 24688
>>>> drwxrwxr-x 2 hieu hieu 4096 Oct 4 10:38 ./
>>>> drwxrwxr-x 5 hieu hieu 4096 Oct 4 10:42 ../
>>>> -rw-rw-r-- 1 hieu hieu 917861 Oct 4 10:42 Alignments.dat
>>>> -rw-rw-r-- 1 hieu hieu 2267885 Oct 4 10:42 cache
>>>> -rw-rw-r-- 1 hieu hieu 76 Oct 4 10:42 config
>>>> -rw-rw-r-- 1 hieu hieu 3146720 Oct 4 10:42 probing_hash.dat
>>>> -rw-rw-r-- 1 hieu hieu 333856 Oct 4 10:42 source_vocabids
>>>> -rw-rw-r-- 1 hieu hieu 18429920 Oct 4 10:42 TargetColl.dat
>>>> -rw-rw-r-- 1 hieu hieu 121401 Oct 4 10:42 TargetVocab.dat
>>>>
>>>>
>>>> On 04/10/2016 09:06, Vito Mandorino wrote:
>>>>
>>>> The command was
>>>>
>>>> perl /home/Moses/mosesdecoder/scripts/generic/binarize4moses2.perl
>>>> --phrase-table=/home/vito/phrase-table.sorted
>>>> --lex-ro=/home/vito/reordering-table.sorted
>>>> --output-dir=/home/vito/integrated_phrase-reordering/
>>>> --num-lex-scores=6
>>>>
>>>> The tables in the command are sorted with LC_ALL . I attach them in .gz
>>>> format. Should one use the .gz format also in the command above?
>>>>
>>>> Vito
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>
>>>
>>> [image: Description : Description : lingua_custodia_final full logo]
>>>
>>> *The Translation Trustee*
>>>
>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>
>>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
>>> <%2B33%206%2084%2065%2068%2089>*
>>>
>>> *Email :* *vito.mandorino@linguacustodia.com
>>> <massinissa.ahmim@linguacustodia.com>*
>>>
>>> *Website :*
>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>
>>
>>
>
>
> --
> *M**. Vito MANDORINO -- Chief Scientist*
>
>
> [image: Description : Description : lingua_custodia_final full logo]
>
> *The Translation Trustee*
>
> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>
> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
> <%2B33%206%2084%2065%2068%2089>*
>
> *Email :* *vito.mandorino@linguacustodia.com
> <massinissa.ahmim@linguacustodia.com>*
>
> *Website :*
> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161007/3f85d11c/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20161007/3f85d11c/attachment.jpg

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 120, Issue 9
*********************************************

0 Response to "Moses-support Digest, Vol 120, Issue 9"

Post a Comment