Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: differences between moses and moses2 output (Vito Mandorino)
2. Re: differences between moses and moses2 output (Hieu Hoang)
----------------------------------------------------------------------
Message: 1
Date: Fri, 7 Oct 2016 16:02:54 +0200
From: Vito Mandorino <vito.mandorino@linguacustodia.com>
Subject: Re: [Moses-support] differences between moses and moses2
output
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CA+8mSmECP-CoDny6eSt9GX6GM=N+y+fj0Jyo2c+kCd+hi4SA=g@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Yes, that worked for me as well, thank you. There is a little improvement
in speed but not that much actually (about 5% faster using 30 threads).
2016-10-04 11:44 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
> yes - the script expects the files to be gzipped.
> It runs ok for me. I executed this:
>
> MOSES_DIR=~/workspace/github/mosesdecoder.perf
>
> $MOSES_DIR/scripts/generic/binarize4moses2.perl
> --phrase-table=phrase-table.gz --lex-ro=reordering-table.wbe-msd-bidirectional-fe.gz
> --output-dir=integrated_phrase-reordering/ --num-lex-scores=6
>
> Got this:
>
> Executing: gzip -dc phrase-table.gz | /home/hieu/workspace/github/
> mosesdecoder.perf/scripts/generic/../../contrib/sigtest-filter/filter-pt
> -n 0 | gzip -c > ./tmp.14373/pt.gz
> ...
> Reading phrase table finished, writing remaining files to disk.
>
> $ ll integrated_phrase-reordering/
> total 24688
> drwxrwxr-x 2 hieu hieu 4096 Oct 4 10:38 ./
> drwxrwxr-x 5 hieu hieu 4096 Oct 4 10:42 ../
> -rw-rw-r-- 1 hieu hieu 917861 Oct 4 10:42 Alignments.dat
> -rw-rw-r-- 1 hieu hieu 2267885 Oct 4 10:42 cache
> -rw-rw-r-- 1 hieu hieu 76 Oct 4 10:42 config
> -rw-rw-r-- 1 hieu hieu 3146720 Oct 4 10:42 probing_hash.dat
> -rw-rw-r-- 1 hieu hieu 333856 Oct 4 10:42 source_vocabids
> -rw-rw-r-- 1 hieu hieu 18429920 Oct 4 10:42 TargetColl.dat
> -rw-rw-r-- 1 hieu hieu 121401 Oct 4 10:42 TargetVocab.dat
>
>
> On 04/10/2016 09:06, Vito Mandorino wrote:
>
> The command was
>
> perl /home/Moses/mosesdecoder/scripts/generic/binarize4moses2.perl
> --phrase-table=/home/vito/phrase-table.sorted
> --lex-ro=/home/vito/reordering-table.sorted --output-dir=/home/vito/integrated_phrase-reordering/
> --num-lex-scores=6
>
> The tables in the command are sorted with LC_ALL . I attach them in .gz
> format. Should one use the .gz format also in the command above?
>
> Vito
>
>
>
--
*M**. Vito MANDORINO -- Chief Scientist*
[image: Description : Description : lingua_custodia_final full logo]
*The Translation Trustee*
*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89*
*Email :* *vito.mandorino@linguacustodia.com
<massinissa.ahmim@linguacustodia.com>*
*Website :*
*www.linguacustodia.finance <http://www.linguacustodia.com/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161007/1acec9a3/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20161007/1acec9a3/attachment-0001.jpg
------------------------------
Message: 2
Date: Fri, 7 Oct 2016 15:25:30 +0100
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] differences between moses and moses2
output
To: Vito Mandorino <vito.mandorino@linguacustodia.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbhrr6dFOhxYX1NyMchpNpcVnrB9U2jEqdvY4KeU0gWfaA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
weird. it should be a massive speedup (~500%). You have to change the
moses.ini file slightly
[feature]
LexicalReordering ? path=reordering-table.msd-bidirectional-fe.0.5.0-0.gz
to
[feature]
LexicalReordering ? property-index=0
Hieu Hoang
http://www.hoang.co.uk/hieu
On 7 October 2016 at 15:02, Vito Mandorino <
vito.mandorino@linguacustodia.com> wrote:
> Yes, that worked for me as well, thank you. There is a little improvement
> in speed but not that much actually (about 5% faster using 30 threads).
>
> 2016-10-04 11:44 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
>
>> yes - the script expects the files to be gzipped.
>> It runs ok for me. I executed this:
>>
>> MOSES_DIR=~/workspace/github/mosesdecoder.perf
>>
>> $MOSES_DIR/scripts/generic/binarize4moses2.perl
>> --phrase-table=phrase-table.gz --lex-ro=reordering-table.wbe-msd-bidirectional-fe.gz
>> --output-dir=integrated_phrase-reordering/ --num-lex-scores=6
>>
>> Got this:
>>
>> Executing: gzip -dc phrase-table.gz | /home/hieu/workspace/github/mo
>> sesdecoder.perf/scripts/generic/../../contrib/sigtest-filter/filter-pt
>> -n 0 | gzip -c > ./tmp.14373/pt.gz
>> ...
>> Reading phrase table finished, writing remaining files to disk.
>>
>> $ ll integrated_phrase-reordering/
>> total 24688
>> drwxrwxr-x 2 hieu hieu 4096 Oct 4 10:38 ./
>> drwxrwxr-x 5 hieu hieu 4096 Oct 4 10:42 ../
>> -rw-rw-r-- 1 hieu hieu 917861 Oct 4 10:42 Alignments.dat
>> -rw-rw-r-- 1 hieu hieu 2267885 Oct 4 10:42 cache
>> -rw-rw-r-- 1 hieu hieu 76 Oct 4 10:42 config
>> -rw-rw-r-- 1 hieu hieu 3146720 Oct 4 10:42 probing_hash.dat
>> -rw-rw-r-- 1 hieu hieu 333856 Oct 4 10:42 source_vocabids
>> -rw-rw-r-- 1 hieu hieu 18429920 Oct 4 10:42 TargetColl.dat
>> -rw-rw-r-- 1 hieu hieu 121401 Oct 4 10:42 TargetVocab.dat
>>
>>
>> On 04/10/2016 09:06, Vito Mandorino wrote:
>>
>> The command was
>>
>> perl /home/Moses/mosesdecoder/scripts/generic/binarize4moses2.perl
>> --phrase-table=/home/vito/phrase-table.sorted
>> --lex-ro=/home/vito/reordering-table.sorted
>> --output-dir=/home/vito/integrated_phrase-reordering/ --num-lex-scores=6
>>
>> The tables in the command are sorted with LC_ALL . I attach them in .gz
>> format. Should one use the .gz format also in the command above?
>>
>> Vito
>>
>>
>>
>
>
> --
> *M**. Vito MANDORINO -- Chief Scientist*
>
>
> [image: Description : Description : lingua_custodia_final full logo]
>
> *The Translation Trustee*
>
> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>
> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
> <%2B33%206%2084%2065%2068%2089>*
>
> *Email :* *vito.mandorino@linguacustodia.com
> <massinissa.ahmim@linguacustodia.com>*
>
> *Website :*
> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161007/9b8d348c/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20161007/9b8d348c/attachment.jpg
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 120, Issue 8
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 120, Issue 8"
Post a Comment