Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Contrib Web - translate.cgi (Vincent Nguyen)
2. character-level translation (fatma elzahraa Eltaher)
3. Re: EMS help (Vincent Nguyen)
----------------------------------------------------------------------
Message: 1
Date: Thu, 30 Jul 2015 18:39:22 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: Re: [Moses-support] Contrib Web - translate.cgi
To: moses-support@mit.edu
Message-ID: <55BA533A.9000701@neuf.fr>
Content-Type: text/plain; charset=utf-8; format=flowed
found it ....
I needed #!/usr/bin/perl -w
instead of
#!/usr/bin/env perl
but now I am confused.
I thought the second one was good.
why not working in the context of translate.cgi ????
Le 30/07/2015 16:48, Vincent Nguyen a ?crit :
>
> still don't know why not working but I managed to patch the original
> tokenizer / detokenizer .perl files from Herv?.
>
> I added the conversion ' => '
> and reversed in his code.
>
> not very clean but does the job.
>
>
>
> Le 30/07/2015 14:53, Vincent Nguyen a ?crit :
>> nope.
>>
>> the tokenizer.perl in scripts/share/tokenizer is much longer than the
>> one in contrib/web/bin.
>>
>> I replaced it file for file, modified the path to nonbreaking_prefixes
>> tried with -b
>> but nothing happens in the ./daemon.pl window.
>>
>>
>>
>>
>> Le 30/07/2015 10:38, Barry Haddow a ?crit :
>>> Try using the -b option in the tokenizer / detokenizer to disable
>>> buffering.
>>>
>>> On 29/07/15 18:47, Vincent Nguyen wrote:
>>>> Hi,
>>>>
>>>> As is, it was working fine except the tokenizer / detokenizer .perl code
>>>> is outdated.
>>>> It causes problem with the apostrophe in French.
>>>>
>>>> so I changed the translate.cgi file to run the 2 perl file from
>>>> moses/scripts/share/tokenizer
>>>> but it does not work at all.
>>>> not same parameters ?
>>>>
>>>> Cheers,
>>>> Vincent
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
------------------------------
Message: 2
Date: Thu, 30 Jul 2015 20:07:55 +0200
From: fatma elzahraa Eltaher <fatmaeltaher@gmail.com>
Subject: [Moses-support] character-level translation
To: moses-support@mit.edu
Message-ID:
<CAOW1BbQ5ayg_Wc3Ds8v4XdNfVGESygNr924WkM_03_tMS2NS_w@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Dear All,
all I need is to enter one word and get it is translation, is it available
in Moses and what must I do to use Moses for character-level translation
?
thank you,
Fatma El-Zahraa El -Taher
Teaching Assistant at Computer & System department
Faculty of Engineering, Azhar University
Email : fatmaeltaher@gmail.com
mobile: +201141600434
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150730/01bcbb9b/attachment-0001.htm
------------------------------
Message: 3
Date: Thu, 30 Jul 2015 22:01:18 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: Re: [Moses-support] EMS help
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>, moses-support
<moses-support@mit.edu>
Message-ID: <55BA828E.6040807@neuf.fr>
Content-Type: text/plain; charset="windows-1252"
Barry,
If I want the end result to be Compact Tables and not OnDisk.
Do I have to binarize first or can I convert directly to Compact ? (ie
can I skip the CreateOnDisk stuff)
if so is there a predefined script or should do it manually ?
thanks
Le 28/07/2015 15:44, Barry Haddow a ?crit :
> Hi Vincent
>
>
> I think the quotes are getting stripped off further down the pipeline.
> You could work around by changing to the compact phrase table. Or try
> editing binarize-model.perl to change
>
> safesystem("$RealBin/filter-model-given-input.pl $targetdir
> $input_config /dev/null $hierarchical -nofilter -Binarizer
> $binarizer") || die "binarising failed";
>
> to
>
> safesystem("$RealBin/filter-model-given-input.pl $targetdir
> $input_config /dev/null $hierarchical -nofilter -Binarizer
> \"$binarizer\"") || die "binarising failed";
>
> Note the escaped quotes around the $binarizer.
>
> cheers - Barry
>
> On 28/07/15 14:09, Vincent Nguyen wrote:
>> same error:
>>
>> #!/bin/bash
>> PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
>> cd /home/moses/working
>> echo 'starting at '`date`' on '`hostname`
>> mkdir -p /home/moses/working/training
>> mkdir -p /home/moses/working/model
>> /home/moses/mosesdecoder/scripts/training/binarize-model.perl
>> /home/moses/working/model/moses.ini.5
>> /home/moses/working/model/moses.bin.ini.6 -Binarizer
>> "/home/moses/mosesdecoder/bin/CreateOnDiskPt 1 1 4 100 2"
>>
>> echo 'finished at '`date`
>> touch /home/moses/working/steps/6/TRAINING_binarize-config.6.DONE
>>
>>
>>
>>
>> Le 28/07/2015 14:47, Barry Haddow a ?crit :
>>> Hi Vincent
>>>
>>> It could be a bug. Could you edit
>>> mosesdecoder/scripts/ems/experiment.meta and change the line:
>>>
>>> template: $binarize-all IN OUT -Binarizer $ttable-binarizer
>>>
>>> to
>>>
>>> template: $binarize-all IN OUT -Binarizer "$ttable-binarizer"
>>>
>>> Note that I have added quotes. Then you'll have to delete the most
>>> recent run, and re-run experiment.perl. If it works, fine. If it
>>> doesn't, could you post the steps/6/TRAINING_binarize-config.6
>>> script (hopefully I got the name right - you may need to change the
>>> number)
>>>
>>> cheers - Barry
>>>
>>>
>>> On 28/07/15 13:11, Vincent Nguyen wrote:
>>>> I know but this is what I have in my config.basic now:
>>>> # conversion of rule table into binary on-disk format
>>>> ttable-binarizer = "$moses-bin-dir/CreateOnDiskPt 1 1 4 100 2"
>>>> binarize-all = $moses-script-dir/training/binarize-model.perl
>>>>
>>>> I don't where else I can add the 5 arguments or if I need to
>>>> reference ttable-binarizer somewhere
>>>>
>>>>
>>>> Le 28/07/2015 13:49, Barry Haddow a ?crit :
>>>>> Hi Vincent
>>>>>
>>>>> If you look at the error log, you will see:
>>>>>
>>>>>> Usage: /home/moses/mosesdecoder/bin/CreateOnDiskPt
>>>>>> numSourceFactors numTargetFactors numScores tableLimit
>>>>>> sortScoreIndex inputPath outputPath
>>>>> You are missing the first 5 arguments to CreateOnDiskPt, as given
>>>>> in config.basic.
>>>>>
>>>>> cheers - Barry
>>>>>
>>>>> On 28/07/15 12:37, Vincent Nguyen wrote:
>>>>>> I don't know why but the binarize crashes see below ....
>>>>>>
>>>>>>>
>>>>>>>> in my working directory I have 2 subdir,
>>>>>>>> "tuning" with inside moses.filtered.ini.5 moses.ini.5
>>>>>>>> moses.tuned.ini.5
>>>>>>>> and
>>>>>>>> "model" with inside moses.ini.5 (apparently this one does not
>>>>>>>> have the
>>>>>>>> tuned weights)
>>>>>>>>
>>>>>>>> those in the tuning subdir : the "tuned" one moses.tuned.ini.5
>>>>>>>> generated
>>>>>>>> after the moses.ini.5 seems to point on phrase-table.5.gz not
>>>>>>>> binarized
>>>>>>>> and the moses.5.ini seem to point on the binarized within
>>>>>>>> tuning/filtered.5/...
>>>>>>>> unclear to me on which one I should use.
>>>>>>> If you run EMS, there will be a filtered ini file inside the
>>>>>>> evaluation directory which can be used to translate the test set
>>>>>>> using the tuned weights. However this model is filtered for the
>>>>>>> test set, so you cannot use it on other sentences.
>>>>>>>
>>>>>>> If you want the full model binarised, then you should add:
>>>>>>>
>>>>>>> binarize-all = $moses-script-dir/training/binarize-model.perl
>>>>>>>
>>>>>>> to the [GENERAL] section of the EMS config and rerun EMS. In
>>>>>>> this case the moses.tuned.ini in tuning can be used to translate
>>>>>>> any sentences.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Executing:
>>>>>> /home/moses/mosesdecoder/scripts/training/filter-model-given-input.pl
>>>>>> /home/moses/working/model/moses.bin.ini.6.tables
>>>>>> /home/moses/working/model/moses.ini.5 /dev/null -nofilter
>>>>>> -Binarizer /home/moses/mosesdecoder/bin/CreateOnDiskPt
>>>>>> Executing: mkdir -p /home/moses/working/model/moses.bin.ini.6.tables
>>>>>> Stripping XML...
>>>>>> Executing:
>>>>>> /home/moses/mosesdecoder/scripts/training/../generic/strip-xml.perl
>>>>>> < /dev/null >
>>>>>> /home/moses/working/model/moses.bin.ini.6.tables/input.34384
>>>>>> pt:PhraseDictionaryMemory name=TranslationModel0 num-features=4
>>>>>> path=/home/moses/working/model/phrase-table.5 input-factor=0
>>>>>> output-factor=0
>>>>>> Considering factor 0
>>>>>> ro:LexicalReordering name=LexicalReordering0 num-features=6
>>>>>> type=wbe-msd-bidirectional-fe-allff input-factor=0
>>>>>> output-factor=0
>>>>>> path=/home/moses/working/model/reordering-table.5.wbe-msd-bidirectional-fe.gz
>>>>>>
>>>>>> Considering factor 0
>>>>>> Filtering files...
>>>>>> filtering /home/moses/working/model/phrase-table.5 ->
>>>>>> /home/moses/working/model/moses.bin.ini.6.tables/phrase-table.0-0.1.1...
>>>>>>
>>>>>> Executing: ln -s /home/moses/working/model/phrase-table.5.gz
>>>>>> /home/moses/working/model/moses.bin.ini.6.tables/phrase-table.0-0.1.1.gz
>>>>>>
>>>>>> binarizing...
>>>>>> Executing: /home/moses/mosesdecoder/bin/CreateOnDiskPt
>>>>>> /home/moses/working/model/moses.bin.ini.6.tables/phrase-table.0-0.1.1.gz
>>>>>> /home/moses/working/model/moses.bin.ini.6.tables/phrase-table.0-0.1.1.bin
>>>>>>
>>>>>> Usage: /home/moses/mosesdecoder/bin/CreateOnDiskPt
>>>>>> numSourceFactors numTargetFactors numScores tableLimit
>>>>>> sortScoreIndex inputPath outputPath
>>>>>> Exit code: 1
>>>>>> Can't binarize at
>>>>>> /home/moses/mosesdecoder/scripts/training/filter-model-given-input.pl
>>>>>> line 417.
>>>>>> Exit code: 1
>>>>>> binarising failed at
>>>>>> /home/moses/mosesdecoder/scripts/training/binarize-model.perl
>>>>>> line 43.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150730/e298f3f3/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 105, Issue 73
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 105, Issue 73"
Post a Comment