Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Phrase extraction breaks on unexpected format of
aligned.grow-diag-final (Philipp Koehn)
2. Re: Query regarding unsupervised transliteration support in
moses (Philipp Koehn)
3. Re: Phrase extraction breaks on unexpected format of
aligned.grow-diag-final (Maarten van Gompel)
4. Re: daemon.pl script issue (Ihab Ramadan)
----------------------------------------------------------------------
Message: 1
Date: Mon, 6 Oct 2014 22:57:08 -0400
From: Philipp Koehn <pkoehn@inf.ed.ac.uk>
Subject: Re: [Moses-support] Phrase extraction breaks on unexpected
format of aligned.grow-diag-final
To: Maarten van Gompel <proycon@anaproy.nl>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDAuQQuEk7wZZwuQ+DWysCymw3kZFq-uRmb_w8+5gsGyZQ@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Hi,
which version of symal are you using?
The one distributed with Moses has not changed, but I am aware that
Nicola Bertoldi's online mgiza includes a version symal with reported
behaviour. You should use the Moses one (in the Moses bin directory).
-phi
On Mon, Oct 6, 2014 at 4:00 AM, Maarten van Gompel <proycon@anaproy.nl> wrote:
> Hi,
>
> I'm using the latest git version of moses, and it seems as if the training
> pipeline got broken somehow as the format of aligned.grow-diag.final changed.
>
> I'm invoking model-train.perl as follows:
>
> /vol/customopt/machine-translation/src/mosesdecoder/scripts/training/train-model.perl -external-bin-dir /vol/customopt/machine-translation/bin -root-dir . --corpus train --f fr --e en --first-step 1 --last-step 9 -reordering msd-bidirectional-fe --lm 0:3:/scratch/proycon/mosestest/train.fr.lm -mgiza -mgiza-cpus 20 -cores 20 -sort-buffer-size 10G -sort-batch-size 253 -sort-compress gzip -sort-parallel 20
>
> And it fails with warning like these on every sentence pair:
>
> WARNING: Et is a bad alignment point in sentence 44968
> T: If we do , I am sure we will be listened to .
> S: Et lorsque nous serons capables de le faire , je suis s?r qu' ils nous ?couteront .
>
> Looking into the code of 'extract', I see aligned.grow-diag-final is supposed to consist of space seperated lines with %d-%d (the alignments). But my aligned.grow-diag-final seems to be in a newer format and looks like this:
>
> Je trouve que ce n' est pas acceptable . {##} I consider this to be unacceptable . {##} 0-0 1-1 2-1 3-2 6-4 4-5 5-5 6-5 7-5 8-6
>
> The 'extract' program only expects the latter part. So I manually stripped the source and target sentences and left only that, and then it works. It seems something is going wrong in the training pipeline?
>
> Regards,
>
> --
>
> Maarten van Gompel
> Centre for Language Studies
> Radboud Universiteit Nijmegen
>
> proycon@anaproy.nl
> http://proycon.anaproy.nl
> http://github.com/proycon
>
> GnuPG key: 0x1A31555C XMPP: proycon@anaproy.nl
> Bitcoin: 1BRptZsKQtqRGSZ5qKbX2azbfiygHxJPsd
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
------------------------------
Message: 2
Date: Mon, 6 Oct 2014 23:47:57 -0400
From: Philipp Koehn <pkoehn@inf.ed.ac.uk>
Subject: Re: [Moses-support] Query regarding unsupervised
transliteration support in moses
To: Ratish Puduppully <ratish123@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDASYqiZ7rNNoXUaiz5eAP7KVQPE6rodfd21PjN4YrpGSg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Hi,
thanks a lot - I added that to the web site!
-phi
On Sat, Sep 27, 2014 at 6:49 AM, Ratish Puduppully <ratish123@gmail.com> wrote:
> Thanks Prof.
> I had missed the post-decoding-transliteration.pl step.
> I executed the same and got the output. The output is looking good and there
> is BLEU score improvement.
> Compliments to the efforts of the team.
>
> Also created a small doc which goes through the steps for Unsupervised
> Transliteration without ems
> https://docs.google.com/document/d/1G9RjczZXWGHU6byJFORf6uToItph1jU_piL53wQhGXg/edit?usp=sharing
>
> Regards,
>
> On Fri, Sep 26, 2014 at 11:11 PM, Philipp Koehn <pkoehn@inf.ed.ac.uk> wrote:
>>
>> Hi,
>>
>> did you run post-decoding transliteration?
>>
>> The original Moses run does not perform the transliteration, it only
>> produces
>> additional output that is used in the post-decoding transliterations step.
>>
>> That step is done by running
>> $MOSES/scripts/Transliteration/post-decoding-transliteration.pl
>>
>> Post-decoding transliterations is fully integrated into EMS, but you can
>> also
>> run it by hand. Unfortunately there is no detailed documentation about the
>> script, but its options are pretty self-explanatory.
>>
>> -phi
>>
>> On Fri, Sep 26, 2014 at 7:43 AM, Ratish Puduppully <ratish123@gmail.com>
>> wrote:
>>>
>>> Hi,
>>> Thanks for introducing support for Unsupervised transliteration in moses.
>>> I was able to execute the same as mentioned using EMS.
>>> I wish to implement it using moses training scripts. I performed the
>>> following steps:
>>>
>>> Set up srilm, mgiza
>>> Execute train-transliteration-module.pl
>>> It generated phrase table for transliteration
>>> I then train moses for translation. I add two arguments:
>>> -post-decoding-translit yes, -transliteration-phrase-table <path to
>>> transliteration phrase table>
>>>
>>> Accordingly, my moses.ini has additional entries
>>> But during decoding, my transliteration phrase table is not being
>>> consulted. Thus the OOV words are not being transliterated.
>>> I have attached my moses.ini file
>>> Please help resolve the issue.
>>>
>>> --
>>> Ratish Puduppully
>>> Research Engineer
>>> Center for Indian Language Technology
>>> Computer Science and Engineering Department
>>> IIT Bombay
>>> email: ratishp@cse.iitb.ac.in
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>
>
>
> --
> Ratish Puduppully
> Research Engineer
> Center for Indian Language Technology
> Computer Science and Engineering Department
> IIT Bombay
> email: ratishp@cse.iitb.ac.in
>
------------------------------
Message: 3
Date: Tue, 07 Oct 2014 09:43:44 +0200
From: Maarten van Gompel <proycon@anaproy.nl>
Subject: Re: [Moses-support] Phrase extraction breaks on unexpected
format of aligned.grow-diag-final
To: moses-support@mit.edu
Message-ID: <20141007074344.27323.68559@roma.anaproy.nl>
Content-Type: text/plain; charset="utf-8"
Quoting Philipp Koehn (2014-10-07 04:57:08)
> Hi,
>
> which version of symal are you using?
>
> The one distributed with Moses has not changed, but I am aware that
> Nicola Bertoldi's online mgiza includes a version symal with reported
> behaviour. You should use the Moses one (in the Moses bin directory).
>
> -phi
Hi,
Thanks! That might quite well be the issue, I am indeed using mgiza and it has
installed its own version of symal.
--
Maarten van Gompel
Centre for Language Studies
Radboud Universiteit Nijmegen
proycon@anaproy.nl
http://proycon.anaproy.nl
http://github.com/proycon
GnuPG key: 0x1A31555C XMPP: proycon@anaproy.nl
Bitcoin: 1BRptZsKQtqRGSZ5qKbX2azbfiygHxJPsd
------------------------------
Message: 4
Date: Tue, 7 Oct 2014 13:42:37 +0200
From: "Ihab Ramadan" <i.ramadan@saudisoft.com>
Subject: Re: [Moses-support] daemon.pl script issue
To: "'Hieu Hoang'" <hieuhoang@gmail.com>, <moses-support@mit.edu>
Message-ID: <003c01cfe223$cc8f9b90$65aed2b0$@saudisoft.com>
Content-Type: text/plain; charset="us-ascii"
The command exactly is
"/mosesdecoder-master/contrib/web/daemon.pl serverIP port
/path/of/moses.ini"
Can you please tell me where to find a documentation for the client server
code
Thanks
From: Hieu Hoang [mailto:hieuhoang@gmail.com]
Sent: Thursday, October 2, 2014 11:35 PM
To: i.ramadan@saudisoft.com; moses-support@mit.edu
Subject: Re: [Moses-support] daemon.pl script issue
what is the exact command you ran?
I'm not sure how robust the code is in
contrib/web
It's been over 2 years since anyone has touched that code.
https://github.com/moses-smt/mosesdecoder/tree/master/contrib/web
If you want a more robust client-server architecture, you should use the
code in
contrib/server
On 01/10/14 15:17, Ihab Ramadan wrote:
Dears,
Your support is so appreciated.
I face a problem in running daemon.pl script which sometimes close the
opened socket randomly saying
"can't bind server socket"
Any advice
Best Regards
Ihab Ramadan| Senior Developer| <http://www.saudisoft.com/> Saudisoft -
Egypt | Tel +2 02 330 320 37 Ext- 0 | Mob+201007570826 | Fax+20233032036 |
Follow us on
<http://www.linkedin.com/company/77017?trk=vsrp_companies_res_name&trkInfo=V
SRPsearchId%3A1489659901402995947155%2CVSRPtargetId%3A77017%2CVSRPcmpt%3Apri
mary> linked |
<https://www.facebook.com/pages/Saudisoft-Co-Ltd/289968997768973?ref_type=bo
okmark> ZA102637861 | <https://twitter.com/Saudisoft> ZA102637858
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141007/edb2b12e/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 1314 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20141007/edb2b12e/attachment.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 1317 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20141007/edb2b12e/attachment-0001.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 1351 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20141007/edb2b12e/attachment-0002.gif
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 96, Issue 6
********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 96, Issue 6"
Post a Comment