Moses-support Digest, Vol 113, Issue 72

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: moses 2.1 vs 3 (Mohammad Salameh)
2. Re: ERROR: use --lm factor:order:filename to specify at least
one language model (Philipp Koehn)
3. Re: english-japanese translation (Vito Mandorino)


----------------------------------------------------------------------

Message: 1
Date: Mon, 28 Mar 2016 17:25:15 -0600
From: Mohammad Salameh <msalameh83@gmail.com>
Subject: Re: [Moses-support] moses 2.1 vs 3
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CADbe+DHVfGBE+oU=G92kmWzr_rCPnOVCYgg+RsvXuMR2CgwFrA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Hieu,
The Moses version I downloaded was in 8 December 2015.
I downloaded several versions after that date that all shared the same
issue.
So I think that was it.
Thanks again


On Mon, Mar 28, 2016 at 5:04 PM, Hieu Hoang <hieuhoang@gmail.com> wrote:

> well spotted, there seemed to have been a mistake which I've just fixed
>
> https://github.com/moses-smt/mosesdecoder/commit/b8bc4a9fb64731b593ba56cce1da1af0bbb6138e
> But are you sure the problem was in Moses 3? It seemed the bug was
> introduced only in Dec 2015
>
> https://github.com/moses-smt/mosesdecoder/commit/65f4f1f92af3d645b1dfc2f161bbcf47bdce1402
> whereas Moses 3 was released in Jan 2015
>
>
> On 28/03/2016 16:58, Mohammad Salameh wrote:
>
> Hi
> I have recently started using Moses 3.0 and I noticed the following
> difference compared to Moses 2.1
> Assuming ttable-limit is set to 20, Moses 3.0 retrieves the 20 translation
> options of each source segment from the phrase table. But before
> calculating the future costs and while it is checking the translation
> options spans, Moses 3.0 only considers the top 8 options for each span.
> Moses 2.1 considers the exact 20 options for each source span.
>
> I have tested this using the same configuration files on both systems.
> Although it never mentions that it pruned anything, but it seems most of
> the translations options are pruned
>
> *Moses 2.1*
> Total translation options: 814
> Total translation options pruned: 0
> translation options spanning from 0 to 0 is 20
> ...
>
> *Moses 3.0*
> Total translation options: 206
> Total translation options pruned: 0
> translation options spanning from 0 to 0 is 8
> ...
>
> Is there a parameter I can set to consider exactly the top 20 without
> discarding any?
>
>
> Regards,
> Salameh
>
>
> _______________________________________________
> Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160328/3b9b2857/attachment-0001.html

------------------------------

Message: 2
Date: Mon, 28 Mar 2016 23:58:23 -0400
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] ERROR: use --lm factor:order:filename to
specify at least one language model
To: Zhanwang Chen <zhanwang.c@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDATLiH8WpSc5E1Atg6=H=vaTJu57AT-_T6ftnHhOwf5QQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

you have to give the training script the location of the language model
since the language model has to built separately but the moses.ini
config file (that is the result of training) includes a pointer to it, so
the training script needs to know what to fill in.

-phi

On Sat, Mar 26, 2016 at 8:37 AM, Zhanwang Chen <zhanwang.c@gmail.com> wrote:

> when I run:
> $MOSES/scripts/training/train-model.perl -hierarchical -ghkm
> -external-bin-dir /home/zhanwang/giza-pp/tools/ -root-dir . --corpus
> corpus/nl2mr --f mr --e nl
>
> showed:
> ERROR: use --lm factor:order:filename to specify at least one language
> model at /home/zhanwang/mosesdecoder/scripts/training/train-model.perl line
> 597.
>
> root@zhanwang-virtual-machine:/home/zhanwang/mosesmodel/corpus3#
> $MOSES/scripts/training/train-model.perl
>
> But I don't want to use factor model.
>
> try this too, show me the same thing. what ever parmeter I try ,it just
> ask me to use --lm factor:order:filename.
> Running the training script
>
> For an standard phrase model, you will typically run the training script
> as follows.
>
> Run the training script:
>
> train-model.perl -root-dir . --corpus corpus/euro --f de --e en
>
> I want to build syntax-base translation model, what should I do?
>
> $MOSES/scripts/training/train-model.perl -ghkm -external-bin-dir
> /home/zhanwang/giza-pp/tools/ -root-dir . --corpus corpus/nl2mr --f mr --e
> nl
>
>
> here is my corpus:
>
> root@zhanwang-virtual-machine:/home/zhanwang/mosesmodel/corpus3/corpus#
> head nl2mr.nl nl2mr.mr
>
> ==> nl2mr.nl <==
>
> Give me the cities in Virginia .
>
> What are the high points of states surrounding Mississippi ?
>
>
> ==> nl2mr.mr <==
>
> answer city loc_2 stateid 'virginia'
>
> answer high_point_1 state next_to_2 stateid 'mississippi'
>
> I want to extract ghkm rules and build a model that can translate "Give
> me the cities in Virginia ." to "answer city loc_2 stateid 'virginia'"
>
> Thanks a lot.
>
> Zhanwang
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160328/7409d3a3/attachment-0001.html

------------------------------

Message: 3
Date: Tue, 29 Mar 2016 13:59:08 +0200
From: Vito Mandorino <vito.mandorino@linguacustodia.com>
Subject: Re: [Moses-support] english-japanese translation
To: moses-support <moses-support@mit.edu>
Message-ID:
<CA+8mSmGtgVJBVAscj3C=nxudmMRZAs2ta=U9azj-oGzyUEogkQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Thank you all for your comments and ressources. I will go through them and
let you know if I stumble upon something interesting during the process.

Vito Mandorino

2016-03-28 9:47 GMT+02:00 Graham Neubig <neubig@is.naist.jp>:

> Hello Vishal,
>
> Yes, that's what pre-ordering means. Specifically it means re-ordering the
> source side.
>
> Graham
>
> On Mon, Mar 28, 2016 at 2:16 PM, Vishal Goyal(????? ????) <
> vishal.pup@gmail.com> wrote:
>
>> Dear Graham,
>> Greetings.
>> Please clarify that Pre-Ordering in your reply means, that making the
>> word order of both Source Language Sentences and Target Language Sentences
>> similar in Source-Target Language Pair before going for training so that it
>> becomes similar to the scenario of closely related language Pair.
>>
>> On Sat, Mar 26, 2016 at 10:57 AM, Graham Neubig <neubig@is.naist.jp>
>> wrote:
>>
>>> Hi Vito,
>>>
>>> English-Japanese and Japanese-English translation are very difficult due
>>> to the grammatical differences between the languages.
>>>
>>> You have a couple options to overcome this problem:
>>> 1) If you want to use phrase-based Moses, you will have to perform some
>>> variety of pre-ordering, in which you rearrange the words in the source
>>> sentence before training/testing.
>>> 2) You can use a syntax-based system, either using the functionality in
>>> Moses (http://www.statmt.org/moses/?n=Moses.SyntaxTutorial), or using
>>> another decoder specifically designed for syntax-based MT such as my
>>> Travatar decoder (http://www.phontron.com/travatar/). I have released
>>> the setup for training our strongest Japanese-English and English-Japanese
>>> systems here: https://github.com/neubig/wat2014
>>>
>>> Regarding the different types of characters, I would leave them as-is.
>>> It is possible to perform normalization, which will help in a limited
>>> number of cases, but if you're just starting out this is really the least
>>> of your problems.
>>>
>>> Graham
>>>
>>>
>>> On Fri, Mar 25, 2016 at 7:51 PM, Vito Mandorino <
>>> vito.mandorino@linguacustodia.com> wrote:
>>>
>>>> Dear all,
>>>>
>>>> does anyone have ever done experiments for English-Japanese and
>>>> Japanese-English translation? Do you know about useful ressources for this
>>>> language pair, or some specific gotchas one should be aware of?
>>>>
>>>> More specifically, what is the best policy for dealing with alphabets?
>>>> Do you think it is a good idea to keep different alphabets (Kanji,
>>>> Hiragana, Katakana, ...) in the corpus, or should one try to convert Kanji
>>>> into one of the other alphabets?
>>>>
>>>> Best regards,
>>>>
>>>> Vito Mandorino
>>>>
>>>> --
>>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>>
>>>>
>>>> [image: Description : Description : lingua_custodia_final full logo]
>>>>
>>>> *The Translation Trustee*
>>>>
>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>>
>>>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
>>>> <%2B33%206%2084%2065%2068%2089>*
>>>>
>>>> *Email :* *vito.mandorino@linguacustodia.com
>>>> <massinissa.ahmim@linguacustodia.com>*
>>>>
>>>> *Website :*
>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>>
>> --
>> *Regards,*
>> Vishal Goyal,
>> Ph.D., M.Tech., MCA, M.C.S.D.
>> Associate Professor(Stage IV),
>> Department of Computer Science,
>> Punjabi University Patiala-147002.
>>
>>
>> *Machine Translation Systems:*
>> [*Online Hindi to Punjabi Machine Translation Tool -*
>> http://h2p.learnpunjabi.org ]
>> [*Statistical Approach Based Hindi to Punjabi Machine Translation
>> System *
>> - http://statmt.org/~vishal/hp/index.cgi
>> - http://tdil-dc.in/hi2pu/index.cgi
>> ]
>> *Online Journal: [Research Cell: An International Journal of Engineering
>> Sciences, http://ijoes.vidyapublications.com
>> <http://ijoes.vidyapublications.com>]*
>> *Book: A Simplified Approach to Data Structures, Shroff Publications and
>> Distributors*
>> http://www.shroffpublishers.com/detail.aspx?title=6163
>>
>
>


--
*M**. Vito MANDORINO -- Chief Scientist*


[image: Description : Description : lingua_custodia_final full logo]

*The Translation Trustee*

*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*

*Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89*

*Email :* *vito.mandorino@linguacustodia.com
<massinissa.ahmim@linguacustodia.com>*

*Website :*
*www.linguacustodia.finance <http://www.linguacustodia.com/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160329/fe278611/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20160329/fe278611/attachment.jpg

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 113, Issue 72
**********************************************

0 Response to "Moses-support Digest, Vol 113, Issue 72"

Post a Comment