Moses-support Digest, Vol 113, Issue 67

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: english-japanese translation (Graham Neubig)
2. ERROR: use --lm factor:order:filename to specify at least one
language model (Zhanwang Chen)


----------------------------------------------------------------------

Message: 1
Date: Sat, 26 Mar 2016 14:27:32 +0900
From: Graham Neubig <neubig@is.naist.jp>
Subject: Re: [Moses-support] english-japanese translation
To: Vito Mandorino <vito.mandorino@linguacustodia.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CADkjOCMzbkFYeybjvM5jEX5Q6ahH2d_3Ncj+W7G66_50eO2owA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Vito,

English-Japanese and Japanese-English translation are very difficult due to
the grammatical differences between the languages.

You have a couple options to overcome this problem:
1) If you want to use phrase-based Moses, you will have to perform some
variety of pre-ordering, in which you rearrange the words in the source
sentence before training/testing.
2) You can use a syntax-based system, either using the functionality in
Moses (http://www.statmt.org/moses/?n=Moses.SyntaxTutorial), or using
another decoder specifically designed for syntax-based MT such as my
Travatar decoder (http://www.phontron.com/travatar/). I have released the
setup for training our strongest Japanese-English and English-Japanese
systems here: https://github.com/neubig/wat2014

Regarding the different types of characters, I would leave them as-is. It
is possible to perform normalization, which will help in a limited number
of cases, but if you're just starting out this is really the least of your
problems.

Graham


On Fri, Mar 25, 2016 at 7:51 PM, Vito Mandorino <
vito.mandorino@linguacustodia.com> wrote:

> Dear all,
>
> does anyone have ever done experiments for English-Japanese and
> Japanese-English translation? Do you know about useful ressources for this
> language pair, or some specific gotchas one should be aware of?
>
> More specifically, what is the best policy for dealing with alphabets? Do
> you think it is a good idea to keep different alphabets (Kanji, Hiragana,
> Katakana, ...) in the corpus, or should one try to convert Kanji into one
> of the other alphabets?
>
> Best regards,
>
> Vito Mandorino
>
> --
> *M**. Vito MANDORINO -- Chief Scientist*
>
>
> [image: Description : Description : lingua_custodia_final full logo]
>
> *The Translation Trustee*
>
> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>
> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
> <%2B33%206%2084%2065%2068%2089>*
>
> *Email :* *vito.mandorino@linguacustodia.com
> <massinissa.ahmim@linguacustodia.com>*
>
> *Website :*
> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160326/77abb7db/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20160326/77abb7db/attachment-0001.jpg

------------------------------

Message: 2
Date: Sat, 26 Mar 2016 13:37:01 +0100
From: Zhanwang Chen <zhanwang.c@gmail.com>
Subject: [Moses-support] ERROR: use --lm factor:order:filename to
specify at least one language model
To: moses-support@mit.edu
Message-ID:
<CAGdkKyR0P2vy3MkF_TaorAgg38WE-vvGw8WWuzRYcED+UVvudg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

when I run:
$MOSES/scripts/training/train-model.perl -hierarchical -ghkm
-external-bin-dir /home/zhanwang/giza-pp/tools/ -root-dir . --corpus
corpus/nl2mr --f mr --e nl

showed:
ERROR: use --lm factor:order:filename to specify at least one language
model at /home/zhanwang/mosesdecoder/scripts/training/train-model.perl line
597.

root@zhanwang-virtual-machine:/home/zhanwang/mosesmodel/corpus3#
$MOSES/scripts/training/train-model.perl

But I don't want to use factor model.

try this too, show me the same thing. what ever parmeter I try ,it just ask
me to use --lm factor:order:filename.
Running the training script

For an standard phrase model, you will typically run the training script as
follows.

Run the training script:

train-model.perl -root-dir . --corpus corpus/euro --f de --e en

I want to build syntax-base translation model, what should I do?

$MOSES/scripts/training/train-model.perl -ghkm -external-bin-dir
/home/zhanwang/giza-pp/tools/ -root-dir . --corpus corpus/nl2mr --f mr --e
nl


here is my corpus:

root@zhanwang-virtual-machine:/home/zhanwang/mosesmodel/corpus3/corpus#
head nl2mr.nl nl2mr.mr

==> nl2mr.nl <==

Give me the cities in Virginia .

What are the high points of states surrounding Mississippi ?


==> nl2mr.mr <==

answer city loc_2 stateid 'virginia'

answer high_point_1 state next_to_2 stateid 'mississippi'

I want to extract ghkm rules and build a model that can translate "Give me
the cities in Virginia ." to "answer city loc_2 stateid 'virginia'"

Thanks a lot.

Zhanwang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160326/d91065bf/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 113, Issue 67
**********************************************

0 Response to "Moses-support Digest, Vol 113, Issue 67"

Post a Comment