Moses-support Digest, Vol 111, Issue 2

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Factored instead of Phrase-based Model? (Shaimaa Marzouk)
2. Re: Factored instead of Phrase-based Model? (Ondrej Bojar)
3. MT Marathon 2010 page hacked. (liling tan)


----------------------------------------------------------------------

Message: 1
Date: Wed, 6 Jan 2016 03:48:26 +0000 (UTC)
From: Shaimaa Marzouk <marzouk_s@yahoo.de>
Subject: [Moses-support] Factored instead of Phrase-based Model?
To: <Moses-support@mit.edu>
Message-ID:
<554383138.1032811.1452052106142.JavaMail.yahoo@mail.yahoo.com>
Content-Type: text/plain; charset="utf-8"

Dear Moses-Team,

I am trying to translate two short sentences included in the same file from German into English using a ?Phrase-based Model?. The first sentence (das auto wurde verkauft) is translated correctly, while the second is partly translated:

I receive as a result for ?ich habe das auto verkauft?
Ich|UNK|UNK|UNK habe|UNK|UNK|UNK the car sold [11111] [total=-203.330] core=(-200.000, -5.000, 5.000, 0.000, 0.000, 0.000, 0.000, 0.000, -18.660)

I tried to modify the training data in different ways, and at last included the exact sentence (along with its translation) in the training data (see attachment). But, I still get the same result.

Do I need to use a ?Factored Translation Model? instead of the ?Phrase-based Model? to be able to translate this sentence? If yes, I find here http://www.statmt.org/moses/?n=Moses.FactoredTutorial explanation of how to train Factored Models. Could you please tell me, where can I find information about
1. how to prepare the training data with additional factors, before training the Factored Model?
2. how to train the Language Model that considers the POS?

I currently use KenLM and Giza++.

Thanks a lot for your support.

Kind regards,
Shaimaa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Training data.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 12271 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20160105/6d4c89cf/attachment-0001.bin

------------------------------

Message: 2
Date: Wed, 06 Jan 2016 08:42:45 +0100
From: Ondrej Bojar <bojar@ufal.mff.cuni.cz>
Subject: Re: [Moses-support] Factored instead of Phrase-based Model?
To: Shaimaa Marzouk <marzouk_s@yahoo.de>, Shaimaa Marzouk
<marzouk_s@yahoo.de>, Moses-support@mit.edu
Message-ID: <1dd724de-0775-46c7-93b1-3c817cb1a0dd@email.android.com>
Content-Type: text/plain; charset=UTF-8

Dear Shaimaa,

Adding factors can only increase any out-of-vocabulary issues.

Use -v (perhaps even a higher verbosity level) in moses to see what all translation options are considered for the problematic sentence. There could be some unfortunate weight settings that for some reason prefer identity translation. (The identity translation must however appear in the data, or the source word must not appear in the data, otherwise Moses would not produce identity translation at all.)

And then go back to the phrase table and manually search for the lines that are supposed to cover the missing words. Here you may find the identity entries.

And then go back and check the word alignment this (test) sentence got in the training data. There are most likely some issues with the alignment that prevented proper translations to be extracted.

Best, Ondrej.


On January 6, 2016 4:48:26 AM CET, Shaimaa Marzouk <marzouk_s@yahoo.de> wrote:
>Dear Moses-Team,
>
>I am trying to translate two short sentences included in the same file
>from German into English using a ?Phrase-based Model?. The first
>sentence (das auto wurde verkauft) is translated correctly, while the
>second is partly translated:
>
>I receive as a result for ?ich habe das auto verkauft?
>Ich|UNK|UNK|UNK habe|UNK|UNK|UNK the car sold [11111]
>[total=-203.330] core=(-200.000, -5.000, 5.000, 0.000, 0.000, 0.000,
>0.000, 0.000, -18.660)
>
>I tried to modify the training data in different ways, and at last
>included the exact sentence (along with its translation) in the
>training data (see attachment). But, I still get the same result.
>
>Do I need to use a ?Factored Translation Model? instead of the
>?Phrase-based Model? to be able to translate this sentence? If yes, I
>find here http://www.statmt.org/moses/?n=Moses.FactoredTutorial
>explanation of how to train Factored Models. Could you please tell me,
>where can I find information about
>1. how to prepare the training data with additional factors, before
>training the Factored Model?
>2. how to train the Language Model that considers the POS?
>
>I currently use KenLM and Giza++.
>
>Thanks a lot for your support.
>
>Kind regards,
>Shaimaa
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support

--
Ondrej Bojar (mailto:obo@cuni.cz / bojar@ufal.mff.cuni.cz)
http://www.cuni.cz/~obo



------------------------------

Message: 3
Date: Wed, 6 Jan 2016 14:07:18 +0100
From: liling tan <alvations@gmail.com>
Subject: [Moses-support] MT Marathon 2010 page hacked.
To: moses-support <moses-support@mit.edu>
Message-ID:
<CAKzPaJLThevbd0Kcwd5KGit0ytMEKYG2n_o_emeLtRU3oWYWoQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Dear Moses / MT Marathon organizers,

I'm not sure whether this is the right place to report this.

I was trying to retrieve a page from MT Marathon 2010 and it seems like a
Russian hacker hacked the page and took over it:
http://www.mtmarathon2010.info/ (see the lower right corner).

And it's using the high google pagerank index to bait people onto pages
like: http://www.mtmarathon2010.info/web/Program_files/survey.pdf

Regards,
Liling
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160106/d4fe9114/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 111, Issue 2
*********************************************

0 Response to "Moses-support Digest, Vol 111, Issue 2"

Post a Comment