Moses-support Digest, Vol 92, Issue 2

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. ERROR: Lexical reordering scoring failed (Tom Hoar)
2. Need help/suggestions in setup EMS (Attilio Santocchia)
3. LM Perplexity in Moses? (Masoud Komeily)

----------------------------------------------------------------------

Message: 1
Date: Sun, 01 Jun 2014 14:29:13 +0700
From: Tom Hoar <tahoar@precisiontranslationtools.com>
Subject: [Moses-support] ERROR: Lexical reordering scoring failed
To: Moses-Support <moses-support@mit.edu>
Message-ID: <538AD649.50206@precisiontranslationtools.com>
Content-Type: text/plain; charset="utf-8"

I experienced an error that I've not seen before using train-model.perl
from Moses Release 1.0. Here are the final lines of the output log as
the script terminated. The error message "Illegal reordering type used:
d" is new to me.

(7) learn reordering model @ Sun Jun 1 00:12:55 IST 2014
(7.1) [no factors] learn reordering model @ Sun Jun 1 00:12:55 IST 2014
(7.2) building tables @ Sun Jun 1 00:12:55 IST 2014
Executing:
/usr/local/lib/mosesdecoder/scripts/../bin/lexical-reordering-score
/opt/domy/TRAININGS/alignments/align-td_fr_ui-en_us-fr_fr/giza.extract.en_us-fr_fr/ext.7-gram.o.sorted.gz
0.5
/opt/domy/ENGINES/tables/phrase-s=en_us-t=fr_fr-p=td_fr_ui-a=giza-g=7/reordering-table.
--model "wbe msd wbe-msd-bidirectional-fe"
Lexical Reordering Scorer
scores lexical reordering models of several types (hierarchical,
phrase-based and word-based-extraction
*Illegal reordering type used: d*
Exit code: 1
ERROR: Lexical reordering scoring failed at
/usr/local/bin/train-model.perl line 1705.

I suspect two potential causes.

1) the system only has about 50 GB of free hard drive space. There are
several tools in the training toolchain that create atomic temp files
that disappear immediately on an error. Is it possible this is one of
those place, and would this error occur if the system ran out of hard
drive space during this stage? I.e. it needs more than 50GB temp file
space?

2) the log has several warning messages that sentence pairs violate 9:1
ratio. E.g.

WARNING: The following sentence pair has source/target sentence length
ration more than the maximum allowed limit for a source word fertility
source length = 1 target length = 15 ratio 15 ferility limit : 9
Shortening sentence
Sent No: 212107 , No. Occurrences: 1

This should not be happening with our corpus preparation tools. There
doesn't seem to be anything particularly different about this training
corpus. So, we will look for our "leak" that let them pass our
preparation filters. Nonetheless, would these ratio violations cause the
fatal error in step 7 as logged above?

Can anyone can confirm my suspensions or suggest an alternate cause.

Thanks,
Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140601/27c6954e/attachment-0001.htm

------------------------------

Message: 2
Date: Sun, 1 Jun 2014 10:12:49 +0000 (UTC)
From: Attilio Santocchia <attilio.santocchia@unipg.it>
Subject: [Moses-support] Need help/suggestions in setup EMS
To: moses-support@mit.edu
Message-ID: <loom.20140601T114613-318@post.gmane.org>
Content-Type: text/plain; charset=us-ascii

Hello,

I'm starting to use moses and I'm going through the documentation.
The how-to is quite clear and I was able to setup a test machine for the
Experiment but now I'd like to have some advice. Our plan is to start an
intensive test of a lot of public and private corpora we already have for
learning what's the best result/configuration for our needs. Here is my
question: we have different options to install our test bench and I don't
know which one is the better one...

1) A openstack private cloud: here I can instantiate many VM (up to 10).
Each VM could get 8GB and 4 vCPU

2) A commercial cloud (not S3) where I guess I can instantiate more
powerfull VM (probably I can go up to 8-16 vCPU and an unlimited
memory... not sure yet how much) but here I have to pay for what I use

In both case I can (if it's useful) install SGE but reading few threads in the
mailing list I'm not sure it's worth the effort. So what's your suggestion?
Do I invest some time in installing and setup the SGE in
the cloud or it's better to stick with a single powerful machine?
In the Moses FAQ I read "takes 1-2 days to tune using 15 CPUs.
10-15 iterations are typical"... is this the amount of resources needed?
If I can shorten the test phase I can also to consider to get more
resources but I don't have any feeling...

Suggestion and/or comments are very welcome!

Thanks

Attilio

------------------------------

Message: 3
Date: Sun, 1 Jun 2014 16:10:03 +0430
From: Masoud Komeily <farazesp@gmail.com>
Subject: [Moses-support] LM Perplexity in Moses?
To: moses-support@mit.edu
Message-ID:
<CAFKrKMW_fMyDh8PnXE0uknHGezAK-_gXWda+Ruhpacehf522VQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

Is it possible to compute Language Model "perplexity" directly in Moses?
Can you help?

Thanks,
Masoud Komeily
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140601/cf89cbf3/attachment-0001.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 92, Issue 2
********************************************

Moses-support Digest, Vol 92, Issue 2

0 Response to "Moses-support Digest, Vol 92, Issue 2"

Post a Comment