Moses-support Digest, Vol 94, Issue 24

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. EA optimization of model selection (Alex Ter-Sarkisov)

----------------------------------------------------------------------

Message: 1
Date: Tue, 19 Aug 2014 16:44:53 +0200
From: Alex Ter-Sarkisov <ater1980@gmail.com>
Subject: [Moses-support] EA optimization of model selection
To: moses-support@mit.edu
Message-ID:
<CAMW75YtfbReGioZJBVHbTL3gX+bvWvz8_m3ftO1pZTtP8Ut-CA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

An important part of SMT is decoding - finding the best possible
translation from source to target language. Very roughly, this procedure is
following: Moses <http://www.statmt.org/moses/> generates a number of
candidates (hypotheses) for each sentence/phrase in the training set, each
of which is represented via a vector of features (e.g. 20) and has a BLEU
score - roughly speaking, quality of translation.

An optimization algorithms uses this input to evolve a model, i.e. a vector
of coefficients with dimensionality equal to the number of features. For
example, for each phrase we want only one (best) translation. We assign the
hypothesis with the highest BLEU score value 1 and values 0 to the
remaining hypotheses. The objective function is the sum of squared
differences between the true 'rank' (0 and 1) and the one evolved by the
model.

Currently the most popular optimizer is Minimum error rate training (MERT),
see Bertoldi et al 'Improved MERT in Moses' (2009) and Och 'MERT in SMT'
(2003). What I'm doing now is developing a GA to select the best candidate
for each phrase. I expect GA to perform well on this task, as the database
is relatively small, the full matrix is about 10^6 by 20, and GAs tend to
perform well on classification-optimization problems, including ANNs.

EDIT: I developed a real-coded evolutionary optimizer for solving this
problem in C++. With a population=10, elitism=1, 1-bit-mutation and Laplace
crossover with a = -1.1, b =1 (see Deep and Thakur, 2007) it evolves a
vector of weights that give a squared error of around 90000, down from
293000 at the start of the run. The size of the database is ~1000 sentences
with ~1000 hypotheses for each sentence.

I would be grateful for any suggestions on how this result can be further
improved, both in terms of other genetic operators
(mutation/crossover/flip), as well as hybridization.A good link to a
paper/report would do.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140819/9b2c201b/attachment-0001.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 94, Issue 24
*********************************************

Moses-support Digest, Vol 94, Issue 24

0 Response to "Moses-support Digest, Vol 94, Issue 24"

Post a Comment