Moses-support Digest, Vol 87, Issue 48

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Vowpal Wabbit? (Alexander Fraser)
2. factored models with pos lm (ezgi y?ld?r?m)
3. Moses Release 2.1 (Hieu Hoang)
4. Re: factored models with pos lm (Ondrej Bojar)


----------------------------------------------------------------------

Message: 1
Date: Tue, 21 Jan 2014 11:17:46 +0100
From: Alexander Fraser <fraser@ims.uni-stuttgart.de>
Subject: Re: [Moses-support] Vowpal Wabbit?
To: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CADaL4v33JarYTB6h8CMVK5797qABQaVH--8590iZP5RDryFrbw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Marcin,

We currently have Vowpal Wabbit integrated as a discriminative phrase
lexicon model (work at the Hopkins workshop on domain adaptation), and as a
discriminative target word selection model (work at the 2013 MTM).

This is different from sparse features. Sparse feature weights are trained
on the dev set to maximize BLEU.

We train on the training corpus (the same corpus as phrases are extracted
from) to maximize classification accuracy (meaning, we try to make the
classifier select the correct target phrase or word given the source
context). The integration of one of our classifiers is as one "dense"
feature (the output is a probability distribution like phrase-based
p(e|f)), and the single weight of the dense feature is tuned using MERT,
MIRA or PRO along with the other dense feature weights to maximize BLEU on
dev.

The code for phrase-based is available in the damt_phrase branch; there
will also be a release for Moses hierarchical.

Cheers, Alex




On Tue, Jan 21, 2014 at 10:17 AM, Marcin Junczys-Dowmunt <junczys@amu.edu.pl
> wrote:

> Hi,
> I remember during the last MT Marathon someone was working on
> integrating Vowpal Wabbit into Moses, mainly for morphology handling. If
> these people are reading the list I'd like to ask what is the current
> status of this? This idea seems to be somewhat in application
> possibilities to sparse Moses, doesn't it?
> Best,
> Marcin
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140121/6aedb0b5/attachment-0001.htm

------------------------------

Message: 2
Date: Tue, 21 Jan 2014 14:44:54 +0200
From: ezgi y?ld?r?m <ezgiyildrm@gmail.com>
Subject: [Moses-support] factored models with pos lm
To: moses-support@mit.edu
Message-ID:
<CAA=houUweYrCsQqEdZ5PrxvYdE7TqUzC3fBq4tekaO4_Nk6sHQ@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi all,

I've problem with factored models. I used an English-Turkish parallel
corpus with three factors (surface|lemma|pos) on both sides. I trained the
decoder with --translation-factors 0-0+2-2 from English to Turkish and
specified two language models, one for surface factor and other one for pos
factor. Here is my training command:

mosesdecoder/scripts/training/train-model.perl --parallel --mgiza
--mgiza-cpus 32 --external-bin-dir ../../usr/local/bin/ --root-dir
$working/ --corpus $working/corpus/$name.en-tr.lowercased --f en --e tr
--alignment grow-diag-final-and --reordering msd-bidirectional-fe --lm
0:5:/home/ezgi/$working/lm/$name-surface.en-tr.lm:0 --lm
2:9:/home/ezgi/working/working_v4/lm/$name-pos.en-tr.lm:0
--translation-factors 0-0+2-2 >& $working/training.out

However, I got this error while the first instance of tuning step is
processing:

Check (*contextFactor[count-1])[factorType] != NULL failed in
moses/LM/SRI.cpp:155
Aborted
Exit code: 134
The decoder died. CONFIG WAS -w -0.217391 -lm 0.054348 0.054348 -d 0.065217
0.065217 0.065217 0.065217 0.065217 0.065217 0.065217 -tm 0.043478 0.043478
0.043478 0.043478 0.043478

I checked that all the factored forms already have three factors. What is
the meaning of this error message? I supposed I made a mistake while
building pos lm, but I'm using witten-bell discounting which is the most
appropriate method for LMs with small vocabulary such as pos-lm.

This is the command I used to build pos-lm:

tools/srilm/bin/i686-m64/ngram-count -order 9 -interpolate -wbdiscount
-text $working/lm/$name-pos.en-tr.lowercased.tr -lm
$working/lm/$name-pos.en-tr.lm

I will be pleased if you help me on this.
Regards,

Ezgi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140121/10f366ea/attachment-0001.htm

------------------------------

Message: 3
Date: Tue, 21 Jan 2014 13:07:46 +0000
From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Subject: [Moses-support] Moses Release 2.1
To: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbi9UjpUymsUgCRviOrBfnPFFc2A_Ef7GkL0YYHuFLxcig@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Ladies and Gentlemen, Boys and Girls

It is my pleasure to announce the official release of Moses, version 2.1.
It's taken a year to do and there's lots of changes.

The most noticeable is that the moses.ini file format has changed, due to
lots of behind the scene refactoring to make it easier to extend Moses. The
documentation throughout the website has been updated to reflect the
changes. However, the decoder is still compatible with the old ini file in
most cases.

More people than ever are contributing to the Moses toolkit. Here are just
some of the thing that they've done:
1. Transliteration Phrase-Table by Nadir Durrani
2. DALM integration by Jun-ya Norimatsu.
3. CoveredReferenceFeature by Ales Tamchyna. Also, constrained Decoding
by Hieu Hoang.
4. Picaro by Jason Riesa.
5. Neural LM by Lane Schwartz.
6. Tokenization configuration files for Greek and Tamil by Dimitris
Mavroeidis and Arththika Paramanathan.
7. DIMwid by Robin Kurtz.
8. Placeholder by Achim Ruopp and Hieu Hoang.
9. Backward LM by lane Schwartz.
10. Multimodel phrase-table by Rico Sennrich.
11. Operation Sequence Model by Nadir Durrani.
12. Alternate weight setting by Philipp Koehn.
13. Lattice and confusion network phrase-based decoding with any
phrase-tables by Hieu Hoang.
14. Ondisk phrase-table for phrase-based model by Hieu Hoang.
15. Clearer error messages by Hieu Hoang.
16. Updated Windows GUI by Jie Jiang

Please see the release notes for more details
http://www.statmt.org/moses/RELEASE-2.1/Mosesv2.1releasenotes.pdf

The code can be downloaded from github, or at:
http://www.statmt.org/moses/RELEASE-2.1/mosesdecoder.v21.tar.gz

Moses is available in a number of different ways:
1. As source code, which you compile yourself
2.compiled binaries, for your OS
3. Virtual machines (Linux 32 and 64 bits) with Moses pre-installed.
4. Amazon EC2 images
More detauls in the release notes.

Happy MT'ing!



--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140121/d9cd45e8/attachment-0001.htm

------------------------------

Message: 4
Date: Tue, 21 Jan 2014 14:13:07 +0100 (CET)
From: Ondrej Bojar <bojar@ufal.mff.cuni.cz>
Subject: Re: [Moses-support] factored models with pos lm
To: ezgi y?ld?r?m <ezgiyildrm@gmail.com>
Cc: moses-support@mit.edu
Message-ID:
<1455915929.8440.1390309987884.JavaMail.root@ufal.mff.cuni.cz>
Content-Type: text/plain; charset=utf-8

Hi, Ezgi,

if you are not using output factor 1, try avoiding it in the configuration at all, so use 0-0+1-1 and 1:9:...pos.lm.
Also, check if you moses.ini contains

[output-factors]
0
1
2

And try verbose output to see if indeed Moses produces all the three factors, not just that they are in the training data.

Actually, I am quite sure that the factor 1 is null with your current setup. The error comes probably from the fact that Moses too eagerly looks at it.

Cheers, Ondrej.

----- Original Message -----
> From: "ezgi y?ld?r?m" <ezgiyildrm@gmail.com>
> To: moses-support@mit.edu
> Sent: Tuesday, 21 January, 2014 1:44:54 PM
> Subject: [Moses-support] factored models with pos lm
>
> Hi all,
>
> I've problem with factored models. I used an English-Turkish parallel
> corpus with three factors (surface|lemma|pos) on both sides. I trained the
> decoder with --translation-factors 0-0+2-2 from English to Turkish and
> specified two language models, one for surface factor and other one for pos
> factor. Here is my training command:
>
> mosesdecoder/scripts/training/train-model.perl --parallel --mgiza
> --mgiza-cpus 32 --external-bin-dir ../../usr/local/bin/ --root-dir
> $working/ --corpus $working/corpus/$name.en-tr.lowercased --f en --e tr
> --alignment grow-diag-final-and --reordering msd-bidirectional-fe --lm
> 0:5:/home/ezgi/$working/lm/$name-surface.en-tr.lm:0 --lm
> 2:9:/home/ezgi/working/working_v4/lm/$name-pos.en-tr.lm:0
> --translation-factors 0-0+2-2 >& $working/training.out
>
> However, I got this error while the first instance of tuning step is
> processing:
>
> Check (*contextFactor[count-1])[factorType] != NULL failed in
> moses/LM/SRI.cpp:155
> Aborted
> Exit code: 134
> The decoder died. CONFIG WAS -w -0.217391 -lm 0.054348 0.054348 -d 0.065217
> 0.065217 0.065217 0.065217 0.065217 0.065217 0.065217 -tm 0.043478 0.043478
> 0.043478 0.043478 0.043478
>
> I checked that all the factored forms already have three factors. What is
> the meaning of this error message? I supposed I made a mistake while
> building pos lm, but I'm using witten-bell discounting which is the most
> appropriate method for LMs with small vocabulary such as pos-lm.
>
> This is the command I used to build pos-lm:
>
> tools/srilm/bin/i686-m64/ngram-count -order 9 -interpolate -wbdiscount
> -text $working/lm/$name-pos.en-tr.lowercased.tr -lm
> $working/lm/$name-pos.en-tr.lm
>
> I will be pleased if you help me on this.
> Regards,
>
> Ezgi
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

--
Ondrej Bojar (mailto:obo@cuni.cz / bojar@ufal.mff.cuni.cz)
http://www.cuni.cz/~obo



------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 87, Issue 48
*********************************************

0 Response to "Moses-support Digest, Vol 87, Issue 48"

Post a Comment