Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Building problems (Sehrob Ibrohimov)
2. Re: filter parallel corpus (Anthony Rousseau)
3. Re: factored models with pos lm (ezgi y?ld?r?m)
----------------------------------------------------------------------
Message: 1
Date: Thu, 23 Jan 2014 09:46:12 +0500
From: Sehrob Ibrohimov <isehrob@gmail.com>
Subject: [Moses-support] Building problems
To: moses-support@mit.edu
Message-ID:
<CADJquiD8yxnm-7d-E5DX6v2euG7cVY=O3=Dh-SjjL7Hfur6uZw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Hi, dear developers and supporters!
I am very happy to live in such a era where everything is possible,
and you can find anything you want thanks to the people like you. And
thank you very much for that!
I can't wait to see how moses translates some foreign language into
mine. But maybe because of such a rush, can't build it.
Unfortunately I am not a specialist in the sphere of computer
technology, I am just a dilettante, but I am learning all of it with
joy (I mean: mathematical statistics, linux systems, c++, and so on).
So, I will be very grateful for any support!
yours
Sehrob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: build.zip
Type: application/zip
Size: 1947 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20140123/ef67bb2f/attachment-0001.zip
------------------------------
Message: 2
Date: Thu, 23 Jan 2014 11:04:29 +0100
From: Anthony Rousseau <anthony.rousseau@lium.univ-lemans.fr>
Subject: Re: [Moses-support] filter parallel corpus
To: Saeed Farzi <saeedfarzi@gmail.com>
Cc: moses-support <moses-support@mit.edu>, "corpora@uib.no"
<corpora@uib.no>
Message-ID: <1A05EF75-20DC-488D-B37B-4122F0309CC1@lium.univ-lemans.fr>
Content-Type: text/plain; charset="windows-1252"
Hello Saeed,
I think you can also use a tool called XenC I developed and released last year.
I believe it can help you since it was designed to cope with similar needs than yours.
You can read about it in this paper:
https://ufal.mff.cuni.cz/pbml/100/art-rousseau.pdf
Source code of the tool can be found here:
https://github.com/rousseau-lium/XenC
Best regards,
?
Anthony Rousseau, Ph.D.
LIUM, University of Le Mans
anthony.rousseau@lium.univ-lemans.fr
Le 16 janv. 2014 ? 16:43, Saeed Farzi <saeedfarzi@gmail.com> a ?crit :
> Dear all,
>
> I am working on a translation task with a very large parallel corpus.
> Because of computational cost of training such a parallel corpus, i am
> going to filter it regarding to the test set ( of course , by the
> filtering, the evaluation must be still fair).
>
> I am looking for a solution or a tool for filtering parallel corpus sentences.
>
> Note that i do not need to filter phrase table. I know that the
> filter_ moses tool reduces the phrase table size.
>
> cheers
> --
> S.Farzi, Ph.D. Student
> Natural Language Processing Lab,
> School of Electrical and Computer Eng.,
> Tehran University
> Tel: +9821-6111-9719
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140123/4064246c/attachment-0001.htm
------------------------------
Message: 3
Date: Thu, 23 Jan 2014 15:05:08 +0200
From: ezgi y?ld?r?m <ezgiyildrm@gmail.com>
Subject: Re: [Moses-support] factored models with pos lm
To: Ondrej Bojar <bojar@ufal.mff.cuni.cz>
Cc: moses-support@mit.edu
Message-ID:
<CAA=houVWP41WWOLpEKxpx6QXHtRPJ2pKRrc2Wet_C2vEdvUqew@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Thanks Ondrej. I attached my "model/moses.ini" file which is called on my
tuning command.
I checked if it contains output factors, unfortunately no output-factors
section in it.
I guess the problem is not to define decoding-steps option. I've tried to
train with decoding options and generation-factors, but the duration of
tuning step is inadmissibly increased. It tuned only a few sentences within
two days.
In addition, I wonder how I can use surface-to-surface "or"
(lemma+pos)-to-(lemma+pos) decoding configuration. Alternative decoding
paths should be able to use as I learned. I want to decode with
surface-to-surface probabilities for the first time. In the case that
surface form does not exist in the surface phrase-table, then I want to use
(lemma+pos)-to-(lemma+pos) translation table.
Regards,
Ezgi
On Tue, Jan 21, 2014 at 3:13 PM, Ondrej Bojar <bojar@ufal.mff.cuni.cz>wrote:
> Hi, Ezgi,
>
> if you are not using output factor 1, try avoiding it in the configuration
> at all, so use 0-0+1-1 and 1:9:...pos.lm.
> Also, check if you moses.ini contains
>
> [output-factors]
> 0
> 1
> 2
>
> And try verbose output to see if indeed Moses produces all the three
> factors, not just that they are in the training data.
>
> Actually, I am quite sure that the factor 1 is null with your current
> setup. The error comes probably from the fact that Moses too eagerly looks
> at it.
>
> Cheers, Ondrej.
>
> ----- Original Message -----
> > From: "ezgi y?ld?r?m" <ezgiyildrm@gmail.com>
> > To: moses-support@mit.edu
> > Sent: Tuesday, 21 January, 2014 1:44:54 PM
> > Subject: [Moses-support] factored models with pos lm
> >
> > Hi all,
> >
> > I've problem with factored models. I used an English-Turkish parallel
> > corpus with three factors (surface|lemma|pos) on both sides. I trained
> the
> > decoder with --translation-factors 0-0+2-2 from English to Turkish and
> > specified two language models, one for surface factor and other one for
> pos
> > factor. Here is my training command:
> >
> > mosesdecoder/scripts/training/train-model.perl --parallel --mgiza
> > --mgiza-cpus 32 --external-bin-dir ../../usr/local/bin/ --root-dir
> > $working/ --corpus $working/corpus/$name.en-tr.lowercased --f en --e tr
> > --alignment grow-diag-final-and --reordering msd-bidirectional-fe --lm
> > 0:5:/home/ezgi/$working/lm/$name-surface.en-tr.lm:0 --lm
> > 2:9:/home/ezgi/working/working_v4/lm/$name-pos.en-tr.lm:0
> > --translation-factors 0-0+2-2 >& $working/training.out
> >
> > However, I got this error while the first instance of tuning step is
> > processing:
> >
> > Check (*contextFactor[count-1])[factorType] != NULL failed in
> > moses/LM/SRI.cpp:155
> > Aborted
> > Exit code: 134
> > The decoder died. CONFIG WAS -w -0.217391 -lm 0.054348 0.054348 -d
> 0.065217
> > 0.065217 0.065217 0.065217 0.065217 0.065217 0.065217 -tm 0.043478
> 0.043478
> > 0.043478 0.043478 0.043478
> >
> > I checked that all the factored forms already have three factors. What is
> > the meaning of this error message? I supposed I made a mistake while
> > building pos lm, but I'm using witten-bell discounting which is the most
> > appropriate method for LMs with small vocabulary such as pos-lm.
> >
> > This is the command I used to build pos-lm:
> >
> > tools/srilm/bin/i686-m64/ngram-count -order 9 -interpolate -wbdiscount
> > -text $working/lm/$name-pos.en-tr.lowercased.tr -lm
> > $working/lm/$name-pos.en-tr.lm
> >
> > I will be pleased if you help me on this.
> > Regards,
> >
> > Ezgi
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
>
> --
> Ondrej Bojar (mailto:obo@cuni.cz / bojar@ufal.mff.cuni.cz)
> http://www.cuni.cz/~obo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140123/37cb05bc/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moses.ini
Type: application/x-wine-extension-ini
Size: 1520 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20140123/37cb05bc/attachment.bin
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 87, Issue 52
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 87, Issue 52"
Post a Comment