Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Correct form of using Mira (Barry Haddow)
2. Error while training the translation system (Sunayana Gawde)
3. Re: Moses on SGE clarification (Hieu Hoang)
4. Re: Correct form of using Mira (Davood Mohammadifar)
----------------------------------------------------------------------
Message: 1
Date: Thu, 29 Oct 2015 09:38:13 +0000
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] Correct form of using Mira
To: Davood Mohammadifar <davood_mf@hotmail.com>, Moses Support
<moses-support@mit.edu>
Message-ID: <5631E905.3000909@staffmail.ed.ac.uk>
Content-Type: text/plain; charset="windows-1252"
Hi Davood
The first command you give has a quote missing at the end - is this correct?
Another difference is that you have "-v 0", so moses will run silently.
What was the actual output when you ran this command? What you have
below looks correct to me.
cheers - Barry
On 28/10/15 21:57, Davood Mohammadifar wrote:
> Hello everyone
>
> because of variations in BLEU score when using normal mert, i decided
> to use mira instead. Moses manual (updated on 28 October 2015) says me
> to use this command:
>
> $MOSES_SCRIPTS/training/mert-moses.pl work/dev.fr work/dev.en
> $MOSES_BIN/moses work/model/moses.ini --mertdir $MOSES_BIN --rootdir
> $MOSES_SCRIPTS --batch-mira --return-best-dev --batch-mira-args '-J
> 300' --decoder-flags '-threads 8 -v 0
>
> but this command is not work for me. When i execute the command, i
> just see some options for it and nothing happens. So i wanted to
> change the command. Based on usual mert, i changed the command to this:
>
> $MOSES_SCRIPTS/training/mert-moses.pl
> /home/mohammadifar/corpus/tune.true.fa
> /home/mohammadifar/corpus/tune.true.en $MOSES_BIN/moses
> /home/mohammadifar/First/train/model/moses.ini --mertdir
> $MOSES_BIN--rootdir $MOSES_SCRIPTS --batch-mira --return-best-dev
> --batch-mira-args="-J 300" --decoder-flags="-threads all"
>
> the difference of two command is in the end. The latter works for me
> very good. BLEU variations in test-set are very slight (many times
> <0.1 and rarely about 0.2 in 3 times running the whole of translation
> commands for same dataset). So i want to be sure, Is the form of using
> mira correct? (Moses v3.0)
>
> Regards
> Davood
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151029/f9caf50b/attachment-0001.html
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151029/f9caf50b/attachment-0001.pl
------------------------------
Message: 2
Date: Thu, 29 Oct 2015 16:26:44 +0530
From: Sunayana Gawde <sunayanagawde17@gmail.com>
Subject: [Moses-support] Error while training the translation system
To: moses-support@mit.edu
Message-ID:
<CANQTV3TrTSiMpMLybgWVaUsHCJE8M4e0z=Fmu0-rffBeFeiV+Q@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi all,
i am trying to develop a baseline translation system with the help of steps
on the site http://www.statmt.org/moses/?n=Moses.Baseline
when i reach the step of training the translation system, i used this
command
nohup nice ~/mosesdecoder/scripts/training/train-model.perl -root-dir train
\
-corpus
~/corpus/news-commentary-v8.fr-en.clean \
-f fr -e en -alignment grow-diag-final-and -reordering
msd-bidirectional-fe \
-lm
0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8 \
-external-bin-dir ~/mosesdecoder/tools >& training.out &
and this is what i get:
sunayana@InspironS:~/working$ nohup nice
~/mosesdecoder/scripts/training/train-model.perl -root-dir train \
> -corpus
~/corpus/news-commentary-v8.fr-en.clean \
> -f fr -e en -alignment grow-diag-final-and -reordering
msd-bidirectional-fe \
nohup: ignoring input and appending output to ?nohup.out?
sunayana@InspironS:~/working$ -lm
0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8 \
> -external-bin-dir ~/mosesdecoder/tools >& training.out &
[1] 3458
sunayana@InspironS:~/working$
It should create a file named moses.ini file but But it is not doing so.
--
*Thanks & Regards*
Ms. Sunayana R. Gawde.
DCST, Goa University.
* P**leas**e don't print t**his e-mail unles**s you really need to.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151029/2d488663/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: build.log.gz
Type: application/x-gzip
Size: 2865 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20151029/2d488663/attachment-0001.gz
------------------------------
Message: 3
Date: Thu, 29 Oct 2015 11:04:27 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Moses on SGE clarification
To: Vincent Nguyen <vnguyen@neuf.fr>, moses-support
<moses-support@mit.edu>
Message-ID: <5631FD3B.5030007@gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
On 28/10/2015 14:20, Vincent Nguyen wrote:
> Hi there,
>
> I need some clarification before screwing up some files.
> I just setup a SGE cluster with a Master + 2 Nodes.
>
> to make it clear let say my cluster name is "default", my master
> headnode is "master", my 2 other nodes are "node1" and "node2"
>
>
> for EMS :
>
> I opened the default experiment.machines file and I see :
>
> cluster: townhill seville hermes lion seville sannox lutzow frontend
> multicore-4: freddie
> multicore-8: tyr thor odin crom
> multicore-16: saxnot vali vili freyja bragi hoenir
> multicore-24: syn hel skaol saga buri loki sif magni
> multicore-32: gna snotra lofn thrud
>
> townhill and others are what ? name machines / nodes ? name of several
> clusters ?
> should I just put "default" or "master node1 node2" ?
I think you put 'default'. townhill seville etc were the name of the
master nodes in Edinburgh.
Using Moses with SGE with multiple nodes hasn't been done for a long
time so there may encounter problems. Philipp Koehn may have started
using it again.
I also used SGE extensively a few months ago but it runs on 1 (big) node
from start to finish. The script for it is here if you want to take a look
scripts/ems/support/submit-grid.perl
It has hardcoded initialisatio for the machine i was running on. You're
welcome to generalise it
>
> multicore-X: should I put machine names here
> if my 3 machines are 8 cores each
> multicore-8: master node1 node2
> right ?
I think this is for running on 1 machine, rather than the cluster. I'm
not sure what the purpose of
multicore-X
is
>
>
> then in the config file for EMS:
>
> #generic-parallelizer =
> $moses-script-dir/ems/support/generic-parallelizer.perl
> #generic-parallelizer =
> $moses-script-dir/ems/support/generic-multicore-parallelizer.perl
>
> which one should take if my nodes are multicore ? still the first one ?
>
>
> ### cluster settings (if run on a cluster machine)
> # number of jobs to be submitted in parallel
> #
> #jobs = 10
> should I count approx 1 job per core on the total cores of my 3 machines ?
>
> # arguments to qsub when scheduling a job
> #qsub-settings = ""
> can this stay empty ?
>
> # project for priviledges and usage accounting
> #qsub-project = iccs_smt
> standard value ?
>
> # memory and time
> #qsub-memory = 4
> #qsub-hours = 48
> 4 what ? GB ?
>
> ### multi-core settings
> # when the generic parallelizer is used, the number of cores
> # specified here
> cores = 4
> is this ignored if generic-parallelizer.perl is chosen ?
>
>
> is there a way to put more load on one specific node ?
>
> Many thanks,
> V.
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
--
Hieu Hoang
http://www.hoang.co.uk/hieu
------------------------------
Message: 4
Date: Thu, 29 Oct 2015 11:49:09 +0000
From: Davood Mohammadifar <davood_mf@hotmail.com>
Subject: Re: [Moses-support] Correct form of using Mira
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Cc: Moses Support <moses-support@mit.edu>
Message-ID: <SNT150-W231ACCEDD4A3821EC3A63B8C200@phx.gbl>
Content-Type: text/plain; charset="windows-1252"
Thanks Barry.
I copied the command from the manual and checked the the missing quote in the end, but the problem was not solved.
Of course, i found the reason.
I copied the command from moses manual (opened by Document viewer in ubuntu) to my terminal and modify the path. but i have the following message after running:
usage: /home/mohammadifar/mosesdecoder/scripts/training/mert-moses.pl input-text references decoder-executable decoder.ini
Options:
--working-dir=mert-dir ... where all the files are created
--nbest=100 ... how big nbestlist to generate
--lattice-samples ... how many lattice samples (Chatterjee & Cancedda, emnlp 2010)
--jobs=N ... set this to anything to run moses in parallel
--cache-model=STRING ... local directory into which copy model before running decoder
--mosesparallelcmd=STR ... use a different script instead of moses-parallel
--queue-flags=STRING ... anything you with to pass to qsub, eg.
'-l ws06osssmt=true'. The default is: '-hard'
To reset the parameters, please use
--queue-flags=' '
(i.e. a space between the quotes).
--decoder-flags=STRING ... extra parameters for the decoder
--continue ... continue from the last successful iteration
--skip-decoder ... skip the decoder run for the first time,
assuming that we got interrupted during
optimization
--shortest --average --closest
... Use shortest/average/closest reference length
as effective reference length (mutually exclusive)
--nocase ... Do not preserve case information; i.e.
case-insensitive evaluation (default is false).
--nonorm ... Do not use text normalization (flag is not active,
i.e. text is NOT normalized)
--filtercmd=STRING ... path to filter-model-given-input.pl
--filterfile=STRING ... path to alternative to input-text for filtering
model. useful for lattice decoding
--rootdir=STRING ... where do helpers reside (if not given explicitly)
--mertdir=STRING ... path to new mert implementation
--mertargs=STRING ... extra args for both extractor and mert
--extractorargs=STRING ... extra args for extractor only
--mertmertargs=STRING ... extra args for mert only
--scorenbestcmd=STRING ... path to score-nbest.py
--old-sge ... passed to parallelizers, assume Grid Engine < 6.0
--inputtype=[0|1|2] ... Handle different input types: (0 for text,
1 for confusion network, 2 for lattices,
default is 0)
--no-filter-phrase-table ... disallow filtering of phrase tables
(useful if binary phrase tables are available)
--random-restarts=INT ... number of random restarts (default: 20)
--predictable-seeds ... provide predictable seeds to mert so that random
restarts are the same on every run
--range=tm:0..1,-1..1 ... specify min and max value for some features
--range can be repeated as needed.
The order of the various --range specifications
is important only within a feature name.
E.g.:
--range=tm:0..1,-1..1 --range=tm:0..2
is identical to:
--range=tm:0..1,-1..1,0..2
but not to:
--range=tm:0..2 --range=tm:0..1,-1..1
--activate-features=STRING ... comma-separated list of features to optimize,
others are fixed to the starting values
default: optimize all features
example: tm_0,tm_4,d_0
--prev-aggregate-nbestlist=INT ... number of previous step to consider when
loading data (default = -1)
-1 means all previous, i.e. from iteration 1
0 means no previous data, i.e. only the
current iteration
N means this and N previous iterations
--maximum-iterations=ITERS ... Maximum number of iterations. Default: 25
--return-best-dev ... Return the weights according to dev bleu, instead of returning
the last iteration
--random-directions ... search only in random directions
--number-of-random-directions=int ... number of random directions
(also works with regular optimizer, default: 0)
--pairwise-ranked ... Use PRO for optimisation (Hopkins and May, emnlp 2011)
--pro-starting-point ... Use PRO to get a starting point for MERT
--batch-mira ... Use Batch MIRA for optimisation (Cherry and Foster, NAACL 2012)
--hg-mira ... Use hypergraph MIRA, ie batch mira with hypergraphs instead of kbests.
--batch-mira-args=STRING ... args to pass through to batch/hg MIRA. This flag is useful to
change MIRA's hyperparameters such as regularization parameter C,
BLEU decay factor, and the number of iterations of MIRA.
--promix-training=STRING ... PRO-based mixture model training (Haddow, NAACL 2013)
--promix-tables=STRING ... Phrase tables for PRO-based mixture model training.
--threads=NUMBER ... Use multi-threaded mert (must be compiled in).
--historic-interpolation ... Interpolate optimized weights with prior iterations' weight
(parameter sets factor [0;1] given to current weights)
--spe-symal=SYMAL ... Use simulated post-editing when decoding.
(SYMAL aligns input to refs)
So mira was not ran.
But i modified quote ? to quote ' in the command! (note that these are two different characters in ubuntu!) and my problem was solved. I think the problem is related to document viewer in Ubuntu, because it changes the true quote in copying text. Copying from firefox pdf viewer has no problem.
Regards
Davood
Date: Thu, 29 Oct 2015 09:38:13 +0000
From: bhaddow@staffmail.ed.ac.uk
To: davood_mf@hotmail.com; moses-support@mit.edu
Subject: Re: [Moses-support] Correct form of using Mira
Hi Davood
The first command you give has a quote missing at the end - is this
correct?
Another difference is that you have "-v 0", so moses will run
silently.
What was the actual output when you ran this command? What you have
below looks correct to me.
cheers - Barry
On 28/10/15 21:57, Davood Mohammadifar
wrote:
Hello everyone
because of variations in BLEU score when using normal mert, i
decided to use mira instead. Moses manual (updated on 28 October
2015) says me to use this command:
$MOSES_SCRIPTS/training/mert-moses.pl work/dev.fr work/dev.en
$MOSES_BIN/moses work/model/moses.ini --mertdir $MOSES_BIN
--rootdir $MOSES_SCRIPTS --batch-mira --return-best-dev
--batch-mira-args '-J 300' --decoder-flags '-threads 8 -v 0
but this command is not work for me. When i execute the command,
i just see some options for it and nothing happens. So i wanted
to change the command. Based on usual mert, i changed the
command to this:
$MOSES_SCRIPTS/training/mert-moses.pl
/home/mohammadifar/corpus/tune.true.fa
/home/mohammadifar/corpus/tune.true.en $MOSES_BIN/moses
/home/mohammadifar/First/train/model/moses.ini --mertdir
$MOSES_BIN--rootdir $MOSES_SCRIPTS --batch-mira
--return-best-dev --batch-mira-args="-J 300"
--decoder-flags="-threads all"
the difference of two command is in the end. The latter works
for me very good. BLEU variations in test-set are very slight
(many times <0.1 and rarely about 0.2 in 3 times running the
whole of translation commands for same dataset). So i want to be
sure, Is the form of using mira correct? (Moses v3.0)
Regards
Davood
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151029/81e45b89/attachment.html
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 108, Issue 78
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 108, Issue 78"
Post a Comment