Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: BLEU score difference about 0.13 for one dataset is
normal? (Michael Denkowski)
2. Re: BLEU score difference about 0.13 for one dataset is
normal? (Tom Hoar)
3. Mixture modeling for OSM model (Hassan Sajjad)
4. Re: Number of CPUs and cores in training for server with 24
cores? (Davood Mohammadifar)
----------------------------------------------------------------------
Message: 1
Date: Wed, 14 Oct 2015 12:13:25 -0400
From: Michael Denkowski <michael.j.denkowski@gmail.com>
Subject: Re: [Moses-support] BLEU score difference about 0.13 for one
dataset is normal?
To: Davood Mohammadifar <davood_mf@hotmail.com>, Moses Support
<moses-support@mit.edu>
Message-ID:
<CA+-GegLbXfFz3fh693WXC_V_A9BaaFT73OD3D6CMc81ZgAU7Gw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi Davood,
If you're comparing two versions of the system to see what effect your work
has on translation quality, you can run Jon Clark's MultEval
<https://github.com/jhclark/multeval> (an implementation of the hypothesis
testing described in the paper). From the BLEU differences you reported,
1000 sentences should be enough to get pretty stable results for your
system. If you run MERT 3 times for each system and MultEval reports
statistically significant improvement across all metrics (BLEU, TER,
Meteor), that's a pretty good indicator that the system is better.
Best,
Michael
On Wed, Oct 14, 2015 at 1:50 AM, Davood Mohammadifar <davood_mf@hotmail.com>
wrote:
> Thanks Michael for the paper and thanks Tom.
>
> Based on the paper, one solution is replication of MERT and testing at
> least three times.
>
> My ideas have subtle effects on BLUE. Do you recommend me run MERT and
> testing three times or more? should i increase the number of sentences for
> tuning?
>
> my dataset for Persian to English includes:
> Training: about 240000 sentences
> Tune: 1000 sentences
> Test: 1000 sentences
>
> ------------------------------
> From: tahoar@precisiontranslationtools.com
> Date: Sun, 11 Oct 2015 12:53:37 +0700
> To: moses-support@mit.edu
> Subject: Re: [Moses-support] BLEU score difference about 0.13 for one
> dataset is normal?
>
>
> Yes. Each tuning with the same test set will give you small variations in
> the final BLEU. Yours looks like they're in a normal range.
>
>
>
> Date: Sun, 11 Oct 2015 04:23:56 +0000
> From: Davood Mohammadifar <davood_mf@hotmail.com>
> Subject: [Moses-support] BLEU score difference about 0.13 for one
> dataset is normal?
> To: Moses Support <moses-support@mit.edu>
>
> Hello every one
>
> I noticed different BLEU scores for same dataset. Also the difference is
> not so much and is about 0.13.
>
> I trained my dataset and tuned development set for Persian-English
> translation. after testing, the score was 21.95. For second time i did the
> same process and obtained 21.82. (my tools were mgiza, mert, ...)
>
> is this difference normal?
>
> My system:
> CPU: Core i7-4790K
> RAM: 16GB
> OS: ubuntu 12.04
>
> Thanks
>
> _______________________________________________ Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151014/73bd844a/attachment-0001.html
------------------------------
Message: 2
Date: Thu, 15 Oct 2015 00:15:34 +0700
From: Tom Hoar <tahoar@precisiontranslationtools.com>
Subject: Re: [Moses-support] BLEU score difference about 0.13 for one
dataset is normal?
To: Moses Support <moses-support@mit.edu>
Message-ID: <561E8DB6.2070101@precisiontranslationtools.com>
Content-Type: text/plain; charset="windows-1252"
Davood,
I don't know enough about your data and uses cases to recommend one way
or another. Running MERT multiple times will give you different BLEU
scores, I have never found the deltas to make a difference in a
production environment.
Tom
On 10/14/2015 12:50 PM, Davood Mohammadifar wrote:
> Thanks Michael for the paper and thanks Tom.
>
> Based on the paper, one solution is replication of MERT and testing at
> least three times.
>
> My ideas have subtle effects on BLUE. Do you recommend me run MERT and
> testing three times or more? should i increase the number of sentences
> for tuning?
>
> my dataset for Persian to English includes:
> Training: about 240000 sentences
> Tune: 1000 sentences
> Test: 1000 sentences
>
> ------------------------------------------------------------------------
> From: tahoar@precisiontranslationtools.com
> Date: Sun, 11 Oct 2015 12:53:37 +0700
> To: moses-support@mit.edu
> Subject: Re: [Moses-support] BLEU score difference about 0.13 for one
> dataset is normal?
>
> Yes. Each tuning with the same test set will give you small variations
> in the final BLEU. Yours looks like they're in a normal range.
>
>
>
> Date: Sun, 11 Oct 2015 04:23:56 +0000
> From: Davood Mohammadifar <davood_mf@hotmail.com>
> Subject: [Moses-support] BLEU score difference about 0.13 for one
> dataset is normal?
> To: Moses Support <moses-support@mit.edu>
>
> Hello every one
>
> I noticed different BLEU scores for same dataset. Also the difference
> is not so much and is about 0.13.
>
> I trained my dataset and tuned development set for Persian-English
> translation. after testing, the score was 21.95. For second time i did
> the same process and obtained 21.82. (my tools were mgiza, mert, ...)
>
> is this difference normal?
>
> My system:
> CPU: Core i7-4790K
> RAM: 16GB
> OS: ubuntu 12.04
>
> Thanks
>
> _______________________________________________ Moses-support mailing
> list Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151014/960e5985/attachment-0001.html
------------------------------
Message: 3
Date: Thu, 15 Oct 2015 15:11:10 +0300
From: Hassan Sajjad <sajjad@ims.uni-stuttgart.de>
Subject: [Moses-support] Mixture modeling for OSM model
To: moses-support <Moses-support@mit.edu>
Message-ID:
<CAOiX71arGKBAjrn-WF7nWE4_xXtwOchHqQxEXoxkqeF2Sy74fA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi all,
We have modified the script for Operation Sequence Model (OSM) to
incorporate language-model-like interpolation (of OSM sub-models trained
individually on each domain), optimized on tune-set. We found this to be
useful in our recent experiments. Please refer to the following paper for
details:
Using Joint Models for Domain Adaptation in Statistical Machine Translation
<http://alt.qcri.org/~ndurrani/pubs/joint-models-domain.pdf> @ Mt-Summit
(2015)
The scripts for data selection using OSM and NNJM interpolation will be
committed later.
The information on how to invoke the interpolated OSM model can be found
here:
http://www.statmt.org/moses/?n=Advanced.Domain#ntoc
Best,
Hassan
Arabic Language Technologies - QCRI
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151015/95763304/attachment-0001.html
------------------------------
Message: 4
Date: Thu, 15 Oct 2015 12:47:03 +0000
From: Davood Mohammadifar <davood_mf@hotmail.com>
Subject: Re: [Moses-support] Number of CPUs and cores in training for
server with 24 cores?
To: Moses Support <moses-support@mit.edu>
Message-ID: <SNT150-W55745EAA1DE057F2CF93B28C3E0@phx.gbl>
Content-Type: text/plain; charset="iso-8859-1"
Hello every one.
I didn't want to open new topic, so i state my problem in my old topic.
Currently i Use CPU Core i7 4790K for running moses' tools. Intel says it has 4 cores and 8 threads and supports multi-threading. I want to know what is suitable command for running Mgiza?
train-model.perl -mgiza -mgiza-cpus ?? -cores ??
Should i use parallel option?
From: davood_mf@hotmail.com
To: moses-support@mit.edu; phi@jhu.edu
Subject: FW: [Moses-support] Number of CPUs and cores in training for server with 24 cores?
Date: Fri, 26 Jun 2015 14:57:35 +0000
Thanks a lot Mr Koehn.
> Date: Fri, 26 Jun 2015 09:06:24 -0400
> Subject: Re: [Moses-support] Number of CPUs and cores in training for server with 24 cores?
> From: phi@jhu.edu
> To: davood_mf@hotmail.com
> CC: moses-support@mit.edu
>
> Hi,
>
> if you have 24 cores, then
>
> -mgiza -mgiza-cpus 12 -cores 24 -parallel
>
> sounds plausible to me.
>
> The "parallel" will parallelize the 2 mgiza runs, so you should use 12
> CPUs for each.
>
> The 24 cores setting is relevant for phrase table building.
>
> -phi
>
>
> On Fri, Jun 26, 2015 at 6:45 AM, Davood Mohammadifar
> <davood_mf@hotmail.com> wrote:
> > Hello everyone
> >
> > I want to use mgiza for training. What is your recommendation for number of
> > cores and CPUs, for good performance and quality in train-model command? Is
> > 24 suitable for both? what is your recommendation for other options (such as
> > -parallel,...)?
> >
> > train-model.perl -mgiza -mgiza-cpus ?? -cores ?? ...
> >
> >
> > System:
> > Number of cores: 24
> > RAM: 40GB
> > OS: Ubuntu 12.04
> >
> > Thanks
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151015/f86c8f1c/attachment.html
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 108, Issue 51
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 108, Issue 51"
Post a Comment