Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: MGIZA is slower than GIZA (Hieu Hoang)
2. Re: MGIZA is slower than GIZA (Li Xiang)
----------------------------------------------------------------------
Message: 1
Date: Mon, 19 Jan 2015 19:52:37 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] MGIZA is slower than GIZA
To: xiangli@me.com, moses-support <moses-support@mit.edu>
Message-ID: <54BD6085.5010101@gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi Li
You're absolutely right, mgiza has gotten slower than giza++! I have
mgiza from 2 years ago which is x2 faster on 3 cores, but now it's x2
slower.
Currently rolling back to find the offending commit. Will get back to
you when I find it
These are the timings:
*CURRENT MGIZA**
*1. 25722.74user 904.54system 1:26:41elapsed 511%CPU (0avgtext+0avgdata
1906128maxresident)k
2. 24095.06user 978.64system 1:20:57elapsed 516%CPU (0avgtext+0avgdata
1906176maxresident)k
*GIZA++*
4902.41user 21.95system 43:54.45elapsed 186%CPU (0avgtext+0avgdata
1906144maxresident)k
*OLD **MGIZA*
6576.71user 570.62system 24:09.90elapsed 492%CPU (0avgtext+0avgdata
1906144maxresident)k
On 17/01/15 08:41, Li Xiang wrote:
> Hi,
>
> GIZA:
>> ${mosesScript}/training/train-model.perl \
>> --external-bin-dir "${binDir}" \
>> --root-dir "${trainDir}" \
>> --corpus train \
>> --f src \
>> --e ref \
>> --alignment grow-diag-final-and \
>> --parallel \
>> --first-step 1 \
>> --last-step 3
> MGIZA
>
>> ${mosesScript}/training/train-model.perl \
>> --external-bin-dir "${binDir}" \
>> --root-dir "${trainDir}" \
>> --corpus train \
>> --f src \
>> --e ref \
>> --alignment grow-diag-final-and \
>> --parallel \
>> --first-step 1 \
>> --last-step 3 \
>> --mgiza --mgiza-cpus 3
>
>
>> ? 2015?1?17??16:39?Hieu Hoang <Hieu.Hoang@ed.ac.uk
>> <mailto:Hieu.Hoang@ed.ac.uk>> ???
>>
>> ok, can u tell me what u ran for giza++ and mgiza
>>
>> On 17 January 2015 at 08:29, Li Xiang <xiangli@me.com
>> <mailto:xiangli@me.com>> wrote:
>>
>> Hi Hieu,
>>
>> I give you 5K training data for evaluate the performance. And I
>> get similar result that mgiza is slower than giza on the data.
>>
>>
>>> ? 2015?1?17??00:34?Hieu Hoang <Hieu.Hoang@ed.ac.uk
>>> <mailto:Hieu.Hoang@ed.ac.uk>> ???
>>>
>>> can you provide the training corpus so I can verify your results?
>>>
>>> On 16 January 2015 at 15:53, Li Xiang <lixiang.ict@gmail.com
>>> <mailto:lixiang.ict@gmail.com>> wrote:
>>>
>>> Hi all,
>>>
>>> I trained the alignment model on the same data with the same
>>> parameters using GIZA and MGIZA respectively. The training
>>> corpus includes 200K sentences. My server has an Intel Quad
>>> CPU i4790K which has 4 cores and each core has 2 threads. It
>>> costs 2905 seconds for GIZA. But it costs 5259 seconds for
>>> MGIZA with 3 threads. I think MGIZA is much faster than
>>> GIZA. But I got bad result. I do not know the reason is the
>>> compile way or others.
>>>
>>> Does anyone has relative experience? Thanks.
>>>
>>> The following is the training command for MGIZA. And the
>>> training data is the FBIS zh-en data. But I can not public
>>> the data because of copyright.
>>>
>>>
>>> ${mosesScript}/training/train-model.perl \
>>> --external-bin-dir "${binDir}" \
>>> --root-dir "${trainDir}" \
>>> --corpus train \
>>> --f src \
>>> --e ref \
>>> --alignment grow-diag-final-and \
>>> --parallel \
>>> --first-step 1 \
>>> --last-step 3 \
>>> --mgiza --mgiza-cpus 3
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>>
>>>
>>> --
>>> Hieu Hoang
>>> Research Associate
>>> University of Edinburgh
>>> http://www.hoang.co.uk/hieu
>>>
>>
>>
>>
>>
>>
>> --
>> Hieu Hoang
>> Research Associate
>> University of Edinburgh
>> http://www.hoang.co.uk/hieu
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150119/760db316/attachment-0001.htm
------------------------------
Message: 2
Date: Tue, 20 Jan 2015 08:18:16 +0800
From: Li Xiang <lixiang.ict@gmail.com>
Subject: Re: [Moses-support] MGIZA is slower than GIZA
To: Tom Hoar <tahoar@precisiontranslationtools.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CA+fVw+7oG=EdWAW8PKPrxuym6XGM_708TdTTDTV9DDVhdcDx9g@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi,
I get a new result with 3456 seconds by setting mgiza-cpu as 4. Still it is
slower than GIZA++.
? 2015?1?18??16:25?Tom Hoar <tahoar@precisiontranslationtools.com> ???
Your command line sets "--mgiza-cpus 3" which assigns 3 CPUs to each
instance of mgiza on your quad-core machine. You also set the "--parallel"
flag. The notes say the --parallel flag run "forward and inverse GIZA++ in
parallel". This means the forward and reverse instances of mgiza are trying
to use 3 CPUs each, or a total of 6 threads on on 8-threaded CPU. Sounds
reasonable.
We've been using dual-threaded quad-cores for a while. We set "--mgiza-cpus
8" and never use the "--parallel" flag. In this configuration, we never
see a slow-down. So, I'm wondering if the "--parallel" flag causes some
other kind of resource bottleneck.
Try dropping the "--parallel" flag, and set "--mgiza-cpus 8" to see what
happens.
On 01/16/2015 11:34 PM, Hieu Hoang wrote:
can you provide the training corpus so I can verify your results?
On 16 January 2015 at 15:53, Li Xiang <lixiang.ict@gmail.com> wrote:
> Hi all,
>
> I trained the alignment model on the same data with the same parameters
> using GIZA and MGIZA respectively. The training corpus includes 200K
> sentences. My server has an Intel Quad CPU i4790K which has 4 cores and
> each core has 2 threads. It costs 2905 seconds for GIZA. But it costs 5259
> seconds for MGIZA with 3 threads. I think MGIZA is much faster than GIZA.
> But I got bad result. I do not know the reason is the compile way or others.
>
> Does anyone has relative experience? Thanks.
>
> The following is the training command for MGIZA. And the training data is
> the FBIS zh-en data. But I can not public the data because of copyright.
>
>
> ${mosesScript}/training/train-model.perl \
> --external-bin-dir "${binDir}" \
> --root-dir "${trainDir}" \
> --corpus train \
> --f src \
> --e ref \
> --alignment grow-diag-final-and \
> --parallel \
> --first-step 1 \
> --last-step 3 \
> --mgiza --mgiza-cpus 3
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
_______________________________________________
Moses-support mailing
listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150120/46f54012/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 99, Issue 41
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 99, Issue 41"
Post a Comment