Moses-support Digest, Vol 88, Issue 55

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Moses training performance (Andrzej Zydron)
2. Re: Moses training performance (Hieu Hoang)
3. Restrict WordTranslationFeature to certain pairs
(Marcin Junczys-Dowmunt)
4. Re: Moses training performance (Andrzej Zydron)

----------------------------------------------------------------------

Message: 1
Date: Tue, 25 Feb 2014 18:01:06 +0000
From: Andrzej Zydron <azydron@xtm-intl.com>
Subject: [Moses-support] Moses training performance
To: moses-support <moses-support@mit.edu>
Message-ID: <530CDA62.9050203@xtm-intl.com>
Content-Type: text/plain; charset=UTF-8; format=flowed

Dear Support,

I realize that there may not be a simple answer, but I would like to
understand why running training on a 9300 segment corpus takes nearly
three times as long on a 12 core Xeon E5-1650v2 128GB RAM Running CentOS
6.5, than on my MacBook Pro 4 core i7 3720QM 8GB RAM running Mavericks.
I am at a loss to explain. On the Xeon server I used a 28GB RAMDISK to
simulate an SSD to make things more equal. I have used mgiza throughout.
I have used the same data nad identical settings throughout on both
machines and I have used the official Moses 2.1 Git distribution and
compiled and linked on the machine.

These are the timings in minutes for the MacBook Pro 4 core i7 3720QM
8Gb RAM SSD:

Start End Time taken
mkls 10:18:50 10:19:23 00:00:33
snt2cooc 10:19:23 10:19:25 00:00:02
mgiza 10:19:25 10:31:58 00:12:33
extract 10:31:58 10:32:04 00:00:06
score 10:32:04 10:32:14 00:00:10
reordering 10:32:14 10:32:17 00:00:03

Total 00:13:27

and these for the 12 core Xeon E5-1650v2 128GB RAM using 28GB RAMDISKfor
all the data:

Start End Time taken
mkls 09:44:24 09:49:00 00:04:36
snt2cooc 09:49:00 09:49:23 00:00:23
mgiza 09:49:23 10:23:32 00:34:09
extract 10:23:32 10:24:20 00:00:48
score 10:24:20 10:26:08 00:01:48
reordering 10:26:08 10:26:20 00:00:12

Total 00:41:56

I know that the Mac is a superb machine (the best I have ever put my
hands on), but I find it difficult to understand why it should be so
much faster than a state of the art Xeon server for Moses training.

Email signature standard

Best Regards,

Andrzej Zydro?

---------------------------------------

CTO

*XTM International Ltd.*

PO Box 2167, Gerrards Cross, SL9 8XF, UK

email: azydron@xtm-intl.com <mailto:azydron@xtm-intl.com>

Tel: +44 (0) 1753 480 479

Mob: +44 (0) 7966 477 181

skype: Zydron

www.xtm-intl.com <http://www.xtm-intl.com/>

------------------------------

Message: 2
Date: Tue, 25 Feb 2014 18:19:16 +0000
From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Subject: Re: [Moses-support] Moses training performance
To: Andrzej Zydron <azydron@xtm-intl.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbjOwqKoAVNBut2g57OriB3CA0iUOCPk7rPf5aKNVyg7rA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Strange and interesting.

I can think of 2 issues:
1. The number of cores isn't relevant unless you explicitly ask mgiza &
the various extraction steps to use multiple cores.
2. It looks like mgiza is the issue
3. I'm not sure how io-bound mgiza is. However, in my test with virtual
machines, io-bound processes are slow

http://www.hanselman.com/blog/VMPerformanceChecklistBeforeYouComplainThatYourVirtualMachineIsSlow.aspx

This may be the case with ram-disk

On 25 February 2014 18:01, Andrzej Zydron <azydron@xtm-intl.com> wrote:

> Dear Support,
>
> I realize that there may not be a simple answer, but I would like to
> understand why running training on a 9300 segment corpus takes nearly
> three times as long on a 12 core Xeon E5-1650v2 128GB RAM Running CentOS
> 6.5, than on my MacBook Pro 4 core i7 3720QM 8GB RAM running Mavericks.
> I am at a loss to explain. On the Xeon server I used a 28GB RAMDISK to
> simulate an SSD to make things more equal. I have used mgiza throughout.
> I have used the same data nad identical settings throughout on both
> machines and I have used the official Moses 2.1 Git distribution and
> compiled and linked on the machine.
>
> These are the timings in minutes for the MacBook Pro 4 core i7 3720QM
> 8Gb RAM SSD:
>
> Start End Time taken
> mkls 10:18:50 10:19:23 00:00:33
> snt2cooc 10:19:23 10:19:25 00:00:02
> mgiza 10:19:25 10:31:58 00:12:33
> extract 10:31:58 10:32:04 00:00:06
> score 10:32:04 10:32:14 00:00:10
> reordering 10:32:14 10:32:17 00:00:03
>
> Total 00:13:27
>
> and these for the 12 core Xeon E5-1650v2 128GB RAM using 28GB RAMDISKfor
> all the data:
>
> Start End Time taken
> mkls 09:44:24 09:49:00 00:04:36
> snt2cooc 09:49:00 09:49:23 00:00:23
> mgiza 09:49:23 10:23:32 00:34:09
> extract 10:23:32 10:24:20 00:00:48
> score 10:24:20 10:26:08 00:01:48
> reordering 10:26:08 10:26:20 00:00:12
>
> Total 00:41:56
>
> I know that the Mac is a superb machine (the best I have ever put my
> hands on), but I find it difficult to understand why it should be so
> much faster than a state of the art Xeon server for Moses training.
>
> Email signature standard
>
> Best Regards,
>
>
> Andrzej Zydro?
>
> ---------------------------------------
>
> CTO
>
> *XTM International Ltd.*
>
> PO Box 2167, Gerrards Cross, SL9 8XF, UK
>
> email: azydron@xtm-intl.com <mailto:azydron@xtm-intl.com>
>
> Tel: +44 (0) 1753 480 479
>
> Mob: +44 (0) 7966 477 181
>
> skype: Zydron
>
> www.xtm-intl.com <http://www.xtm-intl.com/>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140225/8a453c26/attachment-0001.htm

------------------------------

Message: 3
Date: Tue, 25 Feb 2014 20:54:35 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: [Moses-support] Restrict WordTranslationFeature to certain
pairs
To: moses-support <moses-support@mit.edu>
Message-ID: <530CF4FB.9030803@amu.edu.pl>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hi,
Is there a non-programming way to restrict WordTranslationFeature to
specific pairs rather than the complete product of two separate source
and target word lists?
Best,
Marcin

------------------------------

Message: 4
Date: Tue, 25 Feb 2014 20:37:12 +0000
From: Andrzej Zydron <azydron@xtm-intl.com>
Subject: Re: [Moses-support] Moses training performance
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <530CFEF8.6060905@xtm-intl.com>
Content-Type: text/plain; charset="us-ascii"

An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140225/48e9d573/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xtm_logo.png
Type: image/png
Size: 5245 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20140225/48e9d573/attachment.png

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 88, Issue 55
*********************************************

Moses-support Digest, Vol 88, Issue 55

0 Response to "Moses-support Digest, Vol 88, Issue 55"

Post a Comment