Moses-support Digest, Vol 106, Issue 13

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: EMS results - makes sense ? (Vincent Nguyen)
2. Re: Performance issue using Moses Server with Moses 3
(probably same as Oren's) (Martin Baumg?rtner)

----------------------------------------------------------------------

Message: 1
Date: Thu, 6 Aug 2015 09:40:40 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: Re: [Moses-support] EMS results - makes sense ?
To: Barry Haddow <bhaddow@inf.ed.ac.uk>, Hieu Hoang
<hieuhoang@gmail.com>, moses-support <moses-support@mit.edu>
Message-ID: <55C30F78.40409@neuf.fr>
Content-Type: text/plain; charset="windows-1252"

so I dropped my hierarchical model since I got an error.
Switched back to the "more data" by adding the Giga FR EN source
but now another error pops un running Giza Inverse :

Using SCRIPTS_ROOTDIR: /home/moses/mosesdecoder/scripts
Using multi-thread GIZA
using gzip
(2) running giza @ Wed Aug 5 21:03:56 CEST 2015
(2.1a) running snt2cooc fr-en @ Wed Aug 5 21:03:56 CEST 2015
Executing: mkdir -p /home/moses/working/training/giza-inverse.7
Executing: /home/moses/working/bin/training-tools/mgizapp/snt2cooc
/home/moses/working/training/giza-inverse.7/fr-en.cooc
/home/moses/working/training/prepared.7/en.vcb
/home/moses/working/training/prepared.7/fr.vcb
/home/moses/working/training/prepared.7/fr-en-int-train.snt
line 1000
line 2000

...
line 6609000
line 6610000
ERROR: Execution of:
/home/moses/working/bin/training-tools/mgizapp/snt2cooc
/home/moses/working/training/giza-inverse.7/fr-en.cooc
/home/moses/working/training/prepared.7/en.vcb
/home/moses/working/training/prepared.7/fr.vcb
/home/moses/working/training/prepared.7/fr-en-int-train.snt
died with signal 9, without coredump

any clue what signal 9 means ?

Le 04/08/2015 17:28, Barry Haddow a ?crit :
> Hi Vincent
>
> If you are comparing to the results of WMT11, then you can look at the
> system descriptions to see what the authors did. In fact it's worth
> looking at the WMT14 descriptions (WMT15 will be available next month)
> to see how state-of-the-art systems are built.
>
> For fr-en or en-fr, the first thing to look at is the data. There are
> some large data sets released for WMT and you can get a good gain from
> just crunching more data (monolingual and parallel). Unfortunately
> this takes more resources (disk, cpu etc) so you may run into trouble
> here.
>
> The hierarchical models are much bigger so yes you will need more
> disk. For fr-en/en-fr it's probably not worth the extra effort,
>
> cheers - Barry
>
> On 04/08/15 15:58, Vincent Nguyen wrote:
>> thanks for your insights.
>>
>> I am just stuck by the Bleu difference between my 26 and the 30 of
>> WMT11, and some results of WMT14 close to 36 or even 39
>>
>> I am currently having trouble with hierarchical rule set instead of
>> lexical reordering
>> wondering if I will get better results but I have an error message
>> filesystem root low disk space before it crashes.
>> is this model taking more disk space in some ways ?
>>
>> I will next try to use more corpora of which in domain with my
>> internal TMX
>>
>> thanks for your answers.
>>
>> Le 04/08/2015 16:02, Hieu Hoang a ?crit :
>>>
>>> On 03/08/2015 13:00, Vincent Nguyen wrote:
>>>> Hi,
>>>>
>>>> Just a heads up on some EMS results, to get your experienced opinions.
>>>>
>>>> Corpus: Europarlv7 + NC2010
>>>> fr => en
>>>> Evaluation NC2011.
>>>>
>>>> 1) IRSTLM vs KenLM is much slower for training / tuning.
>>> that sounds right. KenLM is also multithreaded, IRSTLM can only be
>>> used in single-threaded decoding.
>>>> 2) BLEU results are almost the same (25.7 with Irstlm, 26.14 with
>>>> KenLM)
>>> true
>>>> 3) Compact Mode is faster than onDisk with a short test (77
>>>> segments 96
>>>> seconds, vs 126 seconds)
>>> true
>>>> 4) One last thing I do not understand though :
>>>> For sake of checking, I replaced NC2011 by NC2010 in the evaluation (I
>>>> know since NC2010 is part of training, should not be relevant)
>>>> I got roughly the same BLEU score. I would have expected a higher
>>>> score
>>>> with a test set inculded in the training corpus.
>>>>
>>>> makes sense ?
>>>>
>>>>
>>>> Next steps :
>>>> What path should I use to get better scores ? I read the 'optimize'
>>>> section of the website which deals more with speed
>>>> and of course I will appply all of this but I was interested in
>>>> tips to
>>>> get more quality if possible.
>>> look into domain adaptation if you have multiple training corpora,
>>> some of which is in-domain and some out-of-domain.
>>>
>>> Other than that, getting good bleu score is a research open question.
>>>
>>> Well done on getting this far
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150806/66b22d34/attachment-0001.htm

------------------------------

Message: 2
Date: Thu, 06 Aug 2015 11:06:49 +0200
From: Martin Baumg?rtner <martin.baumgaertner@star-group.net>
Subject: Re: [Moses-support] Performance issue using Moses Server with
Moses 3 (probably same as Oren's)
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>, Barry Haddow
<bhaddow@staffmail.ed.ac.uk>
Message-ID: <55C323A9.2030208@star-group.net>
Content-Type: text/plain; charset="utf-8"

Hi Hieu,

just sent you the patch for mosesserver, because my attempts to push to
github failed for some unknown reasons ;-)

Best regards,
Martin

Am 05.08.2015 um 14:23 schrieb Hieu Hoang:
>
> It would be good if you can check in your change and take charge of it.
>
> If you're waiting for us academics to fix it, you'll be waiting a long
> time. We rarely use the server, we don't know what the issues are and
> we won't know if we've really fixed it when we change it
>
> Hieu Hoang
> Sent while bumping into things
>
> On 5 Aug 2015 4:15 pm, "Martin Baumg?rtner"
> <martin.baumgaertner@star-group.net
> <mailto:martin.baumgaertner@star-group.net>> wrote:
>
> Hi Oren,
>
> we temporarily fixed this issue with the following quick hack for
> Abyss server's constructor call:
>
> xmlrpc_c::serverAbyss myAbyssServer(
> xmlrpc_c::serverAbyss::constrOpt()
> .registryP(&myRegistry)
> .portNumber(port) // TCP port on which to listen
> .logFileName(logfile)
> .allowOrigin("*")
> .maxConn((unsigned int)numThreads*4) // *4 (performance issue,
> inofficial quick hack)
> );
>
> I'm also looking forward to the official fix, i.e. a configurable
> value for abyss connections ...
>
> Kind regards,
> Martin
>
>
> Am 04.08.2015 um 09:08 schrieb Oren:
>> Hi Barry and Martin,
>>
>> Has this issue been fixed in the source code? Should I take thr
>> current master branch and compile it myself to avoid this issue?
>>
>> Thanks.
>>
>> On Friday, July 24, 2015, Barry Haddow
>> <bhaddow@staffmail.ed.ac.uk <mailto:bhaddow@staffmail.ed.ac.uk>>
>> wrote:
>>
>> Hi Martin
>>
>> So it looks like it was the abyss connection limit that was
>> causing the problem? I'm not sure why this should be, either
>> it should queue the jobs up or discard them.
>>
>> Probably Moses server should allow users to configure the
>> number of abyss connections directly rather than tying it to
>> the number of Moses threads.
>>
>> cheers - Barry
>>
>> On 24/07/15 14:17, Martin Baumg?rtner wrote:
>>> Hi Barry,
>>>
>>> thanks for your quick reply!
>>>
>>> We're currently testing on SHA
>>> e53ad4085942872f1c4ce75cb99afe66137e1e17 (master, from
>>> 2015-07-23). This version includes the fix for mosesserver
>>> recently mentioned by Hieu in the performance thread.
>>>
>>> Following my first intuition, I ran the critical experiments
>>> after having modified mosesserver.cpp just by simply
>>> doubling the given --threads value, but only for abyss
>>> server: .maxConn((unsigned int)numThreads*2):
>>>
>>> 2.)
>>> server: --threads: 8 (i.e. abyss: 16)
>>> client: shoots 10 threads => about 11 seconds, server shows
>>> busy CPU workload => OK
>>>
>>> 5.)
>>> server: --threads: 16 (i.e. abyss: 32)
>>> client: shoots 20 threads => about 11 seconds, server shows
>>> busy CPU workload => OK
>>>
>>> Helps. :-)
>>>
>>> Best wishes,
>>> Martin
>>>
>>> Am 24.07.2015 um 13:26 schrieb Barry Haddow:
>>>> Hi Martin
>>>>
>>>> Thanks for the detailed information. It's a bit strange
>>>> since command-line Moses uses the same threadpool, and we
>>>> always overload the threadpool since the entire test set is
>>>> read in and queued.
>>>>
>>>> The server was refactored somewhat recently - which git
>>>> revision are you using?
>>>>
>>>> In the case where Moses takes a long time, and cpu activity
>>>> is low, it could be either waiting on IO, or waiting on
>>>> locks. If the former, I don't know why it works fine for
>>>> command-line Moses, and if the latter then it's odd how it
>>>> eventually frees itself.
>>>>
>>>> Is it possible to run scenario 2, then attach a debugger
>>>> whilst Moses is in the low-CPU phase to see what it is
>>>> doing? (You can do this in gdb with "info threads")
>>>>
>>>> cheers - Barry
>>>>
>>>> On 24/07/15 12:07, Martin Baumg?rtner wrote:
>>>>> Hi,
>>>>>
>>>>> followed your discussion about mosesserver performance
>>>>> issue with much interest so far.
>>>>>
>>>>> We're having similar behaviour in our perfomance tests
>>>>> with a current github master clone. Both, mosesserver and
>>>>> complete engine run from same local machine, i.e. no NFS.
>>>>> Machine is virtualized CentOS 7 using Hyper-V:
>>>>>
>>>>> > lscpu
>>>>>
>>>>> Architecture: x86_64
>>>>> CPU op-mode(s): 32-bit, 64-bit
>>>>> Byte Order: Little Endian
>>>>> CPU(s): 8
>>>>> On-line CPU(s) list: 0-7
>>>>> Thread(s) per core: 1
>>>>> Core(s) per socket: 8
>>>>> Socket(s): 1
>>>>> NUMA node(s): 1
>>>>> Vendor ID: GenuineIntel
>>>>> CPU family: 6
>>>>> Model: 30
>>>>> Model name: Intel(R) Core(TM) i7 CPU
>>>>> 860 @ 2.80GHz
>>>>> Stepping: 5
>>>>> CPU MHz: 2667.859
>>>>> BogoMIPS: 5335.71
>>>>> Hypervisor vendor: Microsoft
>>>>> Virtualization type: full
>>>>> L1d cache: 32K
>>>>> L1i cache: 32K
>>>>> L2 cache: 256K
>>>>> L3 cache: 8192K
>>>>>
>>>>>
>>>>> Following experiments using an engine with 75000 segments
>>>>> for TM/LM (--minphr-memory, --minlexr-memory):
>>>>>
>>>>> 1.)
>>>>> server: --threads: 8
>>>>> client: shoots 8 threads => about 12 seconds, server shows
>>>>> full CPU workload => OK
>>>>>
>>>>> 2.)
>>>>> server: --threads: 8
>>>>> client: shoots 10 threads => about 85 seconds, server
>>>>> shows mostly low activity, full CPU workload only near end
>>>>> of process => NOT OK
>>>>>
>>>>> 3.)
>>>>> server: --threads: 16
>>>>> client: shoots 10 threads => about 12 seconds, server
>>>>> shows busy CPU workload => OK
>>>>>
>>>>> 4.)
>>>>> server: --threads: 16
>>>>> client: shoots 16 threads => about 11 seconds, server
>>>>> shows busy CPU workload => OK
>>>>>
>>>>> 5.)
>>>>> server: --threads: 16
>>>>> client: shoots 20 threads => about 40-60 seconds
>>>>> (depending), server shows mostly low activity, full CPU
>>>>> workload only near end of process => NOT OK
>>>>>
>>>>>
>>>>> We've seen a breakdown in performance always when the
>>>>> client threads exceed the number of threads given by the
>>>>> --threads param.
>>>>>
>>>>> Kind regards,
>>>>> Martin
>>>>>
>>>>> --
>>>>>
>>>>> *STAR Group* <http://www.star-group.net>
>>>>> <http://www.star-group.net/>
>>>>>
>>>>> *Martin Baumg?rtner*
>>>>>
>>>>> STAR Language Technology & Solutions GmbH
>>>>> Umberto-Nobile-Stra?e 19 | 71063 Sindelfingen | Germany
>>>>> Tel. +49 70 31-4 10 92-0 martin.baumgaertner@star-group.net
>>>>> Fax +49 70 31-4 10 92-70 www.star-group.net
>>>>> <http://www.star-group.net/>
>>>>> Gesch?ftsf?hrer: Oliver Rau, Bernd Barth
>>>>> Handelsregister Stuttgart HRB 245654 | St.-Nr. 56098/11677
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> Moses-support@mit.edu
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>>
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>
>>> --
>>>
>>> *STAR Group* <http://www.star-group.net>
>>> <http://www.star-group.net/>
>>>
>>> *Martin Baumg?rtner*
>>>
>>> STAR Language Technology & Solutions GmbH
>>> Umberto-Nobile-Stra?e 19 | 71063 Sindelfingen | Germany
>>> Tel. +49 70 31-4 10 92-0 martin.baumgaertner@star-group.net
>>> Fax +49 70 31-4 10 92-70 www.star-group.net
>>> <http://www.star-group.net/>
>>> Gesch?ftsf?hrer: Oliver Rau, Bernd Barth
>>> Handelsregister Stuttgart HRB 245654 | St.-Nr. 56098/11677
>>>
>>
>
> --
>
> *STAR Group* <http://www.star-group.net>
> <http://www.star-group.net/>
>
> *Martin Baumg?rtner*
>
> STAR Language Technology & Solutions GmbH
> Umberto-Nobile-Stra?e 19 | 71063 Sindelfingen | Germany
> Tel. +49 70 31-4 10 92-0 martin.baumgaertner@star-group.net
> <mailto:martin.baumgaertner@star-group.net>
> Fax +49 70 31-4 10 92-70 www.star-group.net
> <http://www.star-group.net/>
> Gesch?ftsf?hrer: Oliver Rau, Bernd Barth
> Handelsregister Stuttgart HRB 245654 | St.-Nr. 56098/11677
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

--

*STAR Group* <http://www.star-group.net>
<http://www.star-group.net/>

*Martin Baumg?rtner*

STAR Language Technology & Solutions GmbH
Umberto-Nobile-Stra?e 19 | 71063 Sindelfingen | Germany
Tel. +49 70 31-4 10 92-0 martin.baumgaertner@star-group.net
<mailto:martin.baumgaertner@star-group.net>
Fax +49 70 31-4 10 92-70 www.star-group.net <http://www.star-group.net/>
Gesch?ftsf?hrer: Oliver Rau, Bernd Barth
Handelsregister Stuttgart HRB 245654 | St.-Nr. 56098/11677

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150806/90aba6b9/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 8030 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150806/90aba6b9/attachment.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 8030 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150806/90aba6b9/attachment-0001.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 8030 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150806/90aba6b9/attachment-0002.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fgadbhbc.gif
Type: image/gif
Size: 8030 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150806/90aba6b9/attachment-0003.gif

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 106, Issue 13
**********************************************

Moses-support Digest, Vol 106, Issue 13

0 Response to "Moses-support Digest, Vol 106, Issue 13"

Post a Comment