Moses-support Digest, Vol 122, Issue 41

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Need help for parallelisation in mosesserver (Tom Hoar)
2. Re: Moses-support Digest, Vol 122, Issue 38 (Hieu Hoang)


----------------------------------------------------------------------

Message: 1
Date: Thu, 29 Dec 2016 09:38:36 +0700
From: Tom Hoar <tahoar@pttools.net>
Subject: Re: [Moses-support] Need help for parallelisation in
mosesserver
To: moses-support@mit.edu
Message-ID: <28255fca-6ff5-665d-30d4-a0346c2fc610@pttools.net>
Content-Type: text/plain; charset="windows-1252"

Hi Shubham,

A far as I can tell, Mosesserver is beneficial when you want/need to
parallelize per-segment pre- and post-processing to a different machine
from the Mosesserver process. However, most (but not all) pre- and
post-processing impose minimal computational overhead. They typically
can process an entire batch of sentences (documents even multiple
documents) in one thread in a fraction of the time to run Moses.

So, I'm not sure what you intend to gain by running Mosesserver. It all
seems like a lot of unnecessary work. Why not encapsulate the standard
Moses executable in one deamonized top-level process? Include your pre-
and post- toolchains as sub-processes within the same daemon and use
standard pipes to/from Moses. Then use sockets (other) to pump docs
in/out of the daemon. You have one consistent interface that runs the
same regardless of whether your batch has one sentence or 300. Moses'
per-sentence multiprocessing kicks-in automatically.

Re "then it takes around 10 seconds to translate..." For a 10-token
sentence?! That extremely excessive. This link shows one of our
customer's experiments with his EN-RU Slate Desktop for Windows system
(you'll need to register/log-in to see it but registration is free).
Slate Desktop has a Moses kernel running native on Windows (i.e. no
CYGWIN). It uses a similar technique I described above. The customer's
experiments average 1.5 seconds per sentence for up to 40 tokens.

http://support.pttools.net/support/discussions/topics/6000042166

These times include pre-/post-processing and MT connector overhead
between memoQ and his engine. The SMT model uses the slower (now legacy)
binrarized phrase/reordering tables and binarized KENLM. The newer
compact tables would run faster, but there were complications making
them work on Windows. The CAT tool feeds sentence-by-sentence. So, Moses
is running effectively single threaded. Linux performance is comparable,
maybe a little (5%) faster. You're running 6-7 times slower! I expect
your performance bottleneck lies somewhere else.

Tom



On 12/29/2016 4:51 AM, moses-support-request@mit.edu wrote:
> Date: Thu, 29 Dec 2016 00:53:41 +0530
> From: Shubham Khandelwal<skhlnmiit@gmail.com>
> Subject: [Moses-support] Need help for parallelisation in mosesserver
> To: moses-support<moses-support@mit.edu>
>
> Hello,
>
> As mosesserver accepts only one sentence at a time. So I am creating one
> another component in front of mosesserver to handle tokenisation, casing
> and splitting taking care of parallelisation.
>
> Following is my procedure to do it, let me know whether am I heading
> correctly or not to do this:
> *---*
> *So suppose, if I have 5 different sentences (as a paragraph) to translate
> at once (fr-en). So I will be creating mosesserver on 5 different ports
> firstly and pass those 5 different sentences after doing parallely
> tokenisaton, casing and splitting on those different ports and then
> concatenate the output after recasing and detokenisation parallely. *
> *--*
> Let me know whether this is correct or not ? If no, then please suggest me
> better solution to do this.
>
> Also, I have one more question in this that if a sentence is composed of
> around 10 words. Then when I pass this sentence to translate as follows:
> -> ~/mosesdecoder/bin/mosesserver -f moses.ini -threads 16 -b 0.000000001
>
> then it takes around 10 seconds to translate. To make it fast, I can run
> this on different ports but that is not a good idea I think, as splitting a
> single sentence to multiple group of sentence and then translate them on
> different ports separately, can give different meaning rather than
> translate the whole single sentence at single port.
> So basically, my doubt is how to make better splitting in such cases which
> can take care of parallelisation aswell ?
>
> -- Yours Sincerely, Shubham Khandelwal

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161228/d4a807ea/attachment-0001.html

------------------------------

Message: 2
Date: Thu, 29 Dec 2016 12:10:04 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Moses-support Digest, Vol 122, Issue 38
To: Mike Ladwig <mdladwig@gmail.com>, moses-support
<moses-support@mit.edu>
Message-ID:
<CAEKMkbiTJbN04NX1Oxd0=13UnTmugjz_fQ=MAitT8FVD35uKNA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

ah thanks. I think it's a bug for that edge case - wall at the end of the
sentence. Fixed it

https://github.com/moses-smt/mosesdecoder/commit/c30b28f43b902e48e399ab5cf6c60f6f62c1fb50
A wall at the end of the sentence shouldn't have any effect right?


Hieu Hoang
http://moses-smt.org/

On 28 December 2016 at 21:40, Mike Ladwig <mdladwig@gmail.com> wrote:

> if you have an example, please send it over. It should work
>>
>
> bericht skinner <zone> <wall/> ( <wall/> <ne translation="@numv@"
> entity="a5?0235/2000">@numv@</ne> <wall/> ) <wall/> </zone>
>
> terminate called after throwing an instance of 'util::Exception'
> what(): contrib/moses2/PhraseBased/Sentence.cpp:87 in static
> Moses2::Sentence* Moses2::Sentence::CreateFromStringXML(Moses2::MemPool&,
> Moses2::FactorCollection&, const Moses2::System&, const string&) threw
> util::Exception because `xmlOption->startPos >= ret->GetSize()'.
> wall is beyond the sentence
> Aborted (core dumped)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161229/2e079499/attachment-0001.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 122, Issue 41
**********************************************

0 Response to "Moses-support Digest, Vol 122, Issue 41"

Post a Comment