Moses-support Digest, Vol 89, Issue 15

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: Help. First request to MosesServer very slow
(Kenneth Heafield)
2. Re: question about --return-best-dev in mert-moses
(Jorg Tiedemann)
3. Re: Help. First request to MosesServer very slow (Barry Haddow)
4. No Moses translation without model >=3 (Momo Jeng)

----------------------------------------------------------------------

Message: 1
Date: Thu, 06 Mar 2014 10:40:49 -0800
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] Help. First request to MosesServer very
slow
To: moses-support@mit.edu
Message-ID: <5318C131.8080006@kheafield.com>
Content-Type: text/plain; charset=ISO-8859-1

Hi,

In my view, the threading design of the server is a bug. How about a
producer-consumer queue with multiple producers (a client connection)
and consumers (decoding threads). Each client connection owns a
producer-consumer queue as a return channel so that decoder threads can
return their result. Or we could use boost futures.

Kenneth

On 03/06/14 10:30, Barry Haddow wrote:
> Hi Marcos
>
> I think the problem is that the rules (or phrase pairs) are now cached
> on a per thread basis. This is good for command-line Moses as it uses a
> pool of threads, and having per-thread caches means that there is no
> locking on the caches, as there used to be.
>
> mosesserver, afaik, creates a new thread for each connection, so it
> can't take advantage of the cache. This is done in the xmlrpc-c library
> so we don't have much control over it. If you dig around in the xmlrpc-c
> documentation (or code!) you might find a way to control the threading
> policy.
>
> I just spoke to Marcin about the problem, and we're not sure if loading
> the compact phrase table into memory would help, as you still would need
> the higher level cache (in PhraseDictionary). But you could try this anyway.
>
> cheers - Barry
>
> On 06/03/14 17:20, Marcos Fernandez wrote:
>> Hi, I am having an issue with MosesServer.
>>
>> I am using compact phrase and reordering table, and KENLM.
>>
>> The problem is this (I'll explain with an example):
>>
>> - I have one file with 20 very short sentences. I split and tokenize
>> them and send one XMLPRC request per sentence to MosesServer
>> - If I create just one XMLRPC ServerProxy instance and I use it to send
>> all the requests through it, all the sentences get translated in approx
>> 2.5 sec. The problem is that the first sentence takes almost 2 seconds
>> to get translated, while the other 19 are much faster
>> - If I create one ServerProxy instance per request, the translation time
>> rises to 30 sec (now every sentence takes almost 2 sec)
>>
>> I don't understand the reason of that delay for the first request. I
>> have followed the source of this delay to the function:
>>
>> GetTargetPhraseCollectionLEGACY(const Phrase& src)
>>
>> in the file: ...TranslationModel/PhraseDictionary.cpp
>>
>> It seems that for the first request it's needed to look for something
>> in the phrase table, while for subsequent requests it can be retrieved
>> (most of the times) from a cache.
>>
>> But, as the sentences in my file are not related one to another in any
>> way, the information on this cache can not be sentence-dependent, so why
>> wouldn't it be possible for the cache to be preloaded with the
>> information needed?
>>
>> I think that perhaps I have something misconfigured, because I have seen
>> other people using the approach of creating one ServerProxy object for
>> each XMLRPC request (which would facilitate things a lot for me), so I
>> don't think they are experiencing this overhead. Perhaps using the
>> compact formats can have something to do with it?
>>
>> Any help would be much appreciated. I paste below my moses.ini, if that
>> helps:
>>
>> Thanks :)
>>
>> ### MOSES CONFIG FILE ###
>> ###################
>>
>> # input factors
>> [input-factors]
>> 0
>>
>> # mapping steps
>> [mapping]
>> 0 T 0
>>
>> # translation tables: table type (hierarchical(0), textual (0), binary
>> (1)), source-factors, target-factors, number of scores, file
>> # OLD FORMAT is still handled for back-compatibility
>> # OLD FORMAT translation tables: source-factors, target-factors, number
>> of scores, file
>> # OLD FORMAT a binary table type (1) is assumed
>> [ttable-file]
>> 12 0 0 5 /opt/moses-compiling/modelos/es-en/phrase-model/phrase-table
>>
>> # no generation models, no generation-file section
>>
>> # language models: type(srilm/irstlm), factors, order, file
>> [lmodel-file]
>> 8 0 5
>> /opt/moses-compiling/modelos/es-en/lm/13-19-03gen_intec_head8m_sb5LM.kenlm
>>
>>
>> # limit on how many phrase translations e for each phrase f are loaded
>> # 0 = all elements loaded
>> [ttable-limit]
>> 10
>>
>> # distortion (reordering) files
>> [distortion-file]
>> 0-0 wbe-msd-bidirectional-fe-allff 6
>> /opt/moses-compiling/modelos/es-en/phrase-model/reordering-table
>>
>> # distortion (reordering) weight
>> [weight-d]
>> 0.097107
>> 0.150373
>> -0.0551767
>> -0.0307787
>> 0.114613
>> 0.214587
>> 0.0467398
>>
>> # language model weights
>> [weight-l]
>> 0.0442748
>>
>>
>> # translation model weights
>> [weight-t]
>> 0.00370888
>> 0.0425665
>> 0.0719956
>> 0.0202699
>> 0.071147
>>
>> # no generation models, no weight-generation section
>>
>> # word penalty
>> [weight-w]
>> 0.0366626
>>
>> [distortion-limit]
>> 6
>>
>> [v]
>> 0
>>
>>
>
>

------------------------------

Message: 2
Date: Thu, 6 Mar 2014 20:02:06 +0100
From: Jorg Tiedemann <tiedeman@gmail.com>
Subject: Re: [Moses-support] question about --return-best-dev in
mert-moses
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <3B0111D8-6986-4A01-B208-49701B10888C@gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

OK - I see. But I can't tell you if it helps or not. I didn't do any systematic comparisons. Sorry.
Thanks for the explanation.

J?rg

J?rg Tiedemann
tiedeman@gmail.com

On Mar 6, 2014, at 7:37 PM, Barry Haddow wrote:

> Hi J?rg
>
> In each MERT iteration, the first action is to decode the tuning set and
> create an n-best list, using the current weight set. The 1-bests from
> this decoding run are the hypotheses which get scored by --return-best-dev.
>
> After that decoding, MERT searchs for a weight set that can rerank the
> n-best lists to give a better BLEU, and stops when it reaches a local
> maximum. This is the BLEU that is reported in the moses.ini file. So it
> is a BLEU obtained by decoding with one weight set, and then reranking
> with a different weight set. When you redecode using the new weight set
> you do not get the same set of translations, since the nbest list is
> just a tiny sample of the hypotheses that are considered during
> decoding, so there will normally be hypotheses outwith the nbest list
> which have higher model score.
>
> We haven't generally used --return-best-dev with MERT - does it help?
> It's really designed for pro and kbmira.
>
> cheers - Barry
>
> On 06/03/14 11:28, Jorg Tiedemann wrote:
>> Hi,
>>
>> I have a question about the --return-best-dev flag in mert-moses.pl
>> I have run several experiments using this flag and I don't really
>> understand how it influences the choice of settings during MERT. In
>> many cases, the system will select an early iteration which is much
>> below in terms of BLEU than many iterations later. Maybe my confusing
>> is related to the BLEU score mentioned in the moses.ini files printed
>> after each iteration? Can someone help me? Thanks!
>>
>>
>> Cheers,
>> J?rg
>>
>>
>> J?rg Tiedemann
>> tiedeman@gmail.com <mailto:tiedeman@gmail.com>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140306/2292bc98/attachment-0001.htm

------------------------------

Message: 3
Date: Thu, 06 Mar 2014 19:09:10 +0000
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] Help. First request to MosesServer very
slow
To: Kenneth Heafield <moses@kheafield.com>, moses-support@mit.edu
Message-ID: <5318C7D6.2040802@staffmail.ed.ac.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi

I agree that it's a bad design, but I'm not clear if it can be fixed
without changing to a different xmlrpc library

cheers - Barry

On 06/03/14 18:40, Kenneth Heafield wrote:
> Hi,
>
> In my view, the threading design of the server is a bug. How about a
> producer-consumer queue with multiple producers (a client connection)
> and consumers (decoding threads). Each client connection owns a
> producer-consumer queue as a return channel so that decoder threads can
> return their result. Or we could use boost futures.
>
> Kenneth
>
> On 03/06/14 10:30, Barry Haddow wrote:
>> Hi Marcos
>>
>> I think the problem is that the rules (or phrase pairs) are now cached
>> on a per thread basis. This is good for command-line Moses as it uses a
>> pool of threads, and having per-thread caches means that there is no
>> locking on the caches, as there used to be.
>>
>> mosesserver, afaik, creates a new thread for each connection, so it
>> can't take advantage of the cache. This is done in the xmlrpc-c library
>> so we don't have much control over it. If you dig around in the xmlrpc-c
>> documentation (or code!) you might find a way to control the threading
>> policy.
>>
>> I just spoke to Marcin about the problem, and we're not sure if loading
>> the compact phrase table into memory would help, as you still would need
>> the higher level cache (in PhraseDictionary). But you could try this anyway.
>>
>> cheers - Barry
>>
>> On 06/03/14 17:20, Marcos Fernandez wrote:
>>> Hi, I am having an issue with MosesServer.
>>>
>>> I am using compact phrase and reordering table, and KENLM.
>>>
>>> The problem is this (I'll explain with an example):
>>>
>>> - I have one file with 20 very short sentences. I split and tokenize
>>> them and send one XMLPRC request per sentence to MosesServer
>>> - If I create just one XMLRPC ServerProxy instance and I use it to send
>>> all the requests through it, all the sentences get translated in approx
>>> 2.5 sec. The problem is that the first sentence takes almost 2 seconds
>>> to get translated, while the other 19 are much faster
>>> - If I create one ServerProxy instance per request, the translation time
>>> rises to 30 sec (now every sentence takes almost 2 sec)
>>>
>>> I don't understand the reason of that delay for the first request. I
>>> have followed the source of this delay to the function:
>>>
>>> GetTargetPhraseCollectionLEGACY(const Phrase& src)
>>>
>>> in the file: ...TranslationModel/PhraseDictionary.cpp
>>>
>>> It seems that for the first request it's needed to look for something
>>> in the phrase table, while for subsequent requests it can be retrieved
>>> (most of the times) from a cache.
>>>
>>> But, as the sentences in my file are not related one to another in any
>>> way, the information on this cache can not be sentence-dependent, so why
>>> wouldn't it be possible for the cache to be preloaded with the
>>> information needed?
>>>
>>> I think that perhaps I have something misconfigured, because I have seen
>>> other people using the approach of creating one ServerProxy object for
>>> each XMLRPC request (which would facilitate things a lot for me), so I
>>> don't think they are experiencing this overhead. Perhaps using the
>>> compact formats can have something to do with it?
>>>
>>> Any help would be much appreciated. I paste below my moses.ini, if that
>>> helps:
>>>
>>> Thanks :)
>>>
>>> ### MOSES CONFIG FILE ###
>>> ###################
>>>
>>> # input factors
>>> [input-factors]
>>> 0
>>>
>>> # mapping steps
>>> [mapping]
>>> 0 T 0
>>>
>>> # translation tables: table type (hierarchical(0), textual (0), binary
>>> (1)), source-factors, target-factors, number of scores, file
>>> # OLD FORMAT is still handled for back-compatibility
>>> # OLD FORMAT translation tables: source-factors, target-factors, number
>>> of scores, file
>>> # OLD FORMAT a binary table type (1) is assumed
>>> [ttable-file]
>>> 12 0 0 5 /opt/moses-compiling/modelos/es-en/phrase-model/phrase-table
>>>
>>> # no generation models, no generation-file section
>>>
>>> # language models: type(srilm/irstlm), factors, order, file
>>> [lmodel-file]
>>> 8 0 5
>>> /opt/moses-compiling/modelos/es-en/lm/13-19-03gen_intec_head8m_sb5LM.kenlm
>>>
>>>
>>> # limit on how many phrase translations e for each phrase f are loaded
>>> # 0 = all elements loaded
>>> [ttable-limit]
>>> 10
>>>
>>> # distortion (reordering) files
>>> [distortion-file]
>>> 0-0 wbe-msd-bidirectional-fe-allff 6
>>> /opt/moses-compiling/modelos/es-en/phrase-model/reordering-table
>>>
>>> # distortion (reordering) weight
>>> [weight-d]
>>> 0.097107
>>> 0.150373
>>> -0.0551767
>>> -0.0307787
>>> 0.114613
>>> 0.214587
>>> 0.0467398
>>>
>>> # language model weights
>>> [weight-l]
>>> 0.0442748
>>>
>>>
>>> # translation model weights
>>> [weight-t]
>>> 0.00370888
>>> 0.0425665
>>> 0.0719956
>>> 0.0202699
>>> 0.071147
>>>
>>> # no generation models, no weight-generation section
>>>
>>> # word penalty
>>> [weight-w]
>>> 0.0366626
>>>
>>> [distortion-limit]
>>> 6
>>>
>>> [v]
>>> 0
>>>
>>>
>>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

------------------------------

Message: 4
Date: Thu, 6 Mar 2014 13:00:22 -0800
From: Momo Jeng <momo_jeng@outlook.com>
Subject: [Moses-support] No Moses translation without model >=3
To: "moses-support@mit.edu" <moses-support@MIT.EDU>
Message-ID: <SNT151-W54F2F3AE955AABA2B81EDA85880@phx.gbl>
Content-Type: text/plain; charset="iso-8859-1"

I'm having a problem getting results from Moses, although I think it's really a problem with GIZA++; please let me know if there's a better place for GIZA questions.

When I run Moses instructing GIZA++ to only do model1 and hmm iterations ( "--giza-option model1iterations=3,hmmiterations=3,model3iterations=0,model4iterations=0" ), Moses fails, because GIZA++ doesn't produce an alignment file (the .A3.final file).

Based on a quick look at the GIZA++ code, this failure is explicit in the code. In main.cpp, m3.viterbi(...) is only called at line 652 if at least one iteration is set for model 3, 4, 5, or 6, and m3.viterbi(...) is where the code for writing the alignment file is called. So I'm wondering if this a bug, or by design. Is there some reason that I shouldn't create alignments without using model 3 or higher?

Momo

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140306/c2d5a60e/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 89, Issue 15
*********************************************

Moses-support Digest, Vol 89, Issue 15

0 Response to "Moses-support Digest, Vol 89, Issue 15"

Post a Comment