Moses-support Digest, Vol 121, Issue 31

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: Syntax-based Constrained Decoding (Shuoyang Ding)
2. Re: Syntax-based Constrained Decoding (Hieu Hoang)
3. Does PhraseDictionaryMultiModel require all models to contain
all phrases? (Lane Schwartz)

----------------------------------------------------------------------

Message: 1
Date: Tue, 15 Nov 2016 15:00:31 -0500
From: Shuoyang Ding <mtsding@gmail.com>
Subject: Re: [Moses-support] Syntax-based Constrained Decoding
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: Moses <moses-support@mit.edu>
Message-ID: <B3810206-A7EC-45CB-9F86-560771BB6D6F@gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Hieu,

I?ve made change 1, 2, 4 before emailing you, and the coverage didn?t change much. It turns out the bottleneck is on beam-threshold ? the default value was 1e-5, which is a pretty tough limit for constrained decoding.

After setting that to 0 I played around a little bit with cube-pruning limit. The coverage is around 25% to 40% depending on what number you use, but higher coverage comes with longer decoding time, which is what one would expect to happen.

Still, for string-to-tree constrained decoding the easiest way may still be decoding with phrase tables built per-sentence, since the decoding is generally slower. However, even for that, the default value of beam-threshold needs to be overridden in order to make it work properly.

Hope the info helps.

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding <http://cs.jhu.edu/~sding>

> On Oct 28, 2016, at 9:27 AM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>
> good point. The decoder is set up to translate quickly so there's a few pruning parameters which throws out low scoring rules or hypotheses.
>
> These are some of the pruning parameters you'll need to change (there may be more):
> 1. [feature]
> PhraseDictionaryWHATEVER table-limit=0
> 2. [cube-pruning-pop-limit]
> 1000000
> 3. [beam-threshold]
> 0
> 4. [stack]
> 1000000
> Make the change 1 at a time in case it makes decoding too slow, even with constrained decoding.
>
> It may be that you have to run the decoder with phrase-tables that are trained only on 1 sentence at a time.
>
> I'll be interested in knowing how you get on so let me know how it goes
>
> On 26/10/2016 13:56, Shuoyang Ding wrote:
>> Hi All,
>>
>> I?m trying to do syntax-based constrained decoding on the same data from which I extracted my rules, and I?m getting very low coverage (~12%). I?m using GHKM rule extraction which in theory should be able to reconstruct the target translation even only with minimal rules.
>>
>> Judging from the search graph output, the decoder seems to prune out rules with very low scores, even if they are the only rule that can reconstruct the original reference.
>>
>> I?m curious if there is a way in the current constrained decoding implementation such that I can disable pruning? Or at least, if it is feasible to do so?
>>
>> Thanks!
>>
>> Regards,
>> Shuoyang Ding
>>
>> Ph.D. Student
>> Center for Language and Speech Processing
>> Department of Computer Science
>> Johns Hopkins University
>>
>> Hackerman Hall 225A
>> 3400 N. Charles St.
>> Baltimore, MD 21218
>>
>> http://cs.jhu.edu/~sding
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161115/91bd3154/attachment-0001.html

------------------------------

Message: 2
Date: Wed, 16 Nov 2016 10:51:57 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Syntax-based Constrained Decoding
To: Shuoyang Ding <mtsding@gmail.com>
Cc: Moses <moses-support@mit.edu>
Message-ID: <e708f211-5661-44ef-9071-f847334935a1@gmail.com>
Content-Type: text/plain; charset="utf-8"

good to know that the constrained decoding works. And yes, the
reachability of the training data is only theoritical in the absence of
pruning such as cube pruning, beams etc.

On 15/11/2016 20:00, Shuoyang Ding wrote:
> Hi Hieu,
>
> I?ve made change 1, 2, 4 before emailing you, and the coverage didn?t
> change much. It turns out the bottleneck is on beam-threshold ? the
> default value was 1e-5, which is a pretty tough limit for constrained
> decoding.
>
> After setting that to 0 I played around a little bit with cube-pruning
> limit. The coverage is around 25% to 40% depending on what number you
> use, but higher coverage comes with longer decoding time, which is
> what one would expect to happen.
>
> Still, for string-to-tree constrained decoding the easiest way may
> still be decoding with phrase tables built per-sentence, since the
> decoding is generally slower. However, even for that, the default
> value of beam-threshold needs to be overridden in order to make it
> work properly.
>
> Hope the info helps.
>
> Regards,
> Shuoyang Ding
>
> Ph.D. Student
> Center for Language and Speech Processing
> Department of Computer Science
> Johns Hopkins University
>
> Hackerman Hall 225A
> 3400 N. Charles St.
> Baltimore, MD 21218
>
> http://cs.jhu.edu/~sding <http://cs.jhu.edu/%7Esding>
>
>> On Oct 28, 2016, at 9:27 AM, Hieu Hoang <hieuhoang@gmail.com
>> <mailto:hieuhoang@gmail.com>> wrote:
>>
>> good point. The decoder is set up to translate quickly so there's a
>> few pruning parameters which throws out low scoring rules or hypotheses.
>>
>> These are some of the pruning parameters you'll need to change (there
>> may be more):
>> 1. [feature]
>> PhraseDictionaryWHATEVER table-limit=0
>> 2. [cube-pruning-pop-limit]
>> 1000000
>> 3. [beam-threshold]
>> 0
>> 4. [stack]
>> 1000000
>> Make the change 1 at a time in case it makes decoding too slow, even
>> with constrained decoding.
>>
>> It may be that you have to run the decoder with phrase-tables that
>> are trained only on 1 sentence at a time.
>>
>> I'll be interested in knowing how you get on so let me know how it goes
>>
>> On 26/10/2016 13:56, Shuoyang Ding wrote:
>>> Hi All,
>>>
>>> I?m trying to do syntax-based constrained decoding on the same data
>>> from which I extracted my rules, and I?m getting very low coverage
>>> (~12%). I?m using GHKM rule extraction which in theory should be
>>> able to reconstruct the target translation even only with minimal rules.
>>>
>>> Judging from the search graph output, the decoder seems to prune out
>>> rules with very low scores, even if they are the only rule that can
>>> reconstruct the original reference.
>>>
>>> I?m curious if there is a way in the current constrained decoding
>>> implementation such that I can disable pruning? Or at least, if it
>>> is feasible to do so?
>>>
>>> Thanks!
>>>
>>> Regards,
>>> Shuoyang Ding
>>>
>>> Ph.D. Student
>>> Center for Language and Speech Processing
>>> Department of Computer Science
>>> Johns Hopkins University
>>>
>>> Hackerman Hall 225A
>>> 3400 N. Charles St.
>>> Baltimore, MD 21218
>>>
>>> http://cs.jhu.edu/~sding <http://cs.jhu.edu/%7Esding>
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161116/b5b1a452/attachment-0001.html

------------------------------

Message: 3
Date: Wed, 16 Nov 2016 09:41:23 -0600
From: Lane Schwartz <dowobeha@gmail.com>
Subject: [Moses-support] Does PhraseDictionaryMultiModel require all
models to contain all phrases?
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CABv3vZm9_4mq39kafvXHrNSzBu95W_EuCtdpTT5XZUbu9=Lwcw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

I'm potentially interested in using PhraseDictionaryMultiModel, and how it
differs from PhraseDictionaryGroup.

With PhraseDictionaryMultiModel, is it OK to have disjoint phrase tables?
That is, can PhraseDictionaryMultiModel handle the situation where a
translation option is present in one table but not the other(s)?

Thanks,
Lane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161116/41ebaef5/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 121, Issue 31
**********************************************

Moses-support Digest, Vol 121, Issue 31

0 Response to "Moses-support Digest, Vol 121, Issue 31"

Post a Comment