Moses-support Digest, Vol 84, Issue 9

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: lattices with EPSILON (Ondrej Bojar)


----------------------------------------------------------------------

Message: 1
Date: Fri, 04 Oct 2013 23:20:59 +0200
From: Ondrej Bojar <bojar@ufal.mff.cuni.cz>
Subject: Re: [Moses-support] lattices with EPSILON
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>, cdec-users@googlegroups.com
Cc: moses-support <moses-support@mit.edu>
Message-ID: <fd2ffe95-5a71-412c-95a5-2b925dc98803@email.android.com>
Content-Type: text/plain; charset=utf-8

Hi,

while you can always run rmepsilon from openfst or other toolkit, epsilon edges will be probably particularly useful if one would use different semirings for different components of the score vector. With generic toolkits, all the components of the score vector are processed in a single manner. Depending on whether Moses features do the "plus" of their respective scores on their own, each feature can use its own semiring.

The probably (in some sense) maximal explosion in the number of paths is achieved when the lattice has the form of a confusion network (no epsilons). You get the full cartesian product of choices of the first token, the second token etc.

Cheers, Ondrej.

"Hieu Hoang" <Hieu.Hoang@ed.ac.uk> wrote:

>@nicola - i didn't see a reason either but some lattices from a speech
>recognizer contains them so was just curious. I think chris has a point -
>they may be easier to create.
>
>I think they may also more efficient to decode. In a non-deterministic
>lattice, you might have the 2 edges with the same symbol coming out of 1
>node. Each would have to be decoded separately.
>
>However, its a pain to decode epsilons and there might be weird edge cases,
>eg. consecutive, beginning and end epsilons, entirely epsiloms.
>
>@chris - cheers for the explanation. i might use victor's code and see how
>it goes.
>
>Do you have an example (large) lattice that blows up memory that you can
>share?
>
>Yes - i've changed the code to extract all possible paths. In fact, i
>extract all paths from beginning to end of sentence, without limit. 2
>reasons for this
> 1. I also divorced extracting the path creation from the phrase-table
>lookup. In the general case there's multiple phrase-tables so it's
>difficult to keep track of the tries. Also, the intertwinning of the binary
>pt loookup with lattices made it difficult to read.
> 2. I want to give each feature function the opprtunity to score with
>full knowledge of the path.
>
>This may have to be altered if the memory explosion is too drastic
>
>
>
>
>On 4 October 2013 17:49, Chris Dyer <cdyer@cs.cmu.edu> wrote:
>
>> It's useful to have epsilons since it simplifies the creation of
>> lattices in some cases. Yes, you can convert them to a deterministic
>> equivalent, but that involves implementing FSA determinatization (or
>> using a tool like https://pypi.python.org/pypi/pyfst), which may not
>> be convenient.
>>
>> Btw, I've also noticed that memory usage with lattices/CNs explodes
>> with non-binarized phrase tables (maybe also with binarized PTs?).
>> This is independent of the size of the phrase table and only seems to
>> be a function of the lattice structure. I'm not sure what's going on
>> (the code has changed substantially since I last looked at it). But,
>> you should always match paths in the lattice with paths in the phrase
>> table trie- maybe moses is now trying to extract all possible paths in
>> the lattice up to max-phrase-size or something?
>>
>> On Fri, Oct 4, 2013 at 11:22 AM, Nicola Bertoldi <bertoldi@fbk.eu> wrote:
>> > I don't see any reason why a lattice should contain an EPSILON edge.
>> >
>> > In a confusion network, EPSILON are needed to allow the translation of
>> input of different lengths.
>> > The sausage structure of the CN imposes the same amount of source words,
>> > and the EPSILONs overcome this constraint.
>> >
>> > This is not the case for lattice, because you can have any number of
>> edges/words in a complete source path.
>> >
>> >
>> > cheers,
>> > Nicola
>> >
>> >
>> >
>> > On Oct 4, 2013, at 2:52 PM, Hieu Hoang wrote:
>> >
>> > I'm just looking at the lattices decoding, as implemented in moses.
>> >
>> > for confusion networks, it's fair to have EPSILON words (that represent
>> blank words). However, I don't see the point of them in lattices.
>> >
>> > Anyone have an opinion? How is it implemented in cdec & joshua?
>> >
>> > --
>> > Hieu Hoang
>> > Research Associate
>> > University of Edinburgh
>> > http://www.hoang.co.uk/hieu
>> >
>> > _______________________________________________
>> > Moses-support mailing list
>> > Moses-support@mit.edu<mailto:Moses-support@mit.edu>
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>> >
>> > _______________________________________________
>> > Moses-support mailing list
>> > Moses-support@mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "cdec users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to cdec-users+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
>
>--
>Hieu Hoang
>Research Associate
>University of Edinburgh
>http://www.hoang.co.uk/hieu
>_______________________________________________
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support

--
Ondrej Bojar
http://www.cuni.cz/~obo



------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 84, Issue 9
********************************************

0 Response to "Moses-support Digest, Vol 84, Issue 9"

Post a Comment