Moses-support Digest, Vol 97, Issue 29

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: using sparse features (Eva Hasler)

----------------------------------------------------------------------

Message: 1
Date: Fri, 14 Nov 2014 10:24:21 +0000
From: Eva Hasler <evahasler@gmail.com>
Subject: Re: [Moses-support] using sparse features
To: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Cc: moses-support <moses-support@mit.edu>, Barry Haddow
<bhaddow@staffmail.ed.ac.uk>
Message-ID:
<CABsbj8QfiK3Vm=zUAmUdEwkTiBvWTSqb3G95sWOBkXGXbUhkPg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

In comparison to MERT? not really, we compared English-French and
German-English at IWSLT 2012 and the baseline scores were a bit higher for
En-Fr a bit lower for De-En.
But of course the point is that you can use more features, so you have to
define useful feature sets that are sparse but still able to generalise

On Fri, Nov 14, 2014 at 10:16 AM, Marcin Junczys-Dowmunt <junczys@amu.edu.pl
> wrote:

> Speed aside, quality did not improve significantly?
>
> W dniu 14.11.2014 o 11:11, Eva Hasler pisze:
>
> Let's say there was a bit of disillusionment about the advantages of
> online vs batch mira. Online mira was slow in comparison, but that's also
> because the implementation was still in a kind of development state and not
> optimised
>
>
> On Fri, Nov 14, 2014 at 9:59 AM, Marcin Junczys-Dowmunt <
> junczys@amu.edu.pl> wrote:
>
>> Thanks. For some reasons I usually have quite week results with kbmira.
>> What happened to that interesting Online MIRA idea? Died due to lack of
>> maintenance?
>>
>> W dniu 14.11.2014 o 10:54, Barry Haddow pisze:
>> > Hi Marcin
>> >
>> > Our default option would be kbmira (kbest batch mira). It seems to be
>> > the most stable,
>> >
>> > cheers - Barry
>> >
>> > On 14/11/14 09:43, Marcin Junczys-Dowmunt wrote:
>> >> Apropos MIRA, what's the current best practice tuner for sparse
>> >> features? What are you guys using now for say WMT-grade systems?
>> >>
>> >> W dniu 14.11.2014 o 10:39, Barry Haddow pisze:
>> >>> Hi Prashant
>> >>>
>> >>> We had to do these kind of dynamic weight updates for online MIRA. The
>> >>> code is still there, although might have rotted, start by looking at
>> >>> the
>> >>> weight update methods in StaticData,
>> >>>
>> >>> cheers - Barry
>> >>>
>> >>> On 13/11/14 17:05, Prashant Mathur wrote:
>> >>>> But in CAT scenario we do like this:
>> >>>>
>> >>>> translate: sentence 1
>> >>>> tune: sentence 1 , post-edit 1
>> >>>> translate: sentence 2
>> >>>> tune: sentence 2 , post-edit 2
>> >>>> ...
>> >>>>
>> >>>> In this case, I don't have any features generated or tuned before I
>> >>>> start translating the first sentence.
>> >>>>
>> >>>> Old version is complicated, I am coding on the latest version now.
>> >>>>
>> >>>> --Prashant
>> >>>>
>> >>>>
>> >>>> On Thu, Nov 13, 2014 at 5:26 PM, Philipp Koehn <pkoehn@inf.ed.ac.uk
>> >>>> <mailto:pkoehn@inf.ed.ac.uk>> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> Typically you want to learn these feature weights when
>> >>>> tuning. The
>> >>>> current setup supports and produces a sparse feature file.
>> >>>>
>> >>>> -phi
>> >>>>
>> >>>> On Nov 13, 2014 11:18 AM, "Prashant Mathur" <prashant@fbk.eu
>> >>>> <mailto:prashant@fbk.eu>> wrote:
>> >>>>
>> >>>> what if I don't know the feature names before hand?
>> >>>> In that case, can I set the weights directly during
>> >>>> decoding?
>> >>>>
>> >>>> On Thu, Nov 13, 2014 at 4:59 PM, Barry Haddow
>> >>>> <bhaddow@staffmail.ed.ac.uk
>> >>>> <mailto:bhaddow@staffmail.ed.ac.uk>> wrote:
>> >>>>
>> >>>> Hi Prashant
>> >>>>
>> >>>> You add something like this to your moses.ini:
>> >>>>
>> >>>> [weight-file]
>> >>>> /path/to/sparse/weights/file
>> >>>>
>> >>>> The sparse weights file has the form:
>> >>>>
>> >>>> name1 weight1
>> >>>> name2 weight2
>> >>>> name3 weight3
>> >>>> .
>> >>>> .
>> >>>> .
>> >>>>
>> >>>> At least that's how it works in Moses v2.
>> >>>>
>> >>>> cheers - Barry
>> >>>>
>> >>>> On 13/11/14 15:42, Prashant Mathur wrote:
>> >>>>
>> >>>> Thanks a lot Barry for your answers.
>> >>>>
>> >>>> I have another question.
>> >>>> When I print these sparse features at the end of
>> >>>> decoding, all sparse features are assigned a
>> >>>> weight of
>> >>>> 0 because all of them were initialized during
>> >>>> decoding.
>> >>>> How can I set these weights for sparse features
>> >>>> before
>> >>>> they are evaluated?
>> >>>>
>> >>>>
>> >>>> Thanks Hieu for the link..
>> >>>> I am going to update the code as soon as I can..
>> but
>> >>>> it will take some time.. will get back to you when
>> I
>> >>>> do that.
>> >>>>
>> >>>> --Prashant
>> >>>>
>> >>>>
>> >>>> On Thu, Nov 13, 2014 at 2:34 PM, Hieu Hoang
>> >>>> <Hieu.Hoang@ed.ac.uk <mailto:Hieu.Hoang@ed.ac.uk>
>> >>>> <mailto:Hieu.Hoang@ed.ac.uk
>> >>>> <mailto:Hieu.Hoang@ed.ac.uk>>> wrote:
>> >>>>
>> >>>> re-iterating what Barry said, you should use
>> the
>> >>>> github moses if
>> >>>> you want to create your own feature functions,
>> >>>> especially with
>> >>>> sparse features. The reasons:
>> >>>> 1. Adding new feature functions is a pain in
>> v
>> >>>> 0.91. It's
>> >>>> trivial now. You can watch my talk to find
>> >>>> out why
>> >>>> http://lectures.ms.mff.cuni.cz/video/recordshow/index/44/184
>> >>>> 2. It's confusing exactly when the feature
>> >>>> functions are
>> >>>> computed. It's clear now (hopefully!)
>> >>>> 3. I think you had to set special flags
>> >>>> somewhere to use sparse
>> >>>> features. Now, all feature functions can use
>> >>>> sparse features as
>> >>>> well as dense features
>> >>>> 4. I don't remember the 0.91 code very
>> >>>> well. So
>> >>>> I can't help you
>> >>>> if you get stuck
>> >>>>
>> >>>>
>> >>>> On 13 November 2014 11:06, Barry Haddow
>> >>>> <bhaddow@staffmail.ed.ac.uk
>> >>>> <mailto:bhaddow@staffmail.ed.ac.uk>
>> >>>> <mailto:bhaddow@staffmail.ed.ac.uk
>> >>>> <mailto:bhaddow@staffmail.ed.ac.uk>>>
>> >>>>
>> >>>> wrote:
>> >>>>
>> >>>> Hi Prashant
>> >>>>
>> >>>> I tried to answer your questions inline:
>> >>>>
>> >>>>
>> >>>> On 12/11/14 20:27, Prashant Mathur wrote:
>> >>>> > Hi All,
>> >>>> >
>> >>>> > I have a question about implementing
>> >>>> sparse
>> >>>> feature function.
>> >>>> > I went through the details on its
>> >>>> implementation, still
>> >>>> somethings are
>> >>>> > not clear.
>> >>>> > FYI, I am using an old version of moses
>> >>>> which dates back to
>> >>>> Release
>> >>>> > 0.91 I guess. So, I am sorry if my
>> >>>> questions
>> >>>> don't relate to the
>> >>>> > latest implementation.
>> >>>>
>> >>>> This is a bad idea. The FF interface has
>> >>>> changed a lot since 0.91.
>> >>>>
>> >>>> >
>> >>>> > 1. I was looking at the
>> >>>> TargetNgramFeature where
>> >>>> MakePrefixNgrams adds
>> >>>> > features in Evaluate function. From the
>> >>>> code
>> >>>> it seems
>> >>>> MakePrefixNgrams
>> >>>> > is adding sparse features on the fly.
>> >>>> Is it
>> >>>> correct?
>> >>>>
>> >>>> Yes, you can add sparse features on the
>> fly.
>> >>>> That's really
>> >>>> what makes
>> >>>> them sparse features.
>> >>>>
>> >>>> >
>> >>>> > what is the weight assigned to this newly
>> >>>> added feature? 1 or 0?
>> >>>>
>> >>>> The weight comes from the weights file that
>> >>>> you provide at
>> >>>> start-up. If
>> >>>> the feature is not in the weights file,
>> then
>> >>>> it gets a weight
>> >>>> of 0.
>> >>>>
>> >>>> >
>> >>>> > 2. What is the difference between these
>> >>>> two
>> >>>> functions?
>> >>>> >
>> >>>> > /void PlusEquals(const ScoreProducer*sp,
>> >>>> const std::string&
>> >>>> name,
>> >>>> > float score)/
>> >>>> > /
>> >>>> > /
>> >>>> > /void SparsePlusEquals(const std::string&
>> >>>> full_name, float
>> >>>> score)
>> >>>> > /
>> >>>>
>> >>>> In the first, a string from the
>> >>>> ScoreProducer
>> >>>> is prepended to
>> >>>> the name,
>> >>>> whilst in the second the string full_name
>> is
>> >>>> used as the name.
>> >>>> I think
>> >>>> we should really use the first form to keep
>> >>>> features in their own
>> >>>> namespace, but the second form has
>> >>>> pervaded Moses.
>> >>>>
>> >>>> >
>> >>>> > It seems like both of them are used for
>> >>>> updating sparse feature
>> >>>> > values.. correct?
>> >>>> > Or, do the first one points to sparse
>> >>>> features of a
>> >>>> particular FF and
>> >>>> > second one to generic sparse features?
>> >>>> >
>> >>>> > 3. How is the structure like if I use one
>> >>>> StatelessFeatureFunction
>> >>>> > with unlimited scores? Is it different
>> >>>> from
>> >>>> having unlimited
>> >>>> sparse
>> >>>> > features?
>> >>>> >
>> >>>> > I assume if there is one FF then there is
>> >>>> one weight
>> >>>> assigned to it
>> >>>> > but in the case of sparse features I have
>> >>>> one weight for
>> >>>> each feature.
>> >>>> FFs can be dense or sparse. What that
>> really
>> >>>> means is that the
>> >>>> number of
>> >>>> feature values for a dense FF is known in
>> >>>> advance (and so space is
>> >>>> allocated in the feature value array) but
>> >>>> for
>> >>>> sparse FFs the
>> >>>> number of
>> >>>> feature values are not known in advance. So
>> >>>> even dense FFs can
>> >>>> have
>> >>>> several weights associated with them - e.g.
>> >>>> the phrase table
>> >>>> features.
>> >>>> In more recent versions of Moses a given FF
>> >>>> can have both
>> >>>> dense and
>> >>>> sparse values.
>> >>>>
>> >>>> >
>> >>>> > 4. In general when should I compute the
>> >>>> sparse features?
>> >>>>
>> >>>> In general, computing them as soon as you
>> >>>> can
>> >>>> will probably
>> >>>> make your
>> >>>> code more efficient. When you are able to
>> >>>> compute your sparse
>> >>>> feature
>> >>>> depends on the feature itself. For
>> >>>> example, if
>> >>>> the feature
>> >>>> depends on
>> >>>> only on the phrase pair then it could be
>> >>>> computed and stored
>> >>>> in the
>> >>>> phrase table. This makes the phrase table
>> >>>> bigger (which could
>> >>>> slow you
>> >>>> down) but saves on computation at
>> >>>> decoding. On
>> >>>> the other hand,
>> >>>> a sparse
>> >>>> reordering feature has to be mainly
>> computed
>> >>>> during decoding,
>> >>>> since we
>> >>>> do not know the ordering of segments until
>> >>>> decoding. When I
>> >>>> implemented
>> >>>> sparse reordering features though, I
>> >>>> precomputed the feature
>> >>>> names since
>> >>>> you don't want to do string concatenation
>> >>>> during decoding.
>> >>>>
>> >>>>
>> >>>> cheers - Barry
>> >>>>
>> >>>> >
>> >>>> > Thanks for the patience,
>> >>>> > --Prashant
>> >>>> >
>> >>>> > PS: I am still trying to figure out
>> stuff,
>> >>>> so questions
>> >>>> might seem stupid.
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> _______________________________________________
>> >>>> > Moses-support mailing list
>> >>>> > Moses-support@mit.edu
>> >>>> <mailto:Moses-support@mit.edu>
>> >>>> <mailto:Moses-support@mit.edu
>> >>>> <mailto:Moses-support@mit.edu>>
>> >>>> >
>> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>>>
>> >>>>
>> >>>> --
>> >>>> The University of Edinburgh is a charitable
>> >>>> body, registered in
>> >>>> Scotland, with registration number
>> SC005336.
>> >>>>
>> >>>> _______________________________________________
>> >>>> Moses-support mailing list
>> >>>> Moses-support@mit.edu <mailto:
>> Moses-support@mit.edu>
>> >>>> <mailto:Moses-support@mit.edu
>> >>>> <mailto:Moses-support@mit.edu>>
>> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> -- Hieu Hoang
>> >>>> Research Associate
>> >>>> University of Edinburgh
>> >>>> http://www.hoang.co.uk/hieu
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> The University of Edinburgh is a charitable body,
>> >>>> registered in
>> >>>> Scotland, with registration number SC005336.
>> >>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> Moses-support mailing list
>> >>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>>>
>> >>>>
>> >> _______________________________________________
>> >> Moses-support mailing list
>> >> Moses-support@mit.edu
>> >> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>
>> >
>> >
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141114/d58a0a0e/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 97, Issue 29
*********************************************

Moses-support Digest, Vol 97, Issue 29

0 Response to "Moses-support Digest, Vol 97, Issue 29"

Post a Comment