Moses-support Digest, Vol 97, Issue 30

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: using sparse features (Barry Haddow)
2. Re: using sparse features (Marcin Junczys-Dowmunt)

----------------------------------------------------------------------

Message: 1
Date: Fri, 14 Nov 2014 10:27:40 +0000
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] using sparse features
To: evahasler@gmail.com, Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <5465D91C.9020306@staffmail.ed.ac.uk>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hi Marcin

I think if you look at the situations where sparse features are
successful, you often find they are tuning with multiple references.This
paper lends support to the idea that multiple references are important:
http://www.statmt.org/wmt14/pdf/W14-3360.pdf.

cheers - Barry

On 14/11/14 10:24, Eva Hasler wrote:
> In comparison to MERT? not really, we compared English-French and
> German-English at IWSLT 2012 and the baseline scores were a bit higher
> for En-Fr a bit lower for De-En.
> But of course the point is that you can use more features, so you have
> to define useful feature sets that are sparse but still able to
> generalise
>
> On Fri, Nov 14, 2014 at 10:16 AM, Marcin Junczys-Dowmunt
> <junczys@amu.edu.pl <mailto:junczys@amu.edu.pl>> wrote:
>
> Speed aside, quality did not improve significantly?
>
> W dniu 14.11.2014 o 11:11, Eva Hasler pisze:
>> Let's say there was a bit of disillusionment about the advantages
>> of online vs batch mira. Online mira was slow in comparison, but
>> that's also because the implementation was still in a kind of
>> development state and not optimised
>>
>>
>> On Fri, Nov 14, 2014 at 9:59 AM, Marcin Junczys-Dowmunt
>> <junczys@amu.edu.pl <mailto:junczys@amu.edu.pl>> wrote:
>>
>> Thanks. For some reasons I usually have quite week results
>> with kbmira.
>> What happened to that interesting Online MIRA idea? Died due
>> to lack of
>> maintenance?
>>
>> W dniu 14.11.2014 o 10:54, Barry Haddow pisze:
>> > Hi Marcin
>> >
>> > Our default option would be kbmira (kbest batch mira). It
>> seems to be
>> > the most stable,
>> >
>> > cheers - Barry
>> >
>> > On 14/11/14 09:43, Marcin Junczys-Dowmunt wrote:
>> >> Apropos MIRA, what's the current best practice tuner for
>> sparse
>> >> features? What are you guys using now for say WMT-grade
>> systems?
>> >>
>> >> W dniu 14.11.2014 o 10:39, Barry Haddow pisze:
>> >>> Hi Prashant
>> >>>
>> >>> We had to do these kind of dynamic weight updates for
>> online MIRA. The
>> >>> code is still there, although might have rotted, start by
>> looking at
>> >>> the
>> >>> weight update methods in StaticData,
>> >>>
>> >>> cheers - Barry
>> >>>
>> >>> On 13/11/14 17:05, Prashant Mathur wrote:
>> >>>> But in CAT scenario we do like this:
>> >>>>
>> >>>> translate: sentence 1
>> >>>> tune: sentence 1 , post-edit 1
>> >>>> translate: sentence 2
>> >>>> tune: sentence 2 , post-edit 2
>> >>>> ...
>> >>>>
>> >>>> In this case, I don't have any features generated or
>> tuned before I
>> >>>> start translating the first sentence.
>> >>>>
>> >>>> Old version is complicated, I am coding on the latest
>> version now.
>> >>>>
>> >>>> --Prashant
>> >>>>
>> >>>>
>> >>>> On Thu, Nov 13, 2014 at 5:26 PM, Philipp Koehn
>> <pkoehn@inf.ed.ac.uk <mailto:pkoehn@inf.ed.ac.uk>
>> >>>> <mailto:pkoehn@inf.ed.ac.uk
>> <mailto:pkoehn@inf.ed.ac.uk>>> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> Typically you want to learn these feature weights when
>> >>>> tuning. The
>> >>>> current setup supports and produces a sparse
>> feature file.
>> >>>>
>> >>>> -phi
>> >>>>
>> >>>> On Nov 13, 2014 11:18 AM, "Prashant Mathur"
>> <prashant@fbk.eu <mailto:prashant@fbk.eu>
>> >>>> <mailto:prashant@fbk.eu <mailto:prashant@fbk.eu>>>
>> wrote:
>> >>>>
>> >>>> what if I don't know the feature names before
>> hand?
>> >>>> In that case, can I set the weights directly
>> during
>> >>>> decoding?
>> >>>>
>> >>>> On Thu, Nov 13, 2014 at 4:59 PM, Barry Haddow
>> >>>> <bhaddow@staffmail.ed.ac.uk
>> <mailto:bhaddow@staffmail.ed.ac.uk>
>> >>>> <mailto:bhaddow@staffmail.ed.ac.uk
>> <mailto:bhaddow@staffmail.ed.ac.uk>>> wrote:
>> >>>>
>> >>>> Hi Prashant
>> >>>>
>> >>>> You add something like this to your moses.ini:
>> >>>>
>> >>>> [weight-file]
>> >>>> /path/to/sparse/weights/file
>> >>>>
>> >>>> The sparse weights file has the form:
>> >>>>
>> >>>> name1 weight1
>> >>>> name2 weight2
>> >>>> name3 weight3
>> >>>> .
>> >>>> .
>> >>>> .
>> >>>>
>> >>>> At least that's how it works in Moses v2.
>> >>>>
>> >>>> cheers - Barry
>> >>>>
>> >>>> On 13/11/14 15:42, Prashant Mathur wrote:
>> >>>>
>> >>>> Thanks a lot Barry for your answers.
>> >>>>
>> >>>> I have another question.
>> >>>> When I print these sparse features at
>> the end of
>> >>>> decoding, all sparse features are assigned a
>> >>>> weight of
>> >>>> 0 because all of them were initialized
>> during
>> >>>> decoding.
>> >>>> How can I set these weights for sparse
>> features
>> >>>> before
>> >>>> they are evaluated?
>> >>>>
>> >>>>
>> >>>> Thanks Hieu for the link..
>> >>>> I am going to update the code as soon
>> as I can.. but
>> >>>> it will take some time.. will get back
>> to you when I
>> >>>> do that.
>> >>>>
>> >>>> --Prashant
>> >>>>
>> >>>>
>> >>>> On Thu, Nov 13, 2014 at 2:34 PM, Hieu
>> Hoang
>> >>>> <Hieu.Hoang@ed.ac.uk
>> <mailto:Hieu.Hoang@ed.ac.uk> <mailto:Hieu.Hoang@ed.ac.uk
>> <mailto:Hieu.Hoang@ed.ac.uk>>
>> >>>> <mailto:Hieu.Hoang@ed.ac.uk <mailto:Hieu.Hoang@ed.ac.uk>
>> >>>> <mailto:Hieu.Hoang@ed.ac.uk
>> <mailto:Hieu.Hoang@ed.ac.uk>>>> wrote:
>> >>>>
>> >>>> re-iterating what Barry said, you should use the
>> >>>> github moses if
>> >>>> you want to create your own
>> feature functions,
>> >>>> especially with
>> >>>> sparse features. The reasons:
>> >>>> 1. Adding new feature functions is a pain in v
>> >>>> 0.91. It's
>> >>>> trivial now. You can watch my talk to find
>> >>>> out why
>> >>>> http://lectures.ms.mff.cuni.cz/video/recordshow/index/44/184
>> >>>> 2. It's confusing exactly when the feature
>> >>>> functions are
>> >>>> computed. It's clear now (hopefully!)
>> >>>> 3. I think you had to set special flags
>> >>>> somewhere to use sparse
>> >>>> features. Now, all feature functions can use
>> >>>> sparse features as
>> >>>> well as dense features
>> >>>> 4. I don't remember the 0.91 code very
>> >>>> well. So
>> >>>> I can't help you
>> >>>> if you get stuck
>> >>>>
>> >>>>
>> >>>> On 13 November 2014 11:06, Barry
>> Haddow
>> >>>> <bhaddow@staffmail.ed.ac.uk
>> <mailto:bhaddow@staffmail.ed.ac.uk>
>> >>>> <mailto:bhaddow@staffmail.ed.ac.uk
>> <mailto:bhaddow@staffmail.ed.ac.uk>>
>> >>>> <mailto:bhaddow@staffmail.ed.ac.uk
>> <mailto:bhaddow@staffmail.ed.ac.uk>
>> >>>> <mailto:bhaddow@staffmail.ed.ac.uk
>> <mailto:bhaddow@staffmail.ed.ac.uk>>>>
>> >>>>
>> >>>> wrote:
>> >>>>
>> >>>> Hi Prashant
>> >>>>
>> >>>> I tried to answer your questions inline:
>> >>>>
>> >>>>
>> >>>> On 12/11/14 20:27, Prashant Mathur wrote:
>> >>>> > Hi All,
>> >>>> >
>> >>>> > I have a question about implementing
>> >>>> sparse
>> >>>> feature function.
>> >>>> > I went through the details on its
>> >>>> implementation, still
>> >>>> somethings are
>> >>>> > not clear.
>> >>>> > FYI, I am using an old version of moses
>> >>>> which dates back to
>> >>>> Release
>> >>>> > 0.91 I guess. So, I am sorry if my
>> >>>> questions
>> >>>> don't relate to the
>> >>>> > latest implementation.
>> >>>>
>> >>>> This is a bad idea. The FF interface has
>> >>>> changed a lot since 0.91.
>> >>>>
>> >>>> >
>> >>>> > 1. I was looking at the
>> >>>> TargetNgramFeature where
>> >>>> MakePrefixNgrams adds
>> >>>> > features in Evaluate function. From the
>> >>>> code
>> >>>> it seems
>> >>>> MakePrefixNgrams
>> >>>> > is adding sparse features on the fly.
>> >>>> Is it
>> >>>> correct?
>> >>>>
>> >>>> Yes, you can add sparse features on the fly.
>> >>>> That's really
>> >>>> what makes
>> >>>> them sparse features.
>> >>>>
>> >>>> >
>> >>>> > what is the weight assigned to this newly
>> >>>> added feature? 1 or 0?
>> >>>>
>> >>>> The weight comes from the weights file that
>> >>>> you provide at
>> >>>> start-up. If
>> >>>> the feature is not in the weights file, then
>> >>>> it gets a weight
>> >>>> of 0.
>> >>>>
>> >>>> >
>> >>>> > 2. What is the difference between these
>> >>>> two
>> >>>> functions?
>> >>>> >
>> >>>> > /void PlusEquals(const ScoreProducer*sp,
>> >>>> const std::string&
>> >>>> name,
>> >>>> > float score)/
>> >>>> > /
>> >>>> > /
>> >>>> > /void SparsePlusEquals(const std::string&
>> >>>> full_name, float
>> >>>> score)
>> >>>> > /
>> >>>>
>> >>>> In the first, a string from the
>> >>>> ScoreProducer
>> >>>> is prepended to
>> >>>> the name,
>> >>>> whilst in the second the string full_name is
>> >>>> used as the name.
>> >>>> I think
>> >>>> we should really use the first form to keep
>> >>>> features in their own
>> >>>> namespace, but the second form has
>> >>>> pervaded Moses.
>> >>>>
>> >>>> >
>> >>>> > It seems like both of them are used for
>> >>>> updating sparse feature
>> >>>> > values.. correct?
>> >>>> > Or, do the first one points to sparse
>> >>>> features of a
>> >>>> particular FF and
>> >>>> > second one to generic sparse features?
>> >>>> >
>> >>>> > 3. How is the structure like if I use one
>> >>>> StatelessFeatureFunction
>> >>>> > with unlimited scores? Is it different
>> >>>> from
>> >>>> having unlimited
>> >>>> sparse
>> >>>> > features?
>> >>>> >
>> >>>> > I assume if there is one FF then there is
>> >>>> one weight
>> >>>> assigned to it
>> >>>> > but in the case of sparse features I have
>> >>>> one weight for
>> >>>> each feature.
>> >>>> FFs can be dense or sparse. What that really
>> >>>> means is that the
>> >>>> number of
>> >>>> feature values for a dense FF is known in
>> >>>> advance (and so space is
>> >>>> allocated in the feature value array) but
>> >>>> for
>> >>>> sparse FFs the
>> >>>> number of
>> >>>> feature values are not known in advance. So
>> >>>> even dense FFs can
>> >>>> have
>> >>>> several weights associated with them - e.g.
>> >>>> the phrase table
>> >>>> features.
>> >>>> In more recent versions of Moses a given FF
>> >>>> can have both
>> >>>> dense and
>> >>>> sparse values.
>> >>>>
>> >>>> >
>> >>>> > 4. In general when should I compute the
>> >>>> sparse features?
>> >>>>
>> >>>> In general, computing them as soon as you
>> >>>> can
>> >>>> will probably
>> >>>> make your
>> >>>> code more efficient. When you are able to
>> >>>> compute your sparse
>> >>>> feature
>> >>>> depends on the feature itself. For
>> >>>> example, if
>> >>>> the feature
>> >>>> depends on
>> >>>> only on the phrase pair then it could be
>> >>>> computed and stored
>> >>>> in the
>> >>>> phrase table. This makes the phrase table
>> >>>> bigger (which could
>> >>>> slow you
>> >>>> down) but saves on computation at
>> >>>> decoding. On
>> >>>> the other hand,
>> >>>> a sparse
>> >>>> reordering feature has to be mainly computed
>> >>>> during decoding,
>> >>>> since we
>> >>>> do not know the ordering of segments until
>> >>>> decoding. When I
>> >>>> implemented
>> >>>> sparse reordering features though, I
>> >>>> precomputed the feature
>> >>>> names since
>> >>>> you don't want to do string concatenation
>> >>>> during decoding.
>> >>>>
>> >>>>
>> >>>> cheers - Barry
>> >>>>
>> >>>> >
>> >>>> > Thanks for the patience,
>> >>>> > --Prashant
>> >>>> >
>> >>>> > PS: I am still trying to figure out stuff,
>> >>>> so questions
>> >>>> might seem stupid.
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> _______________________________________________
>> >>>> > Moses-support mailing list
>> >>>> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> >>>> <mailto:Moses-support@mit.edu
>> <mailto:Moses-support@mit.edu>>
>> >>>> <mailto:Moses-support@mit.edu
>> <mailto:Moses-support@mit.edu>
>> >>>> <mailto:Moses-support@mit.edu
>> <mailto:Moses-support@mit.edu>>>
>> >>>> >
>> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>>>
>> >>>>
>> >>>> --
>> >>>> The University of Edinburgh is a charitable
>> >>>> body, registered in
>> >>>> Scotland, with registration number SC005336.
>> >>>>
>> >>>> _______________________________________________
>> >>>> Moses-support mailing list
>> >>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> <mailto:Moses-support@mit.edu <mailto:Moses-support@mit.edu>>
>> >>>> <mailto:Moses-support@mit.edu
>> <mailto:Moses-support@mit.edu>
>> >>>> <mailto:Moses-support@mit.edu
>> <mailto:Moses-support@mit.edu>>>
>> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> -- Hieu Hoang
>> >>>> Research Associate
>> >>>> University of Edinburgh
>> >>>> http://www.hoang.co.uk/hieu
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> The University of Edinburgh is a
>> charitable body,
>> >>>> registered in
>> >>>> Scotland, with registration number SC005336.
>> >>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> Moses-support mailing list
>> >>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> <mailto:Moses-support@mit.edu <mailto:Moses-support@mit.edu>>
>> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>>>
>> >>>>
>> >> _______________________________________________
>> >> Moses-support mailing list
>> >> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> >> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>
>> >
>> >
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

------------------------------

Message: 2
Date: Fri, 14 Nov 2014 13:41:03 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] using sparse features
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <f54cfa69d2482af46d6d397cb111af10@amu.edu.pl>
Content-Type: text/plain; charset="utf-8"

Hi,

Eva: And in a sparse-feature scenario compared to PRO or kbmira?

Barry: Thanks for the pointer. I understand the main problem is
evidence-sparsity for sparse features. I am currently trying to counter
that by using huge devsets (up to 50.000 sentences, divided into pieces
of 5.000, then averaging weights, cross-validation basically) which
seems to help, but I am always suspicious that the optimization method
is not doing as well as it could. So I was hoping you might have
something new :) I remember Collin Cherry talking about lattice Mira, we
don't have this in Moses, have we?

W dniu 2014-11-14 11:27, Barry Haddow napisa?(a):

> Hi Marcin
>
> I think if you look at the situations where sparse features are
> successful, you often find they are tuning with multiple references.This
> paper lends support to the idea that multiple references are important:
> http://www.statmt.org/wmt14/pdf/W14-3360.pdf [1].
>
> cheers - Barry
>
> On 14/11/14 10:24, Eva Hasler wrote:
>
>> In comparison to MERT? not really, we compared English-French and German-English at IWSLT 2012 and the baseline scores were a bit higher for En-Fr a bit lower for De-En. But of course the point is that you can use more features, so you have to define useful feature sets that are sparse but still able to generalise On Fri, Nov 14, 2014 at 10:16 AM, Marcin Junczys-Dowmunt <junczys@amu.edu.pl <mailto:junczys@amu.edu.pl>> wrote: Speed aside, quality did not improve significantly? W dniu 14.11.2014 o 11:11, Eva Hasler pisze:

Links:
------
[1] http://www.statmt.org/wmt14/pdf/W14-3360.pdf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141114/0b31fde5/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 97, Issue 30
*********************************************

Moses-support Digest, Vol 97, Issue 30

0 Response to "Moses-support Digest, Vol 97, Issue 30"

Post a Comment