Moses-support Digest, Vol 97, Issue 20

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: Meaning to language arguments for train-model.perl?
(Kenneth Heafield)
2. Re: using sparse features (Prashant Mathur)
3. Re: using sparse features (Barry Haddow)

----------------------------------------------------------------------

Message: 1
Date: Thu, 13 Nov 2014 10:28:33 -0500
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] Meaning to language arguments for
train-model.perl?
To: moses-support@mit.edu
Message-ID: <5464CE21.7050909@kheafield.com>
Content-Type: text/plain; charset=ISO-8859-1

I'm not proposing to change the script or the arguments. Just want to
make sure that somebody didn't write an if statement three levels deep
in perl that deletes words longer than 100 words unless the "-f"
language string is "de".

On 11/13/14 10:17, Tom Hoar wrote:
> Are you kidding? I thought "decoding" the meaning of "f" and "e" was a
> right of passage for "real" computational linguists! Besides, you go and
> change that and I have to type four more characters every time I run the
> script. Are there any better reasons for leaving things the way they are?
>
> --- Ok. I'll pull my bleeding tongue from my clinched teeth now :) but
> you will create more work for many people with automation wrappers
> around those scripts.
>
> Tom
>
>
> On 11/13/2014 04:04 PM, Kenneth Heafield wrote:
>> Dear Moses,
>>
>> Do the -e and -f arguments to train-model.perl and clean-corpus-n.perl
>> actually get interpreted by anything? Or are they just there as file
>> name extensions that could just as easily be "src" and "tgt"? I think
>> it doesn't matter.
>>
>> Kenneth
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

------------------------------

Message: 2
Date: Thu, 13 Nov 2014 16:42:10 +0100
From: Prashant Mathur <prashant@fbk.eu>
Subject: Re: [Moses-support] using sparse features
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>, Barry Haddow
<bhaddow@staffmail.ed.ac.uk>
Message-ID:
<CAK3pNhJbSJe-3LTXZhn7_iuVowRpDELmuqVyjg3XE6phua9heQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Thanks a lot Barry for your answers.

I have another question.
When I print these sparse features at the end of decoding, all sparse
features are assigned a weight of 0 because all of them were initialized
during decoding.
How can I set these weights for sparse features before they are evaluated?

Thanks Hieu for the link..
I am going to update the code as soon as I can.. but it will take some
time.. will get back to you when I do that.

--Prashant

On Thu, Nov 13, 2014 at 2:34 PM, Hieu Hoang <Hieu.Hoang@ed.ac.uk> wrote:

> re-iterating what Barry said, you should use the github moses if you want
> to create your own feature functions, especially with sparse features. The
> reasons:
> 1. Adding new feature functions is a pain in v 0.91. It's trivial now.
> You can watch my talk to find out why
> http://lectures.ms.mff.cuni.cz/video/recordshow/index/44/184
> 2. It's confusing exactly when the feature functions are computed. It's
> clear now (hopefully!)
> 3. I think you had to set special flags somewhere to use sparse
> features. Now, all feature functions can use sparse features as well as
> dense features
> 4. I don't remember the 0.91 code very well. So I can't help you if you
> get stuck
>
>
> On 13 November 2014 11:06, Barry Haddow <bhaddow@staffmail.ed.ac.uk>
> wrote:
>
>> Hi Prashant
>>
>> I tried to answer your questions inline:
>>
>>
>> On 12/11/14 20:27, Prashant Mathur wrote:
>> > Hi All,
>> >
>> > I have a question about implementing sparse feature function.
>> > I went through the details on its implementation, still somethings are
>> > not clear.
>> > FYI, I am using an old version of moses which dates back to Release
>> > 0.91 I guess. So, I am sorry if my questions don't relate to the
>> > latest implementation.
>>
>> This is a bad idea. The FF interface has changed a lot since 0.91.
>>
>> >
>> > 1. I was looking at the TargetNgramFeature where MakePrefixNgrams adds
>> > features in Evaluate function. From the code it seems MakePrefixNgrams
>> > is adding sparse features on the fly. Is it correct?
>>
>> Yes, you can add sparse features on the fly. That's really what makes
>> them sparse features.
>>
>> >
>> > what is the weight assigned to this newly added feature? 1 or 0?
>>
>> The weight comes from the weights file that you provide at start-up. If
>> the feature is not in the weights file, then it gets a weight of 0.
>>
>> >
>> > 2. What is the difference between these two functions?
>> >
>> > /void PlusEquals(const ScoreProducer*sp, const std::string& name,
>> > float score)/
>> > /
>> > /
>> > /void SparsePlusEquals(const std::string& full_name, float score)
>> > /
>>
>> In the first, a string from the ScoreProducer is prepended to the name,
>> whilst in the second the string full_name is used as the name. I think
>> we should really use the first form to keep features in their own
>> namespace, but the second form has pervaded Moses.
>>
>> >
>> > It seems like both of them are used for updating sparse feature
>> > values.. correct?
>> > Or, do the first one points to sparse features of a particular FF and
>> > second one to generic sparse features?
>> >
>> > 3. How is the structure like if I use one StatelessFeatureFunction
>> > with unlimited scores? Is it different from having unlimited sparse
>> > features?
>> >
>> > I assume if there is one FF then there is one weight assigned to it
>> > but in the case of sparse features I have one weight for each feature.
>> FFs can be dense or sparse. What that really means is that the number of
>> feature values for a dense FF is known in advance (and so space is
>> allocated in the feature value array) but for sparse FFs the number of
>> feature values are not known in advance. So even dense FFs can have
>> several weights associated with them - e.g. the phrase table features.
>> In more recent versions of Moses a given FF can have both dense and
>> sparse values.
>>
>> >
>> > 4. In general when should I compute the sparse features?
>>
>> In general, computing them as soon as you can will probably make your
>> code more efficient. When you are able to compute your sparse feature
>> depends on the feature itself. For example, if the feature depends on
>> only on the phrase pair then it could be computed and stored in the
>> phrase table. This makes the phrase table bigger (which could slow you
>> down) but saves on computation at decoding. On the other hand, a sparse
>> reordering feature has to be mainly computed during decoding, since we
>> do not know the ordering of segments until decoding. When I implemented
>> sparse reordering features though, I precomputed the feature names since
>> you don't want to do string concatenation during decoding.
>>
>>
>> cheers - Barry
>>
>> >
>> > Thanks for the patience,
>> > --Prashant
>> >
>> > PS: I am still trying to figure out stuff, so questions might seem
>> stupid.
>> >
>> >
>> > _______________________________________________
>> > Moses-support mailing list
>> > Moses-support@mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141113/0e32a420/attachment-0001.htm

------------------------------

Message: 3
Date: Thu, 13 Nov 2014 15:59:36 +0000
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] using sparse features
To: Prashant Mathur <prashant@fbk.eu>, Hieu Hoang
<Hieu.Hoang@ed.ac.uk>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <5464D568.8080208@staffmail.ed.ac.uk>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hi Prashant

You add something like this to your moses.ini:

[weight-file]
/path/to/sparse/weights/file

The sparse weights file has the form:

name1 weight1
name2 weight2
name3 weight3
.
.
.

At least that's how it works in Moses v2.

cheers - Barry

On 13/11/14 15:42, Prashant Mathur wrote:
> Thanks a lot Barry for your answers.
>
> I have another question.
> When I print these sparse features at the end of decoding, all sparse
> features are assigned a weight of 0 because all of them were
> initialized during decoding.
> How can I set these weights for sparse features before they are
> evaluated?
>
>
> Thanks Hieu for the link..
> I am going to update the code as soon as I can.. but it will take some
> time.. will get back to you when I do that.
>
> --Prashant
>
>
> On Thu, Nov 13, 2014 at 2:34 PM, Hieu Hoang <Hieu.Hoang@ed.ac.uk
> <mailto:Hieu.Hoang@ed.ac.uk>> wrote:
>
> re-iterating what Barry said, you should use the github moses if
> you want to create your own feature functions, especially with
> sparse features. The reasons:
> 1. Adding new feature functions is a pain in v 0.91. It's
> trivial now. You can watch my talk to find out why
> http://lectures.ms.mff.cuni.cz/video/recordshow/index/44/184
> 2. It's confusing exactly when the feature functions are
> computed. It's clear now (hopefully!)
> 3. I think you had to set special flags somewhere to use sparse
> features. Now, all feature functions can use sparse features as
> well as dense features
> 4. I don't remember the 0.91 code very well. So I can't help you
> if you get stuck
>
>
> On 13 November 2014 11:06, Barry Haddow
> <bhaddow@staffmail.ed.ac.uk <mailto:bhaddow@staffmail.ed.ac.uk>>
> wrote:
>
> Hi Prashant
>
> I tried to answer your questions inline:
>
>
> On 12/11/14 20:27, Prashant Mathur wrote:
> > Hi All,
> >
> > I have a question about implementing sparse feature function.
> > I went through the details on its implementation, still
> somethings are
> > not clear.
> > FYI, I am using an old version of moses which dates back to
> Release
> > 0.91 I guess. So, I am sorry if my questions don't relate to the
> > latest implementation.
>
> This is a bad idea. The FF interface has changed a lot since 0.91.
>
> >
> > 1. I was looking at the TargetNgramFeature where
> MakePrefixNgrams adds
> > features in Evaluate function. From the code it seems
> MakePrefixNgrams
> > is adding sparse features on the fly. Is it correct?
>
> Yes, you can add sparse features on the fly. That's really
> what makes
> them sparse features.
>
> >
> > what is the weight assigned to this newly added feature? 1 or 0?
>
> The weight comes from the weights file that you provide at
> start-up. If
> the feature is not in the weights file, then it gets a weight
> of 0.
>
> >
> > 2. What is the difference between these two functions?
> >
> > /void PlusEquals(const ScoreProducer*sp, const std::string&
> name,
> > float score)/
> > /
> > /
> > /void SparsePlusEquals(const std::string& full_name, float
> score)
> > /
>
> In the first, a string from the ScoreProducer is prepended to
> the name,
> whilst in the second the string full_name is used as the name.
> I think
> we should really use the first form to keep features in their own
> namespace, but the second form has pervaded Moses.
>
> >
> > It seems like both of them are used for updating sparse feature
> > values.. correct?
> > Or, do the first one points to sparse features of a
> particular FF and
> > second one to generic sparse features?
> >
> > 3. How is the structure like if I use one
> StatelessFeatureFunction
> > with unlimited scores? Is it different from having unlimited
> sparse
> > features?
> >
> > I assume if there is one FF then there is one weight
> assigned to it
> > but in the case of sparse features I have one weight for
> each feature.
> FFs can be dense or sparse. What that really means is that the
> number of
> feature values for a dense FF is known in advance (and so space is
> allocated in the feature value array) but for sparse FFs the
> number of
> feature values are not known in advance. So even dense FFs can
> have
> several weights associated with them - e.g. the phrase table
> features.
> In more recent versions of Moses a given FF can have both
> dense and
> sparse values.
>
> >
> > 4. In general when should I compute the sparse features?
>
> In general, computing them as soon as you can will probably
> make your
> code more efficient. When you are able to compute your sparse
> feature
> depends on the feature itself. For example, if the feature
> depends on
> only on the phrase pair then it could be computed and stored
> in the
> phrase table. This makes the phrase table bigger (which could
> slow you
> down) but saves on computation at decoding. On the other hand,
> a sparse
> reordering feature has to be mainly computed during decoding,
> since we
> do not know the ordering of segments until decoding. When I
> implemented
> sparse reordering features though, I precomputed the feature
> names since
> you don't want to do string concatenation during decoding.
>
>
> cheers - Barry
>
> >
> > Thanks for the patience,
> > --Prashant
> >
> > PS: I am still trying to figure out stuff, so questions
> might seem stupid.
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 97, Issue 20
*********************************************

Moses-support Digest, Vol 97, Issue 20

0 Response to "Moses-support Digest, Vol 97, Issue 20"

Post a Comment