Moses-support Digest, Vol 83, Issue 29

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Fwd: Reg :Syntax based model in EMS (Philipp Koehn)
2. Re: Diversify n-best list (Kevin Gimpel)
3. Re: Diversify n-best list (Marcin Junczys-Dowmunt)
4. Re: Reg:SYNTAX based training in EMS (Philipp Koehn)
5. Re: Reg:SYNTAX based training in EMS (karan sharma)


----------------------------------------------------------------------

Message: 1
Date: Tue, 17 Sep 2013 15:55:23 +0100
From: Philipp Koehn <pkoehn@inf.ed.ac.uk>
Subject: Re: [Moses-support] Fwd: Reg :Syntax based model in EMS
To: arushi sharma <arushi.saphira@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDBD6nXs5UPd4evAHqknaw4088DyFyrKYrfxnKe29zC_zw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

the format is the same as the one for the input tree, as described in the
Syntax Tutorial here:
http://www.statmt.org/moses/?n=Moses.SyntaxTutorial#ntoc23

-phi


On Mon, Sep 16, 2013 at 11:22 AM, arushi sharma <arushi.saphira@gmail.com>wrote:

> Hey,
>
> I am working on tree-to-tree based model.The syntax based example config
> file in EMS folder asks for a collins parser.
>
> Can u suggest what changes should i make in config file if i give my own
> XML files and skip the parser step..
>
> Also please mention the file formats for LM,Tuning and testing.What should
> be the data formats for those files.
>
> Regards
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20130917/777c0e8e/attachment-0001.htm

------------------------------

Message: 2
Date: Tue, 17 Sep 2013 10:10:29 -0500
From: Kevin Gimpel <kgimpel@cs.cmu.edu>
Subject: Re: [Moses-support] Diversify n-best list
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CACAqqMvExXzdEoiDouWEwZg=axdRDqrhGVMDHW9N0TRk0JUGTQ@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi again Marcin,
You might be interested in our paper appearing at EMNLP:

A Systematic Exploration of Diversity in Machine Translation (
http://ttic.uchicago.edu/~kgimpel/papers/gimpel+etal.emnlp13.pdf)

We used Moses for all of our experiments, and also do some user experiments
somewhat along the lines of what you described. Please get in touch if
you'd like to discuss things further.
Kevin


On Tue, Jul 30, 2013 at 8:55 AM, Barry Haddow <bhaddow@staffmail.ed.ac.uk>wrote:

> Hi Marcin
>
> There's a Masters student in Edinburgh looking into something related to
> this. He has a project page at
> http://rustamli.github.io/rephrase/index.html
>
> cheers - Barry
>
> On 27/07/13 10:53, Marcin Junczys-Dowmunt wrote:
> > Hi list,
> > is the --distinct parameter currently the only option to generate more
> > diverse n-best lists?
> >
> > I have the following scenario:
> > Human translators use Moses like a TM via a Trados Plugin, upon request
> > they may see a list of m alternatives, which is just a list of the first
> > m sentences from a bigger n-best list. Usually those alternative are not
> > very useful (confirmed by translators), as they are still very similar
> > to each other and the best sentence. Current idea: generate bigger
> > n-best list, cluster using nifty similarity function, display only
> > cluster representatives. Somehow I believe something like that should
> > have been done before, have you heard of anything like that?
> > Best,
> > Marcin
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20130917/54971445/attachment-0001.htm

------------------------------

Message: 3
Date: Tue, 17 Sep 2013 17:14:56 +0200
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Diversify n-best list
To: undisclosed-recipients:;
Cc: moses-support <moses-support@mit.edu>
Message-ID: <523871F0.1000107@amu.edu.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi Kevin,
thanks! I have this on my to-do list for next weeks, so I will
definitely take a very close look :)
Best,
Marcin

W dniu 17.09.2013 17:10, Kevin Gimpel pisze:
> Hi again Marcin,
> You might be interested in our paper appearing at EMNLP:
>
> A Systematic Exploration of Diversity in Machine Translation
> (http://ttic.uchicago.edu/~kgimpel/papers/gimpel+etal.emnlp13.pdf
> <http://ttic.uchicago.edu/%7Ekgimpel/papers/gimpel+etal.emnlp13.pdf>)
>
> We used Moses for all of our experiments, and also do some user
> experiments somewhat along the lines of what you described. Please
> get in touch if you'd like to discuss things further.
> Kevin
>
>
> On Tue, Jul 30, 2013 at 8:55 AM, Barry Haddow
> <bhaddow@staffmail.ed.ac.uk <mailto:bhaddow@staffmail.ed.ac.uk>> wrote:
>
> Hi Marcin
>
> There's a Masters student in Edinburgh looking into something
> related to
> this. He has a project page at
> http://rustamli.github.io/rephrase/index.html
>
> cheers - Barry
>
> On 27/07/13 10:53, Marcin Junczys-Dowmunt wrote:
> > Hi list,
> > is the --distinct parameter currently the only option to
> generate more
> > diverse n-best lists?
> >
> > I have the following scenario:
> > Human translators use Moses like a TM via a Trados Plugin, upon
> request
> > they may see a list of m alternatives, which is just a list of
> the first
> > m sentences from a bigger n-best list. Usually those alternative
> are not
> > very useful (confirmed by translators), as they are still very
> similar
> > to each other and the best sentence. Current idea: generate bigger
> > n-best list, cluster using nifty similarity function, display only
> > cluster representatives. Somehow I believe something like that
> should
> > have been done before, have you heard of anything like that?
> > Best,
> > Marcin
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>



------------------------------

Message: 4
Date: Tue, 17 Sep 2013 16:41:56 +0100
From: Philipp Koehn <pkoehn@inf.ed.ac.uk>
Subject: Re: [Moses-support] Reg:SYNTAX based training in EMS
To: karan sharma <karan.sharma.bond@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAAFADDBd8N5sVs4rUwX7M6Ry8LKs3C-GHAPvHs3OEhTkBvZFGQ@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

If you are building a model with target side syntax, you will only need the
annotation on the target side of the parallel corpus.

If you build a model with source side syntax, you will need the annotation
on the source side of the parallel corpus, the tuning set, and the
evaluation set(s).

If you already have the data parsed, you can specify it with:

[CORPUS]
parsed-stem = /file/name/without/extension

[TUNING]
parsed-input = /file/name

[EVALUATION]
parsed-input = /file/name

-phi



On Sun, Sep 15, 2013 at 10:16 AM, karan sharma
<karan.sharma.bond@gmail.com>wrote:

> Hey,
>
> I am using syntax based model in EMS.In the config file it is asking for
> path to collins parser.
> I am already giving input in XML format.Is their any way I skip this
> step.Also please mention the file format for LM,development and testing.Do
> I have to give XML input for all steps.
>
> Regards
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20130917/2fc4f457/attachment-0001.htm

------------------------------

Message: 5
Date: Tue, 17 Sep 2013 22:14:24 +0530
From: karan sharma <karan.sharma.bond@gmail.com>
Subject: Re: [Moses-support] Reg:SYNTAX based training in EMS
To: Philipp Koehn <pkoehn@inf.ed.ac.uk>, moses-support
<moses-support@mit.edu>
Message-ID:
<CAFc-37QY5M8XB2teWh24=GjaqTKotksU0_Wrpe0EQRpinVc8hw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

sir,
In case I am using tree-to-tree model,then input will be parsed files for
both the sides.

[CORPUS]
parsed-stem = /file/name/without/extension

[Tuning]
parsed-input= filename(parsed)

parsed-output= filename(parsed)

[Evaluation]

I am not able to find the correct files,because input and output for
testing should be in sgm format.
If you could help me with it or correct me if i am wrong somewhere.


Regards






On Tue, Sep 17, 2013 at 9:11 PM, Philipp Koehn <pkoehn@inf.ed.ac.uk> wrote:

> Hi,
>
> If you are building a model with target side syntax, you will only need
> the annotation on the target side of the parallel corpus.
>
> If you build a model with source side syntax, you will need the annotation
> on the source side of the parallel corpus, the tuning set, and the
> evaluation set(s).
>
> If you already have the data parsed, you can specify it with:
>
> [CORPUS]
> parsed-stem = /file/name/without/extension
>
> [TUNING]
> parsed-input = /file/name
>
> [EVALUATION]
> parsed-input = /file/name
>
> -phi
>
>
>
> On Sun, Sep 15, 2013 at 10:16 AM, karan sharma <
> karan.sharma.bond@gmail.com> wrote:
>
>> Hey,
>>
>> I am using syntax based model in EMS.In the config file it is asking for
>> path to collins parser.
>> I am already giving input in XML format.Is their any way I skip this
>> step.Also please mention the file format for LM,development and testing.Do
>> I have to give XML input for all steps.
>>
>> Regards
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20130917/3d00c91f/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 83, Issue 29
*********************************************

0 Response to "Moses-support Digest, Vol 83, Issue 29"

Post a Comment