Moses-support Digest, Vol 102, Issue 59

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Support for per-sentence language model (Ergun Bicici)
2. Question About matrix.stamt.org WMT 2014 Test Set (Graham Neubig)
3. Re: Question About matrix.stamt.org WMT 2014 Test Set
(Matthias Huck)
4. Re: Question About matrix.stamt.org WMT 2014 Test Set
(Graham Neubig)


----------------------------------------------------------------------

Message: 1
Date: Mon, 27 Apr 2015 12:14:52 +0100
From: Ergun Bicici <Ergun.Bicici@computing.dcu.ie>
Subject: Re: [Moses-support] Support for per-sentence language model
To: Kenneth Heafield <moses@kheafield.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAB2pGndgWZqbcdrbNN0pP15pPazhTx8wBxTLi24dpby1KVX4Kg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Can you use Google n-grams through some API? How about word2vec (
https://code.google.com/p/word2vec/) ?


Best Regards,
Ergun

Ergun Bi?ici, CNGL, School of Computing, DCU, www.cngl.ie
http://www.computing.dcu.ie/~ebicici/


On Sat, Apr 25, 2015 at 2:47 PM, Kenneth Heafield <moses@kheafield.com>
wrote:

> Hi,
>
> We know how to produce filtered models. The problem is StaticData
> enforces one feature set per process. Lane could theoretically run
> single-threaded and hack StaticData in between each sentence. The real
> answer is that StaticData needs to die.
>
> Kenneth
>
> On 04/25/2015 07:19 AM, Ergun Bicici wrote:
> >
> > From man ngram:
> > -limit-vocab
> > Discard LM parameters on reading that do not pertain
> > to the words specified in the vocabulary. The default is that
> > words used in the LM are automatically added to the
> > vocabulary. This option can be used to reduce the memory require?
> > ments for large LMs that are going to be evaluated only on
> > a small vocabulary subset.
> >
> > Best Regards,
> > Ergun
> >
> > Ergun Bi?ici, CNGL, School of Computing, DCU, www.cngl.ie
> > <http://www.cngl.ie>
> > http://www.computing.dcu.ie/~ebicici/
> >
> >
> > On Fri, Apr 24, 2015 at 9:12 PM, Lane Schwartz <dowobeha@gmail.com
> > <mailto:dowobeha@gmail.com>> wrote:
> >
> > To answer my own question...
> >
> > After talking with Hieu and Kenneth, it appears that the answer, at
> > present, is no. But if anyone would be interested in working on this
> > as an MT Marathon project, this would be great.
> >
> > On Fri, Apr 24, 2015 at 10:25 AM, Lane Schwartz <dowobeha@gmail.com
> > <mailto:dowobeha@gmail.com>> wrote:
> > > Does moses (and particularly EMS) have a mechanism to allow for
> each
> > > test sentence to have its own LM file that should be used when
> > > translating just that sentence?
> > >
> > > This is in the context of taking a large LM and filtering it for a
> > > single sentence.
> > >
> > > Thanks,
> > > Lane
> >
> >
> >
> > --
> > When a place gets crowded enough to require ID's, social collapse is
> not
> > far away. It is time to go elsewhere. The best thing about space
> > travel
> > is that it made it possible to go elsewhere.
> > -- R.A. Heinlein, "Time Enough For Love"
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150427/bcad20c3/attachment-0001.htm

------------------------------

Message: 2
Date: Mon, 27 Apr 2015 22:14:40 +0900
From: Graham Neubig <neubig@is.naist.jp>
Subject: [Moses-support] Question About matrix.stamt.org WMT 2014 Test
Set
To: "<moses-support@mit.edu>" <moses-support@mit.edu>
Message-ID:
<CADkjOCMfGPmGVQ7+g-HaQLrPHGBGcyAchpCqTFenOPSYoWz4+A@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Moses List,

Sorry about this being a bit off topic, but I have a question about the
files on matrix.statmt.org, and couldn't find any information about who to
contact on the site and assumed that here would be the next-best place to
ask.

Specifically, I'm looking for the SGM files for newstest2014 in the same
order as the system outputs on matrix.statmt.org. On the "test sets" page,
in the place where there should be a link to newstest2014, it seems like
the link actually points to newstest2013:
http://matrix.statmt.org/test_sets/list

And the ones downloadable from the WMT 2015 site seem to be in a different
order, and it'd be a bit of a pain (although possible) to match the lines
properly:
http://www.statmt.org/wmt15/translation-task.html

If possible, could someone help out with this, or tell me who's in charge
of the evaluation matrix so I can contact them directly?

Graham
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150427/51a16977/attachment-0001.htm

------------------------------

Message: 3
Date: Mon, 27 Apr 2015 15:31:33 +0100
From: Matthias Huck <mhuck@inf.ed.ac.uk>
Subject: Re: [Moses-support] Question About matrix.stamt.org WMT 2014
Test Set
To: Graham Neubig <neubig@is.naist.jp>
Cc: "<moses-support@mit.edu>" <moses-support@mit.edu>
Message-ID: <1430145093.30904.212.camel@portedgar>
Content-Type: text/plain; charset="UTF-8"

Hi Graham,

Did you have a look at the tarballs that were distributed last year?
http://www.statmt.org/wmt14/translation-task.html

There are three different version:

- Test sets (5.2 MB) These are the source sgm files with extra "filler"
sentences. They were the actual files released for the campaign.
http://www.statmt.org/wmt14/test.tgz

- Filtered Test sets (3.2 MB) These are the source and reference sgm
files used to evaluate, i.e. the Test sets without the "filler"
sentences. If you want to reproduce results from the campaign, use
these.
http://www.statmt.org/wmt14/test-filtered.tgz

- Cleaned Test sets (3.2 MB) These include fixes to minor encoding
errors, and reinstate around 10% of the en-de data which was excluded
from the evaluation. For further research, use these.
http://www.statmt.org/wmt14/test-full.tgz

WMT has a Google Group:
https://groups.google.com/forum/#!forum/wmt-tasks

Cheers,
Matthias


On Mon, 2015-04-27 at 22:14 +0900, Graham Neubig wrote:
> Hi Moses List,
>
> Sorry about this being a bit off topic, but I have a question about the
> files on matrix.statmt.org, and couldn't find any information about who to
> contact on the site and assumed that here would be the next-best place to
> ask.
>
> Specifically, I'm looking for the SGM files for newstest2014 in the same
> order as the system outputs on matrix.statmt.org. On the "test sets" page,
> in the place where there should be a link to newstest2014, it seems like
> the link actually points to newstest2013:
> http://matrix.statmt.org/test_sets/list
>
> And the ones downloadable from the WMT 2015 site seem to be in a different
> order, and it'd be a bit of a pain (although possible) to match the lines
> properly:
> http://www.statmt.org/wmt15/translation-task.html
>
> If possible, could someone help out with this, or tell me who's in charge
> of the evaluation matrix so I can contact them directly?
>
> Graham
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support



--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



------------------------------

Message: 4
Date: Mon, 27 Apr 2015 23:56:34 +0900
From: Graham Neubig <neubig@is.naist.jp>
Subject: Re: [Moses-support] Question About matrix.stamt.org WMT 2014
Test Set
To: Matthias Huck <mhuck@inf.ed.ac.uk>
Cc: "<moses-support@mit.edu>" <moses-support@mit.edu>
Message-ID:
<CADkjOCOR5OeN+oLh6UX00P4pG3=8O7doHk_kE_8TZK9NNVxjLg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Matthias,

Thank you, that's exactly what I was looking for!
And also thanks for sending the WMT mailing list, I'll send any further
questions about the evaluation matrix to there from now on.

Graham

On Mon, Apr 27, 2015 at 11:31 PM, Matthias Huck <mhuck@inf.ed.ac.uk> wrote:

> Hi Graham,
>
> Did you have a look at the tarballs that were distributed last year?
> http://www.statmt.org/wmt14/translation-task.html
>
> There are three different version:
>
> - Test sets (5.2 MB) These are the source sgm files with extra "filler"
> sentences. They were the actual files released for the campaign.
> http://www.statmt.org/wmt14/test.tgz
>
> - Filtered Test sets (3.2 MB) These are the source and reference sgm
> files used to evaluate, i.e. the Test sets without the "filler"
> sentences. If you want to reproduce results from the campaign, use
> these.
> http://www.statmt.org/wmt14/test-filtered.tgz
>
> - Cleaned Test sets (3.2 MB) These include fixes to minor encoding
> errors, and reinstate around 10% of the en-de data which was excluded
> from the evaluation. For further research, use these.
> http://www.statmt.org/wmt14/test-full.tgz
>
> WMT has a Google Group:
> https://groups.google.com/forum/#!forum/wmt-tasks
>
> Cheers,
> Matthias
>
>
> On Mon, 2015-04-27 at 22:14 +0900, Graham Neubig wrote:
> > Hi Moses List,
> >
> > Sorry about this being a bit off topic, but I have a question about the
> > files on matrix.statmt.org, and couldn't find any information about who
> to
> > contact on the site and assumed that here would be the next-best place to
> > ask.
> >
> > Specifically, I'm looking for the SGM files for newstest2014 in the same
> > order as the system outputs on matrix.statmt.org. On the "test sets"
> page,
> > in the place where there should be a link to newstest2014, it seems like
> > the link actually points to newstest2013:
> > http://matrix.statmt.org/test_sets/list
> >
> > And the ones downloadable from the WMT 2015 site seem to be in a
> different
> > order, and it'd be a bit of a pain (although possible) to match the lines
> > properly:
> > http://www.statmt.org/wmt15/translation-task.html
> >
> > If possible, could someone help out with this, or tell me who's in charge
> > of the evaluation matrix so I can contact them directly?
> >
> > Graham
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150427/f93b2952/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 102, Issue 59
**********************************************

Related Posts :

0 Response to "Moses-support Digest, Vol 102, Issue 59"

Post a Comment