Moses-support Digest, Vol 110, Issue 25

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: how to mert when one dev sentence has one phrase table?
(Kevin Gimpel)
2. Re: Error: The build failed (Hieu Hoang)
3. Re: how to mert when one dev sentence has one phrase table? (??)
4. First Call for Participation: WMT16 Machine Translated
related Shared Tasks (Barry Haddow)


----------------------------------------------------------------------

Message: 1
Date: Sun, 13 Dec 2015 12:12:07 -0600
From: Kevin Gimpel <kgimpel@cs.cmu.edu>
Subject: Re: [Moses-support] how to mert when one dev sentence has one
phrase table?
To: ?? <yaoliang310@163.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CACAqqMv2pDdwb0Ct0sJ_wsaJDfsxionbvU-J6DCQsS8v8KQo_w@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Liang,
Here's a hack to avoid changing Moses code:
First, append a unique identifier to every word in your source file.
E.g., a source file with two lines like the following:

this is sentence 1 .
sentence 2

would become:

this~~1 is~~2 sentence~~3 1~~4 .~~5
sentence~~6 2~~7

Then when generating your sentence-specific phrase tables for these
sentences, use the same IDs in the source words in those phrase table
entries. Then concatenate all the sentence-specific phrase tables together
and train as usual.

The problem with this is that the concatenated phrase tables become really
large, but you can filter the phrase tables a bit. This is what we did in
here <http://ttic.uchicago.edu/~kgimpel/papers/gimpel+smith.smt08.pdf> (Sec.
5), which made things possible to run and didn't seem to affect the
results.

Kevin

On Sun, Dec 13, 2015 at 2:41 AM, ?? <yaoliang310@163.com> wrote:

> Dear support-Team,
> I wanted to add a new feature into moses decoder which relies on
> source contexts of the sentence to be translated.
> my idea are as follows?
> 1, for each test/dev sentence, a phrase table of all potential phrases
> that can be used during decoding is extracted from the aligned training
> set.
> 2, the translation score of phrase pairs in the source contexts
> are computed and then added into the phrase table as a new feature.
> therefore , in my setting, for each sentence , i got an phrase
> table. but i don't know how to tuning on this situation.
> i know maybe i can add a new Feature Function as the moses
> tutorial says, but it diffcult for me to implement the code.
> could you give my some advices?
>
> Thanks
> Liang
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151213/9b4c87ff/attachment-0001.html

------------------------------

Message: 2
Date: Sun, 13 Dec 2015 21:45:33 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Error: The build failed
To: Shaimaa Marzouk <marzouk_s@yahoo.de>, moses-support@mit.edu
Message-ID: <566DE6FD.1000703@gmail.com>
Content-Type: text/plain; charset="windows-1252"

can you
git pull
and try to compile again. There was a compile error I just fixed
https://github.com/moses-smt/mosesdecoder/commit/485887528916b4b27432c4fd2edb03aafac76e49


On 12/12/15 22:04, Shaimaa Marzouk wrote:
> Dear Support-Team,
>
> I have got the error "The build failed", as I entered the command ./bjam -j4 for setting up Moses.
> Please find attached the "build.log.gz".
>
> Could you please help me to fix this error?
>
> Kind regards,
> Shaimaa
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

--
Hieu Hoang
http://www.hoang.co.uk/hieu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151213/2a5c739f/attachment-0001.html

------------------------------

Message: 3
Date: Mon, 14 Dec 2015 11:02:50 +0800 (CST)
From: ?? <yaoliang310@163.com>
Subject: Re: [Moses-support] how to mert when one dev sentence has one
phrase table?
To: "Kevin Gimpel" <kgimpel@cs.cmu.edu>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <79446e4d.4fc4.1519e70c9ab.Coremail.yaoliang310@163.com>
Content-Type: text/plain; charset="gbk"

Hi Kevin,
thanks a lot, your advice is so great!


Liang




? 2015-12-14 02:12:07?"Kevin Gimpel" <kgimpel@cs.cmu.edu> ???

Hi Liang,
Here's a hack to avoid changing Moses code:
First, append a unique identifier to every word in your source file.
E.g., a source file with two lines like the following:


this is sentence 1 .
sentence 2


would become:


this~~1 is~~2 sentence~~3 1~~4 .~~5
sentence~~6 2~~7


Then when generating your sentence-specific phrase tables for these sentences, use the same IDs in the source words in those phrase table entries. Then concatenate all the sentence-specific phrase tables together and train as usual.


The problem with this is that the concatenated phrase tables become really large, but you can filter the phrase tables a bit. This is what we did in here (Sec. 5), which made things possible to run and didn't seem to affect the results.


Kevin



On Sun, Dec 13, 2015 at 2:41 AM, ?? <yaoliang310@163.com> wrote:

Dear support-Team,
I wanted to add a new feature into moses decoder which relies on source contexts of the sentence to be translated.
my idea are as follows?
1, for each test/dev sentence, a phrase table of all potential phrases that can be used during decoding is extracted from the aligned training set.
2, the translation score of phrase pairs in the source contexts are computed and then added into the phrase table as a new feature.
therefore , in my setting, for each sentence , i got an phrase table. but i don't know how to tuning on this situation.
i know maybe i can add a new Feature Function as the moses tutorial says, but it diffcult for me to implement the code.
could you give my some advices?


Thanks
Liang







_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151213/e7084c18/attachment-0001.html

------------------------------

Message: 4
Date: Mon, 14 Dec 2015 10:54:01 +0000
From: Barry Haddow <bhaddow@inf.ed.ac.uk>
Subject: [Moses-support] First Call for Participation: WMT16 Machine
Translated related Shared Tasks
To: moses-support <moses-support@mit.edu>
Message-ID: <566E9FC9.7040401@inf.ed.ac.uk>
Content-Type: text/plain; charset="utf-8"

ACL 2016 FIRST CONFERENCE ON MACHINE TRANSLATION (WMT16)
Shared Tasks on translation, evaluation, automated post-editing and
document alignment.
August 2016, in conjunction with ACL 2016 in Berlin, Germany

http://www.statmt.org/wmt16

As part of WMT, as in previous years, we will be organising a collection
of shared tasks related to machine translation. We hope that both
beginners and established research groups will participate. This year we
are pleased to present the following 10 tasks:

- Translation tasks
- News
- IT-domain
- Biomedical
- Multimodal
- Pronoun
- Evaluation tasks
- Metrics
- Quality estimation
- Tuning
- Other tasks
- Automatic post-editing
- Bilingual document alignment

Further information, including task rationale, timetables and data will
be posted on the WMT16 website, and fully announced in January. Brief
descriptions of each task are given below. Intending participants are
encouraged to register with the mailing list for further announcements
(https://groups.google.com/forum/#!forum/wmt-tasks)

For all tasks, participants will also be invited to submit a short
paper describing their system.

News Translation Task
-----------------------------
This is the translation task run at most of the past WMT editions. This
year the language pairs will be English to/from Czech, Finnish, German,
Romanian, Russian and Turkish. Sponsorship for the task comes from the
EU H2020 projects QT21 and Cracker, Yandex and the University of Helsinki.

IT Domain Translation Task
---------------------
This guest task will involve translation of queries and their responses,
on the topic of information technology. It will cover English to/from
Bulgarian, Czech, German, Spanish, Basque, Dutch and Portugese, and be
sponsored by the EU FP7 project QTLeap.

Biomedical Translation Task
-------------------------------------
This guest task will focus on the translation of biomedical research
abstracts from English to and from Spanish, Portuguese and French.

Multimodal Translation Task
-------------------------------------
This task will aim at generating image descriptions in a target
language, given equivalent descriptions in one or more languages. The
dataset will consist of 30,000 image--description tuples in three
languages -- English, German and French.

Pronoun Translation Task
---------------------------------
This will be similar to the task run last year as part of the DiscoMT
workshop (https://www.idiap.ch/workshop/DiscoMT/shared-task)

Metrics
----------
The idea here is that participants propose evaluation metrics for
machine translation, which compare the MT output against a reference.
The metrics will be correlated against the human judgements produced in
the news translation task. This task is sponsored by QT21.

Quality Estimation
-------------------------
This consists of several sub-tasks, all of which are concerned with the
idea of assessing the quality of MT output without using a reference, at
different levels of granularity: word, phrase, sentence and document.
This task is sponsored by QT21.

Tuning
---------
Participants in this task are asked to come up with algorithms and
objectives (i.e. metrics) for tuning the parameters of a given MT system.

Automatic Post-editing
-------------------------------
In this task participants will aim to create systems that can
automatically correct machine translation outputs, given a corpus of
human post-edits. This task is sponsored by QT21.

Bilingual document alignment
----------------------------------------
The aim is to find translated document pairs from a large collection of
documents in two languages.

Best wishes
Barry Haddow
(On behalf of the organisers)







-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151214/d03aa2ff/attachment.html
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151214/d03aa2ff/attachment.pl

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 110, Issue 25
**********************************************

0 Response to "Moses-support Digest, Vol 110, Issue 25"

Post a Comment