Moses-support Digest, Vol 175, Issue 1

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. CfP to the Automatic Post-Editing shared task at WMT 2021
(Rajen Chatterjee)
2. Re: Language Model Inquiry (Marwa Gaser)


----------------------------------------------------------------------

Message: 1
Date: Sat, 1 May 2021 14:07:57 -0700
From: Rajen Chatterjee <rajen.k.chatterjee@gmail.com>
Subject: [Moses-support] CfP to the Automatic Post-Editing shared task
at WMT 2021
To: moses-support@mit.edu, mt-list@eamt.org, corpora@uib.no,
linguist@listserv.linguistlist.org, NLP-IP account givem to manshri
<nlp-ai@cse.iitb.ac.in>, wmt-ape@fbk.eu
Message-ID:
<CAC4-+NyiCmDcyDnD21vP+y0caA9jBtu1HMzjG+hUZ+mmiqfyLw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

*CALL** FOR PARTICIPATION*

*in the*

*seventh Automatic Post-Editing (APE) shared task *

*at the sixth Conference on Machine Translation (WMT21)*



*OVERVIEW*

The 7th round of the APE shared task follows the success of the previous
rounds organized from 2015 to 2020. The aim is to examine *automatic
methods for correcting errors produced by an unknown machine translation
(MT) system.* This has to be done by exploiting knowledge acquired from
human post-edits, which are provided as training material.
Goals

The aim of this task is to improve MT output in black-box scenarios, in
which the MT system is used "as is" and cannot be modified. From the
application point of view, APE components would make it possible to:

- Cope with systematic errors of an MT system whose decoding process is
not accessible
- Provide professional translators with improved MT output quality to
reduce (human) post-editing effort
- Adapt the output of a general-purpose system to the lexicon/style
requested in a specific application domain

Task Description

This year the task will use Wikipedia data for English --> German and
English --> Chinese language pairs. In these datasets, the source sentences
have been translated into the target language by using a state-of-the-art
neural MT system unknown to the participants (in terms of system
configuration) and then manually post-edited. This dataset is shared by
both Automatic Post-Editing and Quality Estimation shared tasks.

At the training stage, the collected human post-edits have to be used to
learn correction rules for the APE systems. At the test stage they will be
used for system evaluation with automatic metrics (TER and BLEU).

*DIFFERENCES FROM THE 6th ROUND (WMT 2020)*

Compared to the previous round, the main differences are:

- The same data has been re-post-edited to improve the quality

Evaluation

Systems' performance will be evaluated with respect to their capability to
reduce the distance that separates an automatic translation from its
human-revised version. Such distance will be measured in terms of TER,
which will be computed between automatic and human post-edits
in case-sensitive mode. Also, BLEU will be taken into consideration as a
secondary evaluation metric.
Important dates

Release of training and development data

May 01, 2021

Release of test data

July 10, 2021

APE system submission deadline

July 17, 2021

Manual evaluation

August

Paper submission deadline

August 5, 2021

Notification of acceptance

September 5, 2021

Camera-ready deadline

September 15, 2021

Conference (Workshops & Tutorials)

November 10-11, 2021



--
-Regards,
Rajen Chatterjee.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20210501/e3d3b2c8/attachment-0001.html

------------------------------

Message: 2
Date: Sun, 2 May 2021 01:47:14 +0200
From: Marwa Gaser <marwagaser@gmail.com>
Subject: Re: [Moses-support] Language Model Inquiry
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support@mit.edu
Message-ID:
<CADNwFMnaHP8BB_VB8y+O1J8Z+SoE8R--ZtR+Jxf-H4EqcGuONQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Then which numbers do I use for IRSTLM and SRILM?

On Thu, 29 Apr 2021 at 7:10 PM Hieu Hoang <hieuhoang@gmail.com> wrote:

>
> On 4/29/2021 5:27 AM, Marwa Gaser wrote:
>
> Hello,
>
> In the baseline training, what do the numbers in the below line represent?
>
>
> 3 for the 3-gram?
>
> yes
>
> How about 0 and 8?
>
> 0 means that the LM over the surface words. If your output has other
> factors, eg. Je|PRO suis|VB etudiant|ADJ, you can choose to have the LM on
> factor 1
>
> 8 means it uses KenLM, as opposed to SRILM or IRSTLM.
>
>
> -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8
>
>
> _______________________________________________
> Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
> --
> Hieu Hoanghttp://statmt.org/hieu
>
> --
Sent from Gmail Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20210501/9f495230/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 175, Issue 1
*********************************************

0 Response to "Moses-support Digest, Vol 175, Issue 1"

Post a Comment