Moses-support Digest, Vol 135, Issue 2

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. CFP: 11th BUCC Workshop with Shared Task on Identifying
Parallel Sentences in Comparable Corpora (Reinhard Rapp)
2. First CfP: First Workshop on Translation Quality Estimation
and Automatic Post-Editing (Jo?o Gra?a)

----------------------------------------------------------------------

Message: 1
Date: Tue, 2 Jan 2018 21:23:58 +0100
From: "Reinhard Rapp" <reinhardrapp@gmx.de>
Subject: [Moses-support] CFP: 11th BUCC Workshop with Shared Task on
Identifying Parallel Sentences in Comparable Corpora
To: <IRList@lists.shef.ac.uk>, <listmaster@loria.fr>,
<lr_egroup@mail.iiit.ac.in>, <moses-support@mit.edu>,
<news@multilingual.com>
Message-ID: <5771E7812445465584FBFCC18D9FC8F4@ASUSPC>
Content-Type: text/plain; charset="windows-1252"

********************************************************************

11th WORKSHOP ON BUILDING AND USING COMPARABLE CORPORA

Co-located with LREC 2018, Phoenix Seagaia Resort, Miyazaki, Japan

Tuesday, May 8, 2018

Submission deadline: January 20, 2018

SHARED TASK: Identifying parallel sentences in comparable corpora

Website: https://comparable.limsi.fr/bucc2018/

********************************************************************

MOTIVATION

In the language engineering and the linguistics communities, research in
comparable corpora has been motivated by two main reasons. In language
engineering, on the one hand, it is chiefly motivated by the need to use
comparable corpora as training data for statistical NLP applications such
as statistical and neural machine translation or cross-lingual retrieval.
In linguistics, on the other hand, comparable corpora are of interest in
themselves by making possible cross-language discoveries and comparisons.
It is generally accepted in both communities that comparable corpora are
documents in one or several languages that are comparable in content and
form in various degrees and dimensions. We believe that the linguistic
definitions and observations related to comparable corpora can improve
methods to mine such corpora for applications of statistical NLP. As such,
it is of great interest to bring together builders and users of such corpora.

TOPICS

Given that LREC takes place for the first time in Asia, this year's
special theme is "Comparable Corpora for Asian Languages". But we
solicit contributions also on all other topics related to comparable
corpora, including but not limited to the following:

Building Comparable Corpora:

? Human translations
? Automatic and semi-automatic methods
? Methods to mine parallel and non-parallel corpora from the Web
? Tools and criteria to evaluate the comparability of corpora
? Parallel vs non-parallel corpora, monolingual corpora
? Rare and minority languages, across language families
? Multi-media/multi-modal comparable corpora

Applications of comparable corpora:

? Human translations
? Language learning
? Cross-language information retrieval & document categorization
? Bilingual projections
? Machine translation
? Writing assistance
? Machine learning techniques using comparable corpora

Mining from Comparable Corpora:

? Induction of morphological, grammatical, and translation rules from comparable corpora
? Extraction of parallel segments or paraphrases from comparable corpora
? Extraction of bilingual and multilingual translations of single words and multi-word expressions, proper names, and named entities from comparable corpora
? Induction of multilingual word classes from comparable corpora
? Cross-language distributional semantics

SUBMISSION INFORMATION

Please follow the style sheet and templates provided for the main conference at http://lrec2018.lrec-conf.org/en/submission/authors-kit/
The submission website is https://www.softconf.com/lrec2018/BUCC2018/
Papers should be submitted as a PDF file. Submissions must describe original and unpublished work and range from four (4) to eight (8) pages including references.
Reviewing will be double blind, so the papers should not reveal the authors? identity. Accepted papers will be published in the workshop proceedings.
Double submission policy: Parallel submission to other meetings or publications is possible but must be immediately notified to the workshop organizers.
For further information, please contact Reinhard Rapp: reinhardrapp (at) gmx (dot) de

For further information see BUCC 2018 website: http://comparable.limsi.fr/bucc2018/

IMPORTANT DATES

Paper submission deadline: 20 January 2018
Notification of acceptance: 10 February, 2018
Early bird registration (reduced rates): 15 February, 2018
Camera ready final papers: 25 February, 2018
Workshop date: May 8, 2018

SHARED TASK: Identifying parallel sentences in comparable corpora

As a continuation of the previous year's shared task, we announce a modified
shared task for 2018. As is well known, a bottleneck in statistical machine
translation is the scarceness of parallel resources for many language pairs
and domains. Previous research has shown that this bottleneck can be
reduced by utilizing parallel portions found within comparable corpora.
These are useful for many purposes, including automatic terminology
extraction and the training of statistical MT systems. The aim of the
shared task is to quantitatively evaluate competing methods for extracting
parallel sentences from comparable monolingual corpora, so as to give an
overview on the state of the art and to identify the best performing
approaches.

Any submission to the shared task is expected to be accompanied by a short
paper (4 pages plus references). This will be accepted for publication in
the workshop proceedings after a basic quality check: hence the submission
will go via Softconf with the standard peer-review process.

SHARED TASK SCHEDULE

Shared task sample and training sets released: 22 December 2017
Shared task test set release: 22 January 2018
Shared task test submission deadline: 29 January 2018
Shared task paper submission deadline: 2 February 2018
Shared task camera ready papers: 25 February 2018

For further information concerning the shared task see https://comparable.limsi.fr/bucc2018/bucc2018-task.html

WORKSHOP ORGANIZERS

Reinhard Rapp (Magdeburg-Stendal University of Applied Sciences and University of Mainz, Germany), Chair
Pierre Zweigenbaum (LIMS, CNRS, Universit? Paris-Saclay, Orsay, France), Shared task organizer
Serge Sharoff (University of Leeds, United Kingdom)

PROGRAMME COMMITTEE

Ahmet Aker (University of Sheffield, UK)
Caroline Barri?re (CRIM, Montr?al, Canada)
Herv? D?jean (Xerox Research Centre Europe, Grenoble, France)
?ric Gaussier (Universit? Joseph Fourier, Grenoble, France)
Silvia Hansen-Schirra (University of Mainz, Germany)
Natalie Kubler (Universit? Paris Diderot USPC, Frtance)
Philippe Langlais (Universit? de Montr?al, Canada)
Michael Mohler (Language Computer Corp., US)
Emmanuel Morin (Universit? de Nantes, France)
Dragos Stefan Munteanu (Language Weaver, Inc., US)
Lene Offersgaard (University of Copenhagen, Denmark)
Ted Pedersen (University of Minnesota, Duluth, US)
Reinhard Rapp (Magdeburg-Stendal University of Applied Sciences and University of Mainz, Germany)
Serge Sharoff (University of Leeds, UK)
Michel Simard (National Research Council Canada)
Richard Sproat (OGI School of Science & Technology, US)
Pierre Zweigenbaum (LIMSI, CNRS, Universit? Paris-Saclay, Orsay, France)

IDENTIFY, DESCRIBE AND SHARE YOUR LANGUAGE RESOURCES

Please make sure that your papers take into account the following information from the LREC-organizers about the LRE Map, the "Share your LRs!" initiative and the ISLRN number:

* Describing your LRs in the LRE Map is now a normal practice in the
submission procedure of LREC (introduced in 2010 and adopted by
other conferences). To continue the efforts initiated at LREC 2014
about ?Sharing LRs? (data, tools, web-services, etc.), authors will
have the possibility, when submitting a paper, to upload LRs in a
special LREC repository. This effort of sharing LRs, linked to the
LRE Map for their description, may become a new ?regular? feature
for conferences in our field, thus contributing to creating a common
repository where everyone can deposit and share data.

* As scientific work requires accurate citations of referenced work so
as to allow the community to understand the whole context and also
replicate the experiments conducted by other researchers, LREC 2018
endorses the need to uniquely Identify LRs through the use of the
International Standard Language Resource Number (ISLRN,
www.islrn.org), a Persistent Unique Identifier to be assigned to
each Language Resource. The assignment of ISLRNs to LRs cited in
LREC papers will be offered at submission time.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20180102/9470646e/attachment-0001.html

------------------------------

Message: 2
Date: Wed, 3 Jan 2018 11:55:21 +0000
From: Jo?o Gra?a <gracaninja@gmail.com>
Subject: [Moses-support] First CfP: First Workshop on Translation
Quality Estimation and Automatic Post-Editing
To: moses-support <moses-support@mit.edu>, mt-list@eamt.org
Message-ID:
<CAGfH6a7Xp7CvBryaOouV_uThsp=xB94i+Q92T-yUx_NdRzu1Eg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

==========================================================

First Workshop on Translation Quality Estimation and Automatic Post-Editing

Boston, Massachusetts, March 21, 2018

@AMTA 2018 (http://www.conference.amtaweb.org/)

==========================================================

(TL;DR: 4 pages, in ACL format, submit by February 2.)

The goal of quality estimation is to evaluate a translation system?s
quality without access to reference translations (Blatz et al., 2004;
Specia et al., 2013). This has many potential usages: informing an end user
about the reliability of translated content; deciding if a translation is
ready for publishing or if it requires human post-editing; highlighting the
words that need to be changed. Quality estimation systems are particularly
appealing for crowd-sourced and professional translation services, due to
their potential to dramatically reduce post-editing times and to save labor
costs (Specia, 2011). The increasing interest in this problem from an
industrial angle comes as no surprise (Turchi et al., 2014; de Souza et
al., 2015; Martins et al., 2016, 2017; Kozlova et al., 2016). A related
task is that of automatic post-editing (Simard et al. (2007),
Junczys-Dowmunt and Grundkiewicz (2016)), which aims to automatically
correct the output of machine translation. Recent work (Martins, 2017, Kim
et al., 2017, Hokamp, 2017) has shown that the tasks of quality estimation
and automatic post-editing benefit from being trained or stacked together.

In this workshop, we will bring together researchers and industry
practitioners interested in the tasks of quality estimation (word,
sentence, or document level) and automatic post-editing, both from a
research perspective and with the goal of applying these systems in
industrial settings for routing, for improving translation quality, or for
making human post-editors more efficient. Special emphasis will be given to
the case of neural machine translation and the new open problems that it
poses for quality estimation and automatic post-editing.

The workshop will consist of one full day of technical presentations,
including a tentative number of 6 invited talks and 1 contributed talk,
followed by a 30-minutes panel discussion. There will be a poster session
featuring the papers accepted for publication in the workshop proceedings.

SUBMISSIONS

============

We invite the submission of original research papers, review papers as well
as position papers, related to the topic of the workshop. The papers should
not be longer than four pages, excluding references. Topics of the workshop
include but are not limited to:

- Research, review, and position papers on document-level, sentence-level,
or word-level Quality Estimation

- Research, review, and position papers on Automatic Post-Editing

- Machine learning techniques for exploiting the interaction among these
two tasks (e.g. stacking and multi-task learning)

- Corpora curation technologies for developing Quality Estimation datasets

- User studies showing the impact of Quality Estimation tools in translator
productivity

- Automatic metrics for translation fluency and adequacy

- Quality Estimation tailored to Neural Machine Translation

- Quality Estimation tailored to Human Translation

Papers should be formatted according to the ACL template (
http://acl2018.org/downloads/acl18-latex.zip).

Papers should be submitted via START system to https://www.softconf.com/
amta2018/qeape.

Papers will be reviewed for relevance and quality. Accepted papers will be
posted online, and offered oral or poster presentations.

IMPORTANT DATES

================

-

Submission deadline: February 2
-

Notification date: February 9
-

Camera ready deadline: February 16
-

Workshop day: March 21

ORGANIZERS

===========

Andr? Martins (Unbabel and University of Lisbon): andre.martins@unbabel.com

Ramon Astudillo (Unbabel and INESC-ID Lisboa): ramon@unbabel.com

Jo?o Gra?a (Unbabel): joao@unbabel.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20180103/66e63419/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 135, Issue 2
*********************************************

Moses-support Digest, Vol 135, Issue 2

0 Response to "Moses-support Digest, Vol 135, Issue 2"

Post a Comment