Moses-support Digest, Vol 127, Issue 31

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Please help me to install and use moses2 (Hieu Hoang)
2. MLP 2017 Call for Participation in Shared Tasks on
Cross-lingual Word Segmentation and Morpheme Segmentation
(Mikel L. Forcada)


----------------------------------------------------------------------

Message: 1
Date: Thu, 18 May 2017 17:21:48 +0100
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Please help me to install and use moses2
To: Ng? Th? Vinh <ntvinh@ictu.edu.vn>, moses-support@mit.edu
Message-ID: <bf213a06-8f2d-b323-920a-517132ae0355@gmail.com>
Content-Type: text/plain; charset="utf-8"

Please subscribe to the Moses mailing list before posting to it. You can
subscribe here:

http://mailman.mit.edu/mailman/listinfo/moses-support

To answer your question - whenever you see the command

...../bin/moses

you can replace it with

..../bin/moses2

More information can be found here:

http://www.statmt.org/moses/?n=Site.Moses2


On 14/05/2017 11:36, Ng? Th? Vinh wrote:
> Hi all,
>
> I have read the paper Fast, Scalable Phrase-Based SMT Decoding (By
> Hieu Hoang, Nikolay Bogoychev, Lane Schwartz and Marcin
> Junczys-Dowmunt) and I know moses2 is more optimized than mose2. Now,
> I want to use mose2 for my translation system, but I do not know how
> to integrated mose2 to moses.
>
> If I install moses normally, is moses2 automatically integrated into
> system?
>
> Thank you for your helping!
>
> --
> *Ng? Thi? Vinh*
> Faculty of Electronics and Communications,
> Thai Nguyen University of Information and Communication Technology (ICTU).
> TEL: 0987 706 830
> Email: _ntvinh@ictu.edu.vn <mailto:ptnghia@ictu.edu.vn>_

--
Hieu Hoang
http://moses-smt.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20170518/5ea134f9/attachment-0001.html

------------------------------

Message: 2
Date: Thu, 18 May 2017 17:21:58 +0100
From: "Mikel L. Forcada" <mlf@dlsi.ua.es>
Subject: [Moses-support] MLP 2017 Call for Participation in Shared
Tasks on Cross-lingual Word Segmentation and Morpheme Segmentation
To: "mt-list@eamt.org" <mt-list@eamt.org>, moses-support@mit.edu,
"apertium-stuff@lists.sourceforge.net"
<apertium-stuff@lists.sourceforge.net>,
tradumatica@listserv.rediris.es
Message-ID: <4785bdf4-7dfe-48e9-e26b-58607c8327e9@dlsi.ua.es>
Content-Type: text/plain; charset="utf-8"

MLP 2017 Call for Participation in Shared Tasks on Cross-lingual Word
Segmentation and Morpheme Segmentation

The analysis of word formation is among the most fundamental natural
language processing (NLP) technologies for extracting basic processing
units for further NLP tasks in many languages. There are broadly two
groups of segmentation tasks related to word formation, i.e. morpheme
segmentation and word segmentation. Morpheme segmentation is required
in languages such as Turkish, for example, where words are formed by
stems, root words, prefixes, and/or suffixes. It is the foundation for
further morphological analysis tasks. Word segmentation is necessary in
languages such as Mandarin Chinese, where there are no word boundaries
in the writing system.

Although there is clear similarity among different languages in terms of
either morpheme segmentation or word segmentation, most of these tools
are designed specifically for one language. In this shared task, we
encourage the participants to submit the results of one system/method as
applied to multiple languages for one of the two segmentation tasks.
These systems are expected to demonstrate the ability of cross-lingual
processing on the segmentation tasks, which would give insights to our
community into the building of fundamental NLP tools for low resource
languages.

Popular languages such as Chinese and Japanese are also included in the
task for two reasons. Firstly, although morpheme segmentation and word
segmentation tools for these languages have been developed for many
years and are often regarded as mature technologies, human creativity,
variability of textual genres and dialects as exhibited in language
evolution still make them challenging problems to these languages.
Secondly, we would like to encourage participants of this shared task
to develop systems/methods that can be used across different languages
where morpheme segmentation or word segmentation is required for natural
language processing.

A corpus of at least 2,000 sentences will be prepared as the training
set in each language for either morpheme segmentation or word
segmentation. Development and test sets will each include 1,000
sentences for system development and evaluation purposes. The whole
corpus will comprise multiple genres s where plausible in both subtasks.
Recommendations of additional language resources will also be
listed/provided for some languages by the organizers. These resources
might include, but will not be limited to, dictionaries, articles,
social media posts and bilingual (aligned) texts for the target languages.


The tasks will be organized into two subtasks - constrained and
semi-constrained, in the sense on the availability of annotated data in
the corpora. In the constrained subtasks, participants will use only
the corpora provided by the shared task in the development of systems,
where comparisons among different technologies exhibiting their pros and
cons are easier to be made. In the semi-constrained subtasks,
participants are encouraged to use additional publicly available
resources to further improve the performance of their systems. The four
subtasks are as follows; participants can take part in any (and all) of
the subtasks. It should be noted that for the external data used in
semi-constrained subtasks, only un-annotated (raw) data can be used,
while annotated data with word or morpheme boundaries cannot.

?

*

Task: Word Segmentation (WS)

o

Subtask: Word Segmentation - Constrained (WSC)

o

Subtask: Word Segmentation - Semi-constrained (WSS)

*

Task: Morpheme Segmentation (MS)

o

Subtask: Morpheme Segmentation - Constrained (MSC)

o

Subtask: Morpheme Segmentation - Semi-constrained (MSS)

In the development, results of systems tuned only with the given
development sets must be submitted. Participants may also submit
additional results tuned with different development sets, provided a
description on how these sets are produced is given, e.g. a subset
derived manually from the original given development set or by using
some other method. The organizers will provide results of baseline
systems for constrained morpheme segmentation (MSC) and constrained word
segmentation (WSC) tasks. The results of submitted systems will be
evaluated against the prepared test set for each language. Precision,
recall and F1 measure will be used as metrics for the evaluation.

TARGET LANGUAGES(listed in alphabetical order)

*

Word Segmentation: Mandarin Chinese, Thai, Vietnamese.

*

Morpheme Segmentation: Basque, Farsi, Japanese, Finnish, Kazakh,
Marathi, Uyghur.

DATA SAMPLE


The format of the data is shown as below.

*

Uyghur; morpheme segmentation

??????//???? ????//????//?????//????//??? ????//????
??????? ???//???

*

Basque, i.e. Euskara; morpheme segmentation

Paper\\a\\k mahai\\a\\ren gain\\ean daude

*

Mandarin Chinese; word segmentation

?? ??? ? ?? ????

SCHEDULE

May 20, 2017 Shared Task Website Ready
May 20, 2017 First Call for Participants Ready
May 20, 2017 Registration Begins
June 20, 2017 Release of Training Set
July 5, 2017 Dryrun: Release of Development Set
July 8, 2017 Dry run: Results Submission on Development Set
July 10, 2017 Dryrun: Release of Scores
July 12, 2017 Release of Surprise Languages (Training and
Development Sets)
July 20, 2017 Registration Ends
July 24, 2017 Release of Test Set
July 31, 2017 Submission of Systems
August 4, 2017 System Results
August 11, 2017 System Description Paper Due
August 18, 2017 Notification of Acceptance
August 25, 2017 Camera-Ready Deadline



Registration:

Please send a registration email tomlp2017.sharedtasks@gmail.com
<mailto:mlp2017.sharedtasks@gmail.com>with the following information:

*

Institution:

o

Name

o

Country

*

Contact person:

o

Title

o

Last Name

o

First Name

o

Email address

*

Tasks and Subtasks to participate in.

The title of a registration email should be:_Registration_.

ORGANIZERS:[listed in alphabetical order]

Alberto Poncelas



ADAPT Centre, Dublin City University

Alex Huynh



University of Science, Vietnam National University Ho Chi Minh City

Chao-Hong Liu



ADAPT Centre, Dublin City University

Dinh Dien



University of Science, Vietnam National University Ho Chi Minh City

Francis Tyers UiT



Norgga ?rktala? universitehta

Majid Latifi



Universitat Polit?cnica de Catalunya

Nasun-Urt



Inner Mongolia University

Prachya Boonkwan



National Electronics and Computer Technology Center

Teresa Lynn



ADAPT Centre, Dublin City University

Thepchai Supnithi



National Electronics and Computer Technology Center

Tommi A Pirinen



Universit?t Hamburg

Qun Liu



ADAPT Centre, Dublin City University

Vinit Ravishankar



Maharashtra Institute of Technology

Yating Yang



University of Chinese Academy of Sciences

--
Mikel L. Forcada http://www.dlsi.ua.es/~mlf/
Departament de Llenguatges i Sistemes Inform?tics
Universitat d'Alacant
E-03690 Sant Vicent del Raspeig
Spain
Office: +34 96 590 9776

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20170518/00633da6/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 127, Issue 31
**********************************************

0 Response to "Moses-support Digest, Vol 127, Issue 31"

Post a Comment