Moses-support Digest, Vol 101, Issue 73

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. KENLM python module - query strings (Arda Tezcan)
2. Re: KENLM python module - query strings (Kenneth Heafield)
3. Call for Participation: WAT2015 (The 2nd Workshop on Asian
Translation) (Toshiaki Nakazawa)
4. Two MT Marathons in 2015, US and EU (MTMA and MTM) (Lane Schwartz)

----------------------------------------------------------------------

Message: 1
Date: Thu, 26 Mar 2015 21:55:09 +0100
From: Arda Tezcan <arda_te@yahoo.com>
Subject: [Moses-support] KENLM python module - query strings
To: Moses Support <moses-support@mit.edu>
Message-ID: <7673DB02-DF3F-4EFA-B005-90597318F562@yahoo.com>
Content-Type: text/plain; charset="utf-8"

Hi,
I was wondering if the python api for KENLM supports also querying just strings?
As far as I could see from the description it seems to let you query only sentences (but maybe I missed something):
sentence = 'this is a sentence .'
print(model.score(sentence))

If I query a (non-sentence) string with this function I get much lower scores compared to querying directly from the command line with the ?null? option (which gives the expected results).
Is it possible that ?score? function is maybe adding sentence boundaries to the input string before querying?

Thanks in advance,
Arda
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150326/8ad6fcd3/attachment-0001.htm

------------------------------

Message: 2
Date: Thu, 26 Mar 2015 19:23:46 -0400
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] KENLM python module - query strings
To: moses-support@mit.edu
Message-ID: <55149502.4050502@kheafield.com>
Content-Type: text/plain; charset=windows-1252

Hi,

This isn't really a moses question, so please follow up with me if you
need support for KenLM. The model.score function is declared as:

def score(self, sentence, bos = True, eos = True)

Thus if you want the same as running query with null,

model.score(sentence, False, False)

Kenneth

On 03/26/2015 04:55 PM, Arda Tezcan wrote:
> Hi,
> I was wondering if the python api for KENLM supports also querying just
> strings?
> As far as I could see from the description it seems to let you query
> only sentences (but maybe I missed something):
>
> sentence = 'this is a sentence .'
> print(model.score(sentence))
>
>
> If I query a (non-sentence) string with this function I get much lower
> scores compared to querying directly from the command line with the
> ?null? option (which gives the expected results).
> Is it possible that ?score? function is maybe adding sentence boundaries
> to the input string before querying?
>
> Thanks in advance,
> Arda
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

------------------------------

Message: 3
Date: Fri, 27 Mar 2015 12:30:29 +0900
From: Toshiaki Nakazawa <nakazawa@pa.jst.jp>
Subject: [Moses-support] Call for Participation: WAT2015 (The 2nd
Workshop on Asian Translation)
To: moses-support@mit.edu
Message-ID: <m2h9t7glre.wl-nakazawa@pa.jst.jp>
Content-Type: text/plain; charset=US-ASCII

Dear all MT researchers/users,

I'm Toshiaki Nakazawa from JST (Japan Science and Technology Agency),
Japan. This is an announcement of the 2nd Workshop on Asian
Translation (WAT2015). Those who are working on machine translation,
please join us.

Best regards,

---------------------------------------------------------------------------
WAT 2015
(The 2nd Workshop on Asian Translation)
http://lotus.kuee.kyoto-u.ac.jp/WAT/
October 16, 2015, Kyoto, Japan

Following the success of the previous Workshop on Asian Translation
(WAT2014), WAT2015 brings together machine translation researchers and
users to try, evaluate, share and discuss brand-new ideas of machine
translation. We are working toward the practical use of machine
translation among all Asian countries.

For the 2nd WAT, we adopt a new translation subtask
"Chinese-to-Japanese patent translation" in addition to the subtasks
that were conducted in WAT2014.

************************* IMPORTANT NOTICE *************************
Participants of the previous WAT2014 are also required to register to
WAT2015
********************************************************************

TASK
----

The task is to improve the text translation quality for scientific
papers and patent documents. Participants choose any of the subtasks
in which they would like to participate and translate the test data
using their machine translation systems. The WAT organizers will
evaluate the results submitted using automatic evaluation and human
evaluation. We will also provide a baseline machine translation.

Subtasks:
Scientific Paper Subtask:
Japanese --> English
English --> Japanese
Japanese --> Chinese
Chinese --> Japanese
Patent Subtask: <= NEW!
Chinese --> Japanese

Dataset:
For the scientific paper subtask, WAT uses ASPEC (Asian Scientific
Paper Excerpt Corpus) for the dataset including training, development,
development test and test data. Participants must get a copy of ASPEC
by themselves from http://lotus.kuee.kyoto-u.ac.jp/ASPEC/

For the patent subtask, WAT uses the JPO Patent Corpus, which was
constructed by the Japan Patent Office (JPO). This corpus consists of
1 million Japanese-Chinese parallel sentences from patent descriptions
with four categories. Participants of the patent subtask are required
to obtain the data from the WAT2015 site for the JPO Patent Corpus.

Automatic evaluation:
We are providing an automatic evaluation server. It is for free for
everyone, but you need to create an account for evaluation. Just
showing the list of evaluation results does not require an account.

Registration: http://lotus.kuee.kyoto-u.ac.jp/WAT/registration/index.html
Eval. result: http://lotus.kuee.kyoto-u.ac.jp/WAT/evaluation/index.html

Human evaluation:
Human evaluation will be carried out using crowdsourcing. Participants
can submit translation results a maximum of
twice. Sentence-by-sentence pair-wise evaluation compared to the
baseline system will be carried out. The crowdsourcing workers will be
asked to judge which translation is better than the other in view of
adequacy and fluency. All systems will be ranked by the percentage of
translations judged to improve upon the baseline system.

INVITED TALK
------------

We are planning to have an invited talk as follows:

Speaker: Dr. Haizhou Li
Research Director of the Institute for Infocomm Research in Singapore
Principal Scientist and Department Head of Human Language Technology
Title: Adequacy-Fluency Metrics: Evaluating MT in the Continuous Space Model Framework

Please visit the homepage of WAT for the biography of the speaker.
http://lotus.kuee.kyoto-u.ac.jp/WAT/#invited-talk.html

REGISTRATION
------------

The registration for task participants:
The registration fee is FREE for all participants.
http://lotus.kuee.kyoto-u.ac.jp/WAT/registration/index.html

The registration for WAT2015 audiences:
There is no need to register in advance. The registration fee is FREE
for all audiences include participants.

IMPORTANT DATES
---------------

Crowdsourcing evaluation due TBA...
System description draft paper due TBA...
Review feedback TBA...
Camera-ready paper due TBA...
WAT 2015 October 16, 2015

PAPER
-----

Participants who submit results for human evaluation should submit
description papers of their translation systems and evaluation
results.

We strongly prefer that papers include a section entitled "Issues for
Context-aware Machine Translation" which discusses the importance and
usefulness of context.

ORGANIZERS
----------

Toshiaki Nakazawa (Japan Science and Technology Agency (JST))
Hideya Mino (National Institute of Information and Communications Technology (NICT))
Isao Goto (Japan Broadcasting Corporation (NHK))
Graham Neubig (Nara Institute of Science and Technology (NAIST))
Eiichiro Sumita (National Institute of Information and Communications Technology (NICT))
Sadao Kurohashi (Kyoto University)

CONTACT
-------

wat@nlp.ist.i.kyoto-u.ac.jp

---------------------------------------------------------------------------

--
Toshiaki Nakazawa (Researcher)
Japan Science and Technology Agency (JST)
(@ Graduate School of Informatics, Kyoto University)
Yoshida-honmachi, Sakyo-ku, Kyoto, 606-8501, Japan
tel: +81-75-753-5346, fax: +81-75-753-5962
nakazawa@pa.jst.jp / nakazawa@nlp.ist.i.kyoto-u.ac.jp

------------------------------

Message: 4
Date: Fri, 27 Mar 2015 09:04:08 -0500
From: Lane Schwartz <dowobeha@gmail.com>
Subject: [Moses-support] Two MT Marathons in 2015, US and EU (MTMA and
MTM)
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CABv3vZm0wg5Z=bm=V9VUTNp3aV3rF=j0ihVY4+6+CaOrKVNczQ@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

(Apologies for multiple copies.)

This is a join announcement and call for participation and papers for

First MT Marathon in the Americas (May 10-15; Urbana-Champaign, IL, USA)
and
MT Marathon 2015 (Sept 7-12; Prague, Czech Republic, EU)

This year, you are most welcome to take part in one of (or both!)
Machine Translation Marathons. The US edition is sponsored by Bloomberg,
happens in just about a month from now and highlights the features of a
summer school. The EU edition is organized by the EU project CRACKER,
takes place in the traditional second week of September and the
underlying topic is quality in machine translation (QT Marathon).

Machine Translation Marathon is a week long gathering of machine
translation researchers, developers, students and users. It features:

- MT Lectures and Labs covering the basics and tutorials.
- Invited talks from experienced researchers and practitioners.
- Technical Talks about open source tools.
- Hacking Projects to advance tools or research in one week.

Details:

http://www.statmt.org/mtma15 for the US edition (registration open)
http://www.statmt.org/mtm15 for the EU edition (coming soon)

** Call for papers **

Each MT Marathon will again host an Open Source Convention to advance
the state of the art in machine translation.
We invite developers of open source tools to present their work and
submit a paper (for the Prague MT Marathon) of up to 10 pages that
describes the underlying methodology and includes instructions on how to
download and use the tools.

We are looking for stand-alone tools and extensions of existing tools,
such as the Moses open source system. Accepted papers will be presented
during the MT Marathon and published in the 104th issue of the Prague
Bulletin of Mathematical Linguistics (http://ufal.mff.cuni.cz/pbml).

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 101, Issue 73
**********************************************

Moses-support Digest, Vol 101, Issue 73

0 Response to "Moses-support Digest, Vol 101, Issue 73"

Post a Comment