Moses-support Digest, Vol 87, Issue 50

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Kind reminder: Call for Papers: 9th SaLTMiL workshop on
?Free/open-source language resources for the machine translation
of less-resourced languages? at LREC 2014 (Mikel Forcada)
2. Re: Language modelling (Arththika Paramanathan)
3. Re: Moses Release 2.1 (Tom Hoar)


----------------------------------------------------------------------

Message: 1
Date: Tue, 21 Jan 2014 19:38:02 +0100
From: Mikel Forcada <mlf@dlsi.ua.es>
Subject: [Moses-support] Kind reminder: Call for Papers: 9th SaLTMiL
workshop on ?Free/open-source language resources for the machine
translation of less-resourced languages? at LREC 2014
To: moses-support@mit.edu
Message-ID: <52DEBE8A.8060407@dlsi.ua.es>
Content-Type: text/plain; charset="windows-1252"

Call for Papers: 9th SaLTMiL workshop on ?Free/open-source language
resources for the machine translation of less-resourced languages? at
LREC 2014.

A full-day workshop at LREC 2014
Tuesday, 27 May 2014.
Reykjavik (Iceland)

SALTMIL: http://ixa2.si.ehu.es/saltmil/
LREC 2014: http://lrec2014.lrec-conf.org/en/
Website: http://ixa2.si.ehu.es/saltmil/
Paper submission: https://www.softconf.com/lrec2014/SaLTMiL/

The 9th International Workshop of the Special Interest Group on Speech
and Language Technology for Minority Languages (SaLTMiL) will be held in
Reykjav?k, Iceland, on May 24, 2014, as part of the 2014 International
Language Resources and Evaluation Conference (LREC). (For SALTMIL see:
http://ixa2.si.ehu.es/saltmil/); it is also framed as one of the
activities of European project Abu-Matran (http://www.abumatran.eu).
Entitled "Free/open-source language resources for the machine
translation of less-resourced languages", the workshop is intended to
continue the series of SALTMIL/LREC workshops on computational language
resources for minority languages, held in Granada (1998), Athens (2000),
Las Palmas de Gran Canaria (2002), Lisbon (2004), Genoa (2006),
Marrakech (2008), La Valetta (2010) and Istanbul (2012), and is also
expected to attract the audience of Free Rule-Based Machine Translation
workshops (2009, 2011, 2012). The workshop aims to share information on
language resources, tools and best practice, to save isolated
researchers from starting from scratch when building machine translation
for a less-resourced language. An important aspect will be the
strengthening of the free/open-source language resources community,
which can minimize duplication of effort and optimize development and
adoption, in line with the LREC 2014 hot topic ?LRs in the Collaborative
Age? (http://is.gd/LREChot).

The whole-day workshop will consist of short oral papers, a poster
session preceded by a poster-boaster session (2 minutes, 2 slides per
poster), and a round table.

Papers are invited that describe research and development in the
following areas:

FOS LR for rule-based machine translation (dictionaries, rule sets)
FOS LR for statistical machine translation (corpora)
FOS tools to annotate, clean, preprocess, convert, etc. LRs for machine
translation
Machine translation as a tool for creating or enriching FOS LRs for
less-resourced languages

Position papers and (web based) demonstrations will also be considered
for presentation.

The best papers, as evaluated by the programme committee, will be
presented orally and the remaining paper will be presented in poster
format.

We expect short papers of max 6,000 words (up to 6 pages) describing
research addressing one of the above topics, to be submitted as PDF
documents by using the LREC 2014 START conference management system
(https://www.softconf.com/lrec2014/SaLTMiL/).

Submissions should be anonymized. When submitting a paper through the
START page, authors will be kindly asked to share the resources that
have been used for the work described in their paper or that are the
outcome of their research. For further information on this initiative,
please refer to
http://lrec2014.lrec-conf.org/en/calls-for-papers/lrec-2014-special-highlight/.


Submissions of papers should follow the same style as the papers for the
main LREC conference (an Author's Kit made of specific guidelines and
downloadable templates will be published on the conference web site in
due time). All contributions will be included in the workshop
proceedings (CD). They will also be published on the SALTMIL website.

The registration fees will be duly announced at the LREC 2014 site.
Registration in the workshop willl include a coffee break and the
Proceedings of the Workshop. Registration will be handled by the LREC
2014 Secretariat.


Important dates

Deadline for paper submission: February 10, 2014
Notification of acceptance sent: March, 3, 2014
Camera-ready paper due: March 21, 2014


Organizing committee

Joint e-mail address: saltmil2014@dlsi.ua.es

(1) Dr Francis M Tyers
Institutt for spr?kvitskap
Det humanistiske fakultet,
N-9037 Universitetet i Troms?
ftyers@prompsit.com

(2) Dr Kepa Sarasola
Computer Science Faculty
Dept. of Computer Languages
The University of the Basque Country
P.K. 649 20080 DONOSTIA
Basque Country, Spain
Tel: +34 943 01 81 54
Fax: +34 943 21 93 06
ksarasola@ehu.es
http://ixa.si.ehu.es

(3) Prof Mikel L. Forcada
Dept. Llenguatges i Sistemes inform?tics
Universitat d?Alacant
E-03071 Alacant (Spain)
Tel: +34 96 590 9776
FAx: +34 96 590 9326
mlf@ua.es
http://www.dlsi.ua.es/~mlf


Programme Committee

I?aki Alegria, Euskal Herriko Unibertsitatea, Spain
Lars Borin, G?teborgs Universitet, Sweden.
Elaine U? Dhonnchadha, Trinity College Dublin, Ireland
Mikel L. Forcada, Universitat d?Alacant, Spain
Michael Gasser, Indiana University, USA
M?ns Huld?n, Helsingin Yliopisto, Finland
Krister Lind?n, Helsingin Yliopisto, Finland
Nikola Ljube?ic', Sveuc(ili?te u Zagrebu, Croatia
Llu?s Padr?, Universitat Polit?cnica de Catalunya, Spain
Juan Antonio P?rez-Ortiz, Universitat d?Alacant, Spain
Felipe S?nchez-Mart?nez, Universitat d?Alacant
Kepa Sarasola, Euskal Herriko Unibertsitatea, Spain
Kevin P. Scannell, Saint Louis University, USA
Antonio Toral, Dublin City University, Ireland
Trond Trosterud, Universitet i Troms?, Norway
Francis M. Tyers, Universitet i Troms?, Norway

--
Mikel L. Forcada (http://www.dlsi.ua.es/~mlf/)
Departament de Llenguatges i Sistemes Inform?tics
Universitat d'Alacant
E-03071 Alacant, Spain
Phone: +34 96 590 9776
Fax: +34 96 590 9326

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140121/08a97eb0/attachment-0001.htm

------------------------------

Message: 2
Date: Wed, 22 Jan 2014 08:45:46 +0530
From: Arththika Paramanathan <arthiparamanathan@gmail.com>
Subject: Re: [Moses-support] Language modelling
To: Nicola Bertoldi <bertoldi@fbk.eu>, moses-support
<moses-support@mit.edu>
Message-ID:
<CAJSfqEyhVfXSdG7Yk=KupV+AcCO3D0X4pBZ9AErmheOa25JxBQ@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

In moses, it assume English as a target language & other language is source
language (foreign). So that we can translate a foreign language to English
(In my case, Tamil-English). I want to translate English-Tamil. So, what I
want to change,
(in train-model.perl file/ )


On Wed, Jan 22, 2014 at 8:37 AM, Arththika Paramanathan <
arthiparamanathan@gmail.com> wrote:

> Hi Nicola,
> Thank you for your response.
>
> I think in LM with IRSTLM, there are 4 or 5 steps.
> In step 1, it will split the corpus as 1-gram with it's frequency count
> (there is no sorting here)
> In step 2, split this dictionary into 3 dictionaries (balanced n-gram
> lists). Here, the threshold is approximately the total words divided by 3.
> Is it correct?
> In step 3, Collect n-gram for each dictionary. ie) for each words in each
> spitted dictionary, it search for 3-gram & put them in a separate file.
> Then I don't understand the next step (ARPA file).
> How to calculate this?
> -3.72202 <s> -0.598275
> -3.17795 illegal -0.60206
> -2.42099 folder -0.500602
> -2.53169 name -0.723104
>
> Can you please explain me that how to calculate this?
>
>
>
>
>
>
>
> On Tue, Jan 21, 2014 at 10:46 PM, Nicola Bertoldi <bertoldi@fbk.eu> wrote:
>
>> Hi Arththika,
>>
>>
>> (1) In language modelling,
>> how IRSTLM split the dictionary which is extracted from corpus into 3
>> dictionaries?
>> how to calculate n-gram counts?
>>
>>
>>
>> I would like to answer your first question
>> as a responsible of the IRSLTM tookit
>>
>> If not clear, please reply privately to me only.
>>
>>
>> I suppose you are using the build-lm.sh script from IRSTLM
>>
>> The script split the dictionary, sorted according the 1-grams frequency,
>> in such a way that the global frequency of each part is balanced.
>>
>> In this way the corresponding partitions of the n-grams are balanced as
>> well.
>> the n-gram partition is built by taking into consideration the first
>> token,
>>
>> Not sure what do you mean with the second part of the question.
>>
>> best regards,
>> Nicola
>>
>>
>>
>>
>> On Jan 20, 2014, at 7:34 PM, Arththika Paramanathan wrote:
>>
>> Hi,
>>
>> (2) And, If English is the foreign language, what I want to change, (in
>> train-model.perl file)
>>
>> (3) can anyone tell me that how to use a perl module? I want to use this
>> module named Locale-Maketext-Lexicon-0.97 to extract translatable strings
>> from po files.
>>
>>
>>
>> --
>> regards,
>> P.Arththika
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu<mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
> regards,
> P.Arththika
>



--
regards,
P.Arththika
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140122/aa2e5460/attachment-0001.htm

------------------------------

Message: 3
Date: Wed, 22 Jan 2014 10:59:32 +0700
From: Tom Hoar <tahoar@precisiontranslationtools.com>
Subject: Re: [Moses-support] Moses Release 2.1
To: moses-support@mit.edu
Message-ID: <52DF4224.1040100@precisiontranslationtools.com>
Content-Type: text/plain; charset="iso-8859-1"

Congratulations. Good job, everyone!


On 01/21/2014 08:07 PM, Hieu Hoang wrote:
> Ladies and Gentlemen, Boys and Girls
>
> It is my pleasure to announce the official release of Moses, version
> 2.1. It's taken a year to do and there's lots of changes.
>
> The most noticeable is that the moses.ini file format has changed, due
> to lots of behind the scene refactoring to make it easier to extend
> Moses. The documentation throughout the website has been updated to
> reflect the changes. However, the decoder is still compatible with the
> old ini file in most cases.
>
> More people than ever are contributing to the Moses toolkit. Here are
> just some of the thing that they've done:
> 1. Transliteration Phrase-Table by Nadir Durrani
> 2. DALM integration by Jun-ya Norimatsu.
> 3. CoveredReferenceFeature by Ales Tamchyna. Also, constrained
> Decoding by Hieu Hoang.
> 4. Picaro by Jason Riesa.
> 5. Neural LM by Lane Schwartz.
> 6. Tokenization configuration files for Greek and Tamil by Dimitris
> Mavroeidis and Arththika Paramanathan.
> 7. DIMwid by Robin Kurtz.
> 8. Placeholder by Achim Ruopp and Hieu Hoang.
> 9. Backward LM by lane Schwartz.
> 10. Multimodel phrase-table by Rico Sennrich.
> 11. Operation Sequence Model by Nadir Durrani.
> 12. Alternate weight setting by Philipp Koehn.
> 13. Lattice and confusion network phrase-based decoding with any
> phrase-tables by Hieu Hoang.
> 14. Ondisk phrase-table for phrase-based model by Hieu Hoang.
> 15. Clearer error messages by Hieu Hoang.
> 16. Updated Windows GUI by Jie Jiang
>
> Please see the release notes for more details
> http://www.statmt.org/moses/RELEASE-2.1/Mosesv2.1releasenotes.pdf
>
> The code can be downloaded from github, or at:
> http://www.statmt.org/moses/RELEASE-2.1/mosesdecoder.v21.tar.gz
>
> Moses is available in a number of different ways:
> 1. As source code, which you compile yourself
> 2.compiled binaries, for your OS
> 3. Virtual machines (Linux 32 and 64 bits) with Moses pre-installed.
> 4. Amazon EC2 images
> More detauls in the release notes.
>
> Happy MT'ing!
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140122/db44746a/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 87, Issue 50
*********************************************

0 Response to "Moses-support Digest, Vol 87, Issue 50"

Post a Comment