Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: compilation issues with last version of moses
(Kenneth Heafield)
2. announcements from ASPEC and WAT (Toshiaki Nakazawa)
3. TSD 2014 - Call for Demonstrations and Participation (TSD 2014)
4. Translating a (literature) short story with moses...
(Laurent Besacier)
----------------------------------------------------------------------
Message: 1
Date: Fri, 18 Jul 2014 10:06:48 +0800
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] compilation issues with last version of
moses
To: moses-support@mit.edu
Message-ID: <53C88138.4060901@kheafield.com>
Content-Type: text/plain; charset=windows-1252
Hi,
Your architecture is x86 or x86_64, right? And if you checkout master
from https://github.com/kpu/kenlm , it has the same error?
It should have built the programs anyway. Can you please run
bin/build_binary lm/test.arpa lm/test.binary
and send me the lm/test.binary file? Also it would help to know if this
is with probing or trie data structures. Can you please attach the full
build output and send it to me? The failing test names will tell me that.
Finally, what filesystem are you using? Any network mounts?
Kenneth
On 07/16/14 16:18, Pierre Lison wrote:
>
> Hi Hieu, Ken,
>
>>> 2) The test suite for the language model in ?lm/model_test.cc? returns 4 failures, with the following message:
>>> "Vocabulary words are in the wrong place. This could be because the binary file was built with stale gcc and old kenlm. [?]?
>>>
>>> This is strange, since I?m compiling the latest version which, as far as I can see, also contains a recent version of KenLM. The problem arises both when I?m compiling with gcc 4.9 or with the Intel compiler.
>> what distro are you using and does it happen on an older gcc?
>
> I?m compiling moses on a Linux-based computer cluster (using 64 bit CentOS 6). Yes, I just recompiled the whole thing with gcc4.4.7, and the same error occurs. I?m using the boost libraries 1.55.0.
>
> Cheers,
>
> --
> Pierre Lison (Postdoctoral Research Fellow)
> Department of Informatics, University of Oslo
> Mobile: +47.967.998.12
> Web: http://folk.uio.no/plison
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
------------------------------
Message: 2
Date: Fri, 18 Jul 2014 11:42:50 +0900
From: Toshiaki Nakazawa <nakazawa@pa.jst.jp>
Subject: [Moses-support] announcements from ASPEC and WAT
To: moses-support@mit.edu
Message-ID: <m2r41jxu8l.wl%nakazawa@pa.jst.jp>
Content-Type: text/plain; charset=US-ASCII
Dear all,
I'm Toshiaki Nakazawa from JST (Japan Science and Technology Agency),
Japan. Here are four announcements about ASPEC and WAT.
1. User agreement of ASPEC has been updated
Recently, the user agreement of ASPEC (Asian Scientific Paper Excerpt
Corpus) has been updated to be more accommodating to researchers
working for companies. Please feel free to try ASPEC if you are
interested.
2. Automatic evaluation server is open and registration has been started
The automatic evaluation server for WAT (the 1st Workshop of
Asian Translation, which uses ASPEC as the dataset) has been
opened. It is for free for everyone, but you need to create an account
for evaluation. Just showing the list of evaluation results does not
require an account.
Registration: http://lotus.kuee.kyoto-u.ac.jp/WAT/registration/index.html
Eval. result: http://lotus.kuee.kyoto-u.ac.jp/WAT/evaluation/index.html
3. Date and place of WAT has been fixed and schedule has been changed
The date and place of WAT has been fixed as follows:
Date: 4th, October, 2014
Place: Lecture Room 241, 4th floor, Faculty of Engineering Bldg.2,
Hongo Campus, The University of Tokyo, Japan
http://www.u-tokyo.ac.jp/campusmap/cam01_04_03_e.html
Also, due to the late opening of the automatic evaluation server, we
have changed the schedule of WAT as follows:
Crowdsourcing evaluation due: July 31 -> August 31
Draft paper due: August 31 -> September 14
Review feedback: September 7 -> September 21
Camera-ready paper due: September 14 -> September 28
Workshop: October 4
4. Invited speaker at WAT
We are planning to have an invited talk as follows:
Speaker: Dr. Ir. Hammam Riza, Director,
Agency for the Assessment and Application of Technology (BPPT)
Title: Leveraging ASEAN economic communities 2015 through Language Translation
Please visit the homepage of WAT for the biography of the speaker.
For details, please refer to the following URLs.
ASPEC: http://lotus.kuee.kyoto-u.ac.jp/ASPEC/
WAT: http://lotus.kuee.kyoto-u.ac.jp/WAT/
Best regards,
------------------------------
Message: 3
Date: Tue, 15 Jul 2014 22:13:56 +0200
From: TSD 2014 <xrambous@aurora.fi.muni.cz>
Subject: [Moses-support] TSD 2014 - Call for Demonstrations and
Participation
To: tsd2014@tsdconference.org
Message-ID: <E1X796q-0001EI-Nd@aurora.fi.muni.cz>
*********************************************************
TSD 2014 - CALL FOR DEMONSTRATIONS AND PARTICIPATION
*********************************************************
Seventeenth International Conference on TEXT, SPEECH and DIALOGUE (TSD 2014)
Brno, Czech Republic, 8-12 September 2014
http://www.tsdconference.org/
SUBMISSION OF DEMONSTRATION ABSTRACTS
Authors are invited to present actual projects, developed software and
hardware or interesting material relevant to the topics of the
conference. The authors of the demonstrations should provide the
abstract not exceeding one page as plain text. The submission must be
made using the online form available at the conference www pages.
The accepted demonstrations will be presented during a special
Demonstration Session (see the Demo Instructions at
www.tsdconference.org). Demonstrators can present their contribution
with their own notebook with an Internet connection provided by the
organisers or the organisers can prepare a PC computer with multimedia
support for demonstrators.
IMPORTANT DATES
August 3 2014 ............ Submission of demonstration abstracts
August 10 2014 ............ Notification of acceptance for
demonstrations sent to the authors
September 3-7 2014 ........ Conference dates
The demonstration abstracts will not appear in the Proceedings of TSD
2014 but they will be published electronically at the conference website.
KEYNOTE SPEAKERS
Ralph Grishman, New York University, USA
Active Learning for Information Extraction
Bernardo Magnini, FBK - Fondazione Bruno Kessler, Italy
Entailment graphs for text analytics
Salim Roukos, IBM, USA
Recent Progress in Statistical Machine Translation: Algorithms and Applications
The conference is organized by the Faculty of Informatics, Masaryk
University, Brno, and the Faculty of Applied Sciences, University of
West Bohemia, Pilsen. The conference is supported by International
Speech Communication Association.
Venue: Brno, Czech Republic
TSD SERIES
TSD series evolved as a prime forum for interaction between researchers in
both spoken and written language processing from all over the world.
Proceedings of TSD form a book published by Springer-Verlag in their
Lecture Notes in Artificial Intelligence (LNAI) series. TSD Proceedings
are regularly indexed by Thomson Reuters Conference Proceedings Citation
Index. Moreover, LNAI series are listed in all major citation databases
such as DBLP, SCOPUS, EI, INSPEC or COMPENDEX.
TOPICS
Topics of the conference will include (but are not limited to):
Corpora and Language Resources (monolingual, multilingual,
text and spoken corpora, large web corpora, disambiguation,
specialized lexicons, dictionaries)
Speech Recognition (multilingual, continuous, emotional
speech, handicapped speaker, out-of-vocabulary words,
alternative way of feature extraction, new models for
acoustic and language modelling)
Tagging, Classification and Parsing of Text and Speech
(morphological and syntactic analysis, synthesis and
disambiguation, multilingual processing, sentiment analysis,
credibility analysis, automatic text labeling, summarization,
authorship attribution)
Speech and Spoken Language Generation (multilingual, high
fidelity speech synthesis, computer singing)
Semantic Processing of Text and Speech (information
extraction, information retrieval, data mining, semantic web,
knowledge representation, inference, ontologies, sense
disambiguation, plagiarism detection)
Integrating Applications of Text and Speech Processing
(machine translation, natural language understanding,
question-answering strategies, assistive technologies)
Automatic Dialogue Systems (self-learning, multilingual,
question-answering systems, dialogue strategies, prosody in
dialogues)
Multimodal Techniques and Modelling (video processing, facial
animation, visual speech synthesis, user modelling, emotions
and personality modelling)
Papers on processing of languages other than English are strongly
encouraged.
PROGRAM COMMITTEE
Hynek Hermansky, USA (general chair)
Eneko Agirre, Spain
Genevieve Baudoin, France
Paul Cook, Australia
Jan Cernocky, Czech Republic
Simon Dobrisek, Slovenia
Karina Evgrafova, Russia
Darja Fiser, Slovenia
Radovan Garabik, Slovakia
Alexander Gelbukh, Mexico
Louise Guthrie, GB
Jan Hajic, Czech Republic
Eva Hajicova, Czech Republic
Yannis Haralambous, France
Ludwig Hitzenberger, Germany
Jaroslava Hlavacova, Czech Republic
Ales Horak, Czech Republic
Eduard Hovy, USA
Maria Khokhlova, Russia
Daniil Kocharov, Russia
Ivan Kopecek, Czech Republic
Valia Kordoni, Germany
Steven Krauwer, The Netherlands
Siegfried Kunzmann, Germany
Natalija Loukachevitch, Russia
Vaclav Matousek, Czech Republic
Diana McCarthy, United Kingdom
France Mihelic, Slovenia
Hermann Ney, Germany
Elmar Noeth, Germany
Karel Oliva, Czech Republic
Karel Pala, Czech Republic
Nikola Pavesic, Slovenia
Fabio Pianesi, Italy
Maciej Piasecki, Poland
Adam Przepiorkowski, Poland
Josef Psutka, Czech Republic
James Pustejovsky, USA
German Rigau, Spain
Leon Rothkrantz, The Netherlands
Anna Rumshisky, USA
Milan Rusko, Slovakia
Mykola Sazhok, Ukraine
Pavel Skrelin, Russia
Pavel Smrz, Czech Republic
Petr Sojka, Czech Republic
Stefan Steidl, Germany
Georg Stemmer, Germany
Marko Tadic, Croatia
Tamas Varadi, Hungary
Zygmunt Vetulani, Poland
Pascal Wiggers, The Netherlands
Yorick Wilks, GB
Marcin Wolinski, Poland
Victor Zakharov, Russia
FORMAT OF THE CONFERENCE
The conference program will include presentation of invited papers,
oral presentations, and poster/demonstration sessions. Papers will
be presented in plenary or topic oriented sessions.
Social events including a trip in the vicinity of Brno will allow
for additional informal interactions.
OFFICIAL LANGUAGE
The official language of the conference is English.
ACCOMMODATION
The organizing committee will arrange discounts on accommodation in
the 4-star hotel at the conference venue. The current prices of the
accommodation are available at the conference website.
ADDRESS
All correspondence regarding the conference should be
addressed to
Ales Horak, TSD 2014
Faculty of Informatics, Masaryk University
Botanicka 68a, 602 00 Brno, Czech Republic
phone: +420-5-49 49 18 63
fax: +420-5-49 49 18 20
email: tsd2014@tsdconference.org
The official TSD 2014 homepage is: http://www.tsdconference.org/
LOCATION
Brno is the second largest city in the Czech Republic with a
population of almost 400.000 and is the country's judiciary and
trade-fair center. Brno is the capital of South Moravia, which is
located in the south-east part of the Czech Republic and is known
for a wide range of cultural, natural, and technical sights.
South Moravia is a traditional wine region. Brno had been a Royal
City since 1347 and with its six universities it forms a cultural
center of the region.
Brno can be reached easily by direct flights from London, Moscow,
and Eindhoven, and by trains or buses from Prague (200 km) or Vienna
(130 km).
For the participants with some extra time, nearby places may
also be of interest. Local ones include: Brno Castle now called
Spilberk, Veveri Castle, the Old and New City Halls, the
Augustine Monastery with St. Thomas Church and crypt of Moravian
Margraves, Church of St. James, Cathedral of St. Peter & Paul,
Cartesian Monastery in Kralovo Pole, the famous Villa Tugendhat
designed by Mies van der Rohe along with other important
buildings of between-war Czech architecture.
For those willing to venture out of Brno, Moravian Karst with
Macocha Chasm and Punkva caves, battlefield of the Battle of
three emperors (Napoleon, Russian Alexander and Austrian Franz
- Battle by Austerlitz), Chateau of Slavkov (Austerlitz),
Pernstejn Castle, Buchlov Castle, Lednice Chateau, Buchlovice
Chateau, Letovice Chateau, Mikulov with one of the largest Jewish
cemeteries in Central Europe, Telc - a town on the UNESCO
heritage list, and many others are all within easy reach.
------------------------------
Message: 4
Date: Fri, 18 Jul 2014 15:10:12 +0200
From: Laurent Besacier <laurent.besacier@imag.fr>
Subject: [Moses-support] Translating a (literature) short story with
moses...
To: moses-support <moses-support@mit.edu>
Message-ID: <98D21B08-D0BB-4FD9-8E65-A950C5D65FC6@imag.fr>
Content-Type: text/plain; charset="iso-8859-1"
hello,
I was wondering if the pipeline machine translation & post-edition (MT+PE) was usable to translate a literary work (fiction, short story) and thus tried to bring a preliminary answer to this question.
I appplied that to a short story by American writer Richard Powers, still not available in French, which was automatically translated (with a en-fr SMT based on moses) and post-edited and then revised by non- professional translators.
Overall, it took 25 hours of human work for a 10k words story whereas the official french translator of R. Powers told me it would have taken him more than 60h.
But more important, quality was also analyzed with 10 readers of the translated novel + a survey, etc.
In case anyone is interested, all the data collected during this experiment (source, MT outpout, postedited text, revised text, final short story translated in French, survey for readers) is available on the link below
https://github.com/powersmachinetranslation/DATA
https://github.com/powersmachinetranslation/DATA/blob/master/README
So far, only a scientific paper in French was written about this - it can be found on https://github.com/powersmachinetranslation/DATA/blob/master/taln-besacier.pdf
Feel free to use/analyze the data if you are interested in it ; and if you have remarks (or want to know more) about this experiment, please let me know...
best regards
laurent besacier
------------------------
Laurent Besacier
Professeur ? l'Universit? Joseph Fourier (Grenoble 1)
Laboratoire d'Informatique de Grenoble (LIG)
Membre Junior de l'Institut Universitaire de France (IUF 2012-2017)
laurent.besacier@imag.fr
-------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140718/969b9b37/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1879 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20140718/969b9b37/attachment.bin
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 93, Issue 21
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 93, Issue 21"
Post a Comment