Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Fw: positive log probability (arie ardiyanti)
2. Re: Fw: positive log probability (Marcin Junczys-Dowmunt)
3. Moses Server (Abdelfetah Boumerdas)
4. TweetMT 2015: Second call (Cristina)
----------------------------------------------------------------------
Message: 1
Date: Tue, 28 Apr 2015 05:56:18 +0000 (UTC)
From: arie ardiyanti <rie006@yahoo.com>
Subject: [Moses-support] Fw: positive log probability
To: Moses-support <moses-support@mit.edu>
Message-ID:
<227197137.154485.1430200578989.JavaMail.yahoo@mail.yahoo.com>
Content-Type: text/plain; charset="utf-8"
Dear Moses-support team,
I am a beginner of Moses. Pelase help me for these questions :1. I found the following error when build the binary language model. I have checked the arpa file, and found some positive value in 3-gram part. How can I handle this ?
????lm/read_arpa.cc:151 in void lm::PositiveProbWarn::Warn(float) threw FormatLoadException'.???
????Positive log probability 0.130482 in the model.? This is a bug in IRSTLM; you can set config.positive_log_probability = ????SILENT or pass -i to build_binary to substitute 0.0 for the log probability.? Error in the 3-gram at byte 134014 Byte: ????134014 ERROR2. I have done a sundanese to Indonesian translation using a very small corpus. I found that the translation using POS tag is worst than without POS tag (only using surface words). It is possible ? or does it mean that there are something wrong the my tagging process ?
thank you for your? help.best regards,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150428/b573a0b0/attachment-0001.htm
------------------------------
Message: 2
Date: Tue, 28 Apr 2015 08:43:45 +0200
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Fw: positive log probability
To: moses-support@mit.edu
Message-ID: <553F2C21.30308@amu.edu.pl>
Content-Type: text/plain; charset="windows-1252"
Hi,
the message tells you what to do:
"or pass -i to build_binary to substitute 0.0 for the log probability"
Run
build_binary -i ...
Best,
Marcin
W dniu 28.04.2015 o 07:56, arie ardiyanti pisze:
>
> Dear Moses-support team,
>
> I am a beginner of Moses. Pelase help me for these questions :
> 1. I found the following error when build the binary language model. I
> have checked the arpa file, and found some positive value in 3-gram
> part. How can I handle this ?
> lm/read_arpa.cc:151 in void lm::PositiveProbWarn::Warn(float)
> threw FormatLoadException'.
> Positive log probability 0.130482 in the model. This is a bug in
> IRSTLM; you can set config.positive_log_probability = SILENT or
> pass -i to build_binary to substitute 0.0 for the log probability.
> Error in the 3-gram at byte 134014 Byte: 134014 ERROR
> 2. I have done a sundanese to Indonesian translation using a very
> small corpus. I found that the translation using POS tag is worst than
> without POS tag (only using surface words). It is possible ? or does
> it mean that there are something wrong the my tagging process ?
>
> thank you for your help.
> best regards,
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150428/94c60c08/attachment-0001.htm
------------------------------
Message: 3
Date: Tue, 28 Apr 2015 11:48:02 +0100
From: Abdelfetah Boumerdas <aa_boumerdas@esi.dz>
Subject: [Moses-support] Moses Server
To: moses-support <moses-support@mit.edu>
Message-ID:
<CABJLC3c5U4Ujw6ZET0df4Rhh_3C_mgv6ST=Sr2U8tT6P+v8wBg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Dear Moses Users/Developers
I'm working on a web application that uses moses and i want to make it work
with the moses server and here is what i want to do with it:
- i want to give the user the ability to choose the option that he wants
to execute on moses for example the stack size
- i want to send this info along with the text to be translated to the
moses server
- after that i want the moses server to print everything into a file
that i will use to extract the info i need for my application
Doing that i faced these problems and I'll be very grateful if you can tell
me how can i surpass them
1. using the xmlrpc library can i send options to the moses server. For
example from a python program and using the xmlrpc can i tell the moses
server to translate a text with a stack size of 3 (-s 3) ??
2. can i launch the moses server process once and each time send him a
different moses.ini file to work on???
3. because the moses server is launched one time and listens for the
message to arrive to be treated which is the definition of a server. When i
want to print the results to the file i see that the moses server adds the
lines of each execution to the file given to him at first but what i want
is for him to overwrite that file each time. how can i do that??
thank you in advance.
--
BOUMERDAS Abdelfetah
5?me Ann?e Option Syst?mes Informatiques (SIQ)
Ecole nationale Sup?rieure d'Informatique ESI (ex INI)
BP 68 M Oued Smar 16309 - ALGER
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150428/57266300/attachment-0001.htm
------------------------------
Message: 4
Date: Tue, 28 Apr 2015 13:13:08 +0200
From: Cristina <cristinae@cs.upc.edu>
Subject: [Moses-support] TweetMT 2015: Second call
To: moses-support@mit.edu, mt-list <mt-list@eamt.org>
Message-ID:
<CAL0MP8hEqxHYXLcL7H6mpFOnoQOgq44-S-S43E4Jpi7dzaxL6w@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Apologies for multiple postings
*************************************************************************
TweetMT 2015
We would like to communicate some updates on the TweetMT translation task.
The development sets are now available on the web page and some public data
has been included. The date for the workshop is now confirmed to September
15. We would also like to remind you that there is still time for the
registration till May 12.
Important dates
- *March **1*: Registration opened
- *April 21*: Release of the development-set
- *May **12*: *Registration deadline*
- *May 19*: Release of the test-set
- *May 21*: Result submission deadline
- *May 22-June 12*: Manual evaluation. Publication of results
- *July 3*: Short paper submission deadline
- *July 31*: Papers? camera ready version
- *September **15*: Workshop
You can find more information on the website
http://komunitatea.elhuyar.org/tweetmt/
<http://komunitatea.elhuyar.org/tweetmt>--Tweet Translation Workshop at
SEPLN 2015
TweetMT is a workshop and shared task on machine translation applied to
tweets. It will take place in September, 2015, in Alicante, co-located with
SEPLN 2015. The objective of the task is to bring together interested
researchers to join forces to experiment with and compare different
approaches to tweet MT. This workshop is a follow-up to two other workshops
organized previously also at SEPLN: TweetNorm2013 and TweetLID2014.
The machine translation of tweets is a complex task that greatly depends on
the type of data we work with. The translation process of tweets is very
different from that of correct texts posted for instance through a content
manager. Tweets are often written from mobile devices, which exacerbates
the poor quality of the spelling, and include errors, symbols and
diacritics. The texts also vary in terms of structure, where the latter
include tweet-specific features such as hashtags, user mentions, and
retweets, among others. The translation of tweets can be tackled as a
direct translation (tweet-to-tweet) or as an indirect translation (tweet
normalization to standard text (Kaufmann&Kalita, 2011), text translation
and, if needed, tweet generation). Although the first approach looks
attractive, the lack of parallel or comparable tweets for the working
languages (Petrovic et al., 2010) tends to lead us towards an indirect
approach. Some authors also try to gather similar tweets in other languages
(CLIR).
Work in this area is scarce in the literature but a growing interest is
evident (Gotti et al., 2013). An important point of reference is the work
done to translate SMS texts during the Haiti earthquake (Munro, 2010).
The current task will focus on MT of tweets between languages of the
Iberian Peninsula (Basque, Catalan, Galician, Portuguese and Spanish), as
well as English. The organizing committee has released development data
including parallel tweets that will enable participants to train their
systems. For the final evaluation participants will have to submit the
automatic translation of a number of tweet corpora in a short period of
time. The evaluation will be carried out using automatic distances to the
reference corpora.
These corpora are not meant to be representative of all types of messages
that can be observed in informal communication. This is instead an initial
attempt at tackling part of the task which starts by addressing one of its
simplest parts. We are planing on using more informal and varied corpora in
future tasks as we make progress on these initial issues.
The workshop aims to be a forum where researchers will have a chance to
compare their methods, systems and results.
Organizing CommitteeI?aki Alegria (UPV/EHU)
Nora Aranberri (UPV/EHU)
Cristina Espa?a-Bonet (UPC)
Pablo Gamallo (USC)
Eva Mart?nez (UPC)
Hugo Oliveira (Universidade de Coimbra)
I?aki San Vicente (Elhuyar)
Antonio Toral (DCU, Dublin)
Arkaitz Zubiaga (University of Warwick)
Proceedings
The papers of the workshop will be published In the proceedings of ?XXXI
Congreso de la Sociedad Espa?ola de Procesamiento de lenguaje natural?. The
proceedings will be also published using the ceur-ws.org repository, and
will be indexed by DBLP, among others.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150428/a2208a3a/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 102, Issue 62
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 102, Issue 62"
Post a Comment