Moses-support Digest, Vol 86, Issue 11

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Viewer for parallel texts (Eleftherios Avramidis)
2. Re: merging two translation models (??????? ?????????)
3. Re: Regarding Transliteration (Jon Dehdari)
4. Re: Tuning the tree-to-tree based SMT on Moses, Problem, How?
(Aaron L.-F. Han)
5. Re: Tuning the tree-to-tree based SMT on Moses, Problem, How?
(Hieu Hoang)

----------------------------------------------------------------------

Message: 1
Date: Wed, 04 Dec 2013 18:44:32 +0100
From: Eleftherios Avramidis <eleftherios.avramidis@dfki.de>
Subject: [Moses-support] Viewer for parallel texts
To: moses-support@mit.edu
Message-ID: <529F6A00.5060101@dfki.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi,

I need a viewer which would display the two sides of a parallel corpus
side-by-side, so that I can check for sanity etc. I remember I used to
have one of them, but it has been some time, so I cannot find it right
now. Could somebody give me a quick suggestion?

best
Lefteris

--
MSc. Inf. Eleftherios Avramidis
DFKI GmbH, Alt-Moabit 91c, 10559 Berlin
Tel. +49-30 238 95-1806

Fax. +49-30 238 95-1810

-------------------------------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313
-------------------------------------------------------------------------------------------

------------------------------

Message: 2
Date: Wed, 04 Dec 2013 22:00:26 +0400
From: ??????? ????????? <verbalab@yandex.ru>
Subject: Re: [Moses-support] merging two translation models
To: Rico Sennrich <rico.sennrich@gmx.ch>, "moses-support@mit.edu"
<moses-support@mit.edu>
Message-ID: <292821386180026@web18j.yandex.ru>
Content-Type: text/plain; charset=koi8-r

Many thanx to your replies!

One more question concerning these scripts: are there capability to detach translation models to free RAM and attached new TMs while the decoder is running? If not maybe you can provide a roadmap for me to contribute such functionality?

Kind regards!

04.12.2013, 18:39, "Rico Sennrich" <rico.sennrich@gmx.ch>:
> ??????? ????????? <verbalab@...> writes:
>
>> ?Hello, everyone!
>>
>> ?I have the following question: I have trained a huge translation model
>
> trained on 1m sentences of news
>
>> ?texts. Also I have several minor translation models trained on texts of
>
> different domains. Are there any
>
>> ?tools in Moses that can enable merging translation models? ?Is it possible
>
> to merge models when the decoder
>
>> ?is running?
>>
>> ?Thanks!
>>
>> ?Kind regards, Alex Kalinin.
>
> Hi Alex,
>
> there are several scripts that allow you to merge multiple translation
> models: http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc52
>
> if you're interested in merging models at decoding time, you can do a
> log-linear combination of models (
> http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc18 ), or a
> linear-interpolation / count-based merge (
> http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc55 ).
>
> All methods allow you to weight the models in order to prioritize in-domain
> models. Check the documentation/literature if you're interested in more details.
>
> best wishes,
> Rico
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

--
? ?????????, ????????? ???????

------------------------------

Message: 3
Date: Wed, 4 Dec 2013 13:23:24 -0500
From: Jon Dehdari <jonsafari@ling.ohio-state.edu>
Subject: Re: [Moses-support] Regarding Transliteration
To: Prasanth K <prasanthk.ms09@gmail.com>
Cc: moses-support <moses-support@mit.edu>, pranjal4456@gmail.com,
Thomas Meyer <thomas.meyer@idiap.ch>
Message-ID: <20131204182324.GZ15705@ling.ohio-state.edu>
Content-Type: text/plain; charset=iso-8859-1

To match a given orthography in Unicode, you can use Unicode Character
Properties, which in Perl regexes looks like /\p{Bengali}/ for example.
See the following link for more explanation:
http://perldoc.perl.org/perlunicode.html#*Scripts*

So an oversimplified example:
echo "??????????? is a vast country" | perl -pe 's/(\p{Bengali})/$1 /g'

Unicode Character Properties are very useful for many kinds of text
preprocessing.

Best,
-Jon

On Wed, Dec 04, 2013 at 01:41:44PM +0100, Prasanth K wrote:
> @Thomas, I dont think his intention was to search for a given utf8
> string; it was more like to identify strings in a different language.?
> @Pranjal, when you say unicode words, understand that everything (even
> the english words) are encoded in utf8. Its more like you want to pick
> the Bengali (read as foreign language) fragments from your sentence so
> that they are skipped when given to the decoder. I recall doing
> something like this by defining a range of valid characters (which is
> the utf8 code points for characters in bengali in this case) and
> writing a filter to mark such characters. May be not a cool solution,
> but one that will work for ILs ? given that they have different scripts
> and code-values in Unicode.?
> - Regards,
> Prasanth
>
> On Wed, Dec 4, 2013 at 11:12 AM, Thomas Meyer
> <[1]Thomas.Meyer@idiap.ch> wrote:
>
> Hi,
> echo "??????????? is a vast country" | perl -pe '/(???????????)/; print
> $1."\n"'
> if you want to replace it with something:
> echo "??????????? is a vast country" | perl -pe
> 's/???????????/something/g'
> But this is normally not the list to ask perl questions to...
> Best,
> Thomas
> On 04/12/13 10:55, Pranjal Das wrote:
>
> can anyone help with a perl script to extract unicode words from a
> sentence. For eg. i want to extract the word ??????????? from the
> sentence "??????????? is a vast country"..
> ? Pranjal Das
> Department of Information Technology,
> Institute of Science and Technology,
> Gauhati University,Guwahati,Assam
> Phone- [2]+91-8399879454
>
> _______________________________________________
> Moses-support mailing list
> [3]Moses-support@mit.edu
> [4]http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> [5]Moses-support@mit.edu
> [6]http://mailman.mit.edu/mailman/listinfo/moses-support
>
> --
> "Theories have four stages of acceptance. i) this is worthless
> nonsense; ii) this is an interesting, but perverse, point of view, iii)
> this is true, but quite unimportant; iv) I always said so."
> ? --- J.B.S. Haldane
>
> References
>
> 1. mailto:Thomas.Meyer@idiap.ch
> 2. tel:%2B91-8399879454
> 3. mailto:Moses-support@mit.edu
> 4. http://mailman.mit.edu/mailman/listinfo/moses-support
> 5. mailto:Moses-support@mit.edu
> 6. http://mailman.mit.edu/mailman/listinfo/moses-support

> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

------------------------------

Message: 4
Date: Thu, 5 Dec 2013 10:44:13 +0800
From: "Aaron L.-F. Han" <hanlifengaaron@gmail.com>
Subject: Re: [Moses-support] Tuning the tree-to-tree based SMT on
Moses, Problem, How?
To: "hieu.hoang@ed.ac.uk" <hieu.hoang@ed.ac.uk>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAETj867Ye5-eVUCKqV-TkrZKAkU314mD-Sioi85dd2Tw1KiSeA@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Hieu,

We want to know whether the tree-to-tree based SMT on Moses can be tuned
like the phrase-based translation?
If the tree-to-tree based SMT can be tuned, what is the tuning command?

We have tried several times using the following setting, they did not work:
1. use Chinese-English parsed bilingual corpus and already converted into
moses.xml format; use the command similar like the phrase-based tuning,
just replace the "moses" with the "mosed_chart", the tuning exits.

2. use Chinese-English parsed bilingual corpus and already converted into
moses.xml format; use the command similar like the phrase-based tuning,
replace the "moses" with the "mosed_chart", add the "-glue-grammar
--source-syntax --target-syntax", the tuning exits.

3.use Chinese-English plain corpus, use the command similar like the
phrase-based tuning, just replace the "moses" with the "mosed_chart", the
tuning exits.

Bests,
Aaron

--

Master Degree Candidate, Research Assistant,

Natural Language Processing & Portuguese-Chinese Machine Translation
Laboratory

Address: Room 108, Research Building, University of Macau, Av. Padre Tom?s
Pereira Taipa, Macau, China

Homepage: http://www.linkedin.com/in/aaronhan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131205/bc4d53bb/attachment-0001.htm

------------------------------

Message: 5
Date: Thu, 5 Dec 2013 06:09:22 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Tuning the tree-to-tree based SMT on
Moses, Problem, How?
To: "Aaron L.-F. Han" <hanlifengaaron@gmail.com>
Cc: "hieu.hoang@ed.ac.uk" <hieu.hoang@ed.ac.uk>, moses-support
<moses-support@mit.edu>
Message-ID: <62392611-4C21-42DA-9FE7-8520865A7BC7@gmail.com>
Content-Type: text/plain; charset="utf-8"

What the the exact error you're getting? Can you please send me the Moses.ini file you used and an example of your input sentence

Sent while bumping into things

> On 5 Dec 2013, at 02:44, "Aaron L.-F. Han" <hanlifengaaron@gmail.com> wrote:
>
> Hi Hieu,
>
> We want to know whether the tree-to-tree based SMT on Moses can be tuned like the phrase-based translation?
> If the tree-to-tree based SMT can be tuned, what is the tuning command?
>
> We have tried several times using the following setting, they did not work:
> 1. use Chinese-English parsed bilingual corpus and already converted into moses.xml format; use the command similar like the phrase-based tuning, just replace the "moses" with the "mosed_chart", the tuning exits.
>
>
> 2. use Chinese-English parsed bilingual corpus and already converted into moses.xml format; use the command similar like the phrase-based tuning, replace the "moses" with the "mosed_chart", add the "-glue-grammar --source-syntax --target-syntax", the tuning exits.
>
>
> 3.use Chinese-English plain corpus, use the command similar like the phrase-based tuning, just replace the "moses" with the "mosed_chart", the tuning exits.
>
> Bests,
> Aaron
>
>
> --
> Master Degree Candidate, Research Assistant,
> Natural Language Processing & Portuguese-Chinese Machine Translation Laboratory
> Address: Room 108, Research Building, University of Macau, Av. Padre Tom?s Pereira Taipa, Macau, China
> Homepage: http://www.linkedin.com/in/aaronhan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131205/b65bbb57/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 86, Issue 11
*********************************************

Moses-support Digest, Vol 86, Issue 11

0 Response to "Moses-support Digest, Vol 86, Issue 11"

Post a Comment