Moses-support Digest, Vol 88, Issue 8

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Repositioning of non-translatable lexemes (Hieu Hoang)
2. Re: Get word-to-word alignments from mosesserver
(Jyotesh Choudhari)
3. Re: Repositioning of non-translatable lexemes (Achim Ruopp)
4. binarization error (Hieu Hoang)
5. Re: binarization error (Kenneth Heafield)


----------------------------------------------------------------------

Message: 1
Date: Tue, 4 Feb 2014 17:46:04 +0000
From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Subject: Re: [Moses-support] Repositioning of non-translatable lexemes
To: Tom Hoar <tahoar@precisiontranslationtools.com>, Sorin Slavescu
<sorin.slavescu@oracle.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbhmisoeeJzXEKhZ=6SQqN_3zLzR=QpheFT-txgvD3dUoA@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

The placeholder feature we did is described here:
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc61

I think transferring formatting is fairly standardised and simple, it may
be in the M4L project. If anyone would like to open source their
implementation, please consider adding it to the Moses repository

Here's a description:
http://www.mtsummit2013.info/files/proceedings/wptp2-joanis.pdf






On 4 February 2014 16:44, Tom Hoar <tahoar@precisiontranslationtools.com>wrote:

> Hieu, is the new XML-markup feature you and Achim developed a better
> match? I can't find the reference.
>
> Tom.
>
>
>
> On 02/04/2014 11:25 PM, Barry Haddow wrote:
> > Hi Sorin
> >
> > You should check out m4loc (https://code.google.com/p/m4loc/), whose
> > features include "Word-alignment based tag reinsertion".
> >
> > There is also a web translation tool in Moses
> > (http://www.statmt.org/moses/?n=Moses.WebTranslation) that handles the
> > re-insertion of html markup, but this has not been updated for a while
> > and may or may not work with the current version of Moses,
> >
> > cheers - Barry
> >
> > On 04/02/14 15:06, Sorin Slavescu wrote:
> >> Hi all,
> >>
> >> Is there any research, tools or libraries to address the issue of
> >> repositioning non-translatable content like tags, placeholders,
> >> entities into the translated sentence?
> >> For example if the source sentence is "This is a <b>test</b>" and the
> >> translation is "C'est un test" to reposition the <b> tag into the
> >> translation in the right spot to become "C'est un <b>test</b>"
> >>
> >> Thanks,
> >> Sorin
> >> --
> >>
> >>
> >> ORACLE <http://www.oracle.com>
> >> Sorin Slavescu | Principal Software Engineer
> >> Phone: +35318031937 | E-mail: sorin.slavescu@oracle.com
> >> Oracle Worldwide Product Translation (WPTG) - Tools
> >> Block P5, East Point Business Park | Dublin 3, Ireland
> >> Oracle is committed to developing practices and products that help
> >> protect the environment <http://www.oracle.com/commitment>
> >>
> >>
> >> _______________________________________________
> >> Moses-support mailing list
> >> Moses-support@mit.edu
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> >
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140204/34eb23d1/attachment-0001.htm

------------------------------

Message: 2
Date: Tue, 4 Feb 2014 19:16:43 +0000 (UTC)
From: Jyotesh Choudhari <jyoteshrc@gmail.com>
Subject: Re: [Moses-support] Get word-to-word alignments from
mosesserver
To: moses-support@mit.edu
Message-ID: <loom.20140204T201238-354@post.gmane.org>
Content-Type: text/plain; charset=utf-8

Tomas Fulajtar <TomasFu@...> writes:

>
>
>
> Hi Hieu, Jyotesh,
> ?
> I am also ?interested about this feature ? it would be useful for the
M4loc project and its reinsert_wordalign.pm
> tool.
> ?
> Could you please let us know about the progress? Can I help you somehow?

Hi Tom??,
I have implemented the feature and sent the patch to Hieu. He will review it
and notify me if any modifications are needed.

Thanks and regards,
Jyotesh Choudhari

> ?
> Thank you,
> ?
> ?
> Tom?? Fulajt?r
> | ReseacherT: +420-545-552-340?
tomasfu@moravia.com
> |
> moravia.com
>
> ?
> From: Hieu Hoang [mailto:Hieu.Hoang@ed.ac.uk]
> Sent: Wednesday, January 15, 2014 12:52 PMTo: Jyotesh ChoudhariCc:
moses-supportSubject: Re: [Moses-support] Get word-to-word alignments from
mosesserver
> ?
>
> I don't think it's been implemented. However, if you want to implement it,
I can help you. Please email me offline and I'll show you how
>
>
> ?
>
> On 13 January 2014 13:44, Jyotesh Choudhari
<jyoteshrc@gmail.com> wrote:
> Hi,
> According to Moses tutorial
> (http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc30), mosesserver
> returns only phrase alignments. I want word alignments too. How can I get
these?
> Thanks and regards,
> Jyotesh Choudhari
> _______________________________________________
> Moses-support mailing
listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>
>






------------------------------

Message: 3
Date: Tue, 4 Feb 2014 14:24:54 -0500
From: "Achim Ruopp" <achimru@gmail.com>
Subject: Re: [Moses-support] Repositioning of non-translatable lexemes
To: "'Tom Hoar'" <tahoar@precisiontranslationtools.com>,
<moses-support@mit.edu>
Message-ID: <00a201cf21de$ce865480$6b92fd80$@com>
Content-Type: text/plain; charset="us-ascii"

The new placeholder feature in Moses v2.1 is documented here:
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc61
This is for placeholders that have semantic meaning, e.g. numbers and other
named entities rather than tags.

Yes, this needs to be combined with tag handling and evaluated. Working on
this now.

Achim

-----Original Message-----
From: moses-support-bounces@mit.edu [mailto:moses-support-bounces@mit.edu]
On Behalf Of Tom Hoar
Sent: Tuesday, February 04, 2014 11:45 AM
To: moses-support@mit.edu
Subject: Re: [Moses-support] Repositioning of non-translatable lexemes

Hieu, is the new XML-markup feature you and Achim developed a better match?
I can't find the reference.

Tom.



On 02/04/2014 11:25 PM, Barry Haddow wrote:
> Hi Sorin
>
> You should check out m4loc (https://code.google.com/p/m4loc/), whose
> features include "Word-alignment based tag reinsertion".
>
> There is also a web translation tool in Moses
> (http://www.statmt.org/moses/?n=Moses.WebTranslation) that handles the
> re-insertion of html markup, but this has not been updated for a while
> and may or may not work with the current version of Moses,
>
> cheers - Barry
>
> On 04/02/14 15:06, Sorin Slavescu wrote:
>> Hi all,
>>
>> Is there any research, tools or libraries to address the issue of
>> repositioning non-translatable content like tags, placeholders,
>> entities into the translated sentence?
>> For example if the source sentence is "This is a <b>test</b>" and
>> the translation is "C'est un test" to reposition the <b> tag into the
>> translation in the right spot to become "C'est un <b>test</b>"
>>
>> Thanks,
>> Sorin
>> --
>>
>>
>> ORACLE <http://www.oracle.com>
>> Sorin Slavescu | Principal Software Engineer
>> Phone: +35318031937 | E-mail: sorin.slavescu@oracle.com Oracle
>> Worldwide Product Translation (WPTG) - Tools Block P5, East Point
>> Business Park | Dublin 3, Ireland Oracle is committed to developing
>> practices and products that help protect the environment
>> <http://www.oracle.com/commitment>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support



------------------------------

Message: 4
Date: Tue, 4 Feb 2014 19:37:31 +0000
From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Subject: [Moses-support] binarization error
To: Kenneth Heafield <moses@kheafield.com>, moses-support
<moses-support@mit.edu>
Message-ID:
<CAEKMkbhC=OU3MvQBz4PTcHvC5qjWSbDnqjwW08rC-ov0Rw9RKw@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

What's going on here?

.../bin/build_binary -i .../lm/interpolated-lm.1 .../interpolated-binlm.2

stderr:
Reading
/home/s0565741/workspace/experiment/agile10-chinese/lm/interpolated-lm.1
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
lm/read_arpa.cc:100 in void lm::ReadBackoff(util::FilePiece&, lm::Prob&)
threw FormatLoadException'.
Expected tab or newline for backoff in the 5-gram at byte 38261145623 Byte:
38261145623
ERROR

It was a standard EMS run, language models trained with lmplz then
interpolated with SRILM

--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140204/c103455b/attachment-0001.htm

------------------------------

Message: 5
Date: Tue, 04 Feb 2014 11:54:50 -0800
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] binarization error
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>, moses-support
<moses-support@mit.edu>
Message-ID: <52F1458A.1090005@kheafield.com>
Content-Type: text/plain; charset=ISO-8859-1

Looks like NFS and/or SRI got hungry and replaced large swathes of your
ARPA file with 0s.

On 02/04/14 11:37, Hieu Hoang wrote:
> What's going on here?
>
> .../bin/build_binary -i .../lm/interpolated-lm.1 .../interpolated-binlm.2
>
> stderr:
> Reading
> /home/s0565741/workspace/experiment/agile10-chinese/lm/interpolated-lm.1
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> ****************************************************************************************************
> lm/read_arpa.cc:100 in void lm::ReadBackoff(util::FilePiece&, lm::Prob&)
> threw FormatLoadException'.
> Expected tab or newline for backoff in the 5-gram at byte 38261145623
> Byte: 38261145623
> ERROR
>
> It was a standard EMS run, language models trained with lmplz then
> interpolated with SRILM
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>


------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 88, Issue 8
********************************************

Related Posts :

0 Response to "Moses-support Digest, Vol 88, Issue 8"

Post a Comment