Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: differences between moses and moses2 output (Vito Mandorino)
----------------------------------------------------------------------
Message: 1
Date: Wed, 28 Sep 2016 18:03:02 +0200
From: Vito Mandorino <vito.mandorino@linguacustodia.com>
Subject: Re: [Moses-support] differences between moses and moses2
output
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CA+8mSmEF1ptJ12GTcrMpcaKubUX32c1z1J-iLt7UNpTmCGxUBg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Now it works! Thanks. On 6000 test sentences the Moses2 output is now
actually 100% identical to the standard Moses output.
Vito
2016-09-28 16:12 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
> hi Vito,
>
> please git pull and try decoding again. I've just pushed a fix
> https://github.com/hieuhoang/mosesdecoder/commit/
> 0005e98b2674906162ce7945c5edd6a42c9ca418
> Basically, I've changed changed the behavious of the pugi call so that it
> doesn't unescape the &apos words
>
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 28 September 2016 at 14:33, Hieu Hoang <hieuhoang@gmail.com> wrote:
>
>> ah ok. do you have a moses.ini and example input sentence to go with that.
>>
>> pugixml.cpp is used to parse the input sentence for XML markups for
>> placeholders, forced-translation etc. You shouldn't change the code for
>> pugixml 'cos it's an imported library that we don't control and we may
>> reimport in future if there are new releases. The problem seems to be
>> Moses2' use of the library so it should be fixed in Moses2
>>
>> Hieu Hoang
>> http://www.hoang.co.uk/hieu
>>
>> On 28 September 2016 at 14:22, Vito Mandorino <
>> vito.mandorino@linguacustodia.com> wrote:
>>
>>> We are able to replicate the issue with the probingPT version of this
>>> phrase-table:
>>>
>>> ' ||| ' ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>>> & ||| & ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>>> > ||| > ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>>> < ||| < ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>>> " ||| " ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>>> ||| ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>>>   |||   ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>>>
>>> If we understand well, the origin of the issue is in the function
>>> strconv_escape in ./contrib/moses2/pugixml.cpp which replaces some of
>>> these entities with the actual symbol. Commenting out that part seems to
>>> fix the problem, but we wonder if this may cause any issues elsewhere since
>>> we don't know the purpose of the entity replacement.
>>>
>>> Best regards,
>>> Vito
>>>
>>> 2016-09-28 11:19 GMT+02:00 Hieu Hoang <hieuhoang@gmail.com>:
>>>
>>>> Can you make your model files available for download?
>>>>
>>>> Moses and Moses2 aren't guaranteed to give exactly the same answer.
>>>> However, they should be the same quality overall
>>>>
>>>> Hieu Hoang
>>>> http://www.hoang.co.uk/hieu
>>>>
>>>> On 28 September 2016 at 09:53, Vito Mandorino <
>>>> vito.mandorino@linguacustodia.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> we are testing moses2 and we find a decrease in quality which seems to
>>>>> be related to apostrophes. For instance:
>>>>>
>>>>> Source segment 1:
>>>>> mise ? disposition des actionnaires des documents d' information
>>>>> relatifs ? la sicav
>>>>>
>>>>> MT Moses:
>>>>> provision shareholders of the briefing material for the sicav
>>>>>
>>>>> MT Moses2:
>>>>> provision of shareholders documents d' information concerning the fund
>>>>>
>>>>>
>>>>> Source segment 2:
>>>>> tout titre qui deviendrait sp?culatif ? la suite d' une
>>>>> r?trogradation apr?s son acquisition par le fonds ne sera pas liquid? , ?
>>>>> moins que le conseiller en investissement n' estime qu' il y va
>>>>> de l' int?r?t des actionnaires .
>>>>>
>>>>> MT Moses:
>>>>> any security that would become speculative following a downgrading
>>>>> after its takeover by the fund will not be liquidated , unless the
>>>>> investment adviser believes it is in the interest of shareholders .
>>>>>
>>>>> MT Moses2:
>>>>> any security that would become speculative following a possible
>>>>> downgrade d' by the fund after its acquisition will not be liquidated ,
>>>>> unless the investment advisor believes n' stake qu' l' interest of
>>>>> shareholders .
>>>>>
>>>>> It is actually strange that the raw MT output contains the apostrophe
>>>>> symbol instead of the ' entity . What could the reason be?
>>>>>
>>>>> Best regards,
>>>>> Vito
>>>>>
>>>>>
>>>>> --
>>>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>>>
>>>>>
>>>>> [image: Description : Description : lingua_custodia_final full logo]
>>>>>
>>>>> *The Translation Trustee*
>>>>>
>>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>>>
>>>>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
>>>>> <%2B33%206%2084%2065%2068%2089>*
>>>>>
>>>>> *Email :* *vito.mandorino@linguacustodia.com
>>>>> <massinissa.ahmim@linguacustodia.com>*
>>>>>
>>>>> *Website :*
>>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> Moses-support@mit.edu
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>
>>>
>>> [image: Description : Description : lingua_custodia_final full logo]
>>>
>>> *The Translation Trustee*
>>>
>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>
>>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89
>>> <%2B33%206%2084%2065%2068%2089>*
>>>
>>> *Email :* *vito.mandorino@linguacustodia.com
>>> <massinissa.ahmim@linguacustodia.com>*
>>>
>>> *Website :*
>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>
>>
>>
>
--
*M**. Vito MANDORINO -- Chief Scientist*
[image: Description : Description : lingua_custodia_final full logo]
*The Translation Trustee*
*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89*
*Email :* *vito.mandorino@linguacustodia.com
<massinissa.ahmim@linguacustodia.com>*
*Website :*
*www.linguacustodia.finance <http://www.linguacustodia.com/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160928/8918327c/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20160928/8918327c/attachment.jpg
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 119, Issue 42
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 119, Issue 42"
Post a Comment