Moses-support Digest, Vol 86, Issue 62

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Cannot prepare corpus (rohit dholakia)
2. Re: How to convert new moses.ini format of to old format
(Rajen Chatterjee)
3. Re: How to convert new moses.ini format of to old format
(Hieu Hoang)


----------------------------------------------------------------------

Message: 1
Date: Fri, 20 Dec 2013 23:09:50 -0800
From: rohit dholakia <rdholaki@sfu.ca>
Subject: Re: [Moses-support] Cannot prepare corpus
To: "Asad A.Malik" <asad_12204@yahoo.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAA==LgsR3Fz67UQ8buZy7HgfC-BiFDyHOoGPwegLzJ5vW9v5rA@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Asad,

You have the wrong redirections.

Its tokenizer.perl < input > output




On Fri, Dec 20, 2013 at 10:10 PM, Asad A.Malik <asad_12204@yahoo.com> wrote:

> Hi All,
>
> I am using MOSES for the 1st time and following the steps shown in the
> manual, but I am unable to prepare corpus (i.e. tokenization). I've also
> attached image of the terminal.
> Previously I've tried this step and it generated the output"news-commentary-v8.fr-en.tok.en". but now it is showing this output.
> And if I enter all the commands that are mention in manual for corpus
> preparation. the output is same as it is shown in image.
>
> Regards
>
>
> Asad A.Malik
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131220/8da345b4/attachment-0001.htm

------------------------------

Message: 2
Date: Sat, 21 Dec 2013 09:18:46 +0000
From: Rajen Chatterjee <rajen.k.chatterjee@gmail.com>
Subject: Re: [Moses-support] How to convert new moses.ini format of to
old format
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAC4-+Nws0hCsmQR_ZViCFKXVGaXWGmL8zU0ctumOpZHBvdm9FA@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,
You are right there is [ character in the phrase table. But the problem
is I am running same language pair with same train and test set on both,
old moses version and the new version. I am getting decoding output in the
old moses version (in which '[' character is present in the phrase table)
but I am not getting decoding output in the new moses.
As you say '[' character is a problem then why it is not giving error
when decoding with old moses?

Thanks


On Fri, Dec 20, 2013 at 3:54 PM, Hieu Hoang <hieuhoang@gmail.com> wrote:

> from the line number, that's likely that there is a [ character in your
> phrase table. Moses interpret words with [ ] as non-terminals.
>
> I think the error would happen whatever version of Moses you are using.
>
>
> You should escape these characters. Moses' tokenizer converts
> [ --> &#91;
> ] --> &#93;
>
> On 20/12/2013 05:49, Rajen Chatterjee wrote:
>
> I am getting this error during decoding using the new moses:
>
> Start loading text SCFG phrase table. Moses format : [34.000] seconds
> Reading
> /home/rajen/Public/SMT/experiments/acl-14-TAG/results/pb-cross-valid/en-kK1/moses_data/model/phrase-table.gz
>
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> ***************Check nextPos != string::npos failed in moses/Phrase.cpp:214
>
>
> So I thought let me try decoding using old mosesdecoder
>
>
> On Thu, Dec 19, 2013 at 5:31 PM, Hieu Hoang <Hieu.Hoang@ed.ac.uk> wrote:
>
>> There's no script to do that.
>>
>> Is there a reason you need to use the old decoder?
>>
>>
>> On 19 December 2013 16:40, Rajen Chatterjee <
>> rajen.k.chatterjee@gmail.com> wrote:
>>
>>> Hello,
>>> There is a script "scripts/training/convert-moses-ini-to-v2.perl"
>>> which converts an old format of moses.ini to new format, but I want vice
>>> versa i.e. from new format to old format. How can I achieve this?
>>>
>>>
>>> --
>>> -Regards,
>>> Rajen Chatterjee.
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>>
>> --
>> Hieu Hoang
>> Research Associate
>> University of Edinburgh
>> http://www.hoang.co.uk/hieu
>>
>>
>
>
> --
> -Regards,
> Rajen Chatterjee.
>
>
>


--
-Regards,
Rajen Chatterjee.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131221/b1371702/attachment-0001.htm

------------------------------

Message: 3
Date: Sat, 21 Dec 2013 14:02:09 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] How to convert new moses.ini format of to
old format
To: Rajen Chatterjee <rajen.k.chatterjee@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <A4797CB8-10A1-4A56-AF90-7BEF687DDDCF@gmail.com>
Content-Type: text/plain; charset="us-ascii"

You're right, there were 2 different translation models for phrase-based and syntax decoding in the old version. The phrase-based model didn't check for [ ] characters.

The new Moses use only 1 translation model which always checks for [ ].

It will be easier for you to escape the characters, rather than back port the ini file

Sent while bumping into things

> On 21 Dec 2013, at 09:18, Rajen Chatterjee <rajen.k.chatterjee@gmail.com> wrote:
>
> Hi,
> You are right there is [ character in the phrase table. But the problem is I am running same language pair with same train and test set on both, old moses version and the new version. I am getting decoding output in the old moses version (in which '[' character is present in the phrase table) but I am not getting decoding output in the new moses.
> As you say '[' character is a problem then why it is not giving error when decoding with old moses?
>
> Thanks
>
>
>> On Fri, Dec 20, 2013 at 3:54 PM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>> from the line number, that's likely that there is a [ character in your phrase table. Moses interpret words with [ ] as non-terminals.
>>
>> I think the error would happen whatever version of Moses you are using.
>>
>>
>> You should escape these characters. Moses' tokenizer converts
>> [ --> &#91;
>> ] --> &#93;
>>
>>> On 20/12/2013 05:49, Rajen Chatterjee wrote:
>>
>>> I am getting this error during decoding using the new moses:
>>>
>>> Start loading text SCFG phrase table. Moses format : [34.000] seconds
>>> Reading /home/rajen/Public/SMT/experiments/acl-14-TAG/results/pb-cross-valid/en-kK1/moses_data/model/phrase-table.gz
>>> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>>> ***************Check nextPos != string::npos failed in moses/Phrase.cpp:214
>>>
>>>
>>> So I thought let me try decoding using old mosesdecoder
>>>
>>>
>>>> On Thu, Dec 19, 2013 at 5:31 PM, Hieu Hoang <Hieu.Hoang@ed.ac.uk> wrote:
>>>> There's no script to do that.
>>>>
>>>> Is there a reason you need to use the old decoder?
>>>>
>>>>
>>>> On 19 December 2013 16:40, Rajen Chatterjee <rajen.k.chatterjee@gmail.com> wrote:
>>>>> Hello,
>>>>> There is a script "scripts/training/convert-moses-ini-to-v2.perl" which converts an old format of moses.ini to new format, but I want vice versa i.e. from new format to old format. How can I achieve this?
>>>>>
>>>>>
>>>>> --
>>>>> -Regards,
>>>>> Rajen Chatterjee.
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> Moses-support@mit.edu
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>>
>>>> --
>>>> Hieu Hoang
>>>> Research Associate
>>>> University of Edinburgh
>>>> http://www.hoang.co.uk/hieu
>>>
>>>
>>>
>>> --
>>> -Regards,
>>> Rajen Chatterjee.
>
>
>
> --
> -Regards,
> Rajen Chatterjee.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131221/2aa5a88a/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 86, Issue 62
*********************************************

0 Response to "Moses-support Digest, Vol 86, Issue 62"

Post a Comment