Moses-support Digest, Vol 110, Issue 44

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Strange behavioure of the decoder (Hieu Hoang)
2. Re: Strange behavioure of the decoder (Hieu Hoang)
3. Re: Strange behavioure of the decoder (Sehrob Ibrohimov)


----------------------------------------------------------------------

Message: 1
Date: Sat, 26 Dec 2015 20:27:09 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Strange behavioure of the decoder
To: Sehrob Ibrohimov <isehrob@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbiotO+F87sx5uTsaufe_f2oUh-+Cn=SvTyQXKWpr=6_LA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

are you sure the input and the training data are encoded as UTF8?

Also, was the input tokenized exactly the same way as the training? The
tokenization scripts sometimes changes from time to time

Hieu Hoang
http://www.hoang.co.uk/hieu

On 26 December 2015 at 13:53, Sehrob Ibrohimov <isehrob@gmail.com> wrote:

> Hello everyone!
>
> I have came accross some strange behaviour of moses decoder and can't
> get it right. I can't figure out what the problem is.
> The problem is, I have trained translation models about a year ago
> with moses (version 3.0) and got pretty good results. The translation
> was pretty impressive. But some days ago I have got new machine
> (Centos 7) and build moses sources and give him translation model I
> had. And I can't get any translations. But these models was giving
> good results with the previous moses which was on ubuntu 14.0.
> I give him input like: "????????? ?????????? ???????? ???? ?
> ?????????? ????????? ??????".
> It gives me: ?????????|UNK|UNK|UNK ??????????|UNK|UNK|UNK
> ????????|UNK|UNK|UNK ????|UNK|UNK|UNK ?|UNK|UNK|UNK
> ??????????|UNK|UNK|UNK ?????????|UNK|UNK|UNK ??????|UNK|UNK|UNK
> [11111111] [total=-798.292]
>
> core=(-800.000,-8.000,8.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,-25.136)
> Then I thought that there may be some new changes in moses, and I have
> retrained my models with the data I had, using EMS but nothing has
> changed, the decoder gives me the same results.
> Please, help me to figure out what the problem is!
>
> Sincerely yours
> Sehrob
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151226/9a995390/attachment-0001.html

------------------------------

Message: 2
Date: Sat, 26 Dec 2015 23:32:37 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Strange behavioure of the decoder
To: Sehrob Ibrohimov <isehrob@gmail.com>, moses-support
<moses-support@mit.edu>
Message-ID:
<CAEKMkbjuJXon9iSEJD6WmK1Gy-eUuxFdBvjpZ1=XFARpxKp9dg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Can I have a look at the moses.ini file you're using. Or upload all the
model files somewhere so I can replicate it

The structure of the binarised phrase-tables don't change a lot, but I
can't promise that.

If they do change and depending which binary pt you use, they have a
version number that would cause the decoder to exit with an error message,
or they segfault.

Since you're not getting either of these, I sill think it's due to
something simpler like encoding or tokenization

Hieu Hoang
http://www.hoang.co.uk/hieu

On 26 December 2015 at 23:22, Sehrob Ibrohimov <isehrob@gmail.com> wrote:

> Hello Hieu!
>
> Thank you for your replay.
> The training data is encoded as UTF8 (without bom). Input is also
> encoded as UTF-8 and tokenized as training data.
> And what about binarized models that gived translation results, and
> now is giving no any results ? It doesn't give a translation even for
> a single word
>
> Sehrob
>
> On 12/27/15, Hieu Hoang <hieuhoang@gmail.com> wrote:
> > are you sure the input and the training data are encoded as UTF8?
> >
> > Also, was the input tokenized exactly the same way as the training? The
> > tokenization scripts sometimes changes from time to time
> >
> > Hieu Hoang
> > http://www.hoang.co.uk/hieu
> >
> > On 26 December 2015 at 13:53, Sehrob Ibrohimov <isehrob@gmail.com>
> wrote:
> >
> >> Hello everyone!
> >>
> >> I have came accross some strange behaviour of moses decoder and can't
> >> get it right. I can't figure out what the problem is.
> >> The problem is, I have trained translation models about a year ago
> >> with moses (version 3.0) and got pretty good results. The translation
> >> was pretty impressive. But some days ago I have got new machine
> >> (Centos 7) and build moses sources and give him translation model I
> >> had. And I can't get any translations. But these models was giving
> >> good results with the previous moses which was on ubuntu 14.0.
> >> I give him input like: "????????? ?????????? ???????? ???? ?
> >> ?????????? ????????? ??????".
> >> It gives me: ?????????|UNK|UNK|UNK ??????????|UNK|UNK|UNK
> >> ????????|UNK|UNK|UNK ????|UNK|UNK|UNK ?|UNK|UNK|UNK
> >> ??????????|UNK|UNK|UNK ?????????|UNK|UNK|UNK ??????|UNK|UNK|UNK
> >> [11111111] [total=-798.292]
> >>
> >>
> core=(-800.000,-8.000,8.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,-25.136)
> >> Then I thought that there may be some new changes in moses, and I have
> >> retrained my models with the data I had, using EMS but nothing has
> >> changed, the decoder gives me the same results.
> >> Please, help me to figure out what the problem is!
> >>
> >> Sincerely yours
> >> Sehrob
> >>
> >> _______________________________________________
> >> Moses-support mailing list
> >> Moses-support@mit.edu
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151226/a4aaede3/attachment-0001.html

------------------------------

Message: 3
Date: Sun, 27 Dec 2015 04:40:30 +0500
From: Sehrob Ibrohimov <isehrob@gmail.com>
Subject: Re: [Moses-support] Strange behavioure of the decoder
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CADJquiB=Z9yZjHs+bysDZUh9+pX7q9Gv9_ZrarvnsSvbruH17w@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Here is decoder configuration I am using. But I am afraid I can not
upload all the models due to their size and slow internet connection,
sorry.

Sehrob

On 12/27/15, Hieu Hoang <hieuhoang@gmail.com> wrote:
> Can I have a look at the moses.ini file you're using. Or upload all the
> model files somewhere so I can replicate it
>
> The structure of the binarised phrase-tables don't change a lot, but I
> can't promise that.
>
> If they do change and depending which binary pt you use, they have a
> version number that would cause the decoder to exit with an error message,
> or they segfault.
>
> Since you're not getting either of these, I sill think it's due to
> something simpler like encoding or tokenization
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 26 December 2015 at 23:22, Sehrob Ibrohimov <isehrob@gmail.com> wrote:
>
>> Hello Hieu!
>>
>> Thank you for your replay.
>> The training data is encoded as UTF8 (without bom). Input is also
>> encoded as UTF-8 and tokenized as training data.
>> And what about binarized models that gived translation results, and
>> now is giving no any results ? It doesn't give a translation even for
>> a single word
>>
>> Sehrob
>>
>> On 12/27/15, Hieu Hoang <hieuhoang@gmail.com> wrote:
>> > are you sure the input and the training data are encoded as UTF8?
>> >
>> > Also, was the input tokenized exactly the same way as the training? The
>> > tokenization scripts sometimes changes from time to time
>> >
>> > Hieu Hoang
>> > http://www.hoang.co.uk/hieu
>> >
>> > On 26 December 2015 at 13:53, Sehrob Ibrohimov <isehrob@gmail.com>
>> wrote:
>> >
>> >> Hello everyone!
>> >>
>> >> I have came accross some strange behaviour of moses decoder and can't
>> >> get it right. I can't figure out what the problem is.
>> >> The problem is, I have trained translation models about a year ago
>> >> with moses (version 3.0) and got pretty good results. The translation
>> >> was pretty impressive. But some days ago I have got new machine
>> >> (Centos 7) and build moses sources and give him translation model I
>> >> had. And I can't get any translations. But these models was giving
>> >> good results with the previous moses which was on ubuntu 14.0.
>> >> I give him input like: "????????? ?????????? ???????? ???? ?
>> >> ?????????? ????????? ??????".
>> >> It gives me: ?????????|UNK|UNK|UNK ??????????|UNK|UNK|UNK
>> >> ????????|UNK|UNK|UNK ????|UNK|UNK|UNK ?|UNK|UNK|UNK
>> >> ??????????|UNK|UNK|UNK ?????????|UNK|UNK|UNK ??????|UNK|UNK|UNK
>> >> [11111111] [total=-798.292]
>> >>
>> >>
>> core=(-800.000,-8.000,8.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,-25.136)
>> >> Then I thought that there may be some new changes in moses, and I have
>> >> retrained my models with the data I had, using EMS but nothing has
>> >> changed, the decoder gives me the same results.
>> >> Please, help me to figure out what the problem is!
>> >>
>> >> Sincerely yours
>> >> Sehrob
>> >>
>> >> _______________________________________________
>> >> Moses-support mailing list
>> >> Moses-support@mit.edu
>> >> http://mailman.mit.edu/mailman/listinfo/moses-support
>> >>
>> >
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moses.ini
Type: application/octet-stream
Size: 1534 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20151226/d37cc307/attachment.obj

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 110, Issue 44
**********************************************

0 Response to "Moses-support Digest, Vol 110, Issue 44"

Post a Comment