Moses-support Digest, Vol 107, Issue 32

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. sgm generation for personalized test sets (Vincent Nguyen)
2. Re: sgm generation for personalized test sets (Vincent Nguyen)
3. Performance issue with Neural LM for English-Hindi SMT
(Rajnath Patel)


----------------------------------------------------------------------

Message: 1
Date: Sat, 12 Sep 2015 22:07:06 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: [Moses-support] sgm generation for personalized test sets
To: moses-support <moses-support@mit.edu>
Message-ID: <55F485EA.70204@neuf.fr>
Content-Type: text/plain; charset=utf-8; format=flowed

Hi,

What script do you guys use to generate sgm sets based on txt file ?

I have tried makemteval.py in contrib
but there are a few issues.

I think these lines:
lines =
[l.replace('&quot;','\"').replace('&apos;','\'').replace('&gt;','>').replace('&lt;','<').replace('&amp;','&')
for l in filein.read().splitlines()]
filein.close()
lines =
[l.replace('&','&amp;').replace('<','&lt;').replace('>','&gt;').replace('\'','&apos;').replace('\"','&quot;')
for l in lines]

are not 100% bullet proof.

in the output I still get &apos; and such
it does not handle the &nbsp;
it does not handle the \r\n sequence I think since the output has more
lines than in the txt file.

Maybe there is another script.

thanks.





------------------------------

Message: 2
Date: Sun, 13 Sep 2015 10:44:02 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: Re: [Moses-support] sgm generation for personalized test sets
To: moses-support <moses-support@mit.edu>
Message-ID: <55F53752.9060603@neuf.fr>
Content-Type: text/plain; charset=windows-1252; format=flowed


in order to use makemteval.py we need to remove 0D and E2 80 A8 from txt
files.
python handles them as additional line breakers.

Le 12/09/2015 22:07, Vincent Nguyen a ?crit :
> Hi,
>
> What script do you guys use to generate sgm sets based on txt file ?
>
> I have tried makemteval.py in contrib
> but there are a few issues.
>
> I think these lines:
> lines =
> [l.replace('&quot;','\"').replace('&apos;','\'').replace('&gt;','>').replace('&lt;','<').replace('&amp;','&')
> for l in filein.read().splitlines()]
> filein.close()
> lines =
> [l.replace('&','&amp;').replace('<','&lt;').replace('>','&gt;').replace('\'','&apos;').replace('\"','&quot;')
> for l in lines]
>
> are not 100% bullet proof.
>
> in the output I still get &apos; and such
> it does not handle the &nbsp;
> it does not handle the \r\n sequence I think since the output has more
> lines than in the txt file.
>
> Maybe there is another script.
>
> thanks.
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


------------------------------

Message: 3
Date: Sun, 13 Sep 2015 15:21:58 +0530
From: Rajnath Patel <patelrajnath@gmail.com>
Subject: [Moses-support] Performance issue with Neural LM for
English-Hindi SMT
To: moses-support <moses-support@mit.edu>
Cc: patelrajnath <patelrajnath@gmail.com>
Message-ID:
<CAE-r4um9Z5kDgYXm6BOsaiXL0-WtXEPYZAg=WERh0sTrcm-36w@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi all,

I have tried Neural LM(nplm) with phrase based English-Hindi SMT, but
translation quality is kind of not good as compared to n-gram LM(scores are
given below). I have trained LM for 3-gram and 5-gram with default
setting(as mentioned on statmt.org/moses). Kindly suggest, If some one has
tried the same English-Hindi SMT and got improved results. What may be
probable cause of degraded results?

BLEU scores:
n-gram(5-gram)=24.40
neural-lm(5-gram)=11.30
neural-lm(3-gram)=12.10

Thank you.

--
Regards:
Raj Nath Patel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150913/ab80da4d/attachment-0001.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 107, Issue 32
**********************************************

0 Response to "Moses-support Digest, Vol 107, Issue 32"

Post a Comment