Moses-support Digest, Vol 107, Issue 37

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: sgm generation for personalized test sets (Tom Hoar)
2. analysis.perl / mteval-v13a.pl / BLEU-annotation (Vincent Nguyen)

----------------------------------------------------------------------

Message: 1
Date: Mon, 14 Sep 2015 16:45:08 +0700
From: Tom Hoar <tahoar@precisiontranslationtools.com>
Subject: Re: [Moses-support] sgm generation for personalized test sets
To: moses-support@mit.edu
Message-ID: <55F69724.60306@precisiontranslationtools.com>
Content-Type: text/plain; charset="windows-1252"

Thanks, Vincent.

An earlier Moses Perl script used on SGM file as a template, but it was
limited. I never found a good tool to create SGM files for mteval from
scratch. It's just as hard to find a documented reference for the mteval
scripts. That's why I created this tool.

Your two option suggestions are interesting, but I'm not sure it's
practical. A head or tail of -nb lines would be straight forward as long
as you keep all three data sets (src, ref, tst) in sync. Doing that for
a random selection is more involved. I'll have some time late next week
to look at these options.

Tom

On 9/14/2015 4:30 PM, moses-support-request@mit.edu wrote:
> Date: Mon, 14 Sep 2015 08:55:28 +0200
> From: Vincent Nguyen<vnguyen@neuf.fr>
> Subject: Re: [Moses-support] sgm generation for personalized test sets
> To:moses-support@mit.edu
> Message-ID:<55F66F60.40807@neuf.fr>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Hi Tom,
>
> If this script is intended exactly and only to generate sgm test/dev
> files from txt file then yes it needs to be amended.
>
> 1) line breakers except 0A need to be removed prior to the python
> execution (byte stream replace)
>
> 2) even though XML standard is to replace ' by ' and so on for
> others I have noticed that all test/dev sets do not include the xml
> codes like '
> so waht I did I removed the second string replace in your code.
> however I added 2 others replaces in the first sequence :   => " "
> and   => " "
>
> 3) even though this is standard for XML I removed the first 3 lines for
> the doc
> XML DOCTYPE and MTEVAL
> also the last one MTEVAL
>
> all of this to stick to the expected file for test sets.
>
> If you have the chance, you could add 2 options :
> - nb = nb of lines you want to take from the file
> - selection = either nb first lines or random in the txt file
>
> I am just wondering if there is not another perl script developped by
> someone. how were the sets generated to start with ?
>
> cheers,
> Vincent

--
Best regards,

Tom Hoar
Chief Executive Officer
/*Precision Translation Tools Pte Ltd*/
Singapore/Thailand
Web: www.precisiontranslationtools.com
<http://www.precisiontranslationtools.com>
Thailand Mobile: +66 87 345-1875
Skype: tahoar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150914/6e8f8e19/attachment-0001.html

------------------------------

Message: 2
Date: Mon, 14 Sep 2015 12:13:44 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: [Moses-support] analysis.perl / mteval-v13a.pl /
BLEU-annotation
To: moses-support <moses-support@mit.edu>
Message-ID: <55F69DD8.3050701@neuf.fr>
Content-Type: text/plain; charset=utf-8; format=flowed

Guys,

While running EMS with a big test file I realized that the analysis.perl
was executed very quickly while the actual Nist-Bleu was much much longer.

Also one thing is that the file "BLEU-Annotation" generated during
analysis does not contain the right line numbering.
it takes 0 as the first line thus, all line number are offset by 1.

Last, when you "average" the BLEU score from all these lines, it is not
the actual Nist BLEU score reported, slightly different.

Is it computed differently ?

Thanks,

Vincent

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 107, Issue 37
**********************************************

Moses-support Digest, Vol 107, Issue 37

0 Response to "Moses-support Digest, Vol 107, Issue 37"

Post a Comment