Moses-support Digest, Vol 104, Issue 22

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Different phrase tables with same dataset (Davood Mohammadifar)
2. CFP(PP): MT Marathon 2015 in Prague (Ondrej Bojar)
3. Re: Different phrase tables with same dataset (Barry Haddow)
4. The first complete SMT toolkit for x86-64 Windows released
(Tom Hoar)


----------------------------------------------------------------------

Message: 1
Date: Tue, 16 Jun 2015 11:00:14 +0000
From: Davood Mohammadifar <davood_mf@hotmail.com>
Subject: [Moses-support] Different phrase tables with same dataset
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <SNT150-W43AB3B2A7059C4CFE2FDAC8CA70@phx.gbl>
Content-Type: text/plain; charset="iso-8859-1"

Hello everyone
I used Moses 3 for training my parallel corpus. I gained different BLEU scores (18.5-22.5); So i tried to find the reason. Finally, I understood that phrase tables are different from each other. I trained 50000 parallel sentences and the size of phrase table, for the first time was about 39MB (gz format) and in second time, it was about 59MB (gz format). Also the phrase tables' content are somewhat different (in scores, and entries).
I used Mgiza and followed the instructions for baseline system in Moses manual. The problem was remained by using Giza++, too.
The problem was remained in training of 150000 sentences, too.
Is different size of phrase tables, normal?
Thank you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150616/2fdbe6c4/attachment-0001.htm

------------------------------

Message: 2
Date: Tue, 16 Jun 2015 13:05:26 +0200 (CEST)
From: Ondrej Bojar <bojar@ufal.mff.cuni.cz>
Subject: [Moses-support] CFP(PP): MT Marathon 2015 in Prague
To: moses-support@mit.edu
Message-ID:
<1076731541.193705.1434452726771.JavaMail.zimbra@ufal.mff.cuni.cz>
Content-Type: text/plain; charset=utf-8

(Apologies for multiple copies.)

This is a second call for:
- Papers - Projects - Participation

for

MT Marathon 2015 (Sept 7-12; Prague, Czech Republic, EU)

This year, you might have already attended the US edition of MT
Marathon. If you missed it or the fun was too short for you, come and
join us in September in Europe.

The EU edition of MT Marathon is organized by the EU project CRACKER,
with the underlying topic of quality in machine translation (QT Marathon).

Machine Translation Marathon is a week-long gathering of machine
translation researchers, developers, students and users. It features:

- MT Lectures and Labs covering the basics and tutorials.
- Invited talks from experienced researchers and practitioners.
- Technical Talks about open source tools.
- Hacking Projects to advance tools or research in one week.

Details:

http://www.statmt.org/mtm15 (registration will open soon)

Important dates:
July 5, 2015 Abstract submission deadline
July 19, 2015 Paper submission
August 5, 2015 Notification of acceptance
August 12, 2015 Camera-ready paper due


** Call for papers **

We invite developers of open source tools to present their work and
submit a paper of up to 10 pages that describes the underlying
methodology and includes instructions on how to download and use the
tools.

We are looking for stand-alone tools and extensions of existing tools,
such as the Moses open source system. Accepted papers will be
presented during the MT Marathon and published in the 104th issue of
the Prague Bulletin of Mathematical Linguistics
(http://ufal.mff.cuni.cz/pbml).


** Call for project proposals **

As always, project topics will get finalized on the first day of the
Marathon, but it was found useful in the past to announce and refine
project proposals earlier.

If you have an idea what you'd like to implement in a small team of
fellow participants, or if you just want to peek at what is going to
be proposed, have a look or edit the live document linked from:

http://www.statmt.org/mtm15/projects.html


--
Ondrej Bojar (mailto:obo@cuni.cz / bojar@ufal.mff.cuni.cz)
http://www.cuni.cz/~obo


------------------------------

Message: 3
Date: Tue, 16 Jun 2015 13:01:10 +0100
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] Different phrase tables with same dataset
To: Davood Mohammadifar <davood_mf@hotmail.com>,
"moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <55801006.4050206@staffmail.ed.ac.uk>
Content-Type: text/plain; charset="windows-1252"

Hi Davood

It isn't normal to get such large differences in phrase table size or
quality, on the same data set, although small variations are possible.
You should check carefully that you used exactly the same settings in
each run, and check if anything went wrong during training (errors in
the log file),

cheers - Barry

On 16/06/15 12:00, Davood Mohammadifar wrote:
> Hello everyone
>
> I used Moses 3 for training my parallel corpus. I gained different
> BLEU scores (18.5-22.5); So i tried to find the reason. Finally, I
> understood that phrase tables are different from each other. I trained
> 50000 parallel sentences and the size of phrase table, for the first
> time was about 39MB (gz format) and in second time, it was about 59MB
> (gz format). Also the phrase tables' content are somewhat different
> (in scores, and entries).
>
> I used Mgiza and followed the instructions for baseline system in
> Moses manual. The problem was remained by using Giza++, too.
>
> The problem was remained in training of 150000 sentences, too.
>
> Is different size of phrase tables, normal?
>
> Thank you
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150616/5bcf3e3d/attachment-0001.htm
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150616/5bcf3e3d/attachment-0001.bat

------------------------------

Message: 4
Date: Tue, 16 Jun 2015 21:27:37 +0700
From: Tom Hoar <tahoar@precisiontranslationtools.com>
Subject: [Moses-support] The first complete SMT toolkit for x86-64
Windows released
To: moses-support <moses-support@mit.edu>
Message-ID: <55803259.5050808@precisiontranslationtools.com>
Content-Type: text/plain; charset="utf-8"

PTTools is happy to announce the release of Slate, the first packaged
SMT toolkit for native Windows x86-64 operating systems. Note: "native"
means without Cygwin. There is also a parallel Slate package for Linux.
These packages include all the command-line utilities from Moses,
MGIZA++ and PTTools necessary to train, tune and evaluate phrase and
phrase-factored SMT models. They also include many more utilities, but
we test and support only phrase and phrase-factored "modes." You can
find detailed specs about the packages, where to get them, and our
commercial support offerings at this URL and the "More about Slate for
Windows
<http://www.precisiontranslationtools.com/downloads/slate-version-1-0-for-windows/>"
link at the bottom of the page:

http://www.precisiontranslationtools.com/slate/

The Making of Slate: Slate is not a "port" of Moses to Windows. Jeroen's
cross-platform updates created one C++ code base that compiles on either
Posix or Windows. The entire Moses community benefits... Linux, OS-X,
Android and now Windows users alike. That said, PTTools maintains two
forked repositories that we periodically mirror from the respective
Moses repositories. As of this package release, the C++ code in the
Moses & Slate repositories are in-synch and all C++ updates that created
the Slate packages are part of the Moses repositories' master branches.

https://bitbucket.org/pttools/slate-moses/
https://bitbucket.org/pttools/slate-mgiza/

These Slate repositories exist primarily to support commercial SMT users
(Windows and Linux) with cross-platform compatibility. If this is you,
please feel free to create issues in the "issues tracker" of the
slate-moses repository. However, questions about computational
linguistics and how SMT works should remain part of this moses-support list.

Moving forward. Every 3 or 4 months, we will test then-current Moses
commits, pick a stable commit, apply engineering and cross-platform
updates, test & verify them with the Moses team and finally pull the
updated commits into the Slate repositories. This means the Slate
repositories will lag behind the Moses team's work and the Slate code
will always be stable and tested. If anyone is looking for stable
commits that are tested for cross-platform compatibility, please feel
free to pull from these repositories. This also means the Moses
repositories will continue to receive our updates, hopefully leading to
a more stable, robust and perpetual cross-platform code base for everyone.

I invite the Moses community, academic and business alike, to try Slate.
It's exciting to experience SMT on a Windows host, especially when not
so long ago it was said that Moses is "unlikely to ever run on Windows
without Cygwin." (OK, I'm easily excited.) I can already hear (most) of
you scoffing that you wouldn't be caught dead with a Windows machine.
For you, I challenge you to install Wine 1.7 and run Slate for Windows.
All of its binaries and the updated Perl scripts run fine on Wine.
Please report your findings if you try.

Finally, thank you Moses team, for your patient work with our engineer
Jeroen, who worked full-time for more than five months to make this a
reality. Thanks Jeroen!

--

Best regards,
Tom Hoar
Chief Executive Officer
*Precision Translation Tools Pte Ltd*
Singapore/Thailand
Web: www.precisiontranslationtools.com
<http://www.precisiontranslationtools.com>
Thailand Mobile: +66 87 345-1875
Skype: tahoar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150616/de61a7bb/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 104, Issue 22
**********************************************

0 Response to "Moses-support Digest, Vol 104, Issue 22"

Post a Comment