Moses-support Digest, Vol 97, Issue 31

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: using sparse features (Barry Haddow)
2. Re: Encoding in MGIZA (Rico Sennrich)
3. Moses Build Error: Failed gcc.link (Rajen Chatterjee)


----------------------------------------------------------------------

Message: 1
Date: Fri, 14 Nov 2014 12:53:18 +0000
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] using sparse features
To: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <5465FB3E.2000205@staffmail.ed.ac.uk>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hi Marcin

One practical problem that online-mira faced, is that it is not a
drop-in replacement for mert in that way that pro and kbmira are, so it
requires people to change their pipeline a bit. This means that it has a
higher bar for acceptance (in the sense of offering consistent
improvements) than the other methods.

On lattice mira, yes it is implemented. In my tests so far I did not
find it to be clearly different from kbmira, but I have not tested it
extensively with sparse features. It's quite a bit slower (can't
remember exactly how much) but probably we can at least optimise the
number of iterations, and hopefully optimise the code. I won't have a
chance to look at it again for a while, so if anyone else wants to pick
it up ...

cheers - Barry

On 14/11/14 12:41, Marcin Junczys-Dowmunt wrote:
>
> Hi,
>
> Eva: And in a sparse-feature scenario compared to PRO or kbmira?
>
> Barry: Thanks for the pointer. I understand the main problem is
> evidence-sparsity for sparse features. I am currently trying to
> counter that by using huge devsets (up to 50.000 sentences, divided
> into pieces of 5.000, then averaging weights, cross-validation
> basically) which seems to help, but I am always suspicious that the
> optimization method is not doing as well as it could. So I was hoping
> you might have something new :) I remember Collin Cherry talking about
> lattice Mira, we don't have this in Moses, have we?
>
> W dniu 2014-11-14 11:27, Barry Haddow napisa?(a):
>
>> Hi Marcin
>>
>> I think if you look at the situations where sparse features are
>> successful, you often find they are tuning with multiple references.This
>> paper lends support to the idea that multiple references are important:
>> http://www.statmt.org/wmt14/pdf/W14-3360.pdf.
>>
>> cheers - Barry
>>
>> On 14/11/14 10:24, Eva Hasler wrote:
>>> In comparison to MERT? not really, we compared English-French and
>>> German-English at IWSLT 2012 and the baseline scores were a bit
>>> higher for En-Fr a bit lower for De-En. But of course the point is
>>> that you can use more features, so you have to define useful feature
>>> sets that are sparse but still able to generalise On Fri, Nov 14,
>>> 2014 at 10:16 AM, Marcin Junczys-Dowmunt <junczys@amu.edu.pl
>>> <mailto:junczys@amu.edu.pl> <mailto:junczys@amu.edu.pl
>>> <mailto:junczys@amu.edu.pl>>> wrote: Speed aside, quality did not
>>> improve significantly? W dniu 14.11.2014 o 11:11, Eva Hasler pisze:


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



------------------------------

Message: 2
Date: Fri, 14 Nov 2014 13:20:49 +0000 (UTC)
From: Rico Sennrich <rico.sennrich@gmx.ch>
Subject: Re: [Moses-support] Encoding in MGIZA
To: moses-support@mit.edu
Message-ID: <loom.20141114T115012-848@post.gmane.org>
Content-Type: text/plain; charset=us-ascii

Hieu Hoang <Hieu.Hoang@...> writes:

> Ken - should we add encoding on open to all python scripts, rather than
set the PYTHONIOENCODING env variable? That's basically what happens with
the perl scripts/
>
> What python/Linux version are you using? I don't see it on my version
(Python 2.7.3, Ubuntu 12.04)

Hi all,

It's kinda tricky to have consistent encoding between Python 2.X and Python
3. The patch to merge_alignment.py will fail under 2.X. I suggest to use
io.open instead, which works with all versions from 2.6 up. And if any
string processing is done, I suggest using 'from __future__ import
unicode_literals' to ensure that all string literals are interpreted as
unicode, and making sure that all input/output is UTF-8 (including
stdin/stdout/stderr). I usually do this with the following code block:

import codecs
if sys.version_info < (3,0,0):
sys.stdin = codecs.getreader('UTF-8')(sys.stdin)
sys.stdout = codecs.getwriter('UTF-8')(sys.stdout)
sys.stderr = codecs.getwriter('UTF-8')(sys.stderr)

best,
Rico



------------------------------

Message: 3
Date: Fri, 14 Nov 2014 14:30:09 +0100
From: Rajen Chatterjee <rajen.k.chatterjee@gmail.com>
Subject: [Moses-support] Moses Build Error: Failed gcc.link
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAC4-+NxKPNdR-eJSL6MpYxBCV4ZoD79ac5J6MZbM=ny5R3eGWA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Everyone,

When I build moses with the following command it works:
./bjam --with-boost=/home/chatterjee/Public/SMT/boost_1_55_0 -j4


but when I try to build with SRILM with the following command it shows
error "failed gcc.link":(PFA log file)
./bjam --with-boost=/home/chatterjee/Public/SMT/boost_1_55_0
--with-srilm=/home/chatterjee/Public/SMT/srilm-1.7.1 -j4

Did anyone face similar problem and any solution to it?


PS: SRILM is installed successfully and all test cases produced identical
result. So I guess there is no problem with SRILM installation.

--
-Regards,
Rajen Chatterjee.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141114/b3c1a451/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: build.log.gz
Type: application/x-gzip
Size: 11292 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20141114/b3c1a451/attachment.bin

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 97, Issue 31
*********************************************

0 Response to "Moses-support Digest, Vol 97, Issue 31"

Post a Comment