Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. KenLM lazy loading with new ini format (Marcin Junczys-Dowmunt)
2. Re: KenLM lazy loading with new ini format
(Marcin Junczys-Dowmunt)
3. Memory leak with large input files? (Marcin Junczys-Dowmunt)
4. Re: KenLM lazy loading with new ini format (Kenneth Heafield)
5. Re: C++11 (Marcin Junczys-Dowmunt)
6. Re: How to Improve the translation (Lane Schwartz)
7. Re: How to Improve the translation (nadeem khan)
----------------------------------------------------------------------
Message: 1
Date: Tue, 14 Jan 2014 23:07:46 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: [Moses-support] KenLM lazy loading with new ini format
To: moses-support <moses-support@mit.edu>
Message-ID: <52D5B532.5060905@amu.edu.pl>
Content-Type: text/plain; charset=UTF-8; format=flowed
Hi List,
how can I specify lazy loading for KenLM with the new moses.ini format?
Used to be a 9.
Best,
Marcin
------------------------------
Message: 2
Date: Tue, 14 Jan 2014 23:11:59 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] KenLM lazy loading with new ini format
To: moses-support <moses-support@mit.edu>
Message-ID: <52D5B62F.6020204@amu.edu.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Got it.
KENLM ... lazyken=<true/false>
This is sort of offensive, isn't it? :)
W dniu 14.01.2014 23:07, Marcin Junczys-Dowmunt pisze:
> Hi List,
> how can I specify lazy loading for KenLM with the new moses.ini format?
> Used to be a 9.
> Best,
> Marcin
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
------------------------------
Message: 3
Date: Tue, 14 Jan 2014 23:27:58 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: [Moses-support] Memory leak with large input files?
To: moses-support <moses-support@mit.edu>
Message-ID: <52D5B9EE.7010003@amu.edu.pl>
Content-Type: text/plain; charset=UTF-8; format=flowed
Hi,
I think I have noticed some weird behaviour. If a huge input file
(millions of sentences) is processed by moses on stdin, memory usage
keeps growing and growing. At the beginning usage is about 2GB, after
roughly 20000 sentences it reaches around 8GB and it keeps growing.
However, if I give moses only those 20000 sentences at stdin (cut off
with head -n 20000) memory usage stays below 3 GB. it seems the input is
somehow being buffered before translation?
I came across this by retranslating the training corpus, looks buggy.
Best,
Marcin
------------------------------
Message: 4
Date: Tue, 14 Jan 2014 17:52:01 -0500
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] KenLM lazy loading with new ini format
To: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Cc: moses-support@mit.edu
Message-ID:
<CAA+pi367q_1H96CjUgPpuZ8UbkMYqsRt6xK91SCs_ZDAJixD2g@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
I'm too lazy to fix it :-P
On Tue, Jan 14, 2014 at 5:11 PM, Marcin Junczys-Dowmunt
<junczys@amu.edu.pl> wrote:
> Got it.
>
> KENLM ... lazyken=<true/false>
>
> This is sort of offensive, isn't it? :)
>
> W dniu 14.01.2014 23:07, Marcin Junczys-Dowmunt pisze:
>> Hi List,
>> how can I specify lazy loading for KenLM with the new moses.ini format?
>> Used to be a 9.
>> Best,
>> Marcin
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
------------------------------
Message: 5
Date: Wed, 15 Jan 2014 01:12:39 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] C++11
To: moses-support@mit.edu
Message-ID: <52D5D277.7000608@amu.edu.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Hi Rico,
Revision d2d508184e35909aa5da901b81bb70f10f7794c7 breaks my compact
reordering model, but at runtime and only if you do a clean build
without any build artifacts from earlier compilations. It segfaults
during loading in a weird low-level place. I can investigate some more
if you want me to.
Best,
Marcin
W dniu 14.01.2014 18:08, Rico Sennrich pisze:
> Hi list,
>
> I just pushed a commit that uses a C++11 feature (initalizer list). It
> should work with compilers that are no older than 5 years or so (gcc >= 4.4).
>
> If you have trouble compiling it (because you're using an older gcc version
> or another compiler), please speak up. This is basically a test to see if
> it's ok to start seriously using C++11 features in Moses or if it's better
> to revert the commit and wait a couple more years because people are still
> relying on older compilers.
>
> best,
> Rico
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
------------------------------
Message: 6
Date: Tue, 14 Jan 2014 20:17:39 -0500
From: Lane Schwartz <dowobeha@gmail.com>
Subject: Re: [Moses-support] How to Improve the translation
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CABv3vZmypoCFnA=k0PBWDbtwDLmY2+QX1hfV++TsNYh7HTJCkg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
See also:
Final Report of the 2009 Summer Camp for Applied Language Exploration
http://www.cs.jhu.edu/~ccb/publications/scale-2009-report.pdf
On Tue, Jan 14, 2014 at 5:23 AM, Hieu Hoang <Hieu.Hoang@ed.ac.uk> wrote:
> The easiest way to improve translation is to find more data. The 2nd easiest
> way is to improve the tokenization of your languages
>
>
> On 11 January 2014 07:29, Asad A.Malik <asad_12204@yahoo.com> wrote:
>>
>> Hi All,
>>
>> I've developed Urdu to English SMT using MOSES, and it is currently giving
>> me BLEU score of 8. Now I wanted to improve its translation so that it gives
>> me higher BLEU score.
>>
>> Regards
>>
>>
>> Asad A.Malik
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
--
When a place gets crowded enough to require ID's, social collapse is not
far away. It is time to go elsewhere. The best thing about space travel
is that it made it possible to go elsewhere.
-- R.A. Heinlein, "Time Enough For Love"
------------------------------
Message: 7
Date: Tue, 14 Jan 2014 17:35:08 -0800 (PST)
From: nadeem khan <nad_star06@yahoo.com>
Subject: Re: [Moses-support] How to Improve the translation
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<1389749708.12030.YahooMailNeo@web162405.mail.bf1.yahoo.com>
Content-Type: text/plain; charset="iso-8859-1"
Hi Hieu;
We are not same persons, I just saw this question and then your reply about tokenization and want to know the main purpose of it in SMT and as your said good tokenization for any source language can improve the BLEU score that is why asked this question.
Regards
Nadeem Khan
On Wednesday, January 15, 2014 6:21 AM, Lane Schwartz <dowobeha@gmail.com> wrote:
See also:
Final Report of the 2009 Summer Camp for Applied Language Exploration
http://www.cs.jhu.edu/~ccb/publications/scale-2009-report.pdf
On Tue, Jan 14, 2014 at 5:23 AM, Hieu Hoang <Hieu.Hoang@ed.ac.uk> wrote:
> The easiest way to improve translation is to find more data. The 2nd easiest
> way is to improve the tokenization of your languages
>
>
> On 11 January 2014 07:29, Asad A.Malik <asad_12204@yahoo.com> wrote:
>>
>> Hi All,
>>
>> I've developed Urdu to English SMT using MOSES, and it is currently giving
>> me BLEU score of 8. Now I wanted to improve its translation so that it gives
>> me higher BLEU score.
>>
>> Regards
>>
>>
>> Asad A.Malik
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
--
When a place gets crowded enough to require ID's, social collapse is not
far away.? It is time to go elsewhere.? The best thing about space travel
is that it made it possible to go elsewhere.
? ? ? ? ? ? ? ? -- R.A. Heinlein, "Time Enough For Love"
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140114/8b2187e8/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 87, Issue 30
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 87, Issue 30"
Post a Comment