Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Moses using other LMs (Kenneth Heafield)
2. Re: Problem in Web Based Version of MT System using MOSES -
Please help (Rajen Chatterjee)
3. Re: Problem in Web Based Version of MT System using MOSES -
Please help (Barry Haddow)
4. Re: Moses using other LMs (Nicola Bertoldi)
5. Re: Fwd: Moses-support post from
shachar.mirkin@xrce.xerox.com requires approval (Mirkin, Shachar)
----------------------------------------------------------------------
Message: 1
Date: Tue, 18 Mar 2014 21:55:10 -0400
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] Moses using other LMs
To: moses-support@mit.edu
Message-ID: <5328F8FE.3050709@kheafield.com>
Content-Type: text/plain; charset=ISO-8859-1
The Google n-grams are tiny: 1.1 billion 5-grams while I have 263 billion.
They thresholded at 40 (and 200 for vocabulary words). Thresholding
basically means it's only useful for stupid backoff. Also, they didn't
deduplicate the data before training.
Would you like an unpruned interpolated modified Kneser-Ney language
model with these n-gram counts trained on more data than Google used?
1 2640258088
2 15297753348
3 61858786129
4 156775272110
5 263690452834
RandLM implements stupid backoff. KenLM does not; my plan is to remove
the use case for stupid backoff.
Kenneth
On 03/18/14 20:39, Hieu Hoang wrote:
> Moses supports RandLM and neural network LM which can handle very large
> amounts of data, I think.
>
> I'm not sure if IRSTLM or KenLM can handle Google ngram data, but I know
> they can handle large amount of data
>
>
> On 17 March 2014 14:56, Zheng Yuan <yuanzheng_bupt@126.com
> <mailto:yuanzheng_bupt@126.com>> wrote:
>
> Hi,
>
> I am wondering is it possible for Moses to use other kinds of LMs?
> Like some existing Web interface or Google n-gram?
>
> Regards,
> Zheng
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
------------------------------
Message: 2
Date: Wed, 19 Mar 2014 06:44:17 +0000
From: Rajen Chatterjee <rajen.k.chatterjee@gmail.com>
Subject: Re: [Moses-support] Problem in Web Based Version of MT System
using MOSES - Please help
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAC4-+NxUgsVuwatGbr7Zj=QeMfh16X=CgSRr90JaHSZqy_OQQg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi Vishal,
I am not sure but you can have a look at multi threaded moses server
section in this link
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc30 If you have
used --serial option than avoid it and check.
On Wed, Mar 19, 2014 at 12:21 AM, Hieu Hoang <Hieu.Hoang@ed.ac.uk> wrote:
> I'm not an expert on the moses server but I haven't heard this error
> occuring with the moses script.
>
> However, since you've made quite a few changes to the scripts that came
> with Moses, you'll have to debug it yourself.
>
>
> On 18 March 2014 15:02, Vishal Goyal(????? ????) <vishal.pup@gmail.com>wrote:
>
>> Respected All,
>> Greetings
>>
>> We have installed the moses server and web server on a single machine on
>> linux platform. when tested on the local host the system work fine, when we
>> installed the system on web server for public use. The system some work
>> fine but most of the we are not getting the translation of the input text
>> instead the input text is transliterated in the post processing script
>> written in transliterate.pl.
>>
>> Moses server is running as daemon.pl. The index.cgi is being used to
>> take user input and translate.cgi start the moses thread for tranlation.
>>
>> When error log is inspected it show that only one thread of moses is
>> running and all other threads are terminated with the error message (error
>> log attached.)
>>
>> Please also find the attached source code of daemon.pl, index.cgi and
>> translate.cgi
>>
>>
>>
>>
>> --
>> *Regards,*
>> Vishal Goyal,
>> Ph.D., M.Tech., MCA, M.C.S.D.
>> Assistant Professor(Stage III) and Placement Coordinator,
>> Department of Computer Science,
>> Punjabi University Patiala-147002
>> [*Online Hindi to Punjabi Machine Translation Tool -*
>> http://h2p.learnpunjabi.org ]
>> *[Research Cell: An International Journal of Engineering Sciences,
>> http://ijoes.vidyapublications.com <http://ijoes.vidyapublications.com>]*
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
--
-Regards,
Rajen Chatterjee.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140319/f2db3e66/attachment-0001.htm
------------------------------
Message: 3
Date: Wed, 19 Mar 2014 08:37:06 +0000
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] Problem in Web Based Version of MT System
using MOSES - Please help
To: "Vishal Goyal(????? ????)" <vishal.pup@gmail.com>, moses-support
<moses-support@mit.edu>, Ajit Kumar <ajit8671@gmail.com>
Message-ID: <53295732.60609@staffmail.ed.ac.uk>
Content-Type: text/plain; charset=UTF-8; format=flowed
Hi Vishal
The web translation components that you mention are quite old and
haven't been used much in a while.
As far as I can see, the main problem is that the translate.cgi expects
to have many copies of the daemon.pl running, all listening on different
ports. Each one should wrap a different instance of Moses. So this web
translation system is not multi-threaded (it was written before Moses
had threads) it's multi-process,
cheers - Barry
On 18/03/14 15:02, Vishal Goyal(????? ????) wrote:
> Respected All,
> Greetings
>
> We have installed the moses server and web server on a single machine
> on linux platform. when tested on the local host the system work fine,
> when we installed the system on web server for public use. The system
> some work fine but most of the we are not getting the translation of
> the input text instead the input text is transliterated in the post
> processing script written in transliterate.pl <http://transliterate.pl>.
>
> Moses server is running as daemon.pl <http://daemon.pl>. The index.cgi
> is being used to take user input and translate.cgi start the moses
> thread for tranlation.
>
> When error log is inspected it show that only one thread of moses is
> running and all other threads are terminated with the error message
> (error log attached.)
>
> Please also find the attached source code of daemon.pl
> <http://daemon.pl>, index.cgi and translate.cgi
>
>
>
> --
> *Regards,*
> Vishal Goyal,
> Ph.D., M.Tech., MCA, M.C.S.D.
> Assistant Professor(Stage III) and Placement Coordinator,
> Department of Computer Science,
> Punjabi University Patiala-147002
> [*Online Hindi to Punjabi Machine Translation Tool -*
> http://h2p.learnpunjabi.org ]
> *[Research Cell: An International Journal of Engineering Sciences,
> http://ijoes.vidyapublications.com]*
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
------------------------------
Message: 4
Date: Wed, 19 Mar 2014 08:54:11 +0000
From: Nicola Bertoldi <bertoldi@fbk.eu>
Subject: Re: [Moses-support] Moses using other LMs
To: Zheng Yuan <yuanzheng_bupt@126.com>
Cc: moses-support support <moses-support@mit.edu>
Message-ID: <9E1C7C79-E3F3-408F-AB93-A4319859360D@fbk.eu>
Content-Type: text/plain; charset="us-ascii"
Hi Zheng
IRSTLM is able to read and manage google n-gram.
Nicola
On 03/18/14 20:39, Hieu Hoang wrote:
Moses supports RandLM and neural network LM which can handle very large
amounts of data, I think.
I'm not sure if IRSTLM or KenLM can handle Google ngram data, but I know
they can handle large amount of data
On 17 March 2014 14:56, Zheng Yuan <yuanzheng_bupt@126.com<mailto:yuanzheng_bupt@126.com>
<mailto:yuanzheng_bupt@126.com>> wrote:
Hi,
I am wondering is it possible for Moses to use other kinds of LMs?
Like some existing Web interface or Google n-gram?
Regards,
Zheng
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu<mailto:Moses-support@mit.edu> <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
------------------------------
Message: 5
Date: Wed, 19 Mar 2014 10:20:40 +0100
From: "Mirkin, Shachar" <shachar.mirkin@xrce.xerox.com>
Subject: Re: [Moses-support] Fwd: Moses-support post from
shachar.mirkin@xrce.xerox.com requires approval
To: ugermann@inf.ed.ac.uk
Cc: Hieu Hoang <hieu.hoang@ed.ac.uk>, moses-support
<moses-support@mit.edu>
Message-ID: <53296168.2020000@xrce.xerox.com>
Content-Type: text/plain; charset="iso-8859-1"
Hi Ulrich,
Thanks for the detailed answer. I'll try that and follow up on the
progress of this work.
Shachar
On 03/18/2014 07:50 PM, Ulrich Germann wrote:
> PhraseDictionaryDynSuffixArray is deprecated and should not be used
> any more. It will be replaced with memory-mapped suffix array phrase
> tables (mmsapt) which are currently in the branch dynamic-phrase-tables.
>
> In order to use them, you need:
>
> - the two text files, one sentence per line
> - the word alignments in symal format
>
> let fr be the language tag for the language you are translating from
> and en the tag for the language we are translating to
>
> cat train.fr <http://train.fr> | mtt-build -i -o train.fr
> <http://train.fr>
> cat train.en | mtt-build -i -o train.en
> cat train.symal | symal2mam train.fr-en.mam
> mmlex-build train fr en -o train.fr-en.lex -c train.fr-en.coc
>
> then in moses.ini, the line for the phrase table should look like this:
>
> Mmsapt name=PT0 output-factor=0 num-features=5 base=/path/to/train
> L1=fr L2=en
>
> No guarantee that this works; this is work in progress. Probably won't
> work on Mac, and works in multi-threaded mode only.
>
> - Uli
>
>
>
> On Mon, Mar 17, 2014 at 4:17 PM, Mirkin, Shachar
> <shachar.mirkin@xrce.xerox.com <mailto:shachar.mirkin@xrce.xerox.com>>
> wrote:
>
> Hi,
>
> I'm now subscribed also from this email address.
>
> Let me give more details about the problems that I encountered.
> Trying to load the Moses server with the modified ini file, after
> replacing the PhraseDictionaryBinary line with:
>
> PhraseDictionaryDynSuffixArray source=<path-to-source-corpus> target=<path-to-target-corpus> alignment=<path-to-alignments>
>
> (with the correct paths, of course), I got:
>
> Feature function PhraseDictionaryDynSuffixArray0 specified 1 dense
> scores or weights. Actually has 0
>
> This was solved by adding "num-features=0" to the
> PhraseDictionaryDynSuffixArray line.
>
> The next error was:
>
> ...
> Loading source corpus...
> terminate called after throwing an instance of
> 'Moses::StrayFactorException'
> what(): moses/Word.cpp:112 in void
> Moses::Word::CreateFromString(Moses::FactorDirection, const
> std::vector<long unsigned int, std::allocator<long unsigned int>
> >&, const StringPiece&, bool) threw StrayFactorException because
> `fit'.
> You have configured 0 factors but the word le contains factor
> delimiter | too many times.
>
> In this test my source, target and alignment files consist each of
> a single line with no "|"s, and the word "le" is the first one in
> the source.
>
> Is there anything else I should do in the ini file?
>
> Thanks,
> Shachar
>
>
>
>
> On 03/17/2014 02:58 PM, Hieu Hoang wrote:
>> Hi Shachar
>>
>> can you please subscribe to the mailing list before posting to
>> it. It's a public email address so there's a lot of automated
>> spammers. You can subscribe here
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> To answer you question - the webpage does document it in the new
>> ini format, eg.
>> PhraseDictionaryDynSuffixArray source=<path-to-source-corpus> ...
>> Do you have a printout of the old version?
>>
>> Also, the dynamic suffix array is undergoing updates as Uli
>> Germann (cc'ed) is updating it with more features. He can tell
>> you more about it
>>
>>
>> ---------- Forwarded message ----------
>> From: <moses-support-owner@mit.edu
>> <mailto:moses-support-owner@mit.edu>>
>> Date: 17 March 2014 12:13
>> Subject: Moses-support post from shachar.mirkin@xrce.xerox.com
>> <mailto:shachar.mirkin@xrce.xerox.com> requires approval
>> To: moses-support-owner@mit.edu <mailto:moses-support-owner@mit.edu>
>>
>>
>> As list administrator, your authorization is requested for the
>> following mailing list posting:
>>
>> List: Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> From: shachar.mirkin@xrce.xerox.com
>> <mailto:shachar.mirkin@xrce.xerox.com>
>> Subject: Incremental training and the new ini format
>> Reason: Post by non-member to a members-only list
>>
>> At your convenience, visit:
>>
>> http://mailman.mit.edu/mailman/admindb/moses-support
>>
>> to approve or deny the request.
>>
>>
>> ---------- Forwarded message ----------
>> From: "Mirkin, Shachar" <shachar.mirkin@xrce.xerox.com
>> <mailto:shachar.mirkin@xrce.xerox.com>>
>> To: moses-support@mit.edu <mailto:moses-support@mit.edu>
>> Cc:
>> Date: Mon, 17 Mar 2014 13:06:47 +0100
>> Subject: Incremental training and the new ini format
>> Hi,
>>
>> I'm trying to use incremental training with the latest Moses
>> version, but the documentation refers to the old ini format
>> (http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc34).
>> Can you please explain what changes are required to get the
>> incremental training working with the new ini format?
>>
>> Thanks,
>> Shachar
>>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: moses-support-request@mit.edu
>> <mailto:moses-support-request@mit.edu>
>> To:
>> Cc:
>> Date:
>> Subject: confirm 2701c5fb8f659b6037c9e0bf07ad70095ba4ffe2
>> If you reply to this message, keeping the Subject: header intact,
>> Mailman will discard the held message. Do this if the message is
>> spam. If you reply to this message and include an Approved: header
>> with the list password in it, the message will be approved for
>> posting
>> to the list. The Approved: header can also appear in the first line
>> of the body of the reply.
>>
>>
>>
>> --
>> Hieu Hoang
>> Research Associate
>> University of Edinburgh
>> http://www.hoang.co.uk/hieu
>>
>
>
>
>
> --
> Ulrich Germann
> Research Associate
> School of Informatics
> University of Edinburgh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140319/67e0f205/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 89, Issue 43
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 89, Issue 43"
Post a Comment