Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Warning: Too many arguments while IRSTLM language model
Training (renubalyan)
2. Re: Warning: Too many arguments while IRSTLM language model
Training (renubalyan)
----------------------------------------------------------------------
Message: 1
Date: Thu, 5 Dec 2013 20:27:11 +0530 (IST)
From: renubalyan <renubalyan@cdac.in>
Subject: Re: [Moses-support] Warning: Too many arguments while IRSTLM
language model Training
To: Prashant Mathur <prashant@fbk.eu>
Cc: moses-support@mit.edu
Message-ID:
<1776772134.12360.1386255431231.JavaMail.open-xchange@webmail.cdac.in>
Content-Type: text/plain; charset="utf-8"
Hi Prashant,
Thanks for the response.
The manual no where mentions anything about iARPA as mentioned by you. It says
that these 5 steps should provide me a *.arpa.en file, which is binarized using
KenLM in the next step.
But I fail to get the .arpa file due to the Warning.
I am currently using the 5.80.03 version of IRSTLM. I think this is the latest
one.
---------------
Renu
On December 5, 2013 at 4:51 PM Prashant Mathur <prashant@fbk.eu> wrote:
> The command is correct, so it should work. You are using compile-lm to
> convert from iARPA to ARPA format.
> Try updating irstlm toolkit to latest version.
>
> --
> Prashant
>
> On 12/04/2013 04:58 PM, renubalyan wrote:
>
> > > Hi,
> >
> > I am building the baseline system based on Moses manual instructions.
> >
> > I have installed Moses, GIZA++ and IRSTLM as mentioned in the manual.
> > The corpus preparation (tokenization, ...cleaning) steps also goes
> > well.
> >
> > However when I move to Language Model Training: I have some problems
> >
> > I am following these steps:
> >
> > 1. mkdir ~/lm
> >
> > 2. cd ~/lm
> >
> > 3. /home/renu/Desktop/irstlm/bin/add-start-end.sh <
> > /home/renu/Desktop/corpus/news-commentary-v8.fr-en.true.en>
> > news-commentary-v8.fr-en.sb.en
> >
> > 4. export IRSTLM=/home/renu/Desktop/irstlm;
> > /home/renu/Desktop/irstlm/bin/build-lm.sh -i news-commentary-v8.fr-en.sb.en
> > -t ./tmp -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en
> >
> > 5. /home/renu/Desktop/irstlm/bin/compile-lm --text yes
> > news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
> >
> > Steps 1-4 work well but step 5 gives me -------(Warning:Too many
> > parameters)
> >
> > I have searched the web for any possible solution but could not find
> > any.
> >
> > I am not able to move ahead, kindly help.
> >
> > Thanks
> > Renu
> >
> >
> > -------------------------------------------------------------------------------------------------------------------------------
> > This e-mail is for the sole use of the intended recipient(s) and may
> > contain confidential and privileged information. If you are not the
> > intended recipient, please contact the sender by reply e-mail and
> > destroy
> > all copies and the original message. Any unauthorized review, use,
> > disclosure, dissemination, forwarding, printing or copying of this
> > email
> > is strictly prohibited and appropriate legal action will be taken.
> >
> > -------------------------------------------------------------------------------------------------------------------------------
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> > <http://mailman.mit.edu/mailman/listinfo/moses-support>
> >
> > >
-------------------------------------------------------------------------------------------------------------------------------
This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
-------------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131205/21b61f48/attachment-0001.htm
------------------------------
Message: 2
Date: Thu, 5 Dec 2013 20:42:41 +0530 (IST)
From: renubalyan <renubalyan@cdac.in>
Subject: Re: [Moses-support] Warning: Too many arguments while IRSTLM
language model Training
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>, moses-support
<moses-support@mit.edu>
Message-ID:
<979774460.12384.1386256361663.JavaMail.open-xchange@webmail.cdac.in>
Content-Type: text/plain; charset="utf-8"
Hi,
Thanks for the response.
I tried this option too, if I run the command without '--text yes' option then
the command runs fine, However I wanted to ask one thing does this give me an
arpa file or a binarized one? Because when I run the next command mentioned in
the manual:
6. /home/renu/Desktop/mosesdecoder/bin/build_binary
news-commentary-v8.fr-en.arpa.en news-commentary-v8.fr-en.blm.en
I get the following output:
Reading news-commentary-v8.fr-en.arpa.en
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
lm/read_arpa.cc:63 in void lm::ReadARPACounts(util::FilePiece&, std::vector<long
long unsigned int>&) threw FormatLoadException because `line.size() >= 4 &&
StringPiece(line.data(), 4) == "blmt"'.
This looks like an IRSTLM binary file. Did you forget to pass --text yes to
compile-lm? Byte: 40 File: news-commentary-v8.fr-en.arpa.en
ERROR
The last second line put in bold indicates that the one I am using is a binary
file.
Does that mean I already have a binary file and I do not need to use step 6
mentioned above (which infact is for converting from arpa to binary file)
Thanks
Renu
On December 5, 2013 at 4:19 PM Hieu Hoang <Hieu.Hoang@ed.ac.uk> wrote:
> I'm not sure what is
> --text yes
> this is how the EMS runs IRSTLM compile-lm:
> .../compile-lm .../europarl_pos.lm.4 .../europarl_pos.binlm.4
>
>
>
> On 4 December 2013 15:58, renubalyan <renubalyan@cdac.in
> <mailto:renubalyan@cdac.in> > wrote:
> > > Hi,
> >
> > I am building the baseline system based on Moses manual instructions.
> >
> > I have installed Moses, GIZA++ and IRSTLM as mentioned in the manual.
> > The corpus preparation (tokenization, ...cleaning) steps also goes well.
> >
> > However when I move to Language Model Training: I have some problems
> >
> > I am following these steps:
> >
> > 1. mkdir ~/lm
> >
> > 2. cd ~/lm
> >
> > 3. /home/renu/Desktop/irstlm/bin/add-start-end.sh <
> > /home/renu/Desktop/corpus/news-commentary-v8.fr-en.true.en>
> > news-commentary-v8.fr-en.sb.en
> >
> > 4. export IRSTLM=/home/renu/Desktop/irstlm;
> > /home/renu/Desktop/irstlm/bin/build-lm.sh -i news-commentary-v8.fr-en.sb.en
> > -t ./tmp -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en
> >
> > 5. /home/renu/Desktop/irstlm/bin/compile-lm --text yes
> > news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
> >
> > Steps 1-4 work well but step 5 gives me -------(Warning:Too many
> > parameters)
> >
> > I have searched the web for any possible solution but could not find any.
> >
> > I am not able to move ahead, kindly help.
> >
> > Thanks
> > Renu
> >
> >
> > -------------------------------------------------------------------------------------------------------------------------------
> > This e-mail is for the sole use of the intended recipient(s) and may
> > contain confidential and privileged information. If you are not the
> > intended recipient, please contact the sender by reply e-mail and destroy
> > all copies and the original message. Any unauthorized review, use,
> > disclosure, dissemination, forwarding, printing or copying of this email
> > is strictly prohibited and appropriate legal action will be taken.
> >
> > -------------------------------------------------------------------------------------------------------------------------------
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> > <http://mailman.mit.edu/mailman/listinfo/moses-support>
> > >
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu <http://www.hoang.co.uk/hieu>
>
>
-------------------------------------------------------------------------------------------------------------------------------
This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
-------------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20131205/96ae28dd/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 86, Issue 14
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 86, Issue 14"
Post a Comment