Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: moses.ini file changes (Hieu Hoang)
2. Re: moses.ini file changes (Andrzej Zydron)
3. Re: Moses-support post from nad_star06@yahoo.com requires
approval (Hieu Hoang)
4. Language Model Training failed (Janez Kadivec)
----------------------------------------------------------------------
Message: 1
Date: Mon, 3 Mar 2014 17:54:32 +0000
From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Subject: Re: [Moses-support] moses.ini file changes
To: Andrzej Zydron <azydron@xtm-intl.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbgapkPM0EP0-nDorS-6JkBb74yFAH3AdGJobxTZyspk8g@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
You can see examples of equivalent old and new ini file format in the
regression test, eg.
https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.basic-surface-only-withirstlm-binlm.oldformat/moses.ini
https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.basic-surface-only-withirstlm-binlm/moses.ini
and
https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.show-weights.lex-reorder.oldformat/moses.ini
https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.show-weights.lex-reorder/moses.ini
Also, you can see the difference if run an old ini file through
scripts/training/convert-moses-ini-to-v2.perl
On 3 March 2014 17:03, Andrzej Zydron <azydron@xtm-intl.com> wrote:
> Dear Moses-support,
>
> Can someone point me at any documentation or source module that
> describes the parameters of the moses.ini both 'old' and new '2.1'
> formats, specifically what has become of the ttable-file: parameters and
> their new equivalents.
>
> --
> Email signature standard
>
> Thank you in advance,
>
> Andrzej Zydro?
>
> ---------------------------------------
>
> CTO
>
> *XTM International Ltd.*
>
> PO Box 2167, Gerrards Cross, SL9 8XF, UK
>
> email: azydron@xtm-intl.com <mailto:azydron@xtm-intl.com>
>
> Tel: +44 (0) 1753 480 479
>
> Mob: +44 (0) 7966 477 181
>
> skype: Zydron
>
> www.xtm-intl.com <http://www.xtm-intl.com/>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140303/7a67d91b/attachment-0001.htm
------------------------------
Message: 2
Date: Mon, 03 Mar 2014 17:59:44 +0000
From: Andrzej Zydron <azydron@xtm-intl.com>
Subject: Re: [Moses-support] moses.ini file changes
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <5314C310.8080605@xtm-intl.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Many thanks Hieu,
Email signature standard
Best Regards,
Andrzej Zydro?
---------------------------------------
CTO
*XTM International Ltd.*
PO Box 2167, Gerrards Cross, SL9 8XF, UK
email: azydron@xtm-intl.com <mailto:azydron@xtm-intl.com>
Tel: +44 (0) 1753 480 479
Mob: +44 (0) 7966 477 181
skype: Zydron
www.xtm-intl.com <http://www.xtm-intl.com/>
On 03/03/2014 17:54, Hieu Hoang wrote:
> You can see examples of equivalent old and new ini file format in the
> regression test, eg.
> https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.basic-surface-only-withirstlm-binlm.oldformat/moses.ini
> https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.basic-surface-only-withirstlm-binlm/moses.ini
>
> and
> https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.show-weights.lex-reorder.oldformat/moses.ini
> https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.show-weights.lex-reorder/moses.ini
> Also, you can see the difference if run an old ini file through
> scripts/training/convert-moses-ini-to-v2.perl
>
>
>
>
> On 3 March 2014 17:03, Andrzej Zydron <azydron@xtm-intl.com
> <mailto:azydron@xtm-intl.com>> wrote:
>
> Dear Moses-support,
>
> Can someone point me at any documentation or source module that
> describes the parameters of the moses.ini both 'old' and new '2.1'
> formats, specifically what has become of the ttable-file:
> parameters and
> their new equivalents.
>
> --
> Email signature standard
>
> Thank you in advance,
>
> Andrzej Zydro?
>
> ---------------------------------------
>
> CTO
>
> *XTM International Ltd.*
>
> PO Box 2167, Gerrards Cross, SL9 8XF, UK
>
> email: azydron@xtm-intl.com <mailto:azydron@xtm-intl.com>
> <mailto:azydron@xtm-intl.com <mailto:azydron@xtm-intl.com>>
>
> Tel: +44 (0) 1753 480 479 <tel:%2B44%20%280%29%201753%20480%20479>
>
> Mob: +44 (0) 7966 477 181 <tel:%2B44%20%280%29%207966%20477%20181>
>
> skype: Zydron
>
> www.xtm-intl.com <http://www.xtm-intl.com> <http://www.xtm-intl.com/>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
------------------------------
Message: 3
Date: Mon, 03 Mar 2014 20:47:11 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Moses-support post from
nad_star06@yahoo.com requires approval
To: nadeem khan <nad_star06@yahoo.com>, moses-support
<moses-support@mit.edu>
Message-ID: <5314EA4F.60005@gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Hi nadeem
please make available large files like you corpus file available for
download, rather than emailing them. I personally use Dropbox
to answer you question, there are many | characters in your corpus. You
must remove these lines, or use the script
scripts/tokenizer/escape-special-chars.perl
to escape them
You can see which lines have the | character using the command line below:
$grep -n "\|" hin-eng-train-lw.* | head -2
hin-eng-train-lw.hn:772:? ? ??? ??? ??????? ???? ???? ?? ??? ??????? |
hin-eng-train-lw.hn:773:????? ?? ?????? ?????? ? ? ? ? ??? ???? ?? ?? ?
? ( 5 ?? ? ? ?? 16 ?? ) ?? ??? ?? ? ??? ?? ???? ??? ?? ??? ? ?? ? ? ???
? ? ???? ????? |
On 03/03/2014 18:09, moses-support-owner@mit.edu wrote:
error is still there;
Exception: moses/Word.cpp:112 in void
Moses::Word::CreateFromString(Moses::FactorDirection, const
std::vector<unsigned int>&, const StringPiece&, bool) threw
StrayFactorException because `fit'.
You have configured 1 factors but the word | contains factor delimiter |
too many times.
My training corpus is attached with the message..
On Thursday, February 27, 2014 6:15 AM, Philipp Koehn
<pkoehn@inf.ed.ac.uk> wrote:
Hi,
as the error message says, please remove all bar characters "|" from your
training corpus when building the phrase table.
-phi
On Wed, Feb 26, 2014 at 7:58 PM, nadeem khan <nad_star06@yahoo.com
<mailto:nad_star06@yahoo.com>> wrote:
> Hi all;
> I am getting this error while running decoder with alignment flags:
>
> FeatureFunction: UnknownWordPenalty0 start: 9 end: 9
> line=PhraseDictionaryMemory input-factor=0 output-factor=0
> path=/home/legends/work/hin-eng/f5/model/phrase-table.gz num-features=5
> table-limit=20
> FeatureFunction: PhraseDictionaryMemory0 start: 10 end: 14
> Loading SRILM0
> /home/legends/work/hin-eng/f5/lm/urd-eng.lm: line 4317: warning: non-zero
> probability for <unk> in closed-vocabulary LM
> Loading Distortion0
> Loading LexicalReordering0
> Loading table into memory...done.
> Loading WordPenalty0
> Loading UnknownWordPenalty0
> Loading PhraseDictionaryMemory0
> Start loading text SCFG phrase table. Moses format : [8.000] seconds
> Reading /home/legends/work/hin-eng/f5/model/phrase-table.gz
>
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>
****************************************************************************************************
> Exception: moses/Word.cpp:112 in void
> Moses::Word::CreateFromString(Moses::FactorDirection, const
> std::vector<unsigned int>&, const StringPiece&, bool) threw
> StrayFactorException because `fit'.
> You have configured 1 factors but the word | contains factor
delimiter | too
> many times.
>
>
> Please help out in fixing it.
> THANKS
> Regards
> Nadeem
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140303/e77066ae/attachment-0001.htm
------------------------------
Message: 4
Date: Mon, 3 Mar 2014 22:36:02 +0100
From: Janez Kadivec <jankad@zop-cr.com>
Subject: [Moses-support] Language Model Training failed
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CA+viJsd-cPs1AFNOy0nWVm8-weu3Nqwd3xzo4MMs+RAyWP116g@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Hello!
We are moving slowly through the Moses MT preparation task. We came to the
Language model Training. We are following the Moses Baseline.
The language model (LM) is used to ensure fluent output, so it is built
with the target language (i.e English in this case). The IRSTLM
documentation gives a full explanation of the command-line options, but the
following will build an appropriate 3-gram language model, removing
singletons, smoothing with improved Kneser-Ney, and adding sentence
boundary symbols:
mkdir ~/lm
cd ~/lm
~/irstlm/bin/add-start-end.sh \
< ~/corpus/news-commentary-v8.fr-en.true.en \
> news-commentary-v8.fr-en.sb.en
export IRSTLM=$HOME/irstlm; ~/irstlm/bin/build-lm.sh \
-i news-commentary-v8.fr-en.sb.en \
-t ./tmp -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en
~/irstlm/bin/compile-lm --text news-commentary-v8.fr-en.lm.en.gz \
news-commentary-v8.fr-en.arpa.en
First four commands were executed successfuly. The last one failed.
Here is the result after entering the following command line:
zzz@zzz-laptop:~/lm$ ~/moses/irstlm/bin/compile-lm --text
news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
inpfile: news-commentary-v8.fr-en.arpa.en
loading up to the LM level 1000 (if any)
dub: 10000000
Failed to open news-commentary-v8.fr-en.arpa.en!
zzz@zzz-laptop:~/lm$
----------------
Where we made a mistake? I see the xxx.arpa.en listed as input file.
Shouldn't be the xxx.arpa.en file an output file?
Best regards!
Janez Kadivec
Call
Send SMS
Add to Skype
You'll need Skype CreditFree via Skype
Call
Send SMS
Add to Skype
You'll need Skype CreditFree via Skype
Call
Send SMS
Add to Skype
You'll need Skype CreditFree via Skype
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140303/4bcad16d/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 89, Issue 5
********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 89, Issue 5"
Post a Comment