Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: escape-special-characters (Philipp Koehn)
2. Re: phrase table first into compact or binary format.
(Philipp Koehn)
3. Segmentation fault (Massinissa Ahmim)
----------------------------------------------------------------------
Message: 1
Date: Fri, 27 Jun 2014 09:31:47 -0400
From: Philipp Koehn <pkoehn@inf.ed.ac.uk>
Subject: Re: [Moses-support] escape-special-characters
To: charmaine ponay <csponay@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDA5fepxbC2Vqz8gPczDF4_r=SEuhOkRkNXd4r6P9M0gjQ@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Hi,
escape-special-chars.perl is not language specific, so you should not
use the language switch.
Also, it does not do tokenization, so if you want your data tokenized,
you should use the tokenizer instead, which also escapes special
characters.
You do not need to train the language model on the clean data -
training it on all the data will do (slightly) better.
When running train-model.perl by hand, then specify the right language
model with the -lm switch to get a proper language model entry in the
config file.
-phi
On Fri, Jun 27, 2014 at 1:30 AM, charmaine ponay <csponay@gmail.com> wrote:
> hi please help...
>
> I want to use the escape special characters function, is this the correct
> way?
>
> ../mosesdecoder-master/scripts/tokenizer/escape-special-chars.perl -l en \
> < `pwd`/Charmaine/training.en \
>> `pwd`/Charmaine/preprocessed/training.tok.en
> ../mosesdecoder-master/scripts/tokenizer/escape-special-chars.perl -l fl \
> < `pwd`/Charmaine/training.fl \
>> `pwd`/Charmaine/preprocessed/training.tok.fl
>
> then, do I have to make changes with the following commands?
>
> ../mosesdecoder-master/scripts/training/clean-corpus-n.perl \
> `pwd`/Charmaine/training fl en \
> `pwd`/Charmaine/preprocessed/training.clean 1 80
>
> ngram-count -text Charmaine/preprocessed/training.clean.en -lm
> Charmaine/preprocessed/training.clean.en.lm -order 3
>
> ../mosesdecoder-master/scripts/training/train-model.perl \
> -external-bin-dir ../mosesdecoder-master/tools \
> --root-dir `pwd`/Charmaine/training \
> --corpus `pwd`/Charmaine/preprocessed/training.clean \
> --f fl --e en \
> --lm 0:3:`pwd`/Charmaine/
>
> thank you!!!
>
>
> Regards,
>
> Charmaine Salvador - Ponay
> Instructor
> Information and Computer Studies Dept.
> University of Santo Tomas
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
------------------------------
Message: 2
Date: Fri, 27 Jun 2014 09:35:02 -0400
From: Philipp Koehn <pkoehn@inf.ed.ac.uk>
Subject: Re: [Moses-support] phrase table first into compact or binary
format.
To: charmaine ponay <csponay@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDBXEhoJooCYzCYvUbkqFULg7MtmkmMHwcCtSSxjDuWWFg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Hi,
this is described here:
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc6
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc10
-phi
On Fri, Jun 27, 2014 at 1:34 AM, charmaine ponay <csponay@gmail.com> wrote:
> how do i convert table into compact or binary format?
>
> Regards,
>
> Charmaine Salvador - Ponay
> Instructor
> Information and Computer Studies Dept.
> University of Santo Tomas
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
------------------------------
Message: 3
Date: Fri, 27 Jun 2014 16:02:01 +0200
From: Massinissa Ahmim <massinissa.ahmim@linguacustodia.com>
Subject: [Moses-support] Segmentation fault
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CANN0mWay0EUABkfS14CRVfX_-V8VfTRHA57_uv4rnd3KL49AOg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Dear All,
I'm having some troubles to run the lastest version of Moses on a Debian
server,
I've tried to train models with different corporas, but I'm getting the
same error.
When I try to query Moses with an unknown word, the process goes normally
to completion, but when the word is contained in the phrasetable I'm
getting the error below :
Defined parameters (per moses.ini or switch):
config: moses.ini
distortion-limit: 6
feature: UnknownWordPenalty WordPenalty PhrasePenalty
PhraseDictionaryMemory name=TranslationModel0 table-limit=20 num-features=4
path=/home/training/model/phrase-table.gz input-factor=0 output-factor=0
LexicalReordering name=LexicalReordering0 num-features=6
type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0
path=/home/training/model/reordering-table.wbe-msd-bidirectional-fe.gz
Distortion KENLM lazyken=0 name=LM0 factor=1
path=/home/Massi/UN.en-fr.en.lm.blm.en order=5
input-factors: 0
mapping: 0 T 0
weight: UnknownWordPenalty0= 1 WordPenalty0= -1 PhrasePenalty0= 0.2
TranslationModel0= 0.2 0.2 0.2 0.2 LexicalReordering0= 0.3 0.3 0.3 0.3 0.3
0.3 Distortion0= 0.3 LM0= 0.5
line=UnknownWordPenalty
FeatureFunction: UnknownWordPenalty0 start: 0 end: 0
line=WordPenalty
FeatureFunction: WordPenalty0 start: 1 end: 1
line=PhrasePenalty
FeatureFunction: PhrasePenalty0 start: 2 end: 2
line=PhraseDictionaryMemory name=TranslationModel0 table-limit=20
num-features=4 path=/home/training/model/phrase-table.gz input-factor=0
output-factor=0
FeatureFunction: TranslationModel0 start: 3 end: 6
line=LexicalReordering name=LexicalReordering0 num-features=6
type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0
path=/home/training/model/reordering-table.wbe-msd-bidirectional-fe.gz
FeatureFunction: LexicalReordering0 start: 7 end: 12
Initializing LexicalReordering..
line=Distortion
FeatureFunction: Distortion0 start: 13 end: 13
line=KENLM lazyken=0 name=LM0 factor=1
path=/home/Massi/UN.en-fr.en.lm.blm.en order=5
FeatureFunction: LM0 start: 14 end: 14
Loading UnknownWordPenalty0
Loading WordPenalty0
Loading PhrasePenalty0
Loading LexicalReordering0
Loading table into memory...done.
Loading Distortion0
Loading LM0
Loading TranslationModel0
Start loading text phrase table. Moses format : [0.031] seconds
Reading /home/training/model/phrase-table.gz
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
IO from STDOUT/STDIN
Created input-output object : [0.031] seconds
Translating line 0 in thread id 140175290300160
Translating: mmm
Line 0: Initialize search took 0.000 seconds total
Line 0: Collecting options took 0.000 seconds
*** Segmentation fault
Register dump:
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000001dfe000 RSI: 0000000000000000 RDI: 0000000001b2e340
RBP: 0000000001b24500 R8 : 0000000001b480c0 R9 : 0000000000000003
R10: 0000000001b0f9a0 R11: 0000000000000000 R12: 0000000001b48a40
R13: 0000000001b34100 R14: 0000000001b24500 R15: 0000000001b2e340
RSP: 00007f7d1a625c50
RIP: 0000000000500a61 EFLAGS: 00010246
CS: 0033 FS: 0000 GS: 0000
Trap: 0000000e Error: 00000004 OldMask: 00000000 CR2: 00000010
FPUCW: 0000037f FPUSW: 00000000 TAG: 00000000
RIP: 00000000 RDP: 00000000
ST(0) 0000 0000000000000000 ST(1) 0000 0000000000000000
ST(2) 0000 0000000000000000 ST(3) 0000 0000000000000000
ST(4) 0000 0000000000000000 ST(5) 0000 0000000000000000
ST(6) 0000 0000000000000000 ST(7) 0000 0000000000000000
mxcsr: 1fa4
XMM0: 00000000000000000000000000000000 XMM1:
00000000000000000000000000000000
XMM2: 00000000000000000000000000000000 XMM3:
00000000000000000000000000000000
XMM4: 00000000000000000000000000000000 XMM5:
00000000000000000000000000000000
XMM6: 00000000000000000000000000000000 XMM7:
00000000000000000000000000000000
XMM8: 00000000000000000000000000000000 XMM9:
00000000000000000000000000000000
XMM10: 00000000000000000000000000000000 XMM11:
00000000000000000000000000000000
XMM12: 00000000000000000000000000000000 XMM13:
00000000000000000000000000000000
XMM14: 00000000000000000000000000000000 XMM15:
00000000000000000000000000000000
Backtrace:
/home/Moses/mosesdecoder/bin/moses[0x500a61]
/home/Moses/mosesdecoder/bin/moses[0x48fa19]
/home/Moses/mosesdecoder/bin/moses[0x529525]
/home/Moses/mosesdecoder/bin/moses[0x52975a]
/home/Moses/mosesdecoder/bin/moses[0x529b13]
/home/Moses/mosesdecoder/bin/moses[0x529e8c]
/home/Moses/mosesdecoder/bin/moses[0x4798a0]
/home/Moses/mosesdecoder/bin/moses[0x41c895]
/home/Moses/mosesdecoder/bin/moses[0x5126a5]
/home/Moses/mosesdecoder/bin/moses[0x685384]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x7f7d1c0d0b50]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f7d1be1b0ed]
Memory map:
00400000-0075e000 r-xp 00000000 08:02 10619160
/home/Moses/mosesdecoder/bin/moses
0095e000-00961000 rw-p 0035e000 08:02 10619160
/home/Moses/mosesdecoder/bin/moses
00961000-00988000 rw-p 00000000 00:00 0
019ff000-01b3c000 rw-p 00000000 00:00 0
[heap]
01b3c000-01b3e000 rw-p 00000000 00:00 0
[heap]
01b3e000-01b7a000 rw-p 00000000 00:00 0
[heap]
01b7a000-01b7b000 rw-p 00000000 00:00 0
[heap]
01b7b000-01b82000 rw-p 00000000 00:00 0
[heap]
01b82000-01b83000 rw-p 00000000 00:00 0
[heap]
01b83000-01b94000 rw-p 00000000 00:00 0
[heap]
01b94000-01b95000 rw-p 00000000 00:00 0
[heap]
01b95000-01bc2000 rw-p 00000000 00:00 0
[heap]
01bc2000-01bc6000 rw-p 00000000 00:00 0
[heap]
01bc6000-01bea000 rw-p 00000000 00:00 0
[heap]
01bea000-01bf2000 rw-p 00000000 00:00 0
[heap]
01bf2000-01c32000 rw-p 00000000 00:00 0
[heap]
01c32000-01c42000 rw-p 00000000 00:00 0
[heap]
01c42000-01cca000 rw-p 00000000 00:00 0
[heap]
01cca000-01cea000 rw-p 00000000 00:00 0
[heap]
01cea000-01dae000 rw-p 00000000 00:00 0
[heap]
01dae000-01dee000 rw-p 00000000 00:00 0
[heap]
01dee000-01e94000 rw-p 00000000 00:00 0
[heap]
01e94000-01e98000 rw-p 00000000 00:00 0
[heap]
01e98000-01e9e000 rw-p 00000000 00:00 0
[heap]
01e9e000-01f9f000 rw-p 00000000 00:00 0
[heap]
01f9f000-0201e000 rw-p 00000000 00:00 0
[heap]
7f7d19e26000-7f7d19e27000 rw-p 00000000 00:00 0
7f7d19e27000-7f7d19e28000 ---p 00000000 00:00 0
7f7d19e28000-7f7d1a628000 rw-p 00000000 00:00 0
7f7d1a628000-7f7d1bd3f000 r--s 00000000 08:02 10355100
/home/Massi/UN.en-fr.en.lm.blm.en
7f7d1bd3f000-7f7d1bec1000 r-xp 00000000 08:02 16646158
/lib/x86_64-linux-gnu/libc-2.13.so
7f7d1bec1000-7f7d1c0c0000 ---p 00182000 08:02 16646158
/lib/x86_64-linux-gnu/libc-2.13.so
7f7d1c0c0000-7f7d1c0c4000 r--p 00181000 08:02 16646158
/lib/x86_64-linux-gnu/libc-2.13.so
7f7d1c0c4000-7f7d1c0c5000 rw-p 00185000 08:02 16646158
/lib/x86_64-linux-gnu/libc-2.13.so
7f7d1c0c5000-7f7d1c0ca000 rw-p 00000000 00:00 0
7f7d1c0ca000-7f7d1c0e1000 r-xp 00000000 08:02 16646180
/lib/x86_64-linux-gnu/libpthread-2.13.so
7f7d1c0e1000-7f7d1c2e0000 ---p 00017000 08:02 16646180
/lib/x86_64-linux-gnu/libpthread-2.13.so
7f7d1c2e0000-7f7d1c2e1000 r--p 00016000 08:02 16646180
/lib/x86_64-linux-gnu/libpthread-2.13.so
7f7d1c2e1000-7f7d1c2e2000 rw-p 00017000 08:02 16646180
/lib/x86_64-linux-gnu/libpthread-2.13.so
7f7d1c2e2000-7f7d1c2e6000 rw-p 00000000 00:00 0
7f7d1c2e6000-7f7d1c2fb000 r-xp 00000000 08:02 16646186
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f7d1c2fb000-7f7d1c4fb000 ---p 00015000 08:02 16646186
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f7d1c4fb000-7f7d1c4fc000 rw-p 00015000 08:02 16646186
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f7d1c4fc000-7f7d1c57d000 r-xp 00000000 08:02 16646169
/lib/x86_64-linux-gnu/libm-2.13.so
7f7d1c57d000-7f7d1c77c000 ---p 00081000 08:02 16646169
/lib/x86_64-linux-gnu/libm-2.13.so
7f7d1c77c000-7f7d1c77d000 r--p 00080000 08:02 16646169
/lib/x86_64-linux-gnu/libm-2.13.so
7f7d1c77d000-7f7d1c77e000 rw-p 00081000 08:02 16646169
/lib/x86_64-linux-gnu/libm-2.13.so
7f7d1c77e000-7f7d1c866000 r-xp 00000000 08:02 30675278
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
7f7d1c866000-7f7d1ca66000 ---p 000e8000 08:02 30675278
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
7f7d1ca66000-7f7d1ca6e000 r--p 000e8000 08:02 30675278
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
7f7d1ca6e000-7f7d1ca70000 rw-p 000f0000 08:02 30675278
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.17
7f7d1ca70000-7f7d1ca85000 rw-p 00000000 00:00 0
7f7d1ca85000-7f7d1ca8c000 r-xp 00000000 08:02 16646182
/lib/x86_64-linux-gnu/librt-2.13.so
7f7d1ca8c000-7f7d1cc8b000 ---p 00007000 08:02 16646182
/lib/x86_64-linux-gnu/librt-2.13.so
7f7d1cc8b000-7f7d1cc8c000 r--p 00006000 08:02 16646182
/lib/x86_64-linux-gnu/librt-2.13.so
7f7d1cc8c000-7f7d1cc8d000 rw-p 00007000 08:02 16646182
/lib/x86_64-linux-gnu/librt-2.13.so
7f7d1cc8d000-7f7d1cc90000 r-xp 00000000 08:02 16646164
/lib/x86_64-linux-gnu/libSegFault.so
7f7d1cc90000-7f7d1ce90000 ---p 00003000 08:02 16646164
/lib/x86_64-linux-gnu/libSegFault.so
7f7d1ce90000-7f7d1ce91000 r--p 00003000 08:02 16646164
/lib/x86_64-linux-gnu/libSegFault.so
7f7d1ce91000-7f7d1ce92000 rw-p 00004000 08:02 16646164
/lib/x86_64-linux-gnu/libSegFault.so
7f7d1ce92000-7f7d1ce94000 r-xp 00000000 08:02 16646176
/lib/x86_64-linux-gnu/libdl-2.13.so
7f7d1ce94000-7f7d1d094000 ---p 00002000 08:02 16646176
/lib/x86_64-linux-gnu/libdl-2.13.so
7f7d1d094000-7f7d1d095000 r--p 00002000 08:02 16646176
/lib/x86_64-linux-gnu/libdl-2.13.so
7f7d1d095000-7f7d1d096000 rw-p 00003000 08:02 16646176
/lib/x86_64-linux-gnu/libdl-2.13.so
7f7d1d096000-7f7d1d0b6000 r-xp 00000000 08:02 16646184
/lib/x86_64-linux-gnu/ld-2.13.so
7f7d1d29d000-7f7d1d2a4000 rw-p 00000000 00:00 0
7f7d1d2b3000-7f7d1d2b5000 rw-p 00000000 00:00 0
7f7d1d2b5000-7f7d1d2b6000 r--p 0001f000 08:02 16646184
/lib/x86_64-linux-gnu/ld-2.13.so
7f7d1d2b6000-7f7d1d2b7000 rw-p 00020000 08:02 16646184
/lib/x86_64-linux-gnu/ld-2.13.so
7f7d1d2b7000-7f7d1d2b8000 rw-p 00000000 00:00 0
7fff87704000-7fff87725000 rw-p 00000000 00:00 0
[stack]
7fff877ff000-7fff87800000 r-xp 00000000 00:00 0
[vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
[vsyscall]
Erreur de segmentation
Any idea?
Many thanks !
--
[image: Description : Description : lingua_custodia_final full logo]
*The Translation Trustee*
*1, Place Charles de Gaulle*
*78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 7 61 44 40 84*
*Email :* *massinissa.ahmim@linguacustodia.com
<massinissa.ahmim@linguacustodia.com>*
*Website :* *www.linguacustodia.com <http://www.linguacustodia.com/> -
www.thetranslationtrustee.com <http://www.thetranslationtrustee.com>*
? Pensez ? l?environnement, n?imprimez ce courriel que si n?cessaire.
Please do not print this email unless it is absolutely necessary. Spread
environmental awareness.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140627/6f8a8f47/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20140627/6f8a8f47/attachment.jpg
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 92, Issue 50
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 92, Issue 50"
Post a Comment