Moses-support Digest, Vol 143, Issue 4

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Incremental training issue (Ander Corral Naves)
2. mert error (Claudia Matos Veliz)

----------------------------------------------------------------------

Message: 1
Date: Thu, 6 Sep 2018 10:14:02 +0200
From: Ander Corral Naves <a.corral@elhuyar.eus>
Subject: [Moses-support] Incremental training issue
To: moses-support@mit.edu
Message-ID:
<CAPWP6Rq3E3z8adwkF5G8V8jnVsmW9+n302oyJXCtz=WUQNO+Wg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,
I have trained a SMT model using Moses on my own data. My goal is to build
an incremental model so I can later on add more data. I have followed the
instructions in Moses web page about incremental training. My data is
preprocessed and prepared as it says. However, when trying to update and
compute the new alignments I get the following error which I can't really
understand.

[sent:2900000]
Reading more sentence pairs into memory ...
Reading more sentence pairs into memory ...
[sent:3000000]
Reading more sentence pairs into memory ...
Reading more sentence pairs into memory ...
[sent:3100000]
Reading more sentence pairs into memory ...
Reading more sentence pairs into memory ...
Reading more sentence pairs into memory ...
Model1: (1) TRAIN CROSS-ENTROPY 15.768 PERPLEXITY 55801.4
Model1: (1) VITERBI TRAIN CROSS-ENTROPY 19.1387 PERPLEXITY 577188
Model 1 Iteration: 1 took: 811 seconds
Entire Model1 Training took: 811 seconds
Loading HMM alignments from file.
*** Error in `/opt/inc-giza-pp/GIZA++-v2/GIZA++': malloc(): memory
corruption: 0x0000000089e29700 ***
======= Backtrace: =========
[0x5bbe01]
[0x5c605a]
[0x5c7fe1]

[0x4e3288]
[0x4a0dad]
[0x4a3816]
[0x4a430c]
[0x49890e]
[0x436bb4]
[0x40396b]
[0x598f56]
[0x59914a]
[0x404ad9]
======= Memory map: ========
00400000-0072d000 r-xp 00000000 08:02 3278867
/opt/inc-giza-pp/GIZA++-v2/GIZA++
0092c000-00936000 rw-p 0032c000 08:02 3278867
/opt/inc-giza-pp/GIZA++-v2/GIZA++
00936000-00940000 rw-p 00000000 00:00 0
01cfe000-17a52d000 rw-p 00000000 00:00 0
[heap]
7f0894000000-7f089402d000 rw-p 00000000 00:00 0
7f089402d000-7f0898000000 ---p 00000000 00:00 0
7f089a237000-7f08a21f0000 rw-p 00000000 00:00 0
7f08a3bdf000-7f08a5dbc000 rw-p 00000000 00:00 0
7ffdad243000-7ffdad264000 rw-p 00000000 00:00 0
[stack]
7ffdad3b9000-7ffdad3bc000 r--p 00000000 00:00 0
[vvar]
7ffdad3bc000-7ffdad3be000 r-xp 00000000 00:00 0
[vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
[vsyscall]
3-update-alingments.sh: line 2: 3236 Aborted (core dumped)
/opt/inc-giza-pp/GIZA++-v2/GIZA++ giza.conf.2

I don't know if it is a GIZA++ issue (it 's the incremental GIZA
adaptation) or is something related to the previous data preparations steps.

The following instructions can be found in the web page regarding data
preparation. However, it is not clear to me whether those two files
mentioned in the last paragraph are in the correct order. I mean, should I
use the first file for the first command and so for the second or do I need
to take into account the source-target order? Maybe this is related to the
error mentioned above.

snt2cooc

$ $INC_GIZA_PP/bin/snt2cooc.out <new-source-vcb> <new-target-vcb>
<new-source_target.snt> \
<previous-source-target.cooc > new.source-target.cooc
$ $INC_GIZA_PP/bin/snt2cooc.out <new-target-vcb> <new-source-vcb>
<new-target_source.snt> \
<previous-target-source.cooc > new.target-source.cooc

This commands is run once in the source-target direction, and once in the
target-source direction. The previous cooccurrence files can be found in
<experiment-dir>/training/giza.<run>/<target-lang>-<source-lang>.cooc and
<experiment-dir>/training/giza-inverse.<run>/
<source-lang>-<target-lang>.cooc.

Thank you in advance.

*Ander Corral Naves*
ITZULPENGINTZARAKO TEKNOLOGIAK

[image: https://www.linkedin.com/in/itziar-cort%C3%A9s-9b725838/]
<https://www.linkedin.com/in/itziar-cort%C3%A9s-9b725838/>
<https://twitter.com/elhuyarig>
<https://www.youtube.com/user/ElhuyarFundazioa1>
<https://es-es.facebook.com/elhuyar.fundazioa>

*a.corral@elhuyar.eus <a.corral@elhuyar.eus>*
Tel.: 943363040 | luzp.: 200
Zelai Haundi, 3
Osinalde industrialdea

20170 Usurbil

*www.elhuyar.eus* <http://www.elhuyar.eus/>

<https://www.elhuyar.eus/eu/site/komunitatea/laguntzaileak/elhuyarkide-izan>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20180906/712d67a9/attachment-0001.html

------------------------------

Message: 2
Date: Thu, 6 Sep 2018 11:22:01 +0200
From: Claudia Matos Veliz <claudia.matosveliz@ugent.be>
Subject: [Moses-support] mert error
To: moses-support@mit.edu
Message-ID: <E8784069-28DD-409E-9BA9-2EBDF5472386@ugent.be>
Content-Type: text/plain; charset="us-ascii"

Hello,
I have trained a SMT model using Moses, everything was right whit the training but when I tried to tunned the model I got the following error with SRILM

Using SCRIPTS_ROOTDIR: /dependencies/normalisation_demo/moses_kenlm10/scripts
Assuming --mertdir=/dependencies/normalisation_demo/moses_kenlm10/bin
Assuming the tables are already filtered, reusing filtered/moses.ini
Asking moses for feature names and values from filtered/moses.ini
Executing: /opt/mosesdecoder/bin/moses -config filtered/moses.ini -inputtype 0 -show-weights > ./features.list
Defined parameters (per moses.ini or switch):
config: filtered/moses.ini
distortion-limit: 6
feature: UnknownWordPenalty WordPenalty PhraseDictionaryMemory name=TranslationModel0 table-limit=20 num-features=5 path=/home/claudia/NeuralMoses/twe_token_test/mert-work/filtered/phrase-table.0-0.1.1.gz input-factor=0 output-factor=0 LexicalReordering name=LexicalReordering0 num-features=6 type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 path=/home/claudia/NeuralMoses/twe_token_test/mert-work/filtered/reordering-table.wbe-msd-bidirectional-fe Distortion SRILM name=LM0 factor=0 path=/home/claudia/NeuralMoses/lm/model.5 order=5
input-factors: 0
inputtype: 0
mapping: 0 T 0
show-weights:
weight: UnknownWordPenalty0= 1 WordPenalty0= -1 TranslationModel0= 0.2 0.2 0.2 0.2 0.2 LexicalReordering0= 0.3 0.3 0.3 0.3 0.3 0.3 Distortion0= 0.3 LM0= 0.5
/opt/mosesdecoder/bin
line=UnknownWordPenalty
FeatureFunction: UnknownWordPenalty0 start: 0 end: 1
WEIGHT UnknownWordPenalty0=1.000,
line=WordPenalty
FeatureFunction: WordPenalty0 start: 1 end: 2
WEIGHT WordPenalty0=-1.000,
line=PhraseDictionaryMemory name=TranslationModel0 table-limit=20 num-features=5 path=/home/claudia/NeuralMoses/twe_token_test/mert-work/filtered/phrase-table.0-0.1.1.gz input-factor=0 output-factor=0
FeatureFunction: TranslationModel0 start: 2 end: 7
WEIGHT TranslationModel0=0.200,0.200,0.200,0.200,0.200,
line=LexicalReordering name=LexicalReordering0 num-features=6 type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 path=/home/claudia/NeuralMoses/twe_token_test/mert-work/filtered/reordering-table.wbe-msd-bidirectional-fe
FeatureFunction: LexicalReordering0 start: 7 end: 13
Initializing LexicalReordering..
Loading table into memory...done.
WEIGHT LexicalReordering0=0.300,0.300,0.300,0.300,0.300,0.300,
line=Distortion
FeatureFunction: Distortion0 start: 13 end: 14
WEIGHT Distortion0=0.300,
line=SRILM name=LM0 factor=0 path=/home/claudia/NeuralMoses/lm/model.5 order=5
ERROR:Unknown feature function:SRILM
Exit code: 1
Failed to run moses with the config filtered/moses.ini at /dependencies/normalisation_demo/moses_kenlm10/scripts/training/mert-moses.pl line 1271.

The command I used was this:

nice /dependencies/normalisation_demo/moses_kenlm10/scripts/training/mert-moses.pl dev/twe_tok.dev.ori dev/twe_tok.dev.tgt /opt/mosesdecoder/bin/moses model/moses.ini

Any help???
Thanks!!
Claudia

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20180906/8b778345/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 143, Issue 4
*********************************************

Moses-support Digest, Vol 143, Issue 4

0 Response to "Moses-support Digest, Vol 143, Issue 4"

Post a Comment