Moses-support Digest, Vol 103, Issue 48

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Incremental Giza(Inc-giza) crashes with core dump when doing
Incremental training using MOSES EMS (Hegde, Sujay)
2. BLEU unknown on dev for ini when using PRO (jian zhang)
3. Re: keep some features fixed when tuning (Matthias Huck)

----------------------------------------------------------------------

Message: 1
Date: Wed, 20 May 2015 10:32:42 +0000
From: "Hegde, Sujay" <Sujay.Hegde@xerox.com>
Subject: [Moses-support] Incremental Giza(Inc-giza) crashes with core
dump when doing Incremental training using MOSES EMS
To: "moses-support@mit.edu" <moses-support@mit.edu>
Cc: "Venkatapathy, Sriram $Calendar$"
<sriram.venkatapathy@xrce.xerox.com>, "MudaliarMudaliar, Preeti J"
<preeti.mudaliarmudaliar@xerox.com>
Message-ID:
<586EA7C483504E48870F5BF54319B6EC3A3425@USA7109MB006.na.xerox.net>
Content-Type: text/plain; charset="us-ascii"

Hi Admin,

Inc-giza taken from : https://code.google.com/p/inc-giza-pp/
And manually compiled.

The command invoked by Moses EMS(actually train-model.perl ) is :
/mnt/hd1/git/mosesdecoder/training-tools/GIZA++ -CoocurrenceFile /mnt/hd1/training/working-dir-en-es/training/giza.5/en-es.cooc -c /mnt/hd1/training/working-dir-en-es/training/prepared.5/en-es-int-train.snt -hmmdumpfrequency 5 -hmmiterations 5 -m1 5 -m2 0 -m3 0 -m4 0 -m5 0 -model1dumpfrequency 0 -model2dumpfrequency 0 -model345dumpfrequency 0 -model4smoothfactor 0.4 -nodumps 0 -nsmooth 4 -o /mnt/hd1/training/working-dir-en-es/training/giza.5/en-es -oldAlPrbs /mnt/hd1/training/working-dir-en-es/training/giza.1/en-es.hhmm.5 -oldTrPrbs /mnt/hd1/training/working-dir-en-es/training/giza.1/en-es.thmm.5 -onlyaldumps 1 -p0 0.999 -s /mnt/hd1/training/working-dir-en-es/training/prepared.5/es.vcb -step_k 1 -t /mnt/hd1/training/working-dir-en-es/training/prepared.5/en.vcb

The baseline model used is 1.
It crashes with below output on console.It is a HEAP CORRPUTION.

-----------
Model1: Iteration 1
number of French (target) words = 1255
initial unifrom prob = 0.000796813
Model1: (1) TRAIN CROSS-ENTROPY 16.2367 PERPLEXITY 77219
Model1: (1) VITERBI TRAIN CROSS-ENTROPY 20.8406 PERPLEXITY 1.87773e+06
Model 1 Iteration: 1 took: 0 seconds
-----------
Model1: Iteration 2
number of French (target) words = 1255
initial unifrom prob = 0.000796813
Model1: (2) TRAIN CROSS-ENTROPY 5.75346 PERPLEXITY 53.9466
Model1: (2) VITERBI TRAIN CROSS-ENTROPY 9.11781 PERPLEXITY 555.562
Model 1 Iteration: 2 took: 0 seconds
-----------
Model1: Iteration 3
number of French (target) words = 1255
initial unifrom prob = 0.000796813
Model1: (3) TRAIN CROSS-ENTROPY 5.61082 PERPLEXITY 48.868
Model1: (3) VITERBI TRAIN CROSS-ENTROPY 8.86927 PERPLEXITY 467.646
Model 1 Iteration: 3 took: 0 seconds
-----------
Model1: Iteration 4
number of French (target) words = 1255
initial unifrom prob = 0.000796813
Model1: (4) TRAIN CROSS-ENTROPY 5.51417 PERPLEXITY 45.7014
Model1: (4) VITERBI TRAIN CROSS-ENTROPY 8.64774 PERPLEXITY 401.078
Model 1 Iteration: 4 took: 0 seconds
-----------
Model1: Iteration 5
number of French (target) words = 1255
initial unifrom prob = 0.000796813
Model1: (5) TRAIN CROSS-ENTROPY 5.44065 PERPLEXITY 43.4309
Model1: (5) VITERBI TRAIN CROSS-ENTROPY 8.44812 PERPLEXITY 349.25
Model 1 Iteration: 5 took: 0 seconds
Entire Model1 Training took: 0 seconds
NOTE: I am doing iterations with the HMM model!
Loading HMM alignments from file.
*** glibc detected *** /mnt/hd1/git/mosesdecoder/training-tools/GIZA++: malloc(): memory corruption: 0x00000000011c4470 ***
======= Backtrace: =========
[0x524d22]
[0x526f4d]
[0x527b7b]
[0x4f0fcd]
[0x4d2829]
[0x4d329a]
[0x4d340b]
[0x495c39]
[0x48bdf2]
[0x42822f]
[0x4294ba]
[0x505e66]
[0x4001e9]
======= Memory map: ========
00400000-005d9000 r-xp 00000000 08:10 6030184 /mnt/hd1/git/mosesdecoder/training-tools/GIZA++
007d8000-007db000 rw-p 001d8000 08:10 6030184 /mnt/hd1/git/mosesdecoder/training-tools/GIZA++
007db000-007f2000 rw-p 00000000 00:00 0
00fc4000-01438000 rw-p 00000000 00:00 0 [heap]
7f3054000000-7f3054028000 rw-p 00000000 00:00 0
7f3054028000-7f3058000000 ---p 00000000 00:00 0
7f3059db4000-7f3059db5000 rw-p 00000000 00:00 0
7fff689b0000-7fff689c5000 rw-p 00000000 00:00 0 [stack]
7fff689ff000-7fff68a00000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted (core dumped)

Also core analysis using GDB yields:

Core was generated by `/mnt/hd1/git/mosesdecoder/training-tools/GIZA++ -CoocurrenceFile /mnt/hd1/train'.
Program terminated with signal 6, Aborted.
#0 0x0000000000547ee5 in raise ()
(gdb) where
#0 0x0000000000547ee5 in raise ()
#1 0x000000000050e635 in abort ()
#2 0x000000000051f365 in __libc_message ()
#3 0x0000000000524d22 in malloc_printerr ()
#4 0x0000000000526f4d in _int_malloc ()
#5 0x0000000000527b7b in malloc ()
#6 0x00000000004f0fcd in operator new(unsigned long) ()
#7 0x00000000004d2829 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) ()
#8 0x00000000004d329a in char* std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) ()
#9 0x00000000004d340b in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, unsigned long, std::allocator<char> const&) ()
#10 0x0000000000495c39 in HMMTables<int, WordClasses>::readJumps(std::basic_istream<char, std::char_traits<char> >&) ()
#11 0x000000000048bdf2 in hmm::load_table(char const*) ()
#12 0x000000000042822f in StartTraining(int&) ()
#13 0x00000000004294ba in main ()

Is it a problem with inc-Giza or could it be a problem with the libs being used.

Regards,
Sujay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150520/d0db6561/attachment-0001.htm

------------------------------

Message: 2
Date: Wed, 20 May 2015 12:16:46 +0100
From: jian zhang <zhangj@computing.dcu.ie>
Subject: [Moses-support] BLEU unknown on dev for ini when using PRO
To: moses-support@mit.edu
Message-ID:
<CALA=z0ADeev-H6GY=MhK0tJoQNRYvS0ZLDfJP_No0k-8BQAHFg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

The BLEU is written as "unknown" at ini files during tuning when using PRO.
Is there anyway to fix it, or this is the expected behavior?

Thanks,

Jian
--
Jian Zhang
Centre for Next Generation Localisation (CNGL)
<http://www.cngl.ie/index.html>
Dublin City University <http://www.dcu.ie/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150520/a6ad5657/attachment-0001.htm

------------------------------

Message: 3
Date: Wed, 20 May 2015 12:39:00 +0100
From: Matthias Huck <mhuck@inf.ed.ac.uk>
Subject: Re: [Moses-support] keep some features fixed when tuning
To: Vito Mandorino <vito.mandorino@linguacustodia.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <1432121940.30904.684.camel@portedgar>
Content-Type: text/plain; charset="UTF-8"

Hi Vito,

tuneable=false should work.

However, in case you use the EMS to run experiments, there's a pitfall:
If a filtered phrase table for tuning exists from a previous
experimental run, then the EMS will typically apply the filter and
replace the line for the phrase table feature function in your moses.ini
by the respective line from the filtered directory. In case the previous
run didn't have tuneable=false, this will be dropped. You can easily
check this and manually edit the filtered moses.ini in the tuning
directory. A filtered moses.ini can be specified in the EMS config:

[TUNING]
filtered-config = $working-dir/tuning/moses.filtered.ini.10

tuneable=false will keep all the score components of a feature function
at their initial values. If you need something similar for individual
components only, please have a look at this:
https://www.mail-archive.com/moses-support@mit.edu/msg11653.html

Cheers,
Matthias

On Wed, 2015-05-20 at 10:12 +0200, Vito Mandorino wrote:
> Dear all,
>
> is it possible when tuning to tell Moses to constrain the value of a subset
> of the features to some fixed, given-in-advance values ?
> I would like to do that because I'm dealing with a very small tuning set,
> and I think that reducing the number of tuneable features will prevent
> overfitting.
> I have tried two approaches so far but results were not as expected (or
> desired):
> - add tuneable=false to the concerned features in the moses.ini
> - add
> --decoder-flags "-weight-overwrite 'LM0= 0.086 WordPenalty0= -0.021
> PhrasePenalty0= 0.022 Distortion0= 0.3 TranslationModel1= 0.04 -0.01 0.25
> 0.19'"
> to the mert-moses.pl command.
>
>
> Best regards,
>
> Vito Mandorino
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 103, Issue 48
**********************************************

Moses-support Digest, Vol 103, Issue 48

0 Response to "Moses-support Digest, Vol 103, Issue 48"

Post a Comment