Moses-support Digest, Vol 111, Issue 46

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: kbmira died with SIGABRT when tuning (Hieu Hoang)
2. mgiza signal 11 coredump (Matjaz Rihtar)
3. Re: kbmira died with SIGABRT when tuning (Dingyuan Wang)
4. Re: kbmira died with SIGABRT when tuning (Hieu Hoang)


----------------------------------------------------------------------

Message: 1
Date: Fri, 15 Jan 2016 18:45:16 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] kbmira died with SIGABRT when tuning
To: Dingyuan Wang <abcdoyle888@gmail.com>, moses-support@mit.edu
Message-ID: <56993E3C.2080000@gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed

could you make your model files available for download so I can
replicate this problem.

it seems like you're using a feature function with sparse scores. I
think the character '_' must be escaped.


On 12/01/16 04:00, Dingyuan Wang wrote:
> Hi all,
>
> I'm using EMS for doing experiments. Every time the kbmira died with
> SIGABRT when turning on one direction, while tuning on the opposite
> direction (same config and test set) was successful.
>
> The mert.log (stderr) shows follows:
>
>
> kbmira with c=0.01 decay=0.999 no_shuffle=0
> Initialising random seed from system clock
> Found 15323 initial sparse features
> ....terminate called after throwing an instance of
> 'MosesTuning::FileFormatException'
> what(): Error in line "-4.51933 0 0 -6.09733 0 0 0 -121.556 2 -20 12
> -31.6201 -38.5211 -26.5112 -60.6166 WT_?~?=2 WT_?~?=1 PL_s1=4
> PL_s3=1 PL_3,3=1 PL_2,2=3 PL_1,2=1 PL_2,1=3 PL_t1=6 PL_t2=4 PL_t3=2
> PL_2,3=1 PL_s2=7 PL_1,1=3 WT_?~??=1 WT_?~??=1 WT_?~?=1 WT_?~?
> ?=1 WT_?~?=1 WT_?~?=2 WT_?~?=1 WT_?~?=1 WT_?~??=1 WT_?~?=1
> WT_?~??=1 WT_?~?=1 WT_?~??=1 WT_?~??=1 WT_?~??=1 WT_?~
> ?=1 WT_?~??=1 " of run7.features.dat
> Aborted
>
>
> I think since run7.scores.dat is generated by some scripts, I wouldn't
> be responsible for making the bad format. Last time it also died, I
> removed the likely offending line in the test set, but this time another
> line appears.
>
> --
> Dingyuan Wang
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

--
Hieu Hoang
http://www.hoang.co.uk/hieu



------------------------------

Message: 2
Date: Sat, 16 Jan 2016 11:23:05 +0100
From: Matjaz Rihtar <matjaz@eunet.si>
Subject: [Moses-support] mgiza signal 11 coredump
To: Moses SMT Support <moses-support@mit.edu>
Message-ID: <569A1A09.7060509@eunet.si>
Content-Type: text/plain; charset=utf-8

Hello!
I am trying to establish a working version of Moses for the purposes of our project. I followed guidelines from the Moses Web pages (Baseline System, ...) and it was mostly successful, except for the usage of mgiza.

History of what I did:
My System: virtual machine with Ubuntu 14.04 x64, 2 cores, 12 GB of memory.
1. installed release 3.0 from Web page
tried with commands from "Baseline System" ==> mgiza fails with signal 11, coredump
2. compiled and installed latest version of mgiza from Github
tried with commands from "Baseline System" ==> mgiza fails with signal 11, coredump
3. compiled and installed latest version of GIZA++ from Github
tried with commands from "Baseline System" ==> all OK
4. compiled and installed latest version of moses, GIZA++ and mgiza from Github
tried with commands from "Baseline System" ==> OK with GIZA++, fail with mgiza

Basically, for calling GIZA++/mgiza I use the same command with the same input files, the only difference is the following two switches:

GIZAOPT="-mgiza -mgiza-cpus 2"

Command:
$HOME/mosesdecoder/scripts/training/train-model.perl -cores 2 $GIZAOPT -root-dir train -corpus $HOME/corpus/news-commentary-v8.fr-en.clean -f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8 -external-bin-dir $HOME/mosesdecoder/training-tools 2>&1 > train.out

If GIZA++ is called (when GIZAOPT=""), all is OK, when mgiza is called (when GIZAOP="-mgiza ..."), mgiza fails with:

Executing: $HOME/mosesdecoder/training-tools/mgiza -CoocurrenceFile $HOME/tm/train/giza.fr-en/fr-en.cooc -c $HOME/tm/train/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -ncpus 2 -nodumps 1 -nsmooth 4 -o $HOME/tm/train/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s $HOME/tm/train/corpus/en.vcb -t $HOME/tm/train/corpus/fr.vcb
Starting MGIZA
Initializing Global Paras
DEBUG: EnterERROR: Execution of: $HOME/mosesdecoder/training-tools/mgiza -CoocurrenceFile $HOME/tm/train/giza.fr-en/fr-en.cooc -c $HOME/tm/train/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -ncpus 2 -nodumps 1 -nsmooth 4 -o $HOME/tm/train/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s $HOME/tm/train/corpus/en.vcb -t $HOME/tm/train/corpus/fr.vcb
died with signal 11, with coredump

GIZA++ on the other hand works as follows:

Executing: $HOME/mosesdecoder/training-tools/GIZA++ -CoocurrenceFile $HOME/tm/train/giza.fr-en/fr-en.cooc -c $HOME/tm/train/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o $HOME/tm/train/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s $HOME/tm/train/corpus/en.vcb -t $HOME/tm/train/corpus/fr.vcb
Reading vocabulary file from:$HOME/tm/train/corpus/en.vcb
Reading vocabulary file from:$HOME/tm/train/corpus/fr.vcb
10000
20000
...

What can I do to help determine where mgiza fails and get it up & running?
Sub-question: is it really worth running mgiza instead of GIZA++?

Best regards,
Matjaz

PS: I changed /home/... to $HOME in the above examples.


------------------------------

Message: 3
Date: Sat, 16 Jan 2016 22:18:44 +0800
From: Dingyuan Wang <abcdoyle888@gmail.com>
Subject: Re: [Moses-support] kbmira died with SIGABRT when tuning
To: Hieu Hoang <hieuhoang@gmail.com>, moses-support@mit.edu
Message-ID: <569A5144.10601@gmail.com>
Content-Type: text/plain; charset=utf-8

Sorry, but I can't reliably replicate the same problem when running
TUNING_tune.1 alone. There is no character '_' in the test set or top50
list.

I'm using sparse-features = "target-word-insertion top 50,
source-word-deletion top 50, word-translation top 50 50, phrase-length"

I've attached some related files from EMS and the EMS config.

https://mega.nz/#!xs0SFKxL!M_RTBp1JGX24-b4xlYYLP-bLXKiC_Sl-p96x55avAB4

? 2016?01?16? 02:45, Hieu Hoang ??:
> could you make your model files available for download so I can
> replicate this problem.
>
> it seems like you're using a feature function with sparse scores. I
> think the character '_' must be escaped.
>
>
> On 12/01/16 04:00, Dingyuan Wang wrote:
>> Hi all,
>>
>> I'm using EMS for doing experiments. Every time the kbmira died with
>> SIGABRT when turning on one direction, while tuning on the opposite
>> direction (same config and test set) was successful.
>>
>> The mert.log (stderr) shows follows:
>>
>>
>> kbmira with c=0.01 decay=0.999 no_shuffle=0
>> Initialising random seed from system clock
>> Found 15323 initial sparse features
>> ....terminate called after throwing an instance of
>> 'MosesTuning::FileFormatException'
>> what(): Error in line "-4.51933 0 0 -6.09733 0 0 0 -121.556 2 -20 12
>> -31.6201 -38.5211 -26.5112 -60.6166 WT_?~?=2 WT_?~?=1 PL_s1=4
>> PL_s3=1 PL_3,3=1 PL_2,2=3 PL_1,2=1 PL_2,1=3 PL_t1=6 PL_t2=4 PL_t3=2
>> PL_2,3=1 PL_s2=7 PL_1,1=3 WT_?~??=1 WT_?~??=1 WT_?~?=1 WT_?~?
>> ?=1 WT_?~?=1 WT_?~?=2 WT_?~?=1 WT_?~?=1 WT_?~??=1 WT_?~?=1
>> WT_?~??=1 WT_?~?=1 WT_?~??=1 WT_?~??=1 WT_?~??=1 WT_?~
>> ?=1 WT_?~??=1 " of run7.features.dat
>> Aborted
>>
>>
>> I think since run7.scores.dat is generated by some scripts, I wouldn't
>> be responsible for making the bad format. Last time it also died, I
>> removed the likely offending line in the test set, but this time another
>> line appears.
>>
>> --
>> Dingyuan Wang
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>

--
Dingyuan Wang (gumblex)


------------------------------

Message: 4
Date: Sat, 16 Jan 2016 15:42:05 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] kbmira died with SIGABRT when tuning
To: Dingyuan Wang <abcdoyle888@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbhM510qTEBPz_C0HfQeBFLJ-NFfpHtTVPSW=Dx24F5Xug@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

The mert script prints out every command it runs. You should be able to
replicate the error by running the last command
On 16 Jan 2016 14:18, "Dingyuan Wang" <abcdoyle888@gmail.com> wrote:

> Sorry, but I can't reliably replicate the same problem when running
> TUNING_tune.1 alone. There is no character '_' in the test set or top50
> list.
>
> I'm using sparse-features = "target-word-insertion top 50,
> source-word-deletion top 50, word-translation top 50 50, phrase-length"
>
> I've attached some related files from EMS and the EMS config.
>
> https://mega.nz/#!xs0SFKxL!M_RTBp1JGX24-b4xlYYLP-bLXKiC_Sl-p96x55avAB4
>
> ? 2016?01?16? 02:45, Hieu Hoang ??:
> > could you make your model files available for download so I can
> > replicate this problem.
> >
> > it seems like you're using a feature function with sparse scores. I
> > think the character '_' must be escaped.
> >
> >
> > On 12/01/16 04:00, Dingyuan Wang wrote:
> >> Hi all,
> >>
> >> I'm using EMS for doing experiments. Every time the kbmira died with
> >> SIGABRT when turning on one direction, while tuning on the opposite
> >> direction (same config and test set) was successful.
> >>
> >> The mert.log (stderr) shows follows:
> >>
> >>
> >> kbmira with c=0.01 decay=0.999 no_shuffle=0
> >> Initialising random seed from system clock
> >> Found 15323 initial sparse features
> >> ....terminate called after throwing an instance of
> >> 'MosesTuning::FileFormatException'
> >> what(): Error in line "-4.51933 0 0 -6.09733 0 0 0 -121.556 2 -20 12
> >> -31.6201 -38.5211 -26.5112 -60.6166 WT_?~?=2 WT_?~?=1 PL_s1=4
> >> PL_s3=1 PL_3,3=1 PL_2,2=3 PL_1,2=1 PL_2,1=3 PL_t1=6 PL_t2=4 PL_t3=2
> >> PL_2,3=1 PL_s2=7 PL_1,1=3 WT_?~??=1 WT_?~??=1 WT_?~?=1 WT_?~?
> >> ?=1 WT_?~?=1 WT_?~?=2 WT_?~?=1 WT_?~?=1 WT_?~??=1 WT_?~?=1
> >> WT_?~??=1 WT_?~?=1 WT_?~??=1 WT_?~??=1 WT_?~??=1 WT_?~
> >> ?=1 WT_?~??=1 " of run7.features.dat
> >> Aborted
> >>
> >>
> >> I think since run7.scores.dat is generated by some scripts, I wouldn't
> >> be responsible for making the bad format. Last time it also died, I
> >> removed the likely offending line in the test set, but this time another
> >> line appears.
> >>
> >> --
> >> Dingyuan Wang
> >> _______________________________________________
> >> Moses-support mailing list
> >> Moses-support@mit.edu
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> >
>
> --
> Dingyuan Wang (gumblex)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160116/7f96012f/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 111, Issue 46
**********************************************

0 Response to "Moses-support Digest, Vol 111, Issue 46"

Post a Comment