Moses-support Digest, Vol 108, Issue 45

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Segmentation Fault during Tuning (Philipp Koehn)
2. tokenizer / detokenizer (Vincent Nguyen)
3. Compact lex reordering table on OSX/clang (Hieu Hoang)
4. Re: how to run moses through shell script in linux (Philipp Koehn)


----------------------------------------------------------------------

Message: 1
Date: Mon, 12 Oct 2015 09:09:06 -0700
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] Segmentation Fault during Tuning
To: Alex Martinez <cmxela@me.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDCCtfmQ6Sup2CUvxYkAJic-tKBrmeTKypzgfKhzCx_pDg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

in t2, you do generate an output lemma factor - which may be the cause of
this problem (even though you do not seem to use the output lemma anywhere
else).

Does it still core dump, if you change translation factors to:

translation-factors = "lemma -> lemma, pos -> pos, word -> word + lemma +
pos"

-phi

On Sat, Oct 10, 2015 at 9:52 AM, Alex Martinez <cmxela@me.com> wrote:

> Hello,
> I'm trying to build a factored system using EMS based on this example from
> the tutorial:
> ---------------------------------------------------------------------
> % train-model.perl \
> --corpus factored-corpus/proj-syndicate.1000 \
> --root-dir morphgen-backoff \
> --f de --e en \
> --lm 0:3:factored-corpus/surface.lm:0 \
> --lm 2:3:factored-corpus/pos.lm:0 \
> --translation-factors 1-1+3-2+0-0,2 \
> --generation-factors 1-2+1,2-0 \
> --decoding-steps t0,g0,t1,g1:t2 \
> --external-bin-dir .../tools
> ----------------------------------------------------------------------
> I'm getting a segmentation fault during tuning and I have the feeling that
> the problem is related to the line defining the decoding-steps.
> What I have on my EMS config file to get a similar model is:
> --------------------------------------------------------------------
> ### factored training: specify here which factors used
> # if none specified, single factor training is assumed
> # (one translation step, surface to surface)
> #
> input-factors = word lemma pos
> output-factors = word lemma pos
> alignment-factors = "word+lemma -> word+lemma"
> translation-factors = "lemma -> lemma, pos -> pos, word -> word + pos"
> reordering-factors = "word -> word"
> generation-factors = "lemma -> pos, lemma+pos -> word"
> decoding-steps = "t0,g0,t1,g1:t2"
> generation-type = single
> prune-generation = "$moses-bin-dir/pruneGeneration 100"
> -------------------------------------------------------------------------
>
> The training fails in the tuning step and I'm getting this in the
> TUNING_tune.1.STDERR:
>
> Executing: /opt/moses/bin/moses -threads all -v 0 -config
> /mnt/a62/devel/en_es/processfin/model/moses.bin.ini.1 -weight-overwrite
> 'WordPenalty0= -0.128205 TranslationModel0= 0.025641 0.025641 0.025641
> 0.025641 LM2= 0.064103 LM0= 0.064103 GenerationModel1= 0.038462 0.000000
> TranslationModel2= 0.025641 0.025641 0.025641 0.025641 GenerationModel0=
> 0.038462 PhrasePenalty0= 0.025641 Distortion0= 0.038462 TranslationModel1=
> 0.025641 0.025641 0.025641 0.025641 LexicalReordering0= 0.038462 0.038462
> 0.038462 0.038462 0.038462 0.038462 LM1= 0.064103' -n-best-list
> run1.best100.out 100 distinct -input-file
> /mnt/a62/devel/en_es/data/corpora.tuning.en > run1.out
> Segmentation fault (core dumped)
> Exit code: 139
> The decoder died. CONFIG WAS -weight-overwrite 'WordPenalty0= -0.128205
> TranslationModel0= 0.025641 0.025641 0.025641 0.025641 LM2= 0.064103 LM0=
> 0.064103 GenerationModel1= 0.038462 0.000000 TranslationModel2= 0.025641
> 0.025641 0.025641 0.025641 GenerationModel0= 0.038462 PhrasePenalty0=
> 0.025641 Distortion0= 0.038462 TranslationModel1= 0.025641 0.025641
> 0.025641 0.025641 LexicalReordering0= 0.038462 0.038462 0.038462 0.038462
> 0.038462 0.038462 LM1= 0.064103'
> cp: cannot stat ?/mnt/a62/devel/en_es/processfin/tuning/tmp.1/moses.ini?:
> No such file or directory
> -------------------------------------------
>
> If I change this line in the config file from
>
> decoding-steps = "t0,g0,t1,g1:t2"
>
> to
>
> decoding-steps = "t0,g0,t1,g1"
>
> then the training ends without errors.
>
> I'll appreciate suggestions on how to solve that.
>
> Alex
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151012/e6a0193f/attachment-0001.html

------------------------------

Message: 2
Date: Mon, 12 Oct 2015 18:10:12 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: [Moses-support] tokenizer / detokenizer
To: moses-support <moses-support@mit.edu>
Message-ID: <561BDB64.5010701@neuf.fr>
Content-Type: text/plain; charset=utf-8; format=flowed

Hello,

Pretty sure there is no academic importance to this, but :

For the tokenizer we have the -x option to skip XML/HTML tags

For the detokenizer it WILL SKIP whatever.
cf :

while(<STDIN>) {
if (/^<.+>$/ || /^\s*$/) {
#don't try to detokenize XML/HTML tag lines
print $_;
} elsif ($PENN) {
print &detokenize_penn($_);
} else {
print &detokenize($_);
}
}


I think to be consistent, there should be a -x option in the detokenizer too.

Otherwise it will skip entire lines .....

Cheers,

Vincent




------------------------------

Message: 3
Date: Mon, 12 Oct 2015 17:15:26 +0100
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: [Moses-support] Compact lex reordering table on OSX/clang
To: moses-support <moses-support@mit.edu>, Marcin Junczys-Dowmunt
<junczys@amu.edu.pl>
Message-ID: <561BDC9E.9060106@gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed

I'm not sure if anyone else encounters it but the compact lexical
reordering table crashes for me on OSX/clang during loading.

The stack trace i have for this is
LexicalReorderingTableCompact::LexicalReorderingTableCompact
LexicalReorderingTableCompact::Load line 180
StringVector::load line 2808
StringVector::loadCharArray line 247

--
Hieu Hoang
http://www.hoang.co.uk/hieu



------------------------------

Message: 4
Date: Mon, 12 Oct 2015 09:26:26 -0700
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] how to run moses through shell script in
linux
To: Apurva Joshi <apurvajoshi1992@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDC1yhAbhQe9284=SHFDcSANOtp4n3Xzjd3zfX5uajFKyg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

to translate a set of English sentences, you have to put them into a
separate file, say, "english-input.txt", and then run the decoder as
follows:

~/mosesdecoder-RELEASE-3.0/bin/moses -f ~/working/train/model/moses.ini <
english-input.txt

-phi

On Thu, Oct 8, 2015 at 10:59 PM, Apurva Joshi <apurvajoshi1992@gmail.com>
wrote:

> my snapshot...
>
>
> On Fri, Oct 9, 2015 at 11:27 AM, Apurva Joshi <apurvajoshi1992@gmail.com>
> wrote:
>
>> hello guyzz...i have designed SMT using moses for English-hindi ....i use
>> following command to run this...
>>
>> ~/mosesdecoder-RELEASE-3.0/bin/moses -f ~/working/train/model/moses.ini
>>
>> after running above command...my LM, TM starts loading...
>> and then i type english word...and i get corresponding op hindi
>> translation...
>>
>> and it continously remains in that mode...means it does not go back to
>> linux prompt '$' ....
>> m attaching snapshot of op when i run above command...
>>
>> but then it continuously remain into it..so if i write following script
>> pgm...then....
>>
>> #!/bin/bash
>> ~/mosesdecoder-RELEASE-3.0/bin/moses -f ~/working/train/model/moses.ini
>> hello
>>
>>
>>
>> in above script it runs fst moses command....but then scrpit cant accept
>> hello as input for hindi conversion....
>>
>>
>> plzzz tell me how to solve this pbm...i hv one command which takes
>> english file as input and gives hindi op file...
>> but evry time it loads LM ,TM like above..and gives file as op...
>>
>> but i want to run my fst initially above mentioned command so that after
>> i should be able to give ip at any no of time...it should not load LM,TM
>> for each ip....
>>
>>
>> plzzz help...and plzz reply on same email id...
>>
>> apurvajoshi1992@gmail.com
>>
>> guyzz m really waiting for ur help....
>>
>>
>>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151012/60e8859c/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 108, Issue 45
**********************************************

0 Response to "Moses-support Digest, Vol 108, Issue 45"

Post a Comment