Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Fuzzy Match Rule segfaults (Jon Olds)
2. Problem with syntactic baselines, rule tables are too small
!!!! (hxshi)
----------------------------------------------------------------------
Message: 1
Date: Tue, 20 Jan 2015 21:43:27 +0000
From: Jon Olds <joft_uk@yahoo.co.uk>
Subject: Re: [Moses-support] Fuzzy Match Rule segfaults
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <54BECBFF.6040001@yahoo.co.uk>
Content-Type: text/plain; charset="utf-8"
Thanks Hieu. I?ll try to get it working with a much smaller data set
first. I just wanted to make sure that my moses.ini file looks ok.
On 20/01/2015 12:15, Hieu Hoang wrote:
> it should work, but no-one asked about it for a long time. It may not
> be used by many people and fallen into a state of disrepair.
>
> If you want to send me your data files, I can take a look at it when
> I've got time
>
> On 16 January 2015 at 15:51, Jon Olds <joft_uk@yahoo.co.uk
> <mailto:joft_uk@yahoo.co.uk>> wrote:
>
> Hi,
>
> I?ve been taking another look at the fuzzy match rule for hierarchical
> models. I am really not sure how to set it up but seem to have got
> some
> response by using the moses.ini below.
>
> Unfortunately, it then segfaults every time after the following
> output.
>
> I?m probably doing something very stupid, but any assistance would be
> much appreciated.
>
> Cheers,
>
> Jon
>
> (6.6) consolidating the two halves @ Fri Jan 16 15:21:28 UTC 2015
> Executing: /home/ubuntu/tools/mosesdecoder/scripts/../bin/consolidate
> /tmp/moses.gpitcm/fuzzyMatchFile.pt.half.f2e.gz
> /tmp/moses.gpitcm/fuzzyMatchFile.pt.half.e2f.gz /dev/stdout
> --Hierarchical | gzip -c > /tmp/moses.gpitcm/fuzzyMatchFile.pt.gz
> Consolidate v2.0 written by Philipp Koehn
> consolidating direct and indirect rule tables
> processing hierarchical rules
> Executing: rm -f /tmp/moses.gpitcm/fuzzyMatchFile.pt.half.*
> Start loading fuzzy-match phrase model : [41.633] seconds
> Line 0: Initialize search took 0.116 seconds total
> Translating: <s> ce v?hicule est ? ce jour le plus important d?di?
> ? l'
> immobilier tertiaire en Ile - de - France . </s> ||| [0,0]=X (1)
> [0,1]=X
> (1) [0,2]=X (1) [0,3]=X (1) [0,4]=X (1) [0,5]=X (1) [0,6]=X (1)
> [0,7]=X
> (1) [0,8]=X (1) [0,9]=X (1) [0,10]=X (1) [0,11]=X (1) [0,12]=X (1)
> [0,13]=X (1) [0,14]=X (1) [0,15]=X (1) [0,16]=X (1) [0,17]=X (1)
> [0,18]=X (1) [0,19]=X (1) [0,20]=X (1) [0,21]=X (1) [0,22]=X (1)
> [1,1]=X
> (1) [1,2]=X (1) [1,3]=X (1) [1,4]=X (1) [1,5]=X (1) [1,6]=X (1)
> [1,7]=X
> (1) [1,8]=X (1) [1,9]=X (1) [1,10]=X (1) [1,11]=X (1) [1,12]=X (1)
> [1,13]=X (1) [1,14]=X (1) [1,15]=X (1) [1,16]=X (1) [1,17]=X (1)
> [1,18]=X (1) [1,19]=X (1) [1,20]=X (1) [1,21]=X (1) [1,22]=X (1)
> [2,2]=X
> (1) [2,3]=X (1) [2,4]=X (1) [2,5]=X (1) [2,6]=X (1) [2,7]=X (1)
> [2,8]=X
> (1) [2,9]=X (1) [2,10]=X (1) [2,11]=X (1) [2,12]=X (1) [2,13]=X (1)
> [2,14]=X (1) [2,15]=X (1) [2,16]=X (1) [2,17]=X (1) [2,18]=X (1)
> [2,19]=X (1) [2,20]=X (1) [2,21]=X (1) [2,22]=X (1) [3,3]=X (1)
> [3,4]=X
> (1) [3,5]=X (1) [3,6]=X (1) [3,7]=X (1) [3,8]=X (1) [3,9]=X (1)
> [3,10]=X
> (1) [3,11]=X (1) [3,12]=X (1) [3,13]=X (1) [3,14]=X (1) [3,15]=X (1)
> [3,16]=X (1) [3,17]=X (1) [3,18]=X (1) [3,19]=X (1) [3,20]=X (1)
> [3,21]=X (1) [3,22]=X (1) [4,4]=X (1) [4,5]=X (1) [4,6]=X (1) [4,7]=X
> (1) [4,8]=X (1) [4,9]=X (1) [4,10]=X (1) [4,11]=X (1) [4,12]=X (1)
> [4,13]=X (1) [4,14]=X (1) [4,15]=X (1) [4,16]=X (1) [4,17]=X (1)
> [4,18]=X (1) [4,19]=X (1) [4,20]=X (1) [4,21]=X (1) [4,22]=X (1)
> [5,5]=X
> (1) [5,6]=X (1) [5,7]=X (1) [5,8]=X (1) [5,9]=X (1) [5,10]=X (1)
> [5,11]=X (1) [5,12]=X (1) [5,13]=X (1) [5,14]=X (1) [5,15]=X (1)
> [5,16]=X (1) [5,17]=X (1) [5,18]=X (1) [5,19]=X (1) [5,20]=X (1)
> [5,21]=X (1) [5,22]=X (1) [6,6]=X (1) [6,7]=X (1) [6,8]=X (1) [6,9]=X
> (1) [6,10]=X (1) [6,11]=X (1) [6,12]=X (1) [6,13]=X (1) [6,14]=X (1)
> [6,15]=X (1) [6,16]=X (1) [6,17]=X (1) [6,18]=X (1) [6,19]=X (1)
> [6,20]=X (1) [6,21]=X (1) [6,22]=X (1) [7,7]=X (1) [7,8]=X (1) [7,9]=X
> (1) [7,10]=X (1) [7,11]=X (1) [7,12]=X (1) [7,13]=X (1) [7,14]=X (1)
> [7,15]=X (1) [7,16]=X (1) [7,17]=X (1) [7,18]=X (1) [7,19]=X (1)
> [7,20]=X (1) [7,21]=X (1) [7,22]=X (1) [8,8]=X (1) [8,9]=X (1)
> [8,10]=X
> (1) [8,11]=X (1) [8,12]=X (1) [8,13]=X (1) [8,14]=X (1) [8,15]=X (1)
> [8,16]=X (1) [8,17]=X (1) [8,18]=X (1) [8,19]=X (1) [8,20]=X (1)
> [8,21]=X (1) [8,22]=X (1) [9,9]=X (1) [9,10]=X (1) [9,11]=X (1)
> [9,12]=X
> (1) [9,13]=X (1) [9,14]=X (1) [9,15]=X (1) [9,16]=X (1) [9,17]=X (1)
> [9,18]=X (1) [9,19]=X (1) [9,20]=X (1) [9,21]=X (1) [9,22]=X (1)
> [10,10]=X (1) [10,11]=X (1) [10,12]=X (1) [10,13]=X (1) [10,14]=X (1)
> [10,15]=X (1) [10,16]=X (1) [10,17]=X (1) [10,18]=X (1) [10,19]=X (1)
> [10,20]=X (1) [10,21]=X (1) [10,22]=X (1) [11,11]=X (1) [11,12]=X (1)
> [11,13]=X (1) [11,14]=X (1) [11,15]=X (1) [11,16]=X (1) [11,17]=X (1)
> [11,18]=X (1) [11,19]=X (1) [11,20]=X (1) [11,21]=X (1) [11,22]=X (1)
> [12,12]=X (1) [12,13]=X (1) [12,14]=X (1) [12,15]=X (1) [12,16]=X (1)
> [12,17]=X (1) [12,18]=X (1) [12,19]=X (1) [12,20]=X (1) [12,21]=X (1)
> [12,22]=X (1) [13,13]=X (1) [13,14]=X (1) [13,15]=X (1) [13,16]=X (1)
> [13,17]=X (1) [13,18]=X (1) [13,19]=X (1) [13,20]=X (1) [13,21]=X (1)
> [13,22]=X (1) [14,14]=X (1) [14,15]=X (1) [14,16]=X (1) [14,17]=X (1)
> [14,18]=X (1) [14,19]=X (1) [14,20]=X (1) [14,21]=X (1) [14,22]=X (1)
> [15,15]=X (1) [15,16]=X (1) [15,17]=X (1) [15,18]=X (1) [15,19]=X (1)
> [15,20]=X (1) [15,21]=X (1) [15,22]=X (1) [16,16]=X (1) [16,17]=X (1)
> [16,18]=X (1) [16,19]=X (1) [16,20]=X (1) [16,21]=X (1) [16,22]=X (1)
> [17,17]=X (1) [17,18]=X (1) [17,19]=X (1) [17,20]=X (1) [17,21]=X (1)
> [17,22]=X (1) [18,18]=X (1) [18,19]=X (1) [18,20]=X (1) [18,21]=X (1)
> [18,22]=X (1) [19,19]=X (1) [19,20]=X (1) [19,21]=X (1) [19,22]=X (1)
> [20,20]=X (1) [20,21]=X (1) [20,22]=X (1) [21,21]=X (1) [21,22]=X (1)
> [22,22]=X (1)
>
>
> ### MOSES CONFIG FILE ###
> #########################
>
> # input factors
> [input-factors]
> 0
>
> # mapping steps
> [mapping]
> 0 T 0
>
> [cube-pruning-pop-limit]
> 1000
>
> [non-terminals]
> X
>
> [search-algorithm]
> 3
>
> [inputtype]
> 3
>
> [max-chart-span]
> 20
>
> # feature functions
> [feature]
> PhraseDictionaryFuzzyMatch
> source=/home/ubuntu/data/tok/base.clean.fr <http://base.clean.fr>
> target=/home/ubuntu/data/tok/base.clean.en
> alignment=/home/ubuntu/train/model/aligned.grow-diag-final-and
> num-features=0
> KENLM lazyken=0 name=LM0 factor=0
> path=/home/ubuntu/train/lm/base.blm.en
> order=3
>
> # dense weights for feature functions
>
>
> [weight]
> LM0= 0.0866615
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150120/42625fd8/attachment-0001.htm
------------------------------
Message: 2
Date: Wed, 21 Jan 2015 13:08:38 +0800
From: hxshi <hxshi@mtlab.hit.edu.cn>
Subject: [Moses-support] Problem with syntactic baselines, rule tables
are too small !!!!
To: moses-support <moses-support@mit.edu>
Cc: theshihuaxing <theshihuaxing@gmail.com>
Message-ID: <2015012113083807656322@mtlab.hit.edu.cn>
Content-Type: text/plain; charset="gb2312"
I am trying to build a syntactic baselines. Using FBIS data as the training set.
But the result what I got is a too small rule-table, even can not translate anything.
Baseline building steps are as following:
Training data: 234,348 lines for both Chinese side and English side of FBIS data .
for example:
En:the effort against corruption has been intensified , echoing the antismuggling campaign .
Zh: ?????? ???? ???? , ?? ?????? ???? ???? ?? ?? ??
I followed the guild line,
0 step : parsing the English tree with zpar
output such as:
(S (NP (NP (DT the) (NN effort)) (PP (IN against) (NP (NN corruption)))) (VP (VBZ has) (VP (VBN been)
(VP (VBN intensified) (, ,) (S (VP (VBG echoing) (NP (DT the) (JJ antismuggling) (NN campaign))))))) (. .))
1 step: wrapper the syntactic tree with /moses/scripts/training/wrappers/berkeleyparsed2mosesxml.perl
output such as:
<tree label="S"> <tree label="NP"> <tree label="NP"> <tree label="DT"> the </tree> <tree label="NN"> effort </tree>
</tree> <tree label="PP"> <tree label="IN"> against </tree> <tree label="NP"> <tree label="NN"> corruption </tree>
</tree> </tree> </tree> <tree label="VP"> <tree label="VBZ"> has </tree> <tree label="VP"> <tree label="VBN"> been
</tree> <tree label="VP"> <tree label="VBN"> intensified </tree> <tree label=","> , </tree> <tree label="S"> <tree label="VP">
<tree label="VBG"> echoing </tree> <tree label="NP"> <tree label="DT"> the </tree> <tree label="JJ"> antismuggling </tree>
<tree label="NN"> campaign </tree> </tree> </tree> </tree> </tree> </tree> </tree> <tree label="."> . </tree> </tree>
2 step: Train-model with following command
train-model.perl --source-syntax -max-phrase-length=999 --extract-options="--MaxSpan 999" -lm 0:5:${lm_dir}/lmsri.cn --corpus ${corpus_dir}/train_all --f en --e zh
-root-dir $train_dir -external-bin-dir /home/hxshi/moses/tools/bin -mgiza -mgiza-cpus 6 -cores 10 --alignment grow-diag-final-and -score-options ' --GoodTuring'
what I got are:
234348 lines aligned.0.en
234348 lines aligned.0.zh
234348 lines aligned.grow-diag-final-and
3252 lines extract.inv.sorted.gz
3252 lines extract.sorted.gz
1724540 lines lex.e2f
1724540 lines lex.f2e
43 lines moses.ini
2935 lines rule-table.gz
3 step: Tuning with command :
mert-moses.pl --inputtype 3 $d_s $d_ref /home/hxshi/moses/tools/moses/bin/moses $d_ini --working-dir ${tuning_dir} --batch-mira --return-best-dev
--decoder-flags " -threads 20 -v 0 " --rootdir /home/hxshi/moses/tools/moses/scripts -mertdir /home/hxshi/moses/tools/moses/bin --threads 20 --maximum-iterations 30
it stoped even in the first run
Enclose please find my moses.ini and my tuning output
I tried in both Tree2String (En2Ch) and String2Tree (Ch2En). The result almost the same. Nothing can be translated.
Thank you for your patience of reading this mail. I am waiting for your response urgently and sincerely !!
Shi Huaxing
MI&T Lab
School of Computer Science and Technology
Harbin Institute of Technology
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150121/508d4835/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log.turning
Type: application/octet-stream
Size: 2565 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150121/508d4835/attachment.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moses.ini
Type: application/octet-stream
Size: 783 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150121/508d4835/attachment-0001.obj
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 99, Issue 46
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 99, Issue 46"
Post a Comment