Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. How to train a tree-to-tree model? (Steven Huang)
2. cfp for 20th International Conference on Application of
Natural Language to Information Systems (NLDB'15) (Michael Zock)
----------------------------------------------------------------------
Message: 1
Date: Thu, 4 Dec 2014 14:16:11 +0800
From: Steven Huang <d98922047@ntu.edu.tw>
Subject: [Moses-support] How to train a tree-to-tree model?
To: moses-support@mit.edu, ??? <farmer.tw@gmail.com>
Message-ID:
<CAG-iPUoWKSvVTNn4ricg5ueMpPzrO4r=s40kRzwTMX11JaST3w@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi,
I am trying to build a tree-to-tree model. Before that, I've successfully
build a string-to-string syntax model with the following configuration (the
training corpus are in surface form).
/mosesdecoder/scripts/training/train-model.perl \
--root-dir train \
--mgiza \
--mgiza-cpus 20 \
--corpus /corpus \
--f en \
--e ch \
--lm 0:3:/lm/en-ch-surface.arpa.ch:8 \
--hierarchical \
--glue-grammar \
--max-phrase-length 10 \
--alignment grow-diag-final-and \
--external-bin-dir /mosesdecoder/tools
However, I failed to build a tree-to-tree model using the following
configuration with 2 modifications:
1. I added -target-sytax and -source-syntax arguments, andd
2. use syntax-annotated XML as training corpus (see the attached file for
reference).
/mosesdecoder/scripts/training/train-model.perl \
--root-dir train \
--mgiza \
--mgiza-cpus 20 \
--corpus /tree_test/tree \
--f en \
--e ch \
--lm 0:3:/lm/en-ch-surface.arpa.ch:8 \
--hierarchical \
--target-syntax \
--source-syntax \
--glue-grammar \
--max-phrase-length 10 \
--alignment grow-diag-final-and \
--external-bin-dir /mosesdecoder/tools
During training, there are many warnings like this:
Sent No: 13 , No. Occurrences: 1
0
3
ERROR: Forbidden zero sentence length 0
And en.vcb are generated with 9 lines:
en.vcb
1 UNK 0
2 morning 1
3 class 1
4 bel="FRAG"> 1
5 Good 1
6 GAO 1
7 : 1
8 . 1
9 , 1
It seems that the XML is not correctly paresed and is taken as plain text.
Is there anything wrong with my training configuration or training corpus?
Thanks a lot.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141204/1eaff41e/attachment-0001.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tree.en
Type: application/octet-stream
Size: 357 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20141204/1eaff41e/attachment-0001.obj
------------------------------
Message: 2
Date: Thu, 04 Dec 2014 13:30:12 +0100
From: Michael Zock <Michael.Zock@lif.univ-mrs.fr>
Subject: [Moses-support] cfp for 20th International Conference on
Application of Natural Language to Information Systems (NLDB'15)
To: corpora@uib.no, moses-support@mit.edu
Cc: Chris Bieman <biem@cs.tu-darmstadt.de>
Message-ID: <548053D4.1050502@lif.univ-mrs.fr>
Content-Type: text/plain; charset="us-ascii"
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141204/d6bcd93e/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 98, Issue 18
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 98, Issue 18"
Post a Comment