Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Tokenization problem (Ihab Ramadan)
2. ERROR: Lexical reordering scoring failed (Jon Olds)
3. SDL Language Technologies Summer Intern Programme 2015
(Bill Byrne)
----------------------------------------------------------------------
Message: 1
Date: Mon, 5 Jan 2015 10:09:09 +0200
From: "Ihab Ramadan" <i.ramadan@saudisoft.com>
Subject: [Moses-support] Tokenization problem
To: <moses-support@mit.edu>
Message-ID: <004001d028be$e3895570$aa9c0050$@saudisoft.com>
Content-Type: text/plain; charset="us-ascii"
Dears,
Using the tokenizer on the training files replaces the apostrophes with
"' s" (with space) but if I use the same script to tokenize a sentence
it makes the apostrophes to be "'s" (without a space)
This problem confuse the decoder while translation
How to solve this peoblem
Thanks
Best Regards
Ihab Ramadan| Senior Developer| <http://www.saudisoft.com/> Saudisoft -
Egypt | Tel +2 02 330 320 37 Ext- 0 | Mob+201007570826 | Fax+20233032036 |
Follow us on
<http://www.linkedin.com/company/77017?trk=vsrp_companies_res_name&trkInfo=V
SRPsearchId%3A1489659901402995947155%2CVSRPtargetId%3A77017%2CVSRPcmpt%3Apri
mary> linked |
<https://www.facebook.com/pages/Saudisoft-Co-Ltd/289968997768973?ref_type=bo
okmark> ZA102637861 | <https://twitter.com/Saudisoft> ZA102637858
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150105/5dabff91/attachment-0001.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 1314 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150105/5dabff91/attachment-0003.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 1317 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150105/5dabff91/attachment-0004.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 1351 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150105/5dabff91/attachment-0005.gif
------------------------------
Message: 2
Date: Mon, 05 Jan 2015 08:51:56 +0000
From: Jon Olds <joft_uk@yahoo.co.uk>
Subject: [Moses-support] ERROR: Lexical reordering scoring failed
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <54AA50AC.8020102@yahoo.co.uk>
Content-Type: text/plain; charset=utf-8; format=flowed
Hi,
I?m trying to build a hierarchical model using the same (cleaned) data
that I have used successfully to build a phrase model and I keep getting
the following error:
ERROR: Lexical reordering scoring failed at
/home/ubuntu/tools/mosesdecoder/scripts/training/train-model.perl line 1776.
The command I used was:
nohup nice ~/tools/mosesdecoder/scripts/training/train-model.perl
-root-dir train -corpus ~/data/tok/base.clean -f fr -e en -alignment
grow-diag-final-and -reordering msd-bidirectional-fe -lm
0:3:$HOME/data/lm/base.blm.en:8 -external-bin-dir ~/tools/trainingtools
-hierarchical -glue-grammar -max-phrase-length 5 >& training.out &
Any ideas?
Many thanks,
Jon
/home/ubuntu/tools/mosesdecoder/scripts/generic/score-parallel.perl 1
"sort " /home/ubuntu/tools/mosesdecoder/scripts/../bin/score
/home/ubuntu/working/train/model/extract.inv.sorted.gz
/home/ubuntu/working/train/model/lex.e2f
/home/ubuntu/working/train/model/rule-table.half.e2f.gz --Inverse
--Hierarchical 1
Executing:
/home/ubuntu/tools/mosesdecoder/scripts/generic/score-parallel.perl 1
"sort " /home/ubuntu/tools/mosesdecoder/scripts/../bin/score
/home/ubuntu/working/train/model/extract.inv.sorted.gz
/home/ubuntu/working/train/model/lex.e2f
/home/ubuntu/working/train/model/rule-table.half.e2f.gz --Inverse
--Hierarchical 1
Started Sun Jan 4 12:55:12 2015
ln -s /home/ubuntu/working/train/model/extract.inv.sorted.gz
/home/ubuntu/working/train/model/tmp.25446/extract.0.gz
/home/ubuntu/tools/mosesdecoder/scripts/../bin/score
/home/ubuntu/working/train/model/tmp.25446/extract.0.gz
/home/ubuntu/working/train/model/lex.e2f
/home/ubuntu/working/train/model/tmp.25446/phrase-table.half.0000000.gz
--Inverse --Hierarchical 2>> /dev/stderr
/home/ubuntu/working/train/model/tmp.25446/run.0.shgunzip -c
/home/ubuntu/working/train/model/tmp.25446/phrase-table.half.*.gz 2>>
/dev/stderr| LC_ALL=C sort -T
/home/ubuntu/working/train/model/tmp.25446 | gzip -c >
/home/ubuntu/working/train/model/rule-table.half.e2f.gz 2>> /dev/stderr
rm -rf /home/ubuntu/working/train/model/tmp.25446
Finished Sun Jan 4 13:15:56 2015
(6.6) consolidating the two halves @ Sun Jan 4 13:15:56 UTC 2015
Executing: /home/ubuntu/tools/mosesdecoder/scripts/../bin/consolidate
/home/ubuntu/working/train/model/rule-table.half.f2e.gz
/home/ubuntu/working/train/model/rule-table.half.e2f.gz /dev/stdout
--Hierarchical | gzip -c > /home/ubuntu/working/train/model/rule-table.gz
Consolidate v2.0 written by Philipp Koehn
consolidating direct and indirect rule tables
processing hierarchical rules
Executing: rm -f /home/ubuntu/working/train/model/rule-table.half.*
(7) learn reordering model @ Sun Jan 4 13:25:58 UTC 2015
(7.1) [no factors] learn reordering model @ Sun Jan 4 13:25:58 UTC 2015
(7.2) building tables @ Sun Jan 4 13:25:58 UTC 2015
Executing:
/home/ubuntu/tools/mosesdecoder/scripts/../bin/lexical-reordering-score
/home/ubuntu/working/train/model/extract.o.sorted.gz 0.5
/home/ubuntu/working/train/model/reordering-table. --model "wbe msd
wbe-msd-bidirectional-fe"
Lexical Reordering Scorer
scores lexical reordering models of several types (hierarchical,
phrase-based and word-based-extraction
terminate called after throwing an instance of 'util::ErrnoException'
what(): util/file.cc:68 in int util::OpenReadOrThrow(const char*)
threw ErrnoException because `-1 == (ret = open(name, 00))'.
No such file or directory while opening
/home/ubuntu/working/train/model/extract.o.sorted.gz
Aborted (core dumped)
Exit code: 134
ERROR: Lexical reordering scoring failed at
/home/ubuntu/tools/mosesdecoder/scripts/training/train-model.perl line 1776.
------------------------------
Message: 3
Date: Mon, 5 Jan 2015 11:22:00 +0000
From: Bill Byrne <bbyrne@sdl.com>
Subject: [Moses-support] SDL Language Technologies Summer Intern
Programme 2015
To: "moses-support@mit.edu" <moses-support@mit.edu>
Cc: Bill Byrne <bbyrne@sdl.com>
Message-ID: <00F85450-E247-489F-9C63-EB26F9C5DCA8@sdl.com>
Content-Type: text/plain; charset="us-ascii"
SDL Language Technologies is looking for outstanding PhD students for summer internships with our research teams in Los Angeles (USA) and Cambridge (UK). Interns will work one-on-one with a mentor on a three-month NLP research project.
Our projects are centred around machine translation and natural language processing. The exact nature of the project will depend on the interests and background of the intern. The objective will be novel work of commercial value, and the expectation is that successful projects will lead to a paper submission. Examples of recent projects are:
- Target dependency language models in phrase-based decoding
- Source-side preordering for translation using logistic regression and depth-first branch-and-bound search
- Translating into morphologically rich languages
- Online adaptation of MT
- Predicting MT customer ratings
If interested, please send an email to mt-jobs@sdl.com by February 15, 2015, with "2015 Internship Application -- *your name*" in the subject. Please include your CV and a statement that describes your research interests and experience. U.S. citizenship / EU passport is not required, but please indicate your preference for working in Los Angeles or in Cambridge.
SDL Language Technologies (previously: Language Weaver) is the first company to bring a statistical translation system to market. Its R&D team solves large-scale NLP problems with the goal of increasing MT's commercial impact
Bill Byrne | Senior Research Scientist, Director UK R&D Office | SDL Research | (m) +44 (0)7852910371
SDL PLC confidential, all rights reserved.
If you are not the intended recipient of this mail SDL requests and requires that you delete it without acting upon or copying any of its contents, and we further request that you advise us.
SDL PLC is a public limited company registered in England and Wales. Registered number: 02675207.
Registered address: Globe House, Clivemont Road, Maidenhead, Berkshire SL6 7DY, UK.
This message has been scanned for malware by Websense. www.websense.com
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 99, Issue 6
********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 99, Issue 6"
Post a Comment