Moses-support Digest, Vol 112, Issue 8

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: Polysynthetic languages? (Marcin Junczys-Dowmunt)
2. Re: kbmira died with SIGABRT when tuning (Dingyuan Wang)

----------------------------------------------------------------------

Message: 1
Date: Mon, 01 Feb 2016 21:07:59 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Polysynthetic languages?
To: moses-support@mit.edu
Message-ID: <56AFBB1F.9030100@amu.edu.pl>
Content-Type: text/plain; charset=UTF-8; format=flowed

Plain tokenized text is good enough. It may even work as a tokenizer(?)
if none is available. There is no specific notion of "infix themes"
though. The segmentation is purely frequency-based, no linguistic
motivation there, but it may just work.

It's easy enough, just run it and take a look at the results. Even it
looks strange to you it may be worth to do a test training anyway. As I
said, for Russian->English I get a nice improvement for patent data.

On 01.02.2016 19:30, Michael Joyner wrote:
> So how does that work?
>
> it just takes all the words from the corpus and guesses "infix themes"
> ? Or do I have to supply pre-tagged data?
>
> On Mon, Feb 1, 2016 at 9:04 AM, Rico Sennrich <rico.sennrich@gmx.ch
> <mailto:rico.sennrich@gmx.ch>> wrote:
>
> Hi Mike,
>
> here's a link to the tool Marcin mentioned:
> https://github.com/rsennrich/subword-nmt
>
> I haven't tried it on phrase-based MT myself, but feel free to
> give it a try.
>
> You could also try other unsupervised morpheme segmenters like
> morfessor: https://github.com/aalto-speech/morfessor
>
> I don't know if there's any segmentation methods specific for
> Cherokee.
>
> best wishes,
> Rico
>
>
> On 01.02.2016 13:31, Marcin Junczys-Dowmunt wrote:
>>
>> Hi Mike,
>>
>> Maybe take a look at Rico's tool for handling unknown words in
>> neural machine translation. I have been playing around with that
>> for Russian-English and standard phrase-based SMT with some
>> success. I am just not sure if your small corpora will be enough
>> to learn useful segmentations though.
>>
>> It's an unsupervised method for word segmentation. For
>> Russian-English I created a code dictionary of the 100,000
>> most-frequent segments per language. Unseen tokens will get
>> segmented. The segmentation is not neccessarily similar to a
>> linguisticly correct segmentation, though. You will probably want
>> to try smaller numbers.
>>
>> Best,
>>
>> Marcin
>>
>> W dniu 2016-02-01 14:12, Michael Joyner napisa?(a):
>>
>>> I am trying to use Moses with Cherokee using the New Testament
>>> and Genesis as primary corpus. I am feeding it the WEB, BBE as
>>> source English texts at the moment.
>>>
>>> As Cherokee uses bound pronouns and no articles and has almost
>>> nil preposition analogues, (these features are mostly verb
>>> infixes), is there a technique for corpus adjustment that can be
>>> done to improve the phrase mapping between Cherokee and English?
>>>
>>> I am currently doing Cherokee => English.
>>> Thanks, Mike
>>> --
>>>
>>> WEB: World English Bible (Public Domain)
>>> BBE: Basic English Bible (Public Domain)
>>>
>>> * Learn to the Cherokee language: http://jalagigawoni.gnomio.com/
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
>
> * Learn to the Cherokee language: http://jalagigawoni.gnomio.com/
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

------------------------------

Message: 2
Date: Tue, 2 Feb 2016 17:59:23 +0800
From: Dingyuan Wang <abcdoyle888@gmail.com>
Subject: Re: [Moses-support] kbmira died with SIGABRT when tuning
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>, Hieu Hoang
<hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <56B07DFB.9030502@gmail.com>
Content-Type: text/plain; charset=utf-8

Hi all,

Sorry for posting on the old thread.

It can't be a memory problem. I ran a similar experiment on another
machine, and failed the same way that some bytes in WT feature score are
corrupted.

? 2016?01?20? 18:20, Barry Haddow ??:
> Hi Dingyuan
>
> What platform are you running on? I could not reproduce your error on
> Ubuntu 12.04, and valgrind is clean,
>
> cheers - Barry
>
> On 19/01/16 16:31, Barry Haddow wrote:
>> Hi Dingyuan
>>
>> I ran for over 200 iterations and saw no problem. I tried with your LANG
>> and LANGUAGE settings (I don't have the right packages for the other
>> settings) and still saw no failure.
>>
>> Maybe it is a random pointer/memory problem like you suggested. I have
>> started running your model with valgrind, but nothing so far,
>>
>> cheers - Barry
>>
>> On 19/01/16 14:26, Dingyuan Wang wrote:
>>> Hi Barry,
>>>
>>> It usually hits an error in about 1~10 iterations on my laptop. I don't
>>> know what triggers that, so it may be a probability problem.
>>>
>>> Disabling xml-input won't help. I think I should use verbose output.
>>>
>>> My locale settings is:
>>>
>>> LANG=zh_CN.UTF-8
>>> LANGUAGE=zh_CN.UTF-8:zh_TW.UTF-8:zh_HK.utf8:en_US.utf8
>>> LC_CTYPE="zh_CN.UTF-8"
>>> LC_NUMERIC="zh_CN.UTF-8"
>>> LC_TIME="zh_CN.UTF-8"
>>> LC_COLLATE="zh_CN.UTF-8"
>>> LC_MONETARY="zh_CN.UTF-8"
>>> LC_MESSAGES="zh_CN.UTF-8"
>>> LC_PAPER="zh_CN.UTF-8"
>>> LC_NAME="zh_CN.UTF-8"
>>> LC_ADDRESS="zh_CN.UTF-8"
>>> LC_TELEPHONE="zh_CN.UTF-8"
>>> LC_MEASUREMENT="zh_CN.UTF-8"
>>> LC_IDENTIFICATION="zh_CN.UTF-8"
>>> LC_ALL=
>>>
>>> ? 2016?01?19? 19:20, Barry Haddow ??:
>>>> Hi Dingyuan
>>>>
>>>> I have your script and model running, but so far it has not reported
>>>> any
>>>> errors. It's at iteration 27, and I'm using the latest Moses from git.
>>>>
>>>> How long should I expect it to run before it hits an error? Could it be
>>>> affected by the locale setting?
>>>>
>>>> Have you tried running without xml-input to see if you still have the
>>>> problem?
>>>>
>>>> cheers - Barry
>>>>
>>>> On 19/01/16 05:43, Dingyuan Wang wrote:
>>>>> Hi Barry,
>>>>>
>>>>> I've uploaded the model:
>>>>> https://mega.nz/#!UsVSBCBJ!e5IATFvLqrCb5zhmDekLn8NOGw4PSD9RRQLGQeKEvNY
>>>>>
>>>>> To test the model, I included a script 'repeatnbest.sh' which runs
>>>>> moses
>>>>> repeatedly until encoding error occurs.
>>>>>
>>>>> The file run7.best100.out and run7.out in the archive is the last run
>>>>> that produces the error.
>>>>>
>>>>> It seems that it is WordTranslationFeature that causes the problem.
>>>>>
>>>>> ? 2016?01?19? 00:03, Barry Haddow ??:
>>>>>> Hi Dingyuan
>>>>>>
>>>>>> Something is going wrong with the construction or outputting of
>>>>>> feature
>>>>>> names, and it looks like it's WordTranslationFeature that's the
>>>>>> problem.
>>>>>> Does the problem go away if you do not use word translation features?
>>>>>>
>>>>>> If you could make available a model that reproduces the nbest list
>>>>>> construction then I would have a chance to debug it,
>>>>>>
>>>>>> cheers - Barry
>>>>>>
>>>>>> On 18/01/16 15:32, Dingyuan Wang wrote:
>>>>>>> Hi Barry,
>>>>>>>
>>>>>>> I've checked all the models and corpora with the script, without
>>>>>>> finding
>>>>>>> any encoding problem.
>>>>>>>
>>>>>>> I also find that all such errors in nbest list occurs only in the
>>>>>>> feature list (3 different samples), without affecting translation
>>>>>>> result. Therefore, the phrase table or training corpus may not be
>>>>>>> the
>>>>>>> problem.
>>>>>>>
>>>>>>> ? 2016?01?18? 23:04, Barry Haddow ??:
>>>>>>>> Hi Dingyuan
>>>>>>>>
>>>>>>>> Are these encoding errors present in your phrase table? Are they
>>>>>>>> present
>>>>>>>> in your training corpus? Since they appear in the word translation
>>>>>>>> features, and you are using a shortlist, are they in the shortlist
>>>>>>>> files
>>>>>>>> in the model directory? (These have names with "topn" in them
>>>>>>>> afaik).
>>>>>>>>
>>>>>>>> File-system errors are unlikely, and for the most part Moses treats
>>>>>>>> text
>>>>>>>> as byte strings so encoding errors usually trace back to the source
>>>>>>>> text.
>>>>>>>>
>>>>>>>> cheers - Barry
>>>>>>>>
>>>>>>>> On 18/01/16 14:56, Dingyuan Wang wrote:
>>>>>>>>> Hi Barry,
>>>>>>>>>
>>>>>>>>> "The ones starting with the "@"" are due to corrupted bytes in the
>>>>>>>>> nbest
>>>>>>>>> list.
>>>>>>>>>
>>>>>>>>> This kind of corruption occurs from time to time. I wonder if it
>>>>>>>>> comes
>>>>>>>>> from memory errors or filesystem failure or some kind of
>>>>>>>>> pointer/encoding problem in moses.
>>>>>>>>>
>>>>>>>>> I've written a script to find such corrupted lines:
>>>>>>>>>
>>>>>>>>> https://gist.github.com/gumblex/0d9d0848b435e4f9818f
>>>>>>>>>
>>>>>>>>> ? 2016?01?18? 20:42, Barry Haddow ??:
>>>>>>>>>> Hi Dingyuan
>>>>>>>>>>
>>>>>>>>>> The extractor expects feature names to contain an underscore (not
>>>>>>>>>> sure
>>>>>>>>>> exactly why) but some of yours don't, and Moses skips them,
>>>>>>>>>> interpreting
>>>>>>>>>> their values as extra dense features.
>>>>>>>>>>
>>>>>>>>>> The attached screenshot shows my view of the offending names. The
>>>>>>>>>> ones
>>>>>>>>>> starting with the "@" are the problem. So it does look like the
>>>>>>>>>> nbest
>>>>>>>>>> list is corrupted. Can you run the decoder on just that
>>>>>>>>>> sentence, to
>>>>>>>>>> create an uncompressed version of the nbest list?
>>>>>>>>>>
>>>>>>>>>> cheers - Barry
>>>>>>>>>>
>>>>>>>>>> On 18/01/16 12:02, Dingyuan Wang wrote:
>>>>>>>>>>> Hi Barry,
>>>>>>>>>>>
>>>>>>>>>>> Attached is the zgrep result.
>>>>>>>>>>> I found that in the middle of line 61 a few bytes are
>>>>>>>>>>> corrupted. Is
>>>>>>>>>>> that
>>>>>>>>>>> a moses problem or my memory has a problem?
>>>>>>>>>>>
>>>>>>>>>>> I also checked other files using iconv, they are all OK in
>>>>>>>>>>> UTF-8.
>>>>>>>>>>>
>>>>>>>>>>> ? 2016?01?18? 19:32, Barry Haddow ??:
>>>>>>>>>>>> Hi Dingyuan
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, that's very possible. The error could be in extracting
>>>>>>>>>>>> features.dat
>>>>>>>>>>>> from the nbest list. Are you able to post the nbest list? Or at
>>>>>>>>>>>> least
>>>>>>>>>>>> the entries for sentence 16?
>>>>>>>>>>>>
>>>>>>>>>>>> Run something like
>>>>>>>>>>>>
>>>>>>>>>>>> zgrep "^16 " tuning/tmp.1/run7.best100.out.gz
>>>>>>>>>>>>
>>>>>>>>>>>> cheers - Barry
>>>>>>>>>>>>
>>>>>>>>>>>> On 18/01/16 11:24, Dingyuan Wang wrote:
>>>>>>>>>>>>> Hi Barry,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have rerun the ems after the first email, and then posted
>>>>>>>>>>>>> the
>>>>>>>>>>>>> recent
>>>>>>>>>>>>> results, so the line changed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I just use the latest code, and the EMS script. Pretty much
>>>>>>>>>>>>> are
>>>>>>>>>>>>> default
>>>>>>>>>>>>> settings. The EMS setting is:
>>>>>>>>>>>>>
>>>>>>>>>>>>> sparse-features = "target-word-insertion top 50,
>>>>>>>>>>>>> source-word-deletion
>>>>>>>>>>>>> top 50, word-translation top 50 50, phrase-length"
>>>>>>>>>>>>>
>>>>>>>>>>>>> I suspect there is something unexpected in the extractor.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ? 2016?01?18? 19:03, Barry Haddow ??:
>>>>>>>>>>>>>> Hi Dingyuan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In fact it is not the sparse features nor the Asian
>>>>>>>>>>>>>> characters
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>> the problem. The offending line has 17 dense features, yet
>>>>>>>>>>>>>> your
>>>>>>>>>>>>>> model
>>>>>>>>>>>>>> has 14 dense features.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The string "1 1 1" appears directly after the language model
>>>>>>>>>>>>>> feature in
>>>>>>>>>>>>>> line 1694, in your attachment, adding the extra 3
>>>>>>>>>>>>>> features. Note
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> this is not the line you mentioned in your earlier email.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have no idea why there are extra features. Have you made
>>>>>>>>>>>>>> changes to
>>>>>>>>>>>>>> any of the core Moses features?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> best wishes
>>>>>>>>>>>>>> Barry
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The offending line:
>>>>>>>>>>>>>> what(): Error in line "-5.44027 0 0 -5.34901 0 0 0
>>>>>>>>>>>>>> -224.872 1 1
>>>>>>>>>>>>>> 1 -39
>>>>>>>>>>>>>> 18 -26.2331 -40.6736 -44.3698 -82.5072 WT_?~?=3 WT_?~?=1
>>>>>>>>>>>>>> WT_?~?=1
>>>>>>>>>>>>>> WT_?~?=1 WT_?~?=1 PL_s3=5 PL_3,2=2 PL_3,3=3 PL_2,3=4
>>>>>>>>>>>>>> PL_t3=7
>>>>>>>>>>>>>> PL_s1=5
>>>>>>>>>>>>>> PL_1,2=2 PL_1,1=3 PL_t1=4 PL_2,2=3 PL_t2=7 PL_s2=8
>>>>>>>>>>>>>> PL_2,1=1 WT_
>>>>>>>>>>>>>> ?~?=1
>>>>>>>>>>>>>> WT_?~?=1 WT_?~?=1 WT_?~?=1 WT_?~?=1 WT_?~?=1 WT_?~
>>>>>>>>>>>>>> ?=1
>>>>>>>>>>>>>> WT_?~
>>>>>>>>>>>>>> ?=1 WT_??~?=1 WT_??~?=1 WT_?~?=1 WT_?~?=1 WT_?~?
>>>>>>>>>>>>>> ?=1
>>>>>>>>>>>>>> WT_?~
>>>>>>>>>>>>>> ?=1 WT_?~?=1 WT_?~??=1 WT_?~??=1 WT_?~?=1 WT_
>>>>>>>>>>>>>> ?~?=1
>>>>>>>>>>>>>> WT_
>>>>>>>>>>>>>> ?~?
>>>>>>>>>>>>>> ?=1 WT_?~?=1 WT_?~??=1 WT_?~?=1 WT_?~??=1 WT_?~?
>>>>>>>>>>>>>> ?=1
>>>>>>>>>>>>>> WT_?
>>>>>>>>>>>>>> ?~??=1 WT_?~??=1 WT_?~?=1 WT_?~?=1 WT_?~?=1
>>>>>>>>>>>>>> WT_?~?
>>>>>>>>>>>>>> ?=1 WT_
>>>>>>>>>>>>>> ?~??=1 WT_??~??=1 " of ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 18/01/16 10:37, Dingyuan Wang wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've attached that. The line number is 1694.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ? 2016?01?18? 16:43, Barry Haddow ??:
>>>>>>>>>>>>>>>> Hi Dingyuan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is it possible to attach the features.dat file that is
>>>>>>>>>>>>>>>> causing the
>>>>>>>>>>>>>>>> error? Almost certainly Moses is failing to parse the line
>>>>>>>>>>>>>>>> because of
>>>>>>>>>>>>>>>> the Asian characters in the feature names,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> cheers - Barry
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 16/01/16 15:58, Dingyuan Wang wrote:
>>>>>>>>>>>>>>>>> I ran
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ~/software/moses/bin/kbmira -J 75 --dense-init run7.dense
>>>>>>>>>>>>>>>>> --sparse-init
>>>>>>>>>>>>>>>>> run7.sparse-weights --ffile run1.features.dat --ffile
>>>>>>>>>>>>>>>>> run2.features.dat
>>>>>>>>>>>>>>>>> --ffile run3.features.dat --ffile run4.features.dat
>>>>>>>>>>>>>>>>> --ffile
>>>>>>>>>>>>>>>>> run5.features.dat --ffile run6.features.dat --ffile
>>>>>>>>>>>>>>>>> run7.features.dat
>>>>>>>>>>>>>>>>> --scfile run1.scores.dat --scfile run2.scores.dat --scfile
>>>>>>>>>>>>>>>>> run3.scores.dat --scfile run4.scores.dat --scfile
>>>>>>>>>>>>>>>>> run5.scores.dat
>>>>>>>>>>>>>>>>> --scfile run6.scores.dat --scfile run7.scores.dat -o
>>>>>>>>>>>>>>>>> /tmp/mert.out
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> in the tuning/tmp.1 directory, which will certainly
>>>>>>>>>>>>>>>>> replicate the
>>>>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ? 2016?01?16? 23:42, Hieu Hoang ??:
>>>>>>>>>>>>>>>>>> The mert script prints out every command it runs. You
>>>>>>>>>>>>>>>>>> should be
>>>>>>>>>>>>>>>>>> able to
>>>>>>>>>>>>>>>>>> replicate the error by running the last command
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 16 Jan 2016 14:18, "Dingyuan Wang"
>>>>>>>>>>>>>>>>>> <abcdoyle888@gmail.com
>>>>>>>>>>>>>>>>>> <mailto:abcdoyle888@gmail.com>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sorry, but I can't reliably replicate the
>>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>> running
>>>>>>>>>>>>>>>>>> TUNING_tune.1 alone. There is no
>>>>>>>>>>>>>>>>>> character '_' in
>>>>>>>>>>>>>>>>>> the test
>>>>>>>>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>> or top50
>>>>>>>>>>>>>>>>>> list.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm using sparse-features =
>>>>>>>>>>>>>>>>>> "target-word-insertion
>>>>>>>>>>>>>>>>>> top 50,
>>>>>>>>>>>>>>>>>> source-word-deletion top 50,
>>>>>>>>>>>>>>>>>> word-translation
>>>>>>>>>>>>>>>>>> top 50
>>>>>>>>>>>>>>>>>> 50,
>>>>>>>>>>>>>>>>>> phrase-length"
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I've attached some related files from EMS
>>>>>>>>>>>>>>>>>> and the
>>>>>>>>>>>>>>>>>> EMS
>>>>>>>>>>>>>>>>>> config.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> https://mega.nz/#!xs0SFKxL!M_RTBp1JGX24-b4xlYYLP-bLXKiC_Sl-p96x55avAB4
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ? 2016?01?16? 02:45, Hieu Hoang ??:
>>>>>>>>>>>>>>>>>> > could you make your model files
>>>>>>>>>>>>>>>>>> available for
>>>>>>>>>>>>>>>>>> download so I
>>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>> > replicate this problem.
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > it seems like you're using a feature
>>>>>>>>>>>>>>>>>> function with
>>>>>>>>>>>>>>>>>> sparse
>>>>>>>>>>>>>>>>>> scores. I
>>>>>>>>>>>>>>>>>> > think the character '_' must be escaped.
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > On 12/01/16 04:00, Dingyuan Wang wrote:
>>>>>>>>>>>>>>>>>> >> Hi all,
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> I'm using EMS for doing experiments.
>>>>>>>>>>>>>>>>>> Every
>>>>>>>>>>>>>>>>>> time the
>>>>>>>>>>>>>>>>>> kbmira
>>>>>>>>>>>>>>>>>> died with
>>>>>>>>>>>>>>>>>> >> SIGABRT when turning on one direction,
>>>>>>>>>>>>>>>>>> while
>>>>>>>>>>>>>>>>>> tuning
>>>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>>>> opposite
>>>>>>>>>>>>>>>>>> >> direction (same config and test set) was
>>>>>>>>>>>>>>>>>> successful.
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> The mert.log (stderr) shows follows:
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> kbmira with c=0.01 decay=0.999
>>>>>>>>>>>>>>>>>> no_shuffle=0
>>>>>>>>>>>>>>>>>> >> Initialising random seed from system
>>>>>>>>>>>>>>>>>> clock
>>>>>>>>>>>>>>>>>> >> Found 15323 initial sparse features
>>>>>>>>>>>>>>>>>> >> ....terminate called after throwing an
>>>>>>>>>>>>>>>>>> instance of
>>>>>>>>>>>>>>>>>> >> 'MosesTuning::FileFormatException'
>>>>>>>>>>>>>>>>>> >> what(): Error in line "-4.51933 0 0
>>>>>>>>>>>>>>>>>> -6.09733
>>>>>>>>>>>>>>>>>> 0 0 0
>>>>>>>>>>>>>>>>>> -121.556 2
>>>>>>>>>>>>>>>>>> -20 12
>>>>>>>>>>>>>>>>>> >> -31.6201 -38.5211 -26.5112 -60.6166
>>>>>>>>>>>>>>>>>> WT_?~?=2
>>>>>>>>>>>>>>>>>> WT_?~?=1
>>>>>>>>>>>>>>>>>> PL_s1=4
>>>>>>>>>>>>>>>>>> >> PL_s3=1 PL_3,3=1 PL_2,2=3 PL_1,2=1
>>>>>>>>>>>>>>>>>> PL_2,1=3
>>>>>>>>>>>>>>>>>> PL_t1=6
>>>>>>>>>>>>>>>>>> PL_t2=4
>>>>>>>>>>>>>>>>>> PL_t3=2
>>>>>>>>>>>>>>>>>> >> PL_2,3=1 PL_s2=7 PL_1,1=3 WT_?~??=1
>>>>>>>>>>>>>>>>>> WT_?~
>>>>>>>>>>>>>>>>>> ??=1
>>>>>>>>>>>>>>>>>> WT_?~
>>>>>>>>>>>>>>>>>> ?=1
>>>>>>>>>>>>>>>>>> WT_?~?
>>>>>>>>>>>>>>>>>> >> ?=1 WT_?~?=1 WT_?~?=2 WT_?~?=1 WT_
>>>>>>>>>>>>>>>>>> ?~?=1
>>>>>>>>>>>>>>>>>> WT_?~
>>>>>>>>>>>>>>>>>> ??=1
>>>>>>>>>>>>>>>>>> WT_
>>>>>>>>>>>>>>>>>> ?~?=1
>>>>>>>>>>>>>>>>>> >> WT_?~??=1 WT_?~?=1 WT_?~??=1
>>>>>>>>>>>>>>>>>> WT_?~
>>>>>>>>>>>>>>>>>> ??=1
>>>>>>>>>>>>>>>>>> WT_?~?
>>>>>>>>>>>>>>>>>> ?=1 WT_?~
>>>>>>>>>>>>>>>>>> >> ?=1 WT_?~??=1 " of run7.features.dat
>>>>>>>>>>>>>>>>>> >> Aborted
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> I think since run7.scores.dat is
>>>>>>>>>>>>>>>>>> generated by
>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>> scripts, I
>>>>>>>>>>>>>>>>>> wouldn't
>>>>>>>>>>>>>>>>>> >> be responsible for making the bad
>>>>>>>>>>>>>>>>>> format. Last
>>>>>>>>>>>>>>>>>> time it
>>>>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>>>> died, I
>>>>>>>>>>>>>>>>>> >> removed the likely offending line in
>>>>>>>>>>>>>>>>>> the test
>>>>>>>>>>>>>>>>>> set, but
>>>>>>>>>>>>>>>>>> this time
>>>>>>>>>>>>>>>>>> another
>>>>>>>>>>>>>>>>>> >> line appears.
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> --
>>>>>>>>>>>>>>>>>> >> Dingyuan Wang
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> >> Moses-support mailing list
>>>>>>>>>>>>>>>>>> >> Moses-support@mit.edu
>>>>>>>>>>>>>>>>>> <mailto:Moses-support@mit.edu>
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Dingyuan Wang (gumblex)
>>>>>>>>>>>>>>>>>>
>>
>
>

--
Dingyuan Wang (gumblex)

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 112, Issue 8
*********************************************

Moses-support Digest, Vol 112, Issue 8

0 Response to "Moses-support Digest, Vol 112, Issue 8"

Post a Comment