Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Fwd: xml-input is compatible with moses_chart? (Li Xiang)
2. Re: Error during Tuning (Barry Haddow)
3. Hierarical Model trace option (Mihee Ji)
----------------------------------------------------------------------
Message: 1
Date: Mon, 19 Aug 2013 11:42:59 +0800
From: Li Xiang <lixiang.ict@gmail.com>
Subject: [Moses-support] Fwd: xml-input is compatible with
moses_chart?
To: moses-support <moses-support@mit.edu>
Message-ID:
<CA+fVw+5AdDrHksQL3Tebbqw+yaGO9BFBuBGo9ucG6_T7XEC3_w@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi,
I ran the following command:
echo "<n translation="??" label="X" prob=?9999">moses</n>" | moses_chart
-xml-input inclusive -f moses.ini
The attachment contains some files.
Even though I assign a high probability to the specified translation, the
final translation is also the unknown word ?moses? not the specified
translation ????.
And the detailed output:
max-chart-span: 20
max-chart-span: 1000
IO from STDOUT/STDIN
Created input-output object : [2.000] seconds
Translating: <s> moses </s> ||| [0,0]=X (1) [0,1]=X (1) [0,2]=X (1)
[1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
0 1 2
1 1 0
1 0
1
BEST TRANSLATION: 3 S -> S </s> :0-0 : c=-2.203
core=(0.000,-1.000,1.000,0.000,0.000,0.000,0.000,0.000,0.000) [0..2] 2
[total=-104.318]
core=(-100.000,-3.000,4.000,0.000,0.000,0.000,0.000,1.000,-18.237)
moses
Translation took 0.000 seconds
End. : [2.000] seconds
Name:moses_chart VmPeak:157992 kB VmRSS:126356 kB RSSMax:0 kB
user:0.000 sys:0.000 CPU:0.000 real:2.250
So I am confused about the option ?-xml-input inclusive?. Could you give me
a correct direction.
Thanks.
--
Xiang Li
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20130819/7ae5db79/attachment-0001.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moses.ini
Type: application/octet-stream
Size: 930 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20130819/7ae5db79/attachment-0001.obj
------------------------------
Message: 2
Date: Mon, 19 Aug 2013 09:24:01 +0100
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] Error during Tuning
To: Heidi Heweidy <heidi.heweidy@gmail.com>
Cc: moses-support@mit.edu
Message-ID: <20130819092401.18991sljvwyjf5tw@www.staffmail.ed.ac.uk>
Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed"
Hi Heidi
It's not a good idea to run Arabic through the truecaser - it is only
for languages written in latin script. I'm not even sure that Arabic
has case.
Also, I noticed that there are a lot of blank lines in your data. This
could cause you problems so it would be worth removing them, making
sure that you don't get your data out of alignment.
For your mert error, you should again look at the log files - mert.out
and mert.log.
cheers - Barry
Quoting Heidi Heweidy <heidi.heweidy@gmail.com> on Mon, 19 Aug 2013
00:12:37 +0200:
> There are 4 files,two before truecasing and two after, the lines are
> unequal after the truecasing.
> Please notice that when I make the truecased files equal again (by manually
> cutting them), I get an error again in the mert.out saying:
>
> Executing: /home/tjr/mosesdecoder/bin/mert -d 14 --scconfig case:true
> --ffilee run1.features.dat --scfile run1.scores.dat --ifile run1.init.opt
> -n 20 > mert.out 2> mert.log
> Exit code: 134
> ERROR: Failed to run '/home/tjr/mosesdecoder/bin/mert -d 14 --scconfig
> case:true --ffile run1.features.dat --scfile run1.scores.dat --ifile
> run1.init.opt -n 20'. at /home/tjr/mosesdecoder/scripts/training/
> mert-moses.pl line 1554.
>
>
> On Sun, Aug 18, 2013 at 11:09 PM, Barry Haddow
> <bhaddow@staffmail.ed.ac.uk>wrote:
>
>> Hi Heidi
>>
>> If the truecaser changes the number of lines in the file then that's a
>> bug. Have you opened the files in a windows editor? Could you send me the
>> before and after truecase files?
>>
>> cheers - Barry
>>
>>
>> Quoting Heidi Heweidy <heidi.heweidy@gmail.com> on Sun, 18 Aug 2013
>> 20:10:10 +0200:
>>
>> hmmm... well this does make sense..
>>> the problem is there is nothing else that might have changed the number of
>>> lines because after tokenizing, the lines were the same.. the only time
>>> the
>>> files were not the same anymore is right after the truecasing step.. i
>>> just
>>> cut the .true files to have the same number of lines and made sure they
>>> are
>>> properly aligned and i just hope that tuning finishes successfully coz if
>>> not, i dont know what might have caused the problem. fingers crossed.
>>> anyway, thanks alot once again
>>>
>>>
>>> On Sun, Aug 18, 2013 at 7:50 PM, Barry Haddow <bhaddow@staffmail.ed.ac.uk
>>> >**wrote:
>>>
>>> Hi Heidi
>>>>
>>>> Good to hear you found the problem. Tokenisation does not change the
>>>> number of lines, and neither does truecasing, so there must be a problem
>>>> elsewhere in your pre-processing pipeline,
>>>>
>>>> cheers - Barry
>>>>
>>>>
>>>> Quoting Heidi Heweidy <heidi.heweidy@gmail.com> on Sun, 18 Aug 2013
>>>> 19:47:29 +0200:
>>>>
>>>> Yes! Problem found. Thanks alot. There was one more line in one file
>>>> than
>>>>
>>>>> the other.
>>>>> The original tuning data had the exact same number of lines but maybe
>>>>> the
>>>>> lines changed after tokenizing.
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Aug 18, 2013 at 7:34 PM, Barry Haddow <
>>>>> bhaddow@staffmail.ed.ac.uk
>>>>> >**wrote:
>>>>>
>>>>> Hi Heidi
>>>>>
>>>>>>
>>>>>> Can you run
>>>>>>
>>>>>> wc -l ~/corpus/ar-en.tune.true.fr ~/corpus/ar-en.tune.true.en
>>>>>>
>>>>>>
>>>>>> cheers - Barry
>>>>>>
>>>>>>
>>>>>> Quoting Heidi Heweidy <heidi.heweidy@gmail.com> on Sun, 18 Aug 2013
>>>>>> 19:10:21 +0200:
>>>>>>
>>>>>> cd ~/working
>>>>>>
>>>>>> nohup nice ~/mosesdecoder/scripts/******training/mert-moses.pl \
>>>>>>>
>>>>>>>
>>>>>>> ~/corpus/ar-en.tune.true.fr ~/corpus/ar-en.tune.true.en \
>>>>>>> ~/mosesdecoder/bin/moses train/model/moses.ini --mertdir
>>>>>>> ~/mosesdecoder/bin/ \
>>>>>>> &> mert.out &
>>>>>>>
>>>>>>> P.S I'm on the old system version if that would make a difference.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Aug 18, 2013 at 7:05 PM, Barry Haddow <
>>>>>>> bhaddow@staffmail.ed.ac.uk
>>>>>>> >**wrote:
>>>>>>>
>>>>>>> Hi Heidi
>>>>>>>
>>>>>>>
>>>>>>>> Can you give the exact argument that you use to run tuning?
>>>>>>>>
>>>>>>>> cheers - Barry
>>>>>>>>
>>>>>>>>
>>>>>>>> Quoting Heidi Heweidy <heidi.heweidy@gmail.com> on Sun, 18 Aug 2013
>>>>>>>> 18:55:59 +0200:
>>>>>>>>
>>>>>>>> my training set have the same number of lines, same goes for my
>>>>>>>> tuning
>>>>>>>>
>>>>>>>> set,
>>>>>>>>
>>>>>>>>> but each set is not the same number of lines as the other. i dont
>>>>>>>>> see
>>>>>>>>> the
>>>>>>>>> problem because in the moses baseline tutorial, this is how it works
>>>>>>>>> too,
>>>>>>>>> am i wrong?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Aug 18, 2013 at 6:53 PM, Barry Haddow <
>>>>>>>>> bhaddow@staffmail.ed.ac.uk
>>>>>>>>> >**wrote:
>>>>>>>>>
>>>>>>>>> Hi Heidi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You have to supply an input set and a reference set to
>>>>>>>>>> mert-moses.plfor
>>>>>>>>>> tuning. This error suggests that they have different numbers of
>>>>>>>>>> lines
>>>>>>>>>> in
>>>>>>>>>> them - so they are not parallel,
>>>>>>>>>>
>>>>>>>>>> cheers - Barry
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Quoting Heidi Heweidy <heidi.heweidy@gmail.com> on Sun, 18 Aug
>>>>>>>>>> 2013
>>>>>>>>>> 18:45:31 +0200:
>>>>>>>>>>
>>>>>>>>>> Inside of it, i get:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Binary write mode is NOT selected
>>>>>>>>>>
>>>>>>>>>>> Scorer type: BLEU
>>>>>>>>>>> name: case value: true
>>>>>>>>>>> Loading reference from /home/tjr/corpus/ar-en.tune.****
>>>>>>>>>>> ******true.en
>>>>>>>>>>> ............................**********Data::m_score_type BLEU
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Data::Scorer type from Scorer: BLEU
>>>>>>>>>>> loading nbest from run1.best100.out.gz
>>>>>>>>>>> Exception: Sentence id (2844) not found in reference set
>>>>>>>>>>>
>>>>>>>>>>> I do not get the exception, which reference set is this referring
>>>>>>>>>>> to
>>>>>>>>>>> and
>>>>>>>>>>> what does it mean that it is not found?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Aug 18, 2013 at 6:39 PM, Barry Haddow <
>>>>>>>>>>> bhaddow@staffmail.ed.ac.uk
>>>>>>>>>>> >**wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Heidi
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Inside the mert working directory, there should be a file called
>>>>>>>>>>>
>>>>>>>>>>>> extract.err. Look at the error message in this file.
>>>>>>>>>>>>
>>>>>>>>>>>> It could be that the input and reference you are using for tuning
>>>>>>>>>>>> are
>>>>>>>>>>>> mismatched,
>>>>>>>>>>>>
>>>>>>>>>>>> cheers - Barry
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Quoting Heidi Heweidy <heidi.heweidy@gmail.com> on Sun, 18 Aug
>>>>>>>>>>>> 2013
>>>>>>>>>>>> 16:12:33 +0200:
>>>>>>>>>>>>
>>>>>>>>>>>> I have an arabic to english system that works fine after
>>>>>>>>>>>> training
>>>>>>>>>>>> but
>>>>>>>>>>>> when
>>>>>>>>>>>>
>>>>>>>>>>>> I start tuning i end up with this in the mert.out file:
>>>>>>>>>>>>
>>>>>>>>>>>> ERROR: Failed to run '/home/tjr/working/mert-work/*******
>>>>>>>>>>>>
>>>>>>>>>>>>> *****extractor.sh'.
>>>>>>>>>>>>> at
>>>>>>>>>>>>> /home/tjr/mmosesdecoder/************scripts/training/mert-**
>>>>>>>>>>>>> moses.**
>>>>>>>>>>>>> pl <http://mert-moses.pl> line
>>>>>>>>>>>>> 1554.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>>>>>>>> Scotland, with registration number SC005336.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>>>>>> Scotland, with registration number SC005336.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>>>> Scotland, with registration number SC005336.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>> Scotland, with registration number SC005336.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>>
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
------------------------------
Message: 3
Date: Mon, 19 Aug 2013 17:48:12 +0900
From: "Mihee Ji" <mhji@e4net.net>
Subject: [Moses-support] Hierarical Model trace option
To: <moses-support@mit.edu>
Message-ID: <01be01ce9cb8$d6d434f0$847c9ed0$@e4net.net>
Content-Type: text/plain; charset="ks_c_5601-1987"
Hello
Is it possible to get trace information from Hierarchical model? Currently -
T option doesn't work now.
ex) this is |0-1| a |2-2| small |3-3| house |4-4|
I need this result for inline tag handling using m4loc tools.
Thank you.
Mihee Ji
Engineer Team Manager
T&G Division, E4NET
Seoul, Korea
Office: +82-2-3465-8532
Mobile: +82-10-7341-0098
_____
DISCLAIMER: This email and its attachments are intended solely for its
recipient(s) and may contain legally privileged and/or confidential
information. Distribution, disclosure or any other unauthorized
manipulation with this email is prohibited. If you have received it by
mistake, delete it and notify the sender immediately. E4NET Co. cannot be
held responsible for unauthorized modification, incomplete and/or improper
transmission or delayed delivery of this email to the recipient(s). ??:
? ??? ?? ??? ???? ???? ????? ???? ?? ? ???
??? ????? ?? ? ??? ? ????. ? ??? ?? ?? ???
?? ?? ??, ??, ??, ???? ?? ???? ????. ?? ????
??? ??? ? ??? ??????, ????? ?? ?? ???, ??
???? ?? ??? ??? ??? ????. (?)E4NET? ? ??? ??
??, ???? ??? ??? ?? ? ?? ??? ?? ??? ?? ????.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20130819/8ab7e32f/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 82, Issue 25
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 82, Issue 25"
Post a Comment