Moses-support Digest, Vol 122, Issue 28

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Incremental Training (Adel Khalifa)
2. Re: SegFault using XML Markup (Hieu Hoang)
3. Re: Conversion of phrase model to PhraseDictionaryCompact
(Hieu Hoang)


----------------------------------------------------------------------

Message: 1
Date: Thu, 15 Dec 2016 12:26:52 +0200
From: Adel Khalifa <adelkhalifa9@gmail.com>
Subject: [Moses-support] Incremental Training
To: hieuhoang@gmail.com, moses-support@mit.edu
Message-ID:
<CAL3VmkfiHBTq9Y0travj0gpdRC3zTa41Mnu=yHg8X+rQwdu9Fw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Dear Hieu,

I just need a clear steps on how to do incremental training in moses, as
the illustration in the manual is not cleared enough

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161215/70fe221f/attachment-0001.html

------------------------------

Message: 2
Date: Thu, 15 Dec 2016 10:40:20 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] SegFault using XML Markup
To: Ryan ADLER <ryan.adler@unodc.org>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbg7huxVRXH6Rz-t7sMzqsMFk9=S7XKQza-qqfA_t_UAJw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

that's the only thing I can think of. It may be that XML doesn't support
multiple factors.

If you have model files for me to download and replicate the problem, I can
let you know for sure

Hieu Hoang
http://www.hoang.co.uk/hieu

On 15 December 2016 at 10:19, Ryan ADLER <ryan.adler@unodc.org> wrote:

> Dear Hieu,
>
> Unfortunately, this results in the same error. Am I possibly still
> missing something somewhere else?
>
> Kind regards,
> Ryan Adler
>
> [image: Inactive hide details for Hieu Hoang ---15/12/2016 10:36:02---your
> output has 2 factors so the phrases in the xml should have 2]Hieu Hoang
> ---15/12/2016 10:36:02---your output has 2 factors so the phrases in the
> xml should have 2 factors eg '<n translation="abc
>
> From: Hieu Hoang <hieuhoang@gmail.com>
> To: Ryan ADLER/VIENNA/UNO@UNOV
> Cc: moses-support <moses-support@mit.edu>
> Date: 15/12/2016 10:36
>
> Subject: Re: [Moses-support] SegFault using XML Markup
> ------------------------------
>
>
>
> your output has 2 factors so the phrases in the xml should have 2 factors
> eg
> '<n translation="abc|NNS">def</n>'
>
>
> Hieu Hoang
> *http://www.hoang.co.uk/hieu* <http://www.hoang.co.uk/hieu>
>
> On 15 December 2016 at 07:28, Ryan ADLER <*ryan.adler@unodc.org*
> <ryan.adler@unodc.org>> wrote:
>
> Dear Hieu,
>
> Thank you for your reply. Attached, please find the requested
> information. I will try to build a mini-model where I can reproduce this
> as well.
> * (See attached file: moses.mert.ini)*
>
> Command line:
>
> echo '<n translation="abc">def</n>' | /data/smt/mosesdecoder3.0/bin/moses
> -f moses.mert.ini -xml-input exclusive
>
> I would appreciate any insight you can offer.
>
> Kind regards,
> Ryan Adler
>
> [image: Inactive hide details for Hieu Hoang ---14/12/2016
> 17:23:01---can you attach the moses.ini file you use and the exact command
> y]Hieu Hoang ---14/12/2016 17:23:01---can you attach the moses.ini
> file you use and the exact command you ran Hieu Hoang
>
> From: Hieu Hoang <*hieuhoang@gmail.com* <hieuhoang@gmail.com>>
> To: Ryan ADLER/VIENNA/UNO@UNOV
> Cc: moses-support <*moses-support@mit.edu* <moses-support@mit.edu>>
> Date: 14/12/2016 17:23
> Subject: Re: [Moses-support] SegFault using XML Markup
> ------------------------------
>
>
>
>
> can you attach the moses.ini file you use and the exact command you ran
>
> Hieu Hoang
> *http://www.hoang.co.uk/hieu* <http://www.hoang.co.uk/hieu>
>
> On 14 December 2016 at 14:19, Ryan ADLER <*ryan.adler@unodc.org*
> <ryan.adler@unodc.org>> wrote:
> Hi,
>
> When submitting text with XML markup like:
> Some <np translation="abc">def</np> stuff.
>
> I am getting a segfault at:
> Moses::LanguageModelKen<lm::ngram::QuantArrayTrieModel>::CalcScore
> (this=0xa570b0, phrase=..., fullScore=@0x7fffffffd110,
> ngramScore=@0x7fffffffd120, oovCount=@0x7fffffffd130) at moses/LM/Ken.h:77
> 77 std::size_t factor = word.GetFactor(m_factorType)->
> GetId();
>
> This is using the RELEASE-3.0 branch. Does anyone have any ideas
> here?
>
> Kind regards,
> Ryan Adler
>
> _______________________________________________
> Moses-support mailing list
> *Moses-support@mit.edu* <Moses-support@mit.edu>
> *http://mailman.mit.edu/mailman/listinfo/moses-support*
> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161215/b9997dc6/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20161215/b9997dc6/attachment-0001.gif

------------------------------

Message: 3
Date: Thu, 15 Dec 2016 10:44:17 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Conversion of phrase model to
PhraseDictionaryCompact
To: Shubham Khandelwal <skhlnmiit@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbhb6X3oOn1Sh9oK7G--biumuPNWYBxHnXx032DpxRocdA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

there is no limit to the number of words.

if you are using the premade models, make sure you are using tuned
moses.ini file in
tuning/moses.tuned.ini.?


Hieu Hoang
http://www.hoang.co.uk/hieu

On 13 December 2016 at 09:32, Shubham Khandelwal <skhlnmiit@gmail.com>
wrote:

> Thanks. Hieu. I understood now.
> Also, Is there any limit of number of words for the translation because
> when I use this command: *~/mosesdecoder/bin/moses -f moses.ini*
> Then, it translates only some words which are in the beginning. I mean, it
> does not consume all words for the translation after Created input-output
> object.
> Is there any way by which I can control or remove this limit.
>
> Thanks.
>
> On Mon, Dec 12, 2016 at 7:17 PM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>
>> there are actually 7 different configurations. You have to look at the
>> config file in
>> steps/?/config.?
>> For fr-en:
>> 1. phrase-based, truecased
>> 2. phrase-based, lowercased then recased
>> 3. hierarchical model, lowercased then recased
>> 4. phrase-based, lowercased then recased. Using target side word + pos
>> factors
>> 5. Like (2) but using batch-mira to tune
>> 6. Like (2) but using PRO to tune
>> 7. Like (2) but using CreateOnDiskPt to create binary phrase table
>> You can see the BLEU scores in
>> evaluation/report.*
>>
>> Hieu Hoang
>> http://www.hoang.co.uk/hieu
>>
>> On 12 December 2016 at 13:28, Shubham Khandelwal <skhlnmiit@gmail.com>
>> wrote:
>>
>>> Okay Thanks Hieu. I will try it with 1TB HD-memory machine.
>>> Btw I can see there are 4 pre-made models available for fr-en and de-en (
>>> http://www.statmt.org/moses/RELEASE-3.0/models/fr-en/model/ and
>>> http://www.statmt.org/moses/RELEASE-3.0/models/de-en/model/). Can you
>>> please tell me among these 4, which one is better model(in terms of bleu
>>> score) except the huge model which is already there in both, as I can not
>>> understand how analysis is shown in steps folder.
>>> Also, Are all these pre-made models hierarchical model ?
>>>
>>>
>>> On Mon, Dec 12, 2016 at 6:09 PM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> Hieu Hoang
>>>> http://www.hoang.co.uk/hieu
>>>>
>>>> On 10 December 2016 at 14:06, Shubham Khandelwal <skhlnmiit@gmail.com>
>>>> wrote:
>>>>
>>>>> Yes, CreateOnDiskPt command executed without any error.
>>>>>
>>>>> There are 5 files in this phrase-table.3.folder: Misc.dat
>>>>> , Source.dat, TargetColl.dat, TargetInd.dat, Vocab.dat
>>>>> *Misc.dat and Vocab.dat files are empty. *
>>>>> I just checked that my hard-disk memory is full as this folder
>>>>> took 165G already. So may be, due to this reason those 2 files are empty.
>>>>> But CreateOnDiskPt command should throw an error of *No space left on
>>>>> machine *when it stopped.
>>>>> Let me know if no space on my machine, is the issue or not so that I
>>>>> can go for better device having more hard-disk memory.
>>>>>
>>>> Good idea. Not sure who's going to do it but if you do it, please send
>>>> me a patch & I'll check it in
>>>>
>>>>>
>>>>> Also May I know that How much memory phrase-table.3.folder has in
>>>>> general when CreateOnDiskPt command executes completely
>>>>> as phrase-table.3.gz size is only 23GB.
>>>>>
>>>> I'm not too sure. Try it on a disk with 1TB and please report back what
>>>> you find for future reference
>>>>
>>>>>
>>>>> Thanking You.
>>>>>
>>>>>
>>>>> On Sat, Dec 10, 2016 at 6:53 PM, Hieu Hoang <hieuhoang@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> strange, did the CreateOnDiskPt command execute ok, ie. with no error?
>>>>>>
>>>>>> Does this file exist:
>>>>>> /home/shubham/models/fr-en/phrase-table.3.folder/Misc.dat
>>>>>> If you do
>>>>>> cat Misc.dat
>>>>>> what does it say?
>>>>>>
>>>>>> Hieu Hoang
>>>>>> http://www.hoang.co.uk/hieu
>>>>>>
>>>>>> On 10 December 2016 at 11:30, Shubham Khandelwal <skhlnmiit@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Thanks Hieu for your reply.
>>>>>>> I have used *CreateOnDiskPt* to binarize the model and stored in
>>>>>>> *phrase-table.3.folder *using following command:
>>>>>>>
>>>>>>>
>>>>>>> *~/mosesdecoder/bin/CreateOnDiskPt 1 1 4 100 2 phrase-table.3.gz
>>>>>>> phrase-table.3.folder*
>>>>>>>
>>>>>>> Also I have made changes in *moses.ini.3 (*i.e. I have converted
>>>>>>> *PhraseDictionaryMemory* to *PhraseDictionaryCompact* as follows*)
>>>>>>> *
>>>>>>>
>>>>>>> PhraseDictionaryOnDisk name=TranslationModel0 num-features=4
>>>>>>> path=/home/shubham/models/fr-en/phrase-table.3.folder
>>>>>>> input-factor=0 output-factor=0
>>>>>>>
>>>>>>> Now, when I run it using :* ~/mosesdecoder/bin/moses -f
>>>>>>> moses.ini.3 * , it gave following error after *Created input-output
>>>>>>> object*:
>>>>>>>
>>>>>>> *terminate called after throwing an instance of 'util::Exception'*
>>>>>>> * what(): OnDiskPt/OnDiskWrapper.cpp:217 in uint64_t
>>>>>>> OnDiskPt::OnDiskWrapper::GetMisc(const string&) const threw util::Exception
>>>>>>> because `iter == m_miscInfo.end()'.*
>>>>>>> *Couldn't find value for key NumSourceFactors*
>>>>>>> *Aborted (core dumped)*
>>>>>>>
>>>>>>> Here, I do not know that what key value should I pass and how ? Can
>>>>>>> you please help me in this regard.
>>>>>>>
>>>>>>> Thank you so much for your help.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Shubham
>>>>>>>
>>>>>>> On Fri, Dec 9, 2016 at 4:27 PM, Hieu Hoang <hieuhoang@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> This is a hierarchical model. You must binarize with CreateOnDiskPt
>>>>>>>> for this model
>>>>>>>>
>>>>>>>> Hieu Hoang
>>>>>>>> http://www.hoang.co.uk/hieu
>>>>>>>>
>>>>>>>> On 9 December 2016 at 08:18, Shubham Khandelwal <
>>>>>>>> skhlnmiit@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Thanks. It worked now. I have created compact phrase table.
>>>>>>>>> Now, when I am running it using following command:
>>>>>>>>>
>>>>>>>>> *~/mosesdecoder/bin/moses
>>>>>>>>> -f ~/Translate/models/de-en/model/moses.ini.2 -threads all*
>>>>>>>>>
>>>>>>>>> Then, after creating input-output object, it gave following
>>>>>>>>> segmentation fault error:
>>>>>>>>>
>>>>>>>>> Created input-output object : [14.796] seconds
>>>>>>>>> Ich bin ein Student
>>>>>>>>> Line 0: Initialize search took 0.000 seconds total
>>>>>>>>> Translating: <s> Ich bin ein Student </s> ||| [0,0]=X (1) [0,1]=X
>>>>>>>>> (1) [0,2]=X (1) [0,3]=X (1) [0,4]=X (1) [0,5]=X (1) [1,1]=X (1) [1,2]=X (1)
>>>>>>>>> [1,3]=X (1) [1,4]=X (1) [1,5]=X (1) [2,2]=X (1) [2,3]=X (1) [2,4]=X (1)
>>>>>>>>> [2,5]=X (1) [3,3]=X (1) [3,4]=X (1) [3,5]=X (1) [4,4]=X (1) [4,5]=X (1)
>>>>>>>>> [5,5]=X (1)
>>>>>>>>>
>>>>>>>>> Segmentation fault (core dumped)
>>>>>>>>>
>>>>>>>>> In my machine, I have 40GB RAM but still I am confused why it gave
>>>>>>>>> this error.
>>>>>>>>> Can you please help me in this regard. I have attached moses.ini.2
>>>>>>>>> for your reference.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Shubham
>>>>>>>>>
>>>>>>>>> On Fri, Dec 9, 2016 at 2:02 AM, Hieu Hoang <hieuhoang@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> maybe try
>>>>>>>>>>
>>>>>>>>>> -encoding None
>>>>>>>>>>
>>>>>>>>>> On 08/12/2016 19:44, Shubham Khandelwal wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Hieu,
>>>>>>>>>>
>>>>>>>>>> Thanks for your reply.
>>>>>>>>>> Yes, I have used the absolute path and also I tried with -T but
>>>>>>>>>> it did not work.
>>>>>>>>>> Is there any other solution to this problem.
>>>>>>>>>>
>>>>>>>>>> Btw, Can anybody please upload the compact model of all pre-made
>>>>>>>>>> models as this will take less space and also it will be very fast during
>>>>>>>>>> decoding.
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> On Fri, Dec 9, 2016 at 12:50 AM, Hieu Hoang <hieuhoang@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> the previous email you referred to says that the directory
>>>>>>>>>>>
>>>>>>>>>>> * binarised-model/ *
>>>>>>>>>>>
>>>>>>>>>>> *must exist before you run it, otherwise it will segfault. I
>>>>>>>>>>> would also use absolute path to make sure, ie. not *
>>>>>>>>>>> *binarised-model/phrase-table *
>>>>>>>>>>>
>>>>>>>>>>> *but *
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> * /home/shubham/moses/binarised-model/phrase-table *
>>>>>>>>>>>
>>>>>>>>>>> *The previous email exchange also says you should try to add the
>>>>>>>>>>> argument *
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> * -T . *
>>>>>>>>>>>
>>>>>>>>>>> Hieu Hoang
>>>>>>>>>>> http://www.hoang.co.uk/hieu
>>>>>>>>>>>
>>>>>>>>>>> On 8 December 2016 at 15:52, Shubham Khandelwal <
>>>>>>>>>>> skhlnmiit@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>> This is just the reminder of my previous email.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanking You.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Shubham
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Dec 8, 2016 at 9:04 AM, Shubham Khandelwal <
>>>>>>>>>>>> skhlnmiit@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have just downloaded phrase-table.2.gz (18GB) de-en model
>>>>>>>>>>>>> and phrase-table.3.gz (22GB) fr-en model from the available pre-made
>>>>>>>>>>>>> models.
>>>>>>>>>>>>> Now, I am converting them to PhraseDictionaryCompact using
>>>>>>>>>>>>> following command (for exmaple):
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *~/mosesdecoder/bin/processPhraseTableMin -threads all -in
>>>>>>>>>>>>> ~/model/phrase-table.3.gz -nscores 4 -out binarised-model/phrase-table *
>>>>>>>>>>>>>
>>>>>>>>>>>>> But after passing 1/3, it gave following segementation fault
>>>>>>>>>>>>> error:
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Pass 1/3: Creating hash function for rank assignment*
>>>>>>>>>>>>> *Segmentation fault (core dumped)*
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have found almost same issue on this thread:
>>>>>>>>>>>>> http://comments.gmane.org/gmane.comp.nlp.moses.user/13033
>>>>>>>>>>>>> However, I have provided the existing *binarised-model *folder
>>>>>>>>>>>>> in the command. Also, I have the write-access in /tmp but
>>>>>>>>>>>>> still it gave sementation fault.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you please tell me what could be wrong here ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanking You.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Shubham
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Moses-support mailing list
>>>>>>>>>>>> Moses-support@mit.edu
>>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Yours Sincerely,
>>>
>>> Shubham Khandelwal
>>> Masters in Informatics (M2-MoSIG),
>>> University Joseph Fourier-Grenoble INP,
>>> Grenoble, France
>>> Webpage: https://sites.google.com/site/skhandelwl21/
>>>
>>
>>
>
>
> --
> Yours Sincerely,
>
> Shubham Khandelwal
> Masters in Informatics (M2-MoSIG),
> University Joseph Fourier-Grenoble INP,
> Grenoble, France
> Webpage: https://sites.google.com/site/skhandelwl21/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20161215/9f60d0f9/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 122, Issue 28
**********************************************

0 Response to "Moses-support Digest, Vol 122, Issue 28"

Post a Comment