Moses-support Digest, Vol 111, Issue 72

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: Moses-support post from jasneet.sabharwal@sfu.ca requires
approval (amittai axelrod)
2. Re: Moses-support post from jasneet.sabharwal@sfu.ca requires
approval (amittai axelrod)
3. OSM and lmplz are both using -T as a parameter directive
which causes error (Ergun Bicici)

----------------------------------------------------------------------

Message: 1
Date: Sat, 23 Jan 2016 13:08:29 -0500
From: amittai axelrod <amittai@umiacs.umd.edu>
Subject: Re: [Moses-support] Moses-support post from
jasneet.sabharwal@sfu.ca requires approval
To: moses-support@mit.edu
Message-ID: <56A3C19D.60708@umiacs.umd.edu>
Content-Type: text/plain; charset=windows-1252; format=flowed

> The reason for using Witten-Bell was because
> Kneser-Ney wasn?t able to cope up with the counts being generated for
> coarse language models.

that is indeed an annoyance with kndiscount. however, you can now try
using "--discount_fallback" with kenlm. it works for me, even with tens
of classes.

cheers,
~amittai

On 1/23/16 07:11, Jasneet Sabharwal wrote:
> Thanks Ken & Hieu,
>
> I?ll give KenLM a try. The reason for using Witten-Bell was because
> Kneser-Ney wasn?t able to cope up with the counts being generated for
> coarse language models. Sp, I?ll train my LM using SRILM with ngram
> order 8 and WB smoothing and use KenLM with order 8 in Moses.
>
> Best,
> Jasneet
>> On Jan 23, 2016, at 3:38 AM, Kenneth Heafield <moses@kheafield.com
>> <mailto:moses@kheafield.com>> wrote:
>>
>> Hi,
>>
>> You can compile with --max-kenlm-order=8 or change the setting in the
>> Eclipse files.
>>
>> The ARPA file format is interchangeable. You can build an ARPA using
>> SRILM and Witten-Bell (though Bob Moore once called me out at a
>> conference for suggesting that as an alternative to Kneser-Ney) then
>> load with KenLM.
>>
>> Kenneth
>>
>> On 01/23/2016 05:39 AM, Jasneet Sabharwal wrote:
>>> Thanks Hieu.
>>>
>>> I?m using the eclipse project for development. I followed your video to
>>> set it up and I have linked the srilm and irstlm installations in the
>>> root directory of mosesdecoder. I first tried to compile the project,
>>> but neither the SRILM nor the IRSTLM LM cpp files get compiled. So, I
>>> added LM_IRST and included "${workspace_loc}/../../irstlm/include? path
>>> in the C/C++ Build settings of the project. But I still cannot compile
>>> IRST.cpp.
>>>
>>> The reason I?m not using the included KenLM is because my new feature
>>> function requires an 8-gram language model with witten bell smoothing,
>>> which is provided by SRILM. As, IRSTLM can use SRILM generated language
>>> models, so I decided to call IRSTLM code inside my feature function to
>>> get the score for a phrase.
>>>
>>> Any pointers on how can I debug the eclipse project with IRSTLM/SRILM?
>>>
>>> Best,
>>> Jasneet
>>>
>>> PS: When I compile the whole project using "./bjam -j4
>>> ?with-boost=<absolute path to boost> ?with-cmph=<absolute path to cmph>
>>> ?with-irstlm=<absolute path to irstlm>?, it successfully compiles
>>> without any errors.
>>>
>>>
>>>> On Jan 19, 2016, at 4:39 PM, Hieu Hoang <hieuhoang@gmail.com
>>>> <mailto:hieuhoang@gmail.com>
>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>
>>>> I believe Nadir Durrani's OSM uses KenLM inside it. You can look in
>>>> moses/FF/OSM-Feature
>>>> for tips
>>>>
>>>> On 20/01/16 00:31, Jasneet Sabharwal wrote:
>>>>> Thanks Hieu.
>>>>>
>>>>> One last question. What do you think is the best way to load the
>>>>> SRILM language model inside my custom feature function and to get a
>>>>> score for a string that my feature function created?
>>>>>
>>>>> Best,beli
>>>>> Jasneet
>>>>>> On Jan 17, 2016, at 3:45 AM, Hieu Hoang
>>>>>> <<mailto:hieuhoang@gmail.com>hieuhoang@gmail.com
>>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 17/01/16 04:05, Jasneet Sabharwal wrote:
>>>>>>> Thanks Hieu,
>>>>>>>
>>>>>>> I had subscribed to the mailing list and I?m getting the digest,
>>>>>>> but not sure why my email went for your approval. When I get the
>>>>>>> alignments from GetAlignTerm(), the index of the source word is
>>>>>>> relative? To get the index in the source sentence, I?m assuming
>>>>>>> that I would need to get the starting position of the source words
>>>>>>> from CurrSourceWordsRange().GetStartPos() from current hypothesis
>>>>>>> and offset the source alignment index with that value?
>>>>>> yep. And to get the index in the target sentence, use
>>>>>> GetCurrTargetWordsRange().GetStartPos()
>>>>>>>
>>>>>>> Regards,
>>>>>>> Jasneet
>>>>>>>> On Jan 15, 2016, at 3:43 AM, Hieu Hoang <hieuhoang@gmail.com
>>>>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>>>>
>>>>>>>> please subscribe to the Moses mailing list before posting to it.
>>>>>>>> You can subscribe here:
>>>>>>>> http://mailman.mit.edu/mailman/admin/moses-support
>>>>>>>> To answer you question - the target phrase has a method called
>>>>>>>> GetAlignTerm()
>>>>>>>> that contains the alignment for terminals. This comes from the
>>>>>>>> phrase-table, and ultimately from the word alignment.
>>>>>>>>
>>>>>>>> -------- Forwarded Message --------
>>>>>>>> Subject:Moses-support post from jasneet.sabharwal@sfu.ca
>>>>>>>> <mailto:jasneet.sabharwal@sfu.ca> requires
>>>>>>>> approval
>>>>>>>> Date:Wed, 13 Jan 2016 23:36:50 -0500
>>>>>>>> From:moses-support-owner@mit.edu
>>>>>>>> <mailto:moses-support-owner@mit.edu>
>>>>>>>> To:moses-support-owner@mit.edu
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> As list administrator, your authorization is requested for the
>>>>>>>> following mailing list posting:
>>>>>>>>
>>>>>>>> List: Moses-support@mit.edu
>>>>>>>> From: jasneet.sabharwal@sfu.ca
>>>>>>>> Subject: Getting alignments for current hypothesis in phrase
>>>>>>>> based model
>>>>>>>> Reason: Post by non-member to a members-only list
>>>>>>>>
>>>>>>>> At your convenience, visit:
>>>>>>>>
>>>>>>>> http://mailman.mit.edu/mailman/admindb/moses-support
>>>>>>>>
>>>>>>>> to approve or deny the request.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> <ForwardedMessage.eml><ForwardedMessage.eml>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Moses-support mailing list
>>>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>>>> --
>>>>>> Hieu Hoang
>>>>>> http://www.hoang.co.uk/hieu
>>>>>
>>>>
>>>> --
>>>> Hieu Hoang
>>>> http://www.hoang.co.uk/hieu
>>>
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

------------------------------

Message: 2
Date: Sat, 23 Jan 2016 13:10:02 -0500
From: amittai axelrod <amittai@umiacs.umd.edu>
Subject: Re: [Moses-support] Moses-support post from
jasneet.sabharwal@sfu.ca requires approval
To: moses-support@mit.edu
Message-ID: <56A3C1FA.6020500@umiacs.umd.edu>
Content-Type: text/plain; charset=windows-1252; format=flowed

whoops, forgot link. see "class-based models" section in:
http://kheafield.com/code/kenlm/estimation/

~amittai

On 1/23/16 13:08, amittai axelrod wrote:
> > The reason for using Witten-Bell was because
> > Kneser-Ney wasn?t able to cope up with the counts being generated for
> > coarse language models.
>
> that is indeed an annoyance with kndiscount. however, you can now try
> using "--discount_fallback" with kenlm. it works for me, even with tens
> of classes.
>
> cheers,
> ~amittai
>
> On 1/23/16 07:11, Jasneet Sabharwal wrote:
>> Thanks Ken & Hieu,
>>
>> I?ll give KenLM a try. The reason for using Witten-Bell was because
>> Kneser-Ney wasn?t able to cope up with the counts being generated for
>> coarse language models. Sp, I?ll train my LM using SRILM with ngram
>> order 8 and WB smoothing and use KenLM with order 8 in Moses.
>>
>> Best,
>> Jasneet
>>> On Jan 23, 2016, at 3:38 AM, Kenneth Heafield <moses@kheafield.com
>>> <mailto:moses@kheafield.com>> wrote:
>>>
>>> Hi,
>>>
>>> You can compile with --max-kenlm-order=8 or change the setting in the
>>> Eclipse files.
>>>
>>> The ARPA file format is interchangeable. You can build an ARPA using
>>> SRILM and Witten-Bell (though Bob Moore once called me out at a
>>> conference for suggesting that as an alternative to Kneser-Ney) then
>>> load with KenLM.
>>>
>>> Kenneth
>>>
>>> On 01/23/2016 05:39 AM, Jasneet Sabharwal wrote:
>>>> Thanks Hieu.
>>>>
>>>> I?m using the eclipse project for development. I followed your video to
>>>> set it up and I have linked the srilm and irstlm installations in the
>>>> root directory of mosesdecoder. I first tried to compile the project,
>>>> but neither the SRILM nor the IRSTLM LM cpp files get compiled. So, I
>>>> added LM_IRST and included "${workspace_loc}/../../irstlm/include? path
>>>> in the C/C++ Build settings of the project. But I still cannot compile
>>>> IRST.cpp.
>>>>
>>>> The reason I?m not using the included KenLM is because my new feature
>>>> function requires an 8-gram language model with witten bell smoothing,
>>>> which is provided by SRILM. As, IRSTLM can use SRILM generated language
>>>> models, so I decided to call IRSTLM code inside my feature function to
>>>> get the score for a phrase.
>>>>
>>>> Any pointers on how can I debug the eclipse project with IRSTLM/SRILM?
>>>>
>>>> Best,
>>>> Jasneet
>>>>
>>>> PS: When I compile the whole project using "./bjam -j4
>>>> ?with-boost=<absolute path to boost> ?with-cmph=<absolute path to cmph>
>>>> ?with-irstlm=<absolute path to irstlm>?, it successfully compiles
>>>> without any errors.
>>>>
>>>>
>>>>> On Jan 19, 2016, at 4:39 PM, Hieu Hoang <hieuhoang@gmail.com
>>>>> <mailto:hieuhoang@gmail.com>
>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>
>>>>> I believe Nadir Durrani's OSM uses KenLM inside it. You can look in
>>>>> moses/FF/OSM-Feature
>>>>> for tips
>>>>>
>>>>> On 20/01/16 00:31, Jasneet Sabharwal wrote:
>>>>>> Thanks Hieu.
>>>>>>
>>>>>> One last question. What do you think is the best way to load the
>>>>>> SRILM language model inside my custom feature function and to get a
>>>>>> score for a string that my feature function created?
>>>>>>
>>>>>> Best,beli
>>>>>> Jasneet
>>>>>>> On Jan 17, 2016, at 3:45 AM, Hieu Hoang
>>>>>>> <<mailto:hieuhoang@gmail.com>hieuhoang@gmail.com
>>>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 17/01/16 04:05, Jasneet Sabharwal wrote:
>>>>>>>> Thanks Hieu,
>>>>>>>>
>>>>>>>> I had subscribed to the mailing list and I?m getting the digest,
>>>>>>>> but not sure why my email went for your approval. When I get the
>>>>>>>> alignments from GetAlignTerm(), the index of the source word is
>>>>>>>> relative? To get the index in the source sentence, I?m assuming
>>>>>>>> that I would need to get the starting position of the source words
>>>>>>>> from CurrSourceWordsRange().GetStartPos() from current hypothesis
>>>>>>>> and offset the source alignment index with that value?
>>>>>>> yep. And to get the index in the target sentence, use
>>>>>>> GetCurrTargetWordsRange().GetStartPos()
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Jasneet
>>>>>>>>> On Jan 15, 2016, at 3:43 AM, Hieu Hoang <hieuhoang@gmail.com
>>>>>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>> please subscribe to the Moses mailing list before posting to it.
>>>>>>>>> You can subscribe here:
>>>>>>>>> http://mailman.mit.edu/mailman/admin/moses-support
>>>>>>>>> To answer you question - the target phrase has a method called
>>>>>>>>> GetAlignTerm()
>>>>>>>>> that contains the alignment for terminals. This comes from the
>>>>>>>>> phrase-table, and ultimately from the word alignment.
>>>>>>>>>
>>>>>>>>> -------- Forwarded Message --------
>>>>>>>>> Subject:Moses-support post from jasneet.sabharwal@sfu.ca
>>>>>>>>> <mailto:jasneet.sabharwal@sfu.ca> requires
>>>>>>>>> approval
>>>>>>>>> Date:Wed, 13 Jan 2016 23:36:50 -0500
>>>>>>>>> From:moses-support-owner@mit.edu
>>>>>>>>> <mailto:moses-support-owner@mit.edu>
>>>>>>>>> To:moses-support-owner@mit.edu
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> As list administrator, your authorization is requested for the
>>>>>>>>> following mailing list posting:
>>>>>>>>>
>>>>>>>>> List: Moses-support@mit.edu
>>>>>>>>> From: jasneet.sabharwal@sfu.ca
>>>>>>>>> Subject: Getting alignments for current hypothesis in phrase
>>>>>>>>> based model
>>>>>>>>> Reason: Post by non-member to a members-only list
>>>>>>>>>
>>>>>>>>> At your convenience, visit:
>>>>>>>>>
>>>>>>>>> http://mailman.mit.edu/mailman/admindb/moses-support
>>>>>>>>>
>>>>>>>>> to approve or deny the request.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> <ForwardedMessage.eml><ForwardedMessage.eml>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Moses-support mailing list
>>>>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>
>>>>>>> --
>>>>>>> Hieu Hoang
>>>>>>> http://www.hoang.co.uk/hieu
>>>>>>
>>>>>
>>>>> --
>>>>> Hieu Hoang
>>>>> http://www.hoang.co.uk/hieu
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>

------------------------------

Message: 3
Date: Sun, 24 Jan 2016 15:50:45 +0100
From: Ergun Bicici <ergunbicici@yahoo.com>
Subject: [Moses-support] OSM and lmplz are both using -T as a
parameter directive which causes error
To: moses-support@mit.edu
Message-ID:
<CAB59qTMap4SXPBTtM82Set8L_vssdwx8eqk5rUSd5diEyQ6GAw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Dear Moses Support,

?OSM training script instance like the following:?

mosesdecoder/scripts/OSM/OSM-Train.perl --corpus-f SMT_de-en/training/cor
pus.1.de --corpus-e
?
SMT_de-en/training/corpus.1.en --alignment
?
SMT_de-en/mod
el/aligned.1.grow-diag-final-and --order 4 --out-dir
?
SMT_de-en/model/OSM.1 --moses-src-dir mosesdecoder --input-extension de
--output-extension en -lmplz 'mosesdecoder/bin/lmplz -S 40% -T
?
SMT_de-en/model/tmp'

Calls lmplz like the following:?
?Executing: mosesdecoder/bin/lmplz -S 40% -T
?
SMT_de-en/model/tmp -T
?
SMT_de-en
/model/OSM.1 --order 4 --text
?
SMT_de-en/model/OSM.1//opCorpus --arpa
?
SMT_de-en
/model/OSM.1//operationLM --prune 0 0 1

causing the following error:
option '--temp_prefix' cannot be specified more than once
?
?This works ok:?
mosesdecoder/scripts/OSM/OSM-Train.perl --corpus-f SMT_de-en/training/cor
pus.1.de --corpus-e
?
SMT_de-en/training/corpus.1.en --alignment
?
SMT_de-en/mod
el/aligned.1.grow-diag-final-and --order 4 --out-dir
?
SMT_de-en/model/OSM.1 --moses-src-dir mosesdecoder --input-extension de
--output-extension en -lmplz 'mosesdecoder/bin/lmplz -S 40% -T
?
SMT_de-en/model/tmp'

Regards,
Ergun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160124/a83e5615/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 111, Issue 72
**********************************************

Moses-support Digest, Vol 111, Issue 72

0 Response to "Moses-support Digest, Vol 111, Issue 72"

Post a Comment