Moses-support Digest, Vol 111, Issue 84

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: Moses-support post from jasneet.sabharwal@sfu.ca requires
approval (Kenneth Heafield)
2. Re: Segmentation fault on hierarchical model with moses in
server mode (Barry Haddow)
3. Re: Segmentation fault on hierarchical model with moses in
server mode (Hieu Hoang)

----------------------------------------------------------------------

Message: 1
Date: Fri, 29 Jan 2016 17:11:36 +0000
From: Kenneth Heafield <moses@kheafield.com>
Subject: Re: [Moses-support] Moses-support post from
jasneet.sabharwal@sfu.ca requires approval
To: Jasneet Sabharwal <jasneet.sabharwal@sfu.ca>
Cc: moses-support@mit.edu
Message-ID: <56AB9D48.7040002@kheafield.com>
Content-Type: text/plain; charset=windows-1252

Hi,

Read the comments in lm/model.hh and lm/left.hh. Short example:

#include "lm/model.hh"
#include <iostream>
#include <string>
int main() {
using namespace lm::ngram;
Model model("file.arpa");
State state(model.BeginSentenceState()), out_state;
const Vocabulary &vocab = model.GetVocabulary();
std::string word;
while (std::cin >> word) {
std::cout << model.Score(state, vocab.Index(word), out_state) << '\n';
state = out_state;
}
}

Kenneth

On 01/29/2016 04:10 PM, Jasneet Sabharwal wrote:
> Hi Ken,
>
> I have a language model trained on word classes. So, it is trained on
> sentences like ?10 15 21 1 23?. I now have a feature function that
> generates a phrase ?1 7 10 20?. I?ve been able to load my language model
> in my feature function. How can I get the score from my language model
> for this phrase?
>
> Regards,
> Jasneet
>> On Jan 23, 2016, at 7:11 AM, Jasneet Sabharwal
>> <jasneet.sabharwal@sfu.ca <mailto:jasneet.sabharwal@sfu.ca>> wrote:
>>
>> Thanks Ken & Hieu,
>>
>> I?ll give KenLM a try. The reason for using Witten-Bell was because
>> Kneser-Ney wasn?t able to cope up with the counts being generated for
>> coarse language models. Sp, I?ll train my LM using SRILM with ngram
>> order 8 and WB smoothing and use KenLM with order 8 in Moses.
>>
>> Best,
>> Jasneet
>>> On Jan 23, 2016, at 3:38 AM, Kenneth Heafield <moses@kheafield.com
>>> <mailto:moses@kheafield.com>> wrote:
>>>
>>> Hi,
>>>
>>> You can compile with --max-kenlm-order=8 or change the setting in the
>>> Eclipse files.
>>>
>>> The ARPA file format is interchangeable. You can build an ARPA using
>>> SRILM and Witten-Bell (though Bob Moore once called me out at a
>>> conference for suggesting that as an alternative to Kneser-Ney) then
>>> load with KenLM.
>>>
>>> Kenneth
>>>
>>> On 01/23/2016 05:39 AM, Jasneet Sabharwal wrote:
>>>> Thanks Hieu.
>>>>
>>>> I?m using the eclipse project for development. I followed your video to
>>>> set it up and I have linked the srilm and irstlm installations in the
>>>> root directory of mosesdecoder. I first tried to compile the project,
>>>> but neither the SRILM nor the IRSTLM LM cpp files get compiled. So, I
>>>> added LM_IRST and included "${workspace_loc}/../../irstlm/include? path
>>>> in the C/C++ Build settings of the project. But I still cannot compile
>>>> IRST.cpp.
>>>>
>>>> The reason I?m not using the included KenLM is because my new feature
>>>> function requires an 8-gram language model with witten bell smoothing,
>>>> which is provided by SRILM. As, IRSTLM can use SRILM generated language
>>>> models, so I decided to call IRSTLM code inside my feature function to
>>>> get the score for a phrase.
>>>>
>>>> Any pointers on how can I debug the eclipse project with IRSTLM/SRILM?
>>>>
>>>> Best,
>>>> Jasneet
>>>>
>>>> PS: When I compile the whole project using "./bjam -j4
>>>> ?with-boost=<absolute path to boost> ?with-cmph=<absolute path to cmph>
>>>> ?with-irstlm=<absolute path to irstlm>?, it successfully compiles
>>>> without any errors.
>>>>
>>>>
>>>>> On Jan 19, 2016, at 4:39 PM, Hieu Hoang <hieuhoang@gmail.com
>>>>> <mailto:hieuhoang@gmail.com>
>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>
>>>>> I believe Nadir Durrani's OSM uses KenLM inside it. You can look in
>>>>> moses/FF/OSM-Feature
>>>>> for tips
>>>>>
>>>>> On 20/01/16 00:31, Jasneet Sabharwal wrote:
>>>>>> Thanks Hieu.
>>>>>>
>>>>>> One last question. What do you think is the best way to load the
>>>>>> SRILM language model inside my custom feature function and to get a
>>>>>> score for a string that my feature function created?
>>>>>>
>>>>>> Best,beli
>>>>>> Jasneet
>>>>>>> On Jan 17, 2016, at 3:45 AM, Hieu Hoang
>>>>>>> <<mailto:hieuhoang@gmail.com>hieuhoang@gmail.com
>>>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 17/01/16 04:05, Jasneet Sabharwal wrote:
>>>>>>>> Thanks Hieu,
>>>>>>>>
>>>>>>>> I had subscribed to the mailing list and I?m getting the digest,
>>>>>>>> but not sure why my email went for your approval. When I get the
>>>>>>>> alignments from GetAlignTerm(), the index of the source word is
>>>>>>>> relative? To get the index in the source sentence, I?m assuming
>>>>>>>> that I would need to get the starting position of the source words
>>>>>>>> from CurrSourceWordsRange().GetStartPos() from current hypothesis
>>>>>>>> and offset the source alignment index with that value?
>>>>>>> yep. And to get the index in the target sentence, use
>>>>>>> GetCurrTargetWordsRange().GetStartPos()
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Jasneet
>>>>>>>>> On Jan 15, 2016, at 3:43 AM, Hieu Hoang <hieuhoang@gmail.com
>>>>>>>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>> please subscribe to the Moses mailing list before posting to it.
>>>>>>>>> You can subscribe here:
>>>>>>>>> http://mailman.mit.edu/mailman/admin/moses-support
>>>>>>>>> To answer you question - the target phrase has a method called
>>>>>>>>> GetAlignTerm()
>>>>>>>>> that contains the alignment for terminals. This comes from the
>>>>>>>>> phrase-table, and ultimately from the word alignment.
>>>>>>>>>
>>>>>>>>> -------- Forwarded Message --------
>>>>>>>>> Subject:Moses-support post from jasneet.sabharwal@sfu.ca
>>>>>>>>> <mailto:jasneet.sabharwal@sfu.ca> requires
>>>>>>>>> approval
>>>>>>>>> Date:Wed, 13 Jan 2016 23:36:50 -0500
>>>>>>>>> From:moses-support-owner@mit.edu
>>>>>>>>> <mailto:moses-support-owner@mit.edu>
>>>>>>>>> To:moses-support-owner@mit.edu <mailto:moses-support-owner@mit.edu>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> As list administrator, your authorization is requested for the
>>>>>>>>> following mailing list posting:
>>>>>>>>>
>>>>>>>>> List: Moses-support@mit.edu
>>>>>>>>> From: jasneet.sabharwal@sfu.ca
>>>>>>>>> Subject: Getting alignments for current hypothesis in phrase
>>>>>>>>> based model
>>>>>>>>> Reason: Post by non-member to a members-only list
>>>>>>>>>
>>>>>>>>> At your convenience, visit:
>>>>>>>>>
>>>>>>>>> http://mailman.mit.edu/mailman/admindb/moses-support
>>>>>>>>>
>>>>>>>>> to approve or deny the request.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> <ForwardedMessage.eml><ForwardedMessage.eml>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Moses-support mailing list
>>>>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>
>>>>>>> --
>>>>>>> Hieu Hoang
>>>>>>> http://www.hoang.co.uk/hieu
>>>>>>
>>>>>
>>>>> --
>>>>> Hieu Hoang
>>>>> http://www.hoang.co.uk/hieu
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>

------------------------------

Message: 2
Date: Fri, 29 Jan 2016 20:28:32 +0000
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] Segmentation fault on hierarchical model
with moses in server mode
To: Hieu Hoang <hieuhoang@gmail.com>, ugermann@inf.ed.ac.uk, Martin
Baumg?rtner <martin.baumgaertner@star-group.net>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <56ABCB70.3060708@staffmail.ed.ac.uk>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi All

I think I see what happened now.

When you give the input "dies ist ein haus" to the sample model, the
"dies" is unknown, and there is no translation. The server did not check
for this condition, and got a seg fault. I have added a check, so if you
pull and try again it should not crash.

In the log pasted by Martin, he passed "das ist ein haus" to
command-line Moses, which works, and gives a translation.

I think ideally the sample models should handle unknown words, and give
a translation. Maybe adding a glue rule would be sufficient?

cheers - Barry

On 29/01/16 11:13, Barry Haddow wrote:
> Hi
>
> When I run command-line Moses, I get the output below - i.e. no best
> translation. The server crashes for me since it does not check for the
> null pointer, but the command-line version does.
>
> I think there should be a translation for this example.
>
> cheers - Barry
>
> [gna]bhaddow: echo 'dies ist ein haus' | ~/moses.new/bin/moses -f
> string-to-tree/moses.ini
> Defined parameters (per moses.ini or switch):
> config: string-to-tree/moses.ini
> cube-pruning-pop-limit: 1000
> feature: KENLM name=LM factor=0 order=3 num-features=1
> path=lm/europarl.srilm.gz WordPenalty UnknownWordPenalty
> PhraseDictionaryMemory input-factor=0 output-factor=0
> path=string-to-tree/rule-table num-features=1 table-limit=20
> input-factors: 0
> inputtype: 3
> mapping: 0 T 0
> max-chart-span: 20 1000
> non-terminals: X S
> search-algorithm: 3
> translation-details: translation-details.log
> weight: WordPenalty0= 0 LM= 0.5 PhraseDictionaryMemory0= 0.5
> line=KENLM name=LM factor=0 order=3 num-features=1 path=lm/europarl.srilm.gz
> Loading the LM will be faster if you build a binary file.
> Reading lm/europarl.srilm.gz
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> **The ARPA file is missing <unk>. Substituting log10 probability -100.000.
> **************************************************************************************************
> FeatureFunction: LM start: 0 end: 0
> line=WordPenalty
> FeatureFunction: WordPenalty0 start: 1 end: 1
> line=UnknownWordPenalty
> FeatureFunction: UnknownWordPenalty0 start: 2 end: 2
> line=PhraseDictionaryMemory input-factor=0 output-factor=0
> path=string-to-tree/rule-table num-features=1 table-limit=20
> FeatureFunction: PhraseDictionaryMemory0 start: 3 end: 3
> Loading LM
> Loading WordPenalty0
> Loading UnknownWordPenalty0
> Loading PhraseDictionaryMemory0
> Start loading text phrase table. Moses format : [3.038] seconds
> Reading string-to-tree/rule-table
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> ****************************************************************************************************
> max-chart-span: 20
> Created input-output object : [3.041] seconds
> Line 0: Initialize search took 0.000 seconds total
> Translating: <s> dies ist ein haus </s> ||| [0,0]=X (1) [0,1]=X (1)
> [0,2]=X (1) [0,3]=X (1) [0,4]=X (1) [0,5]=X (1) [1,1]=X (1) [1,2]=X (1)
> [1,3]=X (1) [1,4]=X (1) [1,5]=X (1) [2,2]=X (1) [2,3]=X (1) [2,4]=X (1)
> [2,5]=X (1) [3,3]=X (1) [3,4]=X (1) [3,5]=X (1) [4,4]=X (1) [4,5]=X (1)
> [5,5]=X (1)
>
> 0 1 2 3 4 5
> 0 1 2 2 1 0
> 0 0 0 2 0
> 0 0 4 0
> 0 0 0
> 0 0
> 0
> Line 0: Additional reporting took 0.000 seconds total
> Line 0: Translation took 0.002 seconds total
> Translation took 0.000 seconds
> Name:moses VmPeak:74024 kB VmRSS:11084 kB RSSMax:36832 kB
> user:2.972 sys:0.048 CPU:3.020 real:3.058
>
>
> On 29/01/16 00:40, Hieu Hoang wrote:
>> If it works ok on the command line but crashes when using the server,
>> then that suggest a server issue.
>>
>> I don't know much about the server code, to be honest.
>>
>

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

------------------------------

Message: 3
Date: Fri, 29 Jan 2016 20:56:30 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Segmentation fault on hierarchical model
with moses in server mode
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>, Martin
Baumg?rtner <martin.baumgaertner@star-group.net>
Message-ID:
<CAEKMkbhGhjDAKqSGT8fE02U4SdXcV5th2_x=7mt2CdZ18U5QUg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

yeah, that sample model was completely made up manually. Ideally it would
be taken from a small, trained system

Hieu Hoang
http://www.hoang.co.uk/hieu

On 29 January 2016 at 20:28, Barry Haddow <bhaddow@staffmail.ed.ac.uk>
wrote:

> Hi All
>
> I think I see what happened now.
>
> When you give the input "dies ist ein haus" to the sample model, the
> "dies" is unknown, and there is no translation. The server did not check
> for this condition, and got a seg fault. I have added a check, so if you
> pull and try again it should not crash.
>
> In the log pasted by Martin, he passed "das ist ein haus" to command-line
> Moses, which works, and gives a translation.
>
> I think ideally the sample models should handle unknown words, and give a
> translation. Maybe adding a glue rule would be sufficient?
>
> cheers - Barry
>
>
> On 29/01/16 11:13, Barry Haddow wrote:
>
>> Hi
>>
>> When I run command-line Moses, I get the output below - i.e. no best
>> translation. The server crashes for me since it does not check for the
>> null pointer, but the command-line version does.
>>
>> I think there should be a translation for this example.
>>
>> cheers - Barry
>>
>> [gna]bhaddow: echo 'dies ist ein haus' | ~/moses.new/bin/moses -f
>> string-to-tree/moses.ini
>> Defined parameters (per moses.ini or switch):
>> config: string-to-tree/moses.ini
>> cube-pruning-pop-limit: 1000
>> feature: KENLM name=LM factor=0 order=3 num-features=1
>> path=lm/europarl.srilm.gz WordPenalty UnknownWordPenalty
>> PhraseDictionaryMemory input-factor=0 output-factor=0
>> path=string-to-tree/rule-table num-features=1 table-limit=20
>> input-factors: 0
>> inputtype: 3
>> mapping: 0 T 0
>> max-chart-span: 20 1000
>> non-terminals: X S
>> search-algorithm: 3
>> translation-details: translation-details.log
>> weight: WordPenalty0= 0 LM= 0.5 PhraseDictionaryMemory0= 0.5
>> line=KENLM name=LM factor=0 order=3 num-features=1
>> path=lm/europarl.srilm.gz
>> Loading the LM will be faster if you build a binary file.
>> Reading lm/europarl.srilm.gz
>>
>> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>> **The ARPA file is missing <unk>. Substituting log10 probability
>> -100.000.
>>
>> **************************************************************************************************
>> FeatureFunction: LM start: 0 end: 0
>> line=WordPenalty
>> FeatureFunction: WordPenalty0 start: 1 end: 1
>> line=UnknownWordPenalty
>> FeatureFunction: UnknownWordPenalty0 start: 2 end: 2
>> line=PhraseDictionaryMemory input-factor=0 output-factor=0
>> path=string-to-tree/rule-table num-features=1 table-limit=20
>> FeatureFunction: PhraseDictionaryMemory0 start: 3 end: 3
>> Loading LM
>> Loading WordPenalty0
>> Loading UnknownWordPenalty0
>> Loading PhraseDictionaryMemory0
>> Start loading text phrase table. Moses format : [3.038] seconds
>> Reading string-to-tree/rule-table
>>
>> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>>
>> ****************************************************************************************************
>> max-chart-span: 20
>> Created input-output object : [3.041] seconds
>> Line 0: Initialize search took 0.000 seconds total
>> Translating: <s> dies ist ein haus </s> ||| [0,0]=X (1) [0,1]=X (1)
>> [0,2]=X (1) [0,3]=X (1) [0,4]=X (1) [0,5]=X (1) [1,1]=X (1) [1,2]=X (1)
>> [1,3]=X (1) [1,4]=X (1) [1,5]=X (1) [2,2]=X (1) [2,3]=X (1) [2,4]=X (1)
>> [2,5]=X (1) [3,3]=X (1) [3,4]=X (1) [3,5]=X (1) [4,4]=X (1) [4,5]=X (1)
>> [5,5]=X (1)
>>
>> 0 1 2 3 4 5
>> 0 1 2 2 1 0
>> 0 0 0 2 0
>> 0 0 4 0
>> 0 0 0
>> 0 0
>> 0
>> Line 0: Additional reporting took 0.000 seconds total
>> Line 0: Translation took 0.002 seconds total
>> Translation took 0.000 seconds
>> Name:moses VmPeak:74024 kB VmRSS:11084 kB RSSMax:36832 kB
>> user:2.972 sys:0.048 CPU:3.020 real:3.058
>>
>>
>> On 29/01/16 00:40, Hieu Hoang wrote:
>>
>>> If it works ok on the command line but crashes when using the server,
>>> then that suggest a server issue.
>>>
>>> I don't know much about the server code, to be honest.
>>>
>>>
>>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160129/06106b6a/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 111, Issue 84
**********************************************

Moses-support Digest, Vol 111, Issue 84

0 Response to "Moses-support Digest, Vol 111, Issue 84"