Moses-support Digest, Vol 111, Issue 85

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Segmentation fault on hierarchical model with moses in
server mode (Matthias Huck)
2. Re: Segmentation fault on hierarchical model with moses in
server mode (Hieu Hoang)
3. Re: Segmentation fault on hierarchical model with moses in
server mode (Matthias Huck)


----------------------------------------------------------------------

Message: 1
Date: Fri, 29 Jan 2016 21:15:49 +0000
From: Matthias Huck <mhuck@inf.ed.ac.uk>
Subject: Re: [Moses-support] Segmentation fault on hierarchical model
with moses in server mode
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>, Hieu Hoang
<hieuhoang@gmail.com>, ugermann@inf.ed.ac.uk, Martin Baumg?rtner
<martin.baumgaertner@star-group.net>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <1454102149.2154.13.camel@inf.ed.ac.uk>
Content-Type: text/plain; charset="UTF-8"

Hi,

It seems to me that this toy string-to-tree setup is either outdated,
or it always had issues. It should be replaced.

Under real-world conditions, the decoder should always be able to
produce some hypothesis. We would therefore usually extract a whole set
of glue rules. And we would typically also add an [unknown-lhs] section
to the moses.ini that would tell the decoder which left-hand side non
-terminal labels to use for out-of-vocabulary words. To my knowledge,
these two techniques are crucial for being able to parse any input
sentence provided to the chart decoder in syntax-based translation.

So, in my opinion, the problem is most likely neither the server
implementation nor the syntax-based decoder, but a problematic setup.
I would consider it okay for the server to crash (or at least print a
warning) under such circumstances. You don't want it to silently not
translate complete sentences.

(I must admit that I didn't look into it in too much detail, but it sho
uld be easy to confirm.)

Cheers,
Matthias


On Fri, 2016-01-29 at 20:28 +0000, Barry Haddow wrote:
> Hi All
>
> I think I see what happened now.
>
> When you give the input "dies ist ein haus" to the sample model, the
> "dies" is unknown, and there is no translation. The server did not check
> for this condition, and got a seg fault. I have added a check, so if you
> pull and try again it should not crash.
>
> In the log pasted by Martin, he passed "das ist ein haus" to
> command-line Moses, which works, and gives a translation.
>
> I think ideally the sample models should handle unknown words, and give
> a translation. Maybe adding a glue rule would be sufficient?
>
> cheers - Barry
>
> On 29/01/16 11:13, Barry Haddow wrote:
> > Hi
> >
> > When I run command-line Moses, I get the output below - i.e. no best
> > translation. The server crashes for me since it does not check for the
> > null pointer, but the command-line version does.
> >
> > I think there should be a translation for this example.
> >
> > cheers - Barry
> >
> > [gna]bhaddow: echo 'dies ist ein haus' | ~/moses.new/bin/moses -f
> > string-to-tree/moses.ini
> > Defined parameters (per moses.ini or switch):
> > config: string-to-tree/moses.ini
> > cube-pruning-pop-limit: 1000
> > feature: KENLM name=LM factor=0 order=3 num-features=1
> > path=lm/europarl.srilm.gz WordPenalty UnknownWordPenalty
> > PhraseDictionaryMemory input-factor=0 output-factor=0
> > path=string-to-tree/rule-table num-features=1 table-limit=20
> > input-factors: 0
> > inputtype: 3
> > mapping: 0 T 0
> > max-chart-span: 20 1000
> > non-terminals: X S
> > search-algorithm: 3
> > translation-details: translation-details.log
> > weight: WordPenalty0= 0 LM= 0.5 PhraseDictionaryMemory0= 0.5
> > line=KENLM name=LM factor=0 order=3 num-features=1 path=lm/europarl.srilm.gz
> > Loading the LM will be faster if you build a binary file.
> > Reading lm/europarl.srilm.gz
> > ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > **The ARPA file is missing . Substituting log10 probability -100.000.
> > **************************************************************************************************
> > FeatureFunction: LM start: 0 end: 0
> > line=WordPenalty
> > FeatureFunction: WordPenalty0 start: 1 end: 1
> > line=UnknownWordPenalty
> > FeatureFunction: UnknownWordPenalty0 start: 2 end: 2
> > line=PhraseDictionaryMemory input-factor=0 output-factor=0
> > path=string-to-tree/rule-table num-features=1 table-limit=20
> > FeatureFunction: PhraseDictionaryMemory0 start: 3 end: 3
> > Loading LM
> > Loading WordPenalty0
> > Loading UnknownWordPenalty0
> > Loading PhraseDictionaryMemory0
> > Start loading text phrase table. Moses format : [3.038] seconds
> > Reading string-to-tree/rule-table
> > ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > ****************************************************************************************************
> > max-chart-span: 20
> > Created input-output object : [3.041] seconds
> > Line 0: Initialize search took 0.000 seconds total
> > Translating: dies ist ein haus ||| [0,0]=X (1) [0,1]=X (1)
> > [0,2]=X (1) [0,3]=X (1) [0,4]=X (1) [0,5]=X (1) [1,1]=X (1) [1,2]=X (1)
> > [1,3]=X (1) [1,4]=X (1) [1,5]=X (1) [2,2]=X (1) [2,3]=X (1) [2,4]=X (1)
> > [2,5]=X (1) [3,3]=X (1) [3,4]=X (1) [3,5]=X (1) [4,4]=X (1) [4,5]=X (1)
> > [5,5]=X (1)
> >
> > 0 1 2 3 4 5
> > 0 1 2 2 1 0
> > 0 0 0 2 0
> > 0 0 4 0
> > 0 0 0
> > 0 0
> > 0
> > Line 0: Additional reporting took 0.000 seconds total
> > Line 0: Translation took 0.002 seconds total
> > Translation took 0.000 seconds
> > Name:moses VmPeak:74024 kB VmRSS:11084 kB RSSMax:36832 kB
> > user:2.972 sys:0.048 CPU:3.020 real:3.058
> >
> >
> > On 29/01/16 00:40, Hieu Hoang wrote:
> > > If it works ok on the command line but crashes when using the server,
> > > then that suggest a server issue.
> > >
> > > I don't know much about the server code, to be honest.
> > >
> >
>
>

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



------------------------------

Message: 2
Date: Fri, 29 Jan 2016 21:26:10 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Segmentation fault on hierarchical model
with moses in server mode
To: Matthias Huck <mhuck@inf.ed.ac.uk>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>, Barry Haddow
<bhaddow@staffmail.ed.ac.uk>, Martin Baumg?rtner
<martin.baumgaertner@star-group.net>
Message-ID:
<CAEKMkbjim=XaSuL7T2NkfqFmmChBq8zy2Ue_29py2nr_a8aLFA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

The decoder should handle no translation without falling over. But yes, the
model is too toy
On 29 Jan 2016 9:15 pm, "Matthias Huck" <mhuck@inf.ed.ac.uk> wrote:

> Hi,
>
> It seems to me that this toy string-to-tree setup is either outdated,
> or it always had issues. It should be replaced.
>
> Under real-world conditions, the decoder should always be able to
> produce some hypothesis. We would therefore usually extract a whole set
> of glue rules. And we would typically also add an [unknown-lhs] section
> to the moses.ini that would tell the decoder which left-hand side non
> -terminal labels to use for out-of-vocabulary words. To my knowledge,
> these two techniques are crucial for being able to parse any input
> sentence provided to the chart decoder in syntax-based translation.
>
> So, in my opinion, the problem is most likely neither the server
> implementation nor the syntax-based decoder, but a problematic setup.
> I would consider it okay for the server to crash (or at least print a
> warning) under such circumstances. You don't want it to silently not
> translate complete sentences.
>
> (I must admit that I didn't look into it in too much detail, but it sho
> uld be easy to confirm.)
>
> Cheers,
> Matthias
>
>
> On Fri, 2016-01-29 at 20:28 +0000, Barry Haddow wrote:
> > Hi All
> >
> > I think I see what happened now.
> >
> > When you give the input "dies ist ein haus" to the sample model, the
> > "dies" is unknown, and there is no translation. The server did not check
> > for this condition, and got a seg fault. I have added a check, so if you
> > pull and try again it should not crash.
> >
> > In the log pasted by Martin, he passed "das ist ein haus" to
> > command-line Moses, which works, and gives a translation.
> >
> > I think ideally the sample models should handle unknown words, and give
> > a translation. Maybe adding a glue rule would be sufficient?
> >
> > cheers - Barry
> >
> > On 29/01/16 11:13, Barry Haddow wrote:
> > > Hi
> > >
> > > When I run command-line Moses, I get the output below - i.e. no best
> > > translation. The server crashes for me since it does not check for the
> > > null pointer, but the command-line version does.
> > >
> > > I think there should be a translation for this example.
> > >
> > > cheers - Barry
> > >
> > > [gna]bhaddow: echo 'dies ist ein haus' | ~/moses.new/bin/moses -f
> > > string-to-tree/moses.ini
> > > Defined parameters (per moses.ini or switch):
> > > config: string-to-tree/moses.ini
> > > cube-pruning-pop-limit: 1000
> > > feature: KENLM name=LM factor=0 order=3 num-features=1
> > > path=lm/europarl.srilm.gz WordPenalty UnknownWordPenalty
> > > PhraseDictionaryMemory input-factor=0 output-factor=0
> > > path=string-to-tree/rule-table num-features=1 table-limit=20
> > > input-factors: 0
> > > inputtype: 3
> > > mapping: 0 T 0
> > > max-chart-span: 20 1000
> > > non-terminals: X S
> > > search-algorithm: 3
> > > translation-details: translation-details.log
> > > weight: WordPenalty0= 0 LM= 0.5 PhraseDictionaryMemory0= 0.5
> > > line=KENLM name=LM factor=0 order=3 num-features=1
> path=lm/europarl.srilm.gz
> > > Loading the LM will be faster if you build a binary file.
> > > Reading lm/europarl.srilm.gz
> > >
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > > **The ARPA file is missing . Substituting log10 probability -100.000.
> > >
> **************************************************************************************************
> > > FeatureFunction: LM start: 0 end: 0
> > > line=WordPenalty
> > > FeatureFunction: WordPenalty0 start: 1 end: 1
> > > line=UnknownWordPenalty
> > > FeatureFunction: UnknownWordPenalty0 start: 2 end: 2
> > > line=PhraseDictionaryMemory input-factor=0 output-factor=0
> > > path=string-to-tree/rule-table num-features=1 table-limit=20
> > > FeatureFunction: PhraseDictionaryMemory0 start: 3 end: 3
> > > Loading LM
> > > Loading WordPenalty0
> > > Loading UnknownWordPenalty0
> > > Loading PhraseDictionaryMemory0
> > > Start loading text phrase table. Moses format : [3.038] seconds
> > > Reading string-to-tree/rule-table
> > >
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > >
> ****************************************************************************************************
> > > max-chart-span: 20
> > > Created input-output object : [3.041] seconds
> > > Line 0: Initialize search took 0.000 seconds total
> > > Translating: dies ist ein haus ||| [0,0]=X (1) [0,1]=X (1)
> > > [0,2]=X (1) [0,3]=X (1) [0,4]=X (1) [0,5]=X (1) [1,1]=X (1) [1,2]=X (1)
> > > [1,3]=X (1) [1,4]=X (1) [1,5]=X (1) [2,2]=X (1) [2,3]=X (1) [2,4]=X (1)
> > > [2,5]=X (1) [3,3]=X (1) [3,4]=X (1) [3,5]=X (1) [4,4]=X (1) [4,5]=X (1)
> > > [5,5]=X (1)
> > >
> > > 0 1 2 3 4 5
> > > 0 1 2 2 1 0
> > > 0 0 0 2 0
> > > 0 0 4 0
> > > 0 0 0
> > > 0 0
> > > 0
> > > Line 0: Additional reporting took 0.000 seconds total
> > > Line 0: Translation took 0.002 seconds total
> > > Translation took 0.000 seconds
> > > Name:moses VmPeak:74024 kB VmRSS:11084 kB RSSMax:36832 kB
> > > user:2.972 sys:0.048 CPU:3.020 real:3.058
> > >
> > >
> > > On 29/01/16 00:40, Hieu Hoang wrote:
> > > > If it works ok on the command line but crashes when using the server,
> > > > then that suggest a server issue.
> > > >
> > > > I don't know much about the server code, to be honest.
> > > >
> > >
> >
> >
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160129/2bc2b3a0/attachment-0001.html

------------------------------

Message: 3
Date: Fri, 29 Jan 2016 21:37:37 +0000
From: Matthias Huck <mhuck@inf.ed.ac.uk>
Subject: Re: [Moses-support] Segmentation fault on hierarchical model
with moses in server mode
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>, Barry Haddow
<bhaddow@staffmail.ed.ac.uk>, Martin Baumg?rtner
<martin.baumgaertner@star-group.net>
Message-ID: <1454103457.2154.20.camel@inf.ed.ac.uk>
Content-Type: text/plain; charset="UTF-8"


On Fri, 2016-01-29 at 21:26 +0000, Hieu Hoang wrote:
> The decoder should handle no translation without falling over. But
> yes, the model is too toy

Normally the decoder would always produce some translation. (The translation could be an empty sentence, of course.) If it's misconfigured, it should tell you about it. But maybe not with a segmentation fault. :-)


> On 29 Jan 2016 9:15 pm, "Matthias Huck" <mhuck@inf.ed.ac.uk> wrote:
> > Hi,
> >
> > It seems to me that this toy string-to-tree setup is either
> > outdated,
> > or it always had issues. It should be replaced.
> >
> > Under real-world conditions, the decoder should always be able to
> > produce some hypothesis. We would therefore usually extract a whole
> > set
> > of glue rules. And we would typically also add an [unknown-lhs]
> > section
> > to the moses.ini that would tell the decoder which left-hand side
> > non
> > -terminal labels to use for out-of-vocabulary words. To my
> > knowledge,
> > these two techniques are crucial for being able to parse any input
> > sentence provided to the chart decoder in syntax-based translation.
> >
> > So, in my opinion, the problem is most likely neither the server
> > implementation nor the syntax-based decoder, but a problematic
> > setup.
> > I would consider it okay for the server to crash (or at least print
> > a
> > warning) under such circumstances. You don't want it to silently
> > not
> > translate complete sentences.
> >
> > (I must admit that I didn't look into it in too much detail, but it
> > sho
> > uld be easy to confirm.)
> >
> > Cheers,
> > Matthias
> >
> >
> > On Fri, 2016-01-29 at 20:28 +0000, Barry Haddow wrote:
> > > Hi All
> > >
> > > I think I see what happened now.
> > >
> > > When you give the input "dies ist ein haus" to the sample model,
> > the
> > > "dies" is unknown, and there is no translation. The server did
> > not check
> > > for this condition, and got a seg fault. I have added a check, so
> > if you
> > > pull and try again it should not crash.
> > >
> > > In the log pasted by Martin, he passed "das ist ein haus" to
> > > command-line Moses, which works, and gives a translation.
> > >
> > > I think ideally the sample models should handle unknown words,
> > and give
> > > a translation. Maybe adding a glue rule would be sufficient?
> > >
> > > cheers - Barry
> > >
> > > On 29/01/16 11:13, Barry Haddow wrote:
> > > > Hi
> > > >
> > > > When I run command-line Moses, I get the output below - i.e. no
> > best
> > > > translation. The server crashes for me since it does not check
> > for the
> > > > null pointer, but the command-line version does.
> > > >
> > > > I think there should be a translation for this example.
> > > >
> > > > cheers - Barry
> > > >
> > > > [gna]bhaddow: echo 'dies ist ein haus' | ~/moses.new/bin/moses
> > -f
> > > > string-to-tree/moses.ini
> > > > Defined parameters (per moses.ini or switch):
> > > > config: string-to-tree/moses.ini
> > > > cube-pruning-pop-limit: 1000
> > > > feature: KENLM name=LM factor=0 order=3 num
> > -features=1
> > > > path=lm/europarl.srilm.gz WordPenalty UnknownWordPenalty
> > > > PhraseDictionaryMemory input-factor=0 output-factor=0
> > > > path=string-to-tree/rule-table num-features=1 table-limit=20
> > > > input-factors: 0
> > > > inputtype: 3
> > > > mapping: 0 T 0
> > > > max-chart-span: 20 1000
> > > > non-terminals: X S
> > > > search-algorithm: 3
> > > > translation-details: translation-details.log
> > > > weight: WordPenalty0= 0 LM= 0.5
> > PhraseDictionaryMemory0= 0.5
> > > > line=KENLM name=LM factor=0 order=3 num-features=1
> > path=lm/europarl.srilm.gz
> > > > Loading the LM will be faster if you build a binary file.
> > > > Reading lm/europarl.srilm.gz
> > > > ----5---10---15---20---25---30---35---40---45---50---55---60--
> > -65---70---75---80---85---90---95--100
> > > > **The ARPA file is missing . Substituting log10 probability
> > -100.000.
> > > >
> > *******************************************************************
> > *******************************
> > > > FeatureFunction: LM start: 0 end: 0
> > > > line=WordPenalty
> > > > FeatureFunction: WordPenalty0 start: 1 end: 1
> > > > line=UnknownWordPenalty
> > > > FeatureFunction: UnknownWordPenalty0 start: 2 end: 2
> > > > line=PhraseDictionaryMemory input-factor=0 output-factor=0
> > > > path=string-to-tree/rule-table num-features=1 table-limit=20
> > > > FeatureFunction: PhraseDictionaryMemory0 start: 3 end: 3
> > > > Loading LM
> > > > Loading WordPenalty0
> > > > Loading UnknownWordPenalty0
> > > > Loading PhraseDictionaryMemory0
> > > > Start loading text phrase table. Moses format : [3.038] seconds
> > > > Reading string-to-tree/rule-table
> > > > ----5---10---15---20---25---30---35---40---45---50---55---60--
> > -65---70---75---80---85---90---95--100
> > > >
> > *******************************************************************
> > *********************************
> > > > max-chart-span: 20
> > > > Created input-output object : [3.041] seconds
> > > > Line 0: Initialize search took 0.000 seconds total
> > > > Translating: dies ist ein haus ||| [0,0]=X (1) [0,1]=X (1)
> > > > [0,2]=X (1) [0,3]=X (1) [0,4]=X (1) [0,5]=X (1) [1,1]=X (1)
> > [1,2]=X (1)
> > > > [1,3]=X (1) [1,4]=X (1) [1,5]=X (1) [2,2]=X (1) [2,3]=X (1)
> > [2,4]=X (1)
> > > > [2,5]=X (1) [3,3]=X (1) [3,4]=X (1) [3,5]=X (1) [4,4]=X (1)
> > [4,5]=X (1)
> > > > [5,5]=X (1)
> > > >
> > > > 0 1 2 3 4 5
> > > > 0 1 2 2 1 0
> > > > 0 0 0 2 0
> > > > 0 0 4 0
> > > > 0 0 0
> > > > 0 0
> > > > 0
> > > > Line 0: Additional reporting took 0.000 seconds total
> > > > Line 0: Translation took 0.002 seconds total
> > > > Translation took 0.000 seconds
> > > > Name:moses VmPeak:74024 kB VmRSS:11084 kB RSSMax:36832 kB
> > > > user:2.972 sys:0.048 CPU:3.020 real:3.058
> > > >
> > > >
> > > > On 29/01/16 00:40, Hieu Hoang wrote:
> > > > > If it works ok on the command line but crashes when using the
> > server,
> > > > > then that suggest a server issue.
> > > > >
> > > > > I don't know much about the server code, to be honest.
> > > > >
> > > >
> > >
> > >
> >
> > --
> > The University of Edinburgh is a charitable body, registered in
> > Scotland, with registration number SC005336.
> >
> >

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 111, Issue 85
**********************************************

Related Posts :

0 Response to "Moses-support Digest, Vol 111, Issue 85"

Post a Comment