Moses-support Digest, Vol 115, Issue 22

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Tokenizer (Philipp Koehn)
2. DEBUG_LEVEL:1 Error: lower order count-of-counts cannot be
estimated properly (Tomasz Gawryl)
3. Re: DEBUG_LEVEL:1 Error: lower order count-of-counts cannot
be estimated properly (Nicola Bertoldi)


----------------------------------------------------------------------

Message: 1
Date: Thu, 19 May 2016 17:54:15 -0400
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] Tokenizer
To: Adel Khalifa <adelkhalifa9@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDDoQNgbZakUEZnE0ooxO7vC9DJynqitnfSHhCvwD2yaCw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

this question is not very clear... What does the translation system
or tokenizer currently produce and what would you want it to
produce? An example would be helpful.

-phi

On Wed, May 18, 2016 at 9:24 AM, Adel Khalifa <adelkhalifa9@gmail.com>
wrote:

> Hello All,
>
> How can I fixing the translation of " 's " to not appear in the target
>
> Regards,
> Adel
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160519/ee3b70b3/attachment-0001.html

------------------------------

Message: 2
Date: Fri, 20 May 2016 11:31:33 +0200
From: "Tomasz Gawryl" <tomasz.gawryl@skrivanek.pl>
Subject: [Moses-support] DEBUG_LEVEL:1 Error: lower order
count-of-counts cannot be estimated properly
To: <moses-support@mit.edu>
Message-ID: <041f01d1b27a$65fd0510$31f70f30$@skrivanek.pl>
Content-Type: text/plain; charset="us-ascii"

Hi,

I'm trying to build 10 ngram's model but my training pipeline ends with
error: "Error: lower order count-of-counts cannot be estimated properly".
Corpus has 33 mln sentences.

I successfully trained much smaller corpus (around 5 mln sentences) using
the same config file. Would you suggest me something how to fix this
problem?



Regards,

Thomas



--------------



# more steps/2/LM_ACROSS-BIGMAMA-OPENSUB2016_train.2.STDERR



Generating successor statistics

level 2

level 3

level 4

level 5

level 6

level 7

level 8

level 9

level 10

level 1

computing statistics

n1: 1 n2: 0 n3: 0 n4: 0 unover3: 0

DEBUG_LEVEL:1 Error: lower order count-of-counts cannot be estimated
properly

Hint: use another smoothing method with this corpus.



EXECUTING rm -rf
/home/moses/working/experiments/NGRAM10-A/tmp/irstlm-build-tmp.6920

FINISH.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160520/b782c0d6/attachment-0001.html

------------------------------

Message: 3
Date: Fri, 20 May 2016 12:11:58 +0200
From: Nicola Bertoldi <bertoldi@fbk.eu>
Subject: Re: [Moses-support] DEBUG_LEVEL:1 Error: lower order
count-of-counts cannot be estimated properly
To: Tomasz Gawryl <tomasz.gawryl@skrivanek.pl>
Cc: moses-support@mit.edu
Message-ID: <0ED7338D-DAB7-41C3-982B-CE44592A3A42@fbk.eu>
Content-Type: text/plain; charset="utf-8"

Hi Thomas

this is clearly a problem related to IRSTLM

I would kindly ask you to open a ticket in the IRSTLM Github Repo adding as much info as possible

Please also add the actual command you run

I (as IRSTLM developer) will reply asap

Nicola


> On 20 May 2016, at 11:31, Tomasz Gawryl <tomasz.gawryl@skrivanek.pl> wrote:
>
> Hi,
> I?m trying to build 10 ngram?s model but my training pipeline ends with error: ?Error: lower order count-of-counts cannot be estimated properly?. Corpus has 33 mln sentences.
> I successfully trained much smaller corpus (around 5 mln sentences) using the same config file. Would you suggest me something how to fix this problem?
>
> Regards,
> Thomas
>
> --------------
>
> # more steps/2/LM_ACROSS-BIGMAMA-OPENSUB2016_train.2.STDERR
>
> Generating successor statistics
> level 2
> level 3
> level 4
> level 5
> level 6
> level 7
> level 8
> level 9
> level 10
> level 1
> computing statistics
> n1: 1 n2: 0 n3: 0 n4: 0 unover3: 0
> DEBUG_LEVEL:1 Error: lower order count-of-counts cannot be estimated properly
> Hint: use another smoothing method with this corpus.
>
> EXECUTING rm -rf /home/moses/working/experiments/NGRAM10-A/tmp/irstlm-build-tmp.6920
> FINISH.
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support <http://mailman.mit.edu/mailman/listinfo/moses-support>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160520/518de0ee/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 115, Issue 22
**********************************************

0 Response to "Moses-support Digest, Vol 115, Issue 22"

Post a Comment