Moses-support Digest, Vol 107, Issue 49

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: Blingual neural lm, log-likelihood: -nan (Nikolay Bogoychev)
2. Re: Blingual neural lm, log-likelihood: -nan (Barry Haddow)
3. Re: Blingual neural lm, log-likelihood: -nan (Rico Sennrich)

----------------------------------------------------------------------

Message: 1
Date: Mon, 21 Sep 2015 08:45:58 +0100
From: Nikolay Bogoychev <nheart@gmail.com>
Subject: Re: [Moses-support] Blingual neural lm, log-likelihood: -nan
To: jian zhang <zhangj@computing.dcu.ie>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAJzPUEzmH6DpGhG=cBBNd3wuDBp5Ly9D_nAc-wKGtZMudXeJ6Q@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hey Jian,

I have encountered this problem with nplm myself and couldn't really find a
solution that works every time.

Basically what happens is that there is a token that occurs very frequently
on the same position and it's weights become huge and eventually not a
number which propagates to the rest of the data. This usually happens with
the beginning of sentence token especially if your source and target size
contexts are big. One thing you could do is to decrease the source and
target size context (doesn't always work). Another thing you could do is to
lower the learning rate (always works, but you might need to set it quite
low like 0.25)

The proper solution to this according to Ashish Vasvani who is the creator
of nplm is to use gradient clipping which is commented out in his code. You
should contact him because this is a nplm issue.

Cheers,

Nick

On Sat, Sep 19, 2015 at 8:58 PM, jian zhang <zhangj@computing.dcu.ie> wrote:

> Hi all,
>
> I got
>
> Epoch xxxx
> Current learning rate: 1
> Training minibatches: Validation log-likelihood: -nan
> perplexity: nan
>
> during bilingual neural lm training.
>
> I use command:
> /home/user/tools/nplm-master-rsennrich/src/trainNeuralNetwork --train_file
> work_dir/blm/train.numberized --num_epochs 30 --model_prefix
> work_dir/blm/train.10k.model.nplm --learning_rate 1 --minibatch_size 1000
> --num_noise_samples 100 --num_hidden 2 --input_embedding_dimension 512
> --output_embedding_dimension 192 --num_threads 6 --loss_function log
> --activation_function tanh --validation_file work_dir/blm/valid.numberized
> --validation_minibatch_size 10
>
> where train.numberized and valid.numberized files are splitted from the
> file generated by
> script ${moses}/scripts/training/bilingual-lm/extract_training.py.
>
> Training/Validation numbers are:
> Number of training instances: 4128195
> Number of validation instances: 217274
>
>
> Thanks,
>
> Jian
>
>
> Jian Zhang
> Centre for Next Generation Localisation (CNGL)
> <http://www.cngl.ie/index.html>
> Dublin City University <http://www.dcu.ie/>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150921/ffa20e03/attachment-0001.html

------------------------------

Message: 2
Date: Mon, 21 Sep 2015 08:58:16 +0100
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] Blingual neural lm, log-likelihood: -nan
To: Nikolay Bogoychev <nheart@gmail.com>, jian zhang
<zhangj@computing.dcu.ie>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <55FFB898.7040805@staffmail.ed.ac.uk>
Content-Type: text/plain; charset="windows-1252"

Hi Jian

You could also try using dropout. Adding something like

--dropout 0.8 --input_dropout 0.9 --null_index 1

to nplm training can help - look at your vocabulary file to see what the
null index should be set to. This works with the Moses version of nplm,

cheers - Barry

On 21/09/15 08:45, Nikolay Bogoychev wrote:
>
> Hey Jian,
>
> I have encountered this problem with nplm myself and couldn't really
> find a solution that works every time.
>
> Basically what happens is that there is a token that occurs very
> frequently on the same position and it's weights become huge and
> eventually not a number which propagates to the rest of the data. This
> usually happens with the beginning of sentence token especially if
> your source and target size contexts are big. One thing you could do
> is to decrease the source and target size context (doesn't always
> work). Another thing you could do is to lower the learning rate
> (always works, but you might need to set it quite low like 0.25)
>
> The proper solution to this according to Ashish Vasvani who is the
> creator of nplm is to use gradient clipping which is commented out in
> his code. You should contact him because this is a nplm issue.
>
> Cheers,
>
> Nick
>
>
> On Sat, Sep 19, 2015 at 8:58 PM, jian zhang <zhangj@computing.dcu.ie
> <mailto:zhangj@computing.dcu.ie>> wrote:
>
> Hi all,
>
> I got
>
> Epoch xxxx
> Current learning rate: 1
> Training minibatches: Validation log-likelihood: -nan
> perplexity: nan
>
> during bilingual neural lm training.
>
> I use command:
> /home/user/tools/nplm-master-rsennrich/src/trainNeuralNetwork
> --train_file work_dir/blm/train.numberized --num_epochs 30
> --model_prefix work_dir/blm/train.10k.model.nplm --learning_rate 1
> --minibatch_size 1000 --num_noise_samples 100 --num_hidden 2
> --input_embedding_dimension 512 --output_embedding_dimension 192
> --num_threads 6 --loss_function log --activation_function tanh
> --validation_file work_dir/blm/valid.numberized
> --validation_minibatch_size 10
>
> where train.numberized and valid.numberized files are splitted
> from the file generated by
> script ${moses}/scripts/training/bilingual-lm/extract_training.py.
>
> Training/Validation numbers are:
> Number of training instances: 4128195
> Number of validation instances: 217274
>
>
> Thanks,
>
> Jian
>
> Jian Zhang
> Centre for Next Generation Localisation (CNGL)
> <http://www.cngl.ie/index.html>
> Dublin City University <http://www.dcu.ie/>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150921/97b22bba/attachment-0001.html
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150921/97b22bba/attachment-0001.pl

------------------------------

Message: 3
Date: Mon, 21 Sep 2015 10:08:11 +0100
From: Rico Sennrich <rico.sennrich@gmx.ch>
Subject: Re: [Moses-support] Blingual neural lm, log-likelihood: -nan
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>, Nikolay Bogoychev
<nheart@gmail.com>, jian zhang <zhangj@computing.dcu.ie>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <371kglfno0xilkyrq01j9dtk.1442825862575@email.android.com>
Content-Type: text/plain; charset="utf-8"

Hi all,

Small correction: --dropout isn't on Github (yet). I never got gains from it, and thus didn't commit. I'll have to double-check my implementation.

--input_dropout also didn't give me any gains, but could make training more stable (helping against nan), and is helpful if you want to get probabilities with incomplete context (say, if you have a 5-gram nplm and want to score a bigram. This is common in hiero decoding).

Best wishes,
Rico

Sent from my Hitchhiker's guide

-------- Original Message --------
From:Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Sent:Mon, 21 Sep 2015 08:58:16 +0100
To:Nikolay Bogoychev <nheart@gmail.com>,jian zhang <zhangj@computing.dcu.ie>
Cc:moses-support@mit.edu
Subject:Re: [Moses-support] Blingual neural lm, log-likelihood: -nan

>The University of Edinburgh is a charitable body, registered in
>Scotland, with registration number SC005336.
>
>_______________________________________________
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150921/1f6e6df7/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 107, Issue 49
**********************************************

Moses-support Digest, Vol 107, Issue 49

0 Response to "Moses-support Digest, Vol 107, Issue 49"

Post a Comment