Moses-support Digest, Vol 97, Issue 82

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Delvin et al 2014 (Tom Hoar)
2. Re: how to test whether tcmalloc is used? (Barry Haddow)
3. Re: Delvin et al 2014 (Nikolay Bogoychev)
4. Re: Delvin et al 2014 (Nikolay Bogoychev)

----------------------------------------------------------------------

Message: 1
Date: Wed, 26 Nov 2014 18:53:29 +0700
From: Tom Hoar <tahoar@precisiontranslationtools.com>
Subject: [Moses-support] Delvin et al 2014
To: moses-support@mit.edu
Message-ID: <5475BF39.5080800@precisiontranslationtools.com>
Content-Type: text/plain; charset="utf-8"

Hieu,

Sorry I missed you in Vancouver. I just reviewed your slide deck from
the MosesCore TAUS Round Table in Vancouver
(taus-moses-industry-roundtable-2014-changes-in-moses-hieu-hoang-university-of-edinburgh).

In particular, I'm interested in the "Bilingual Language Models" that
"replicate Delvin et al, 2014". A search on statmt.org/moses doesn't
show any hits searching for "delvin". So, A) is the code finished? If so
B) are there any instructions how to enable/use this feature? If not, C)
what kind of help do you need to test the code for release?

--

Best regards,
Tom Hoar
Managing Director
*Precision Translation Tools Co., Ltd.*
Bangkok, Thailand
Web: www.precisiontranslationtools.com
<http://www.precisiontranslationtools.com>
Mobile: +66 87 345-1875
Skype: tahoar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141126/675bf394/attachment-0001.htm

------------------------------

Message: 2
Date: Wed, 26 Nov 2014 11:56:23 +0000
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] how to test whether tcmalloc is used?
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>, Rico Sennrich
<rico.sennrich@gmx.ch>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <5475BFE7.8030009@staffmail.ed.ac.uk>
Content-Type: text/plain; charset=windows-1252; format=flowed

How about:

nm -C moses | grep tcmalloc

On 26/11/14 11:34, Hieu Hoang wrote:
> Best to do what Rico says, but....
>
> If the tcmalloc library is dynamically linked to Moses, running ldd
> will show it is linked into moses:
> #ldd bin/moses
> .....
> libtcmalloc_minimal.so.4 => /usr/lib/libtcmalloc_minimal.so.4
> (0x00007ff49f5a2000)
> ...
> You can force it to statically link by deleting
> rm /usr/lib/libtcmalloc*.a
>
> On 26 November 2014 at 10:50, Rico Sennrich <rico.sennrich@gmx.ch
> <mailto:rico.sennrich@gmx.ch>> wrote:
>
> Li Xiang <lixiang.ict@...> writes:
>
> >
> > I compile Moses with tcmalloc. How can I test whether tcmalloc
> is used and
> evaluate the performance ?
> >
>
> there's probably many ways, but here's three:
>
> at compile time, you will see the following message if tcmalloc is
> not enabled:
>
> "Tip: install tcmalloc for faster threading. See
> BUILD-INSTRUCTIONS.txt for
> more information."
>
> you can also use '--without-tcmalloc' to disable tcmalloc and
> compare speed
> to a binary that is compiled with tcmalloc.
>
> If you use profiling tools (such as 'perf'), you can see which
> malloc is
> being called. 'perf top' shows me this line, among others:
>
> 1.75% moses_chart moses [.]
> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*,
> unsigned long, int
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

------------------------------

Message: 3
Date: Wed, 26 Nov 2014 13:02:45 +0000
From: Nikolay Bogoychev <nheart@gmail.com>
Subject: Re: [Moses-support] Delvin et al 2014
To: Tom Hoar <tahoar@precisiontranslationtools.com>
Cc: moses-support@mit.edu
Message-ID:
<CAJzPUExQKVLnywCZR2NwZujEWKkVUsUOPQ5S9==ydBUzXXucXg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hey,

BilingualLM is implemented and as of last week resides within moses master:
https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp

To compile it you need a NeuralNetwork backend for it. Currently there are
two supported: Oxlm and Nplm. Adding a new backend is relatively easy, you
need to implement the interface as shown here:
https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/BiLM_NPLM.h

To compile with oxlm backend you need to compile moses with the switch
-with-oxlm=/path/to/oxlm
To compile with nplm backend you need to compile moses with the switch
-with-nplm=/path/to/nplm (You need this fork of nplm
https://github.com/rsennrich/nplm

Unfortunately documentaiton is not yet available so here's a short summary
how to train a model and use it using, the nplm backend:
Use the extract training script to prepare aligned bilingual corpus:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/extract_training.py

You need the following options:

"-e", "--target-language", type="string", dest="target_language")
//Mandatory, for example es "-f", "--source-language", type="string",
dest="source_language") //Mandatory, for example en "-c", "--corpus",
type="string", dest="corpus_stem") // path/to/corpus In the directory you
have specified there should be files corpus.sourcelang and
corpus.targetlang "-t", "--tagged-corpus", type="string",
dest="tagged_stem") //Optional for backoff to pos tag "-a", "--align",
type="string", dest="align_file") //Mandatory alignemtn file "-w",
"--working-dir", type="string", dest="working_dir") //Output directory of
the model "-n", "--target-context", type="int", dest="n") / "-m",
"--source-context", type="int", dest="m") //The actual context size is 2*m
+ 1, this is the number of words on both left and right "-s",
"--prune-source-vocab", type="int", dest="sprune") //cutoff vocabulary
threshold "-p", "--prune-target-vocab", type="int", dest="tprune") //cutoff
vocabulary threshold
Then, use the training script to train the model:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/train_nplm.py

Example execution is: train_nplm.py -w de-en-500250source/ -r
de-en150nopos-source750 -n 16 -d 0
--nplm-home=/home/abmayne/code/deepathon/nplm_one_layer/ -c corpus.1.word
-i 750 -o 750

where -i and -o are input and output embeddings
-n is the total ngram size
-d is the number of hidden layyers
-w and -c are the same as the extract_training options
-r is the output directory of the model

Consult the python script for more detailed description of the options

After you have done that in the output directory you should have a trained
bilingual Neural Network language model

To run it in moses as a feature function you need the following line:

BilingualNPLM
filepath=/mnt/gna0/nbogoych/new_nplm_german/de-en150nopos/train.10k.model.nplm.10
target_ngrams=4 source_ngrams=9
source_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.source
target_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.targe

The source and target vocab is located in the working directory used to
prepare the neural network language model.
target_ngrams doesn't include the predicted word (so target_ngrams = 4,
would mean 1 word predicted and 4 target context word)
The total of the model would target_ngrams + source_ngrams + 1)

I will write a proper documentation in the following weeks. If you have
any problems runnning it, please consult me.

Cheers,

Nick

On Wed, Nov 26, 2014 at 11:53 AM, Tom Hoar <
tahoar@precisiontranslationtools.com> wrote:

> Hieu,
>
> Sorry I missed you in Vancouver. I just reviewed your slide deck from the
> MosesCore TAUS Round Table in Vancouver
> (taus-moses-industry-roundtable-2014-changes-in-moses-hieu-hoang-university-of-edinburgh).
>
>
> In particular, I'm interested in the "Bilingual Language Models" that
> "replicate Delvin et al, 2014". A search on statmt.org/moses doesn't show
> any hits searching for "delvin". So, A) is the code finished? If so B) are
> there any instructions how to enable/use this feature? If not, C) what kind
> of help do you need to test the code for release?
>
> --
>
> Best regards,
> Tom Hoar
> Managing Director
> *Precision Translation Tools Co., Ltd.*
> Bangkok, Thailand
> Web: www.precisiontranslationtools.com
> Mobile: +66 87 345-1875
> Skype: tahoar
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141126/63dd0f37/attachment-0001.htm

------------------------------

Message: 4
Date: Wed, 26 Nov 2014 13:04:17 +0000
From: Nikolay Bogoychev <nheart@gmail.com>
Subject: Re: [Moses-support] Delvin et al 2014
To: Tom Hoar <tahoar@precisiontranslationtools.com>
Cc: moses-support@mit.edu
Message-ID:
<CAJzPUEwUbk+Jj6FAynV3u8L1MYcQWMbcv2SFjj-A6NreFqmHAw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Fix formatting...

Hey,

BilingualLM is implemented and as of last week resides within moses master:
https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp

To compile it you need a NeuralNetwork backend for it. Currently there are
two supported: Oxlm and Nplm. Adding a new backend is relatively easy, you
need to implement the interface as shown here:
https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/BiLM_NPLM.h

To compile with oxlm backend you need to compile moses with the switch
-with-oxlm=/path/to/oxlm
To compile with nplm backend you need to compile moses with the switch
-with-nplm=/path/to/nplm (You need this fork of nplm
https://github.com/rsennrich/nplm

Unfortunately documentaiton is not yet available so here's a short summary
how to train a model and use it using, the nplm backend:
Use the extract training script to prepare aligned bilingual corpus:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/extract_training.py

You need the following options:

"-e", "--target-language", type="string", dest="target_language")
//Mandatory, for example es "-f", "--source-language", type="string",
dest="source_language") //Mandatory, for example en "-c", "--corpus",
type="string", dest="corpus_stem") // path/to/corpus In the directory you
have specified there should be files corpus.sourcelang and
corpus.targetlang "-t", "--tagged-corpus", type="string",
dest="tagged_stem") //Optional for backoff to pos tag "-a", "--align",
type="string", dest="align_file") //Mandatory alignemtn file "-w",
"--working-dir", type="string", dest="working_dir") //Output directory of
the model "-n", "--target-context", type="int", dest="n") / "-m",
"--source-context", type="int", dest="m") //The actual context size is 2*m
+ 1, this is the number of words on both left and right "-s",
"--prune-source-vocab", type="int", dest="sprune") //cutoff vocabulary
threshold "-p", "--prune-target-vocab", type="int", dest="tprune") //cutoff
vocabulary threshold

Then, use the training script to train the model:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/train_nplm.py

Example execution is:

train_nplm.py -w de-en-500250source/ -r de-en150nopos-source750 -n 16 -d 0
--nplm-home=/home/abmayne/code/deepathon/nplm_one_layer/ -c corpus.1.word
-i 750 -o 750

where -i and -o are input and output embeddings
-n is the total ngram size
-d is the number of hidden layyers
-w and -c are the same as the extract_training options
-r is the output directory of the model

Consult the python script for more detailed description of the options

After you have done that in the output directory you should have a trained
bilingual Neural Network language model

To run it in moses as a feature function you need the following line:

BilingualNPLM filepath=/mnt/gna0/nbogoych/new_nplm_german/de-en150nopos/train.10k.model.nplm.10
target_ngrams=4 source_ngrams=9 source_vocab=/mnt/gna0/
nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.source
target_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.targe

The source and target vocab is located in the working directory used to
prepare the neural network language model.
target_ngrams doesn't include the predicted word (so target_ngrams = 4,
would mean 1 word predicted and 4 target context word)
The total of the model would target_ngrams + source_ngrams + 1)

I will write a proper documentation in the following weeks. If you have
any problems runnning it, please consult me.

Cheers,

Nick

On Wed, Nov 26, 2014 at 1:02 PM, Nikolay Bogoychev <nheart@gmail.com> wrote:

> Hey,
>
> BilingualLM is implemented and as of last week resides within moses
> master:
> https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp
>
> To compile it you need a NeuralNetwork backend for it. Currently there are
> two supported: Oxlm and Nplm. Adding a new backend is relatively easy, you
> need to implement the interface as shown here:
>
> https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/BiLM_NPLM.h
>
> To compile with oxlm backend you need to compile moses with the switch
> -with-oxlm=/path/to/oxlm
> To compile with nplm backend you need to compile moses with the switch
> -with-nplm=/path/to/nplm (You need this fork of nplm
> https://github.com/rsennrich/nplm
>
> Unfortunately documentaiton is not yet available so here's a short summary
> how to train a model and use it using, the nplm backend:
> Use the extract training script to prepare aligned bilingual corpus:
> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/extract_training.py
>
> You need the following options:
>
> "-e", "--target-language", type="string", dest="target_language")
> //Mandatory, for example es "-f", "--source-language", type="string",
> dest="source_language") //Mandatory, for example en "-c", "--corpus",
> type="string", dest="corpus_stem") // path/to/corpus In the directory you
> have specified there should be files corpus.sourcelang and
> corpus.targetlang "-t", "--tagged-corpus", type="string",
> dest="tagged_stem") //Optional for backoff to pos tag "-a", "--align",
> type="string", dest="align_file") //Mandatory alignemtn file "-w",
> "--working-dir", type="string", dest="working_dir") //Output directory of
> the model "-n", "--target-context", type="int", dest="n") / "-m",
> "--source-context", type="int", dest="m") //The actual context size is 2*m
> + 1, this is the number of words on both left and right "-s",
> "--prune-source-vocab", type="int", dest="sprune") //cutoff vocabulary
> threshold "-p", "--prune-target-vocab", type="int", dest="tprune") //cutoff
> vocabulary threshold
> Then, use the training script to train the model:
> https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/train_nplm.py
>
> Example execution is: train_nplm.py -w de-en-500250source/ -r
> de-en150nopos-source750 -n 16 -d 0
> --nplm-home=/home/abmayne/code/deepathon/nplm_one_layer/ -c corpus.1.word
> -i 750 -o 750
>
> where -i and -o are input and output embeddings
> -n is the total ngram size
> -d is the number of hidden layyers
> -w and -c are the same as the extract_training options
> -r is the output directory of the model
>
> Consult the python script for more detailed description of the options
>
> After you have done that in the output directory you should have a trained
> bilingual Neural Network language model
>
> To run it in moses as a feature function you need the following line:
>
> BilingualNPLM
> filepath=/mnt/gna0/nbogoych/new_nplm_german/de-en150nopos/train.10k.model.nplm.10
> target_ngrams=4 source_ngrams=9
> source_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.source
> target_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.targe
>
> The source and target vocab is located in the working directory used to
> prepare the neural network language model.
> target_ngrams doesn't include the predicted word (so target_ngrams = 4,
> would mean 1 word predicted and 4 target context word)
> The total of the model would target_ngrams + source_ngrams + 1)
>
> I will write a proper documentation in the following weeks. If you have
> any problems runnning it, please consult me.
>
> Cheers,
>
> Nick
>
>
>
>
> On Wed, Nov 26, 2014 at 11:53 AM, Tom Hoar <
> tahoar@precisiontranslationtools.com> wrote:
>
>> Hieu,
>>
>> Sorry I missed you in Vancouver. I just reviewed your slide deck from the
>> MosesCore TAUS Round Table in Vancouver
>> (taus-moses-industry-roundtable-2014-changes-in-moses-hieu-hoang-university-of-edinburgh).
>>
>>
>> In particular, I'm interested in the "Bilingual Language Models" that
>> "replicate Delvin et al, 2014". A search on statmt.org/moses doesn't
>> show any hits searching for "delvin". So, A) is the code finished? If so B)
>> are there any instructions how to enable/use this feature? If not, C) what
>> kind of help do you need to test the code for release?
>>
>> --
>>
>> Best regards,
>> Tom Hoar
>> Managing Director
>> *Precision Translation Tools Co., Ltd.*
>> Bangkok, Thailand
>> Web: www.precisiontranslationtools.com
>> Mobile: +66 87 345-1875
>> Skype: tahoar
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20141126/5d1e27d0/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 97, Issue 82
*********************************************

Moses-support Digest, Vol 97, Issue 82

0 Response to "Moses-support Digest, Vol 97, Issue 82"

Post a Comment