Moses-support Digest, Vol 161, Issue 4

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Tuning a factored model -> crash (Haukur P?ll J?nsson)


----------------------------------------------------------------------

Message: 1
Date: Tue, 10 Mar 2020 11:43:27 +0000
From: Haukur P?ll J?nsson <haukurpj@ru.is>
Subject: Re: [Moses-support] Tuning a factored model -> crash
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <6a16b201f769438a808d1dbbfda01b76@ru.is>
Content-Type: text/plain; charset="iso-8859-1"

Dear Hieu,


Thank you for your reply. I appreciate that you still take time to answer questions on this mailing list.


It's good to hear that all datasets should be processed in the same way. This is now the case for my data. The `dev` data contains all three factors, just as the training and testing data:


==> /work/haukurpj/data/mideind/dev/form-pos-lemma/data.en <==
for|IN|for less|JJR|less than|IN|than one|CD|one day|NN|day (|(|( up|IN|up to|TO|to 24|CD|24 hours|NNS|hour )|)|) :|:|:

==> /work/haukurpj/data/mideind/dev/form-pos-lemma/data.is <==
skemur|aam|skammt en|c|en einn|tfkeo|einn dag|nkeo|dagur (|(|( 24|ta|24 klst.|as|klst. e?a|c|e?a skemur|aam|skammt )|)|) :|:|:

I have two follow up questions.


1. The crashing problem still persists, I even increased the allocated memory (via slurm) to 64GB, trying to eliminate memory allocation issues during decoding. I attached the moses.ini file used during tuning. The filtered phrase-tables look correct.
The crash:
Line Line 1: Collecting options took 0: Collecting options took 0.005 seconds at moses/Manager.cpp Line 141
0.004 seconds at moses/Manager.cpp Line 141
Segmentation fault
Exit code: 139
2. When tuning using `mert-moses.pl` it attempts to optimize the BLEU score, w.r.t. the dev data. Is the BLEU score defined over the complete factors (i.e. the surface and POS need to be correct to be a correct translation) or individual factors and then combined (using f.ex. arithmetic mean) to create an overall BLEU score?

________________________________
Fr?: Hieu Hoang <hieuhoang@gmail.com>
Sent: 9. mars 2020 20:22:08
Til: Haukur P?ll J?nsson
Afrit: moses-support@mit.edu
Efni: Re: [Moses-support] Tuning a factored model -> crash


Hieu Hoang
http://statmt.org/hieu


On Mon, 9 Mar 2020 at 10:08, Haukur P?ll J?nsson <haukurpj@ru.is<mailto:haukurpj@ru.is>> wrote:
Hi all,


For the last few weeks, I have been trying to train and tune a factored model. I have found it difficult to implement and I now seek some assistance.


I am trying to build a simple factored model from English to Icelandic: T0-0, T0,1-1. Where factor 0 is the surface and factor 1 is the POS. The training data (source and target) I have has three factors `surface|pos|lemma`, the lemma is ignored for now.


When tuning a factored model I run into problems. My first question is, what factors should be in tuning data? It seems that I can have all factors as the input/source but I'm unsure about the output/target.
The tuning data needs to also have factors 0 and 1. You should pre-process ttraining, tuning and test data in the same way

I run the tuning like so (using 10 threads):


"$MOSESDECODER"/scripts/training/mert-moses.pl<http://mert-moses.pl> \
"$DEV_DATA_IN"."$LANG_FROM" \
"$DEV_DATA_OUT"."$LANG_TO" \
"$MOSESDECODER"/bin/moses "$BASE_MOSES_INI" \
--mertdir "$MOSESDECODER"/bin \
--working-dir "$TUNE_DIR" \
--decoder-flags="-threads $THREADS"


But then when starting to decode the decoder crashes.


Line Line 9: Initialize search took leave|VBP0.103 seconds total
soft|JJ hands|NNS ,|, Jess|NNP you|PRP very|RB weak|JJ ,|, including|VBG 4: Initialize search took .|.
: Collecting options took 0.003severe|JJ ,|, his|PRP$I|PRP loved|VBD Sean|NNP .|. you|PRP know|VBP .|.
seconds at moses/Manager.cpp Line 0.129 seconds total
joint|JJ examination|NN Line 2: Initialize search took 141
Segmentation fault
Exit code: 139


I am monitoring the memory usage and the decoder is only using about 4GB of memory from the 32GB allocated when it crashes. Why the is the decoder crashing? Are there some recommendations for settings when training a factored model?



Haukur P?ll J?nsson

Ranns?knars?rfr??ingur | T?lvunarfr??ideild

Research Specialist | School of Computer Science

P?stfang / E-mail: haukurpj@ru.is<mailto:haukurpj@ru.is>

[1568888021669]

H?sk?linn ? Reykjav?k | Reykjavik University

Menntavegur 1 | 101 Reykjav?k | Iceland

S?mi/Tel: +354 599 6200

www.hr.is<http://www.hr.is>


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moses.ini
Type: application/octet-stream
Size: 1405 bytes
Desc: moses.ini
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20200310/5d14c0b4/attachment-0001.obj

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 161, Issue 4
*********************************************

0 Response to "Moses-support Digest, Vol 161, Issue 4"

Post a Comment