Moses-support Digest, Vol 87, Issue 56

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: EMS MML IndexError: list index out of range (jian zhang)
2. Re: Sparse features (Hieu Hoang)
3. Re: word alignment-words' indexes and sentences' length (Tom Hoar)


----------------------------------------------------------------------

Message: 1
Date: Fri, 24 Jan 2014 22:27:40 +0000
From: jian zhang <jianzhang09@gmail.com>
Subject: Re: [Moses-support] EMS MML IndexError: list index out of
range
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Cc: moses-support@mit.edu
Message-ID:
<CALA=z0B-EvQDNmjqz2VOinXSyiZN9C=dtPpOVZbAkrfOy=4VvA@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Barry,

All the scores are 99999 in that file.

Thanks,


Jian


On Fri, Jan 24, 2014 at 3:51 PM, Barry Haddow <bhaddow@staffmail.ed.ac.uk>wrote:

> Hi Jian
>
> This is a bit suspect:
>
>
> 2014-01-24 14:17:26,276 Retaining at least 0 entries and ignoring 2075137
>
> Are the scores in this file sensible (or are they all the same?)
>
> /home/mml/mml-test/experiment/training/corpus-mml-score.1
>
> cheers - Barry
>
>
> On 24/01/14 14:53, jian zhang wrote:
>
>> Hi,
>>
>> I got error of IndexError: list index out of range at the
>> TRAINING_mml-filter-before-wa step.
>>
>> I had read the post at https://www.mail-archive.com/
>> moses-support@mit.edu/msg08767.html, however I still can not figure out
>> what is wrong.
>>
>> The full error is
>>
>> general:strategy = Score
>> general:source_language = fr
>> general:target_language = en
>> general:input_stem = /home/mml/mml-test/experiment/training/corpus.1
>> general:output_stem = /home/mml/mml-test/experiment/training/corpus-mml.1
>> general:domain_file = /home/mml/mml-test/experiment/model/domains.1
>> general:domain_file_out = /home/mml/mml-test/experiment/
>> training/corpus-mml.1
>> score:score_file = /home/mml/mml-test/experiment/
>> training/corpus-mml-score.1
>> score:proportion = 0.9
>>
>> 2014-01-24 14:17:26,276 Retaining at least 0 entries and ignoring 2075137
>> Traceback (most recent call last):
>> File "/home/tools/mosesdecoder/scripts/ems/support/mml-filter.py",
>> line 156, in <module>
>> main()
>> File "/home/tools/mosesdecoder/scripts/ems/support/mml-filter.py",
>> line 111, in main
>> strategy = strategy_class(config)
>> File "/home/tools/mosesdecoder/scripts/ems/support/mml-filter.py",
>> line 72, in __init__
>> [float(line[:-1]) for line in open(self.score_file)],
>> reverse=True)[ignore_count + count]
>> IndexError: list index out of range
>>
>> And my ems configuration file has:
>>
>> #################################################################
>> # PARALLEL CORPUS PREPARATION:
>> # create a tokenized, sentence-aligned corpus, ready for training
>>
>> [CORPUS]
>>
>> #in-domain parallel corpus
>> [CORPUS:in]
>> clean-stem = $training-in-domain-corpus
>>
>> [CORPUS:out]
>> #out-domain parallel corpus
>> clean-stem = $training-out-domain-corpus
>>
>>
>> #################################################################
>> # LANGUAGE MODEL TRAINING
>> [LM]
>> [LM:lm]
>> type = 8
>> lm = $language-model
>> #################################################################
>> # MODIFIED MOORE LEWIS FILTERING
>>
>> [MML]
>>
>> lm-training = $srilm-dir/ngram-count
>> lm-settings = "-interpolate -kndiscount -unk"
>> lm-binarizer = $moses-src-dir/bin/build_binary
>> lm-query = $moses-src-dir/bin/query
>> order = 5
>>
>> ### in-/out-of-domain source/target corpora to train the 4 language model
>> #
>> # in-domain parallel corpus
>> indomain-stem = [CORPUS:in:clean-split-stem]
>>
>> # out-of-domain parallel corpus
>> outdomain-stem = [CORPUS:out:clean-split-stem]
>>
>> # settings: number of lines sampled from the corpora to train each
>> language model on
>> settings = "--line-count 100000"
>>
>> #################################################################
>> # TRANSLATION MODEL TRAINING
>> [TRAINING]
>> script = $moses-script-dir/training/train-model.perl
>> training-options = "-mgiza -mgiza-cpus 12 -sort-buffer-size 16G
>> -sort-compress gzip -sort-parallel 12 -cores 12"
>> parallel = yes
>> alignment-symmetrization-method = grow-diag-final-and
>> lexicalized-reordering = msd-bidirectional-fe
>> score-settings = "--GoodTuring"
>> include-word-alignment-in-rules = yes
>>
>> #space separated all out-of domain corpora to be filtered
>> mml-filter-corpora = out
>> mml-before-wa = "-proportion 0.9"
>>
>> #####################################################
>>
>> Thanks.
>>
>>
>> Jian Zhang
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> --
> Jian Zhang
> Centre for Next Generation Localisation (CNGL)<http://www.cngl.ie/index.html>
> Dublin City University <http://www.dcu.ie/>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140124/4daf9445/attachment-0001.htm

------------------------------

Message: 2
Date: Sat, 25 Jan 2014 07:23:30 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Sparse features
To: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>, moses-support
<moses-support@mit.edu>
Message-ID: <52E36672.2000407@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

I suppose you're read the page
http://www.statmt.org/moses/?n=Moses.SparseFeatures

I small, fairly clear feature function that uses sparse feature is
TargetBigramFeature
The main thing to note
1. It has not dense feature, from the 0 in the constructor
TargetBigramFeature::TargetBigramFeature(const std::string &line)
:StatefulFeatureFunction(0, line)

2. In Evaluate(), adding a score to the score component is done by
calling
PlusEquals(this, sparse-feature-name, score);
For dense features, it would have been
PlusEquals(this, vector-of-scores);

On 21/01/2014 09:12, Marcin Junczys-Dowmunt wrote:
> Hi,
> I am increasingly interested in trying out sparse feature in Moses and
> now I am wondering how to approach this. There is not much information
> on the webpage. Can somebody share a nice article about the benefits and
> maybe some basic implementation as a working example?
>
> Thanks,
> Marcin
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



------------------------------

Message: 3
Date: Sat, 25 Jan 2014 21:08:37 +0700
From: Tom Hoar <tahoar@precisiontranslationtools.com>
Subject: Re: [Moses-support] word alignment-words' indexes and
sentences' length
To: moses-support@mit.edu
Message-ID: <52E3C565.9080105@precisiontranslationtools.com>
Content-Type: text/plain; charset="iso-8859-1"

You might check tokenizer.perl's new argument: -protected. This option
reads simple regex search patterns from a file and protects the patterns
from tokenization. I've never used it so you'll need to study how it works.



On 01/24/2014 03:58 PM, amir haghighi wrote:
> I use the built-in tokenizer in the Moses.
> how can I change this tokenizer? should I change the source code?
>
> Regards
> Amir
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140125/9dac1f17/attachment-0001.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 87, Issue 56
*********************************************

0 Response to "Moses-support Digest, Vol 87, Issue 56"

Post a Comment