Moses-support Digest, Vol 113, Issue 30

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Training: ferility limit (Masa Taka)
2. (no subject) (BIRENDRA CHAUHAN SINGH)
3. Re: apostrophe: detokenization or corpus issue ? (Philipp Koehn)
4. Re: (no subject) (Philipp Koehn)


----------------------------------------------------------------------

Message: 1
Date: Thu, 10 Mar 2016 03:21:14 +0000
From: Masa Taka <grantaka36@gmail.com>
Subject: Re: [Moses-support] Training: ferility limit
To: Philipp Koehn <phi@jhu.edu>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAEJVhXeCQYqKRq-7jPfb3L4g26C8TYF0dXr4qH0u4S8tbMa6vA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Philipp,

I agree that and will do filtering out before training. Thank you.

Regards,
Masataka



2016?3?8?(?) 5:51 Philipp Koehn <phi@jhu.edu>:

> Hi,
>
> it generally makes little sense to align a sentence with 1 word to a
> sentence with 28 words.
>
> It may be possible to change the hard-coded limit to 9 words, but for all
> practical purposes it is better to filter out sentence pairs that violate
> exceed the default ratio.
>
> -phi
>
> On Mon, Mar 7, 2016 at 4:43 AM, Masa Taka <grantaka36@gmail.com> wrote:
>
>> > the maximum allowed limit for a source word fertility
>> > source length = 1 target length = 28 ratio 28 ferility limit : 9
>>
>> With GIZA++ training, we have found some warning as above (in the
>> attached .txt file.) This mailing list archive says the following and I
>> know what and why.
>>
>> https://www.mail-archive.com/moses-support%40mit.edu/msg00603.html
>>
>> Q1) Can we change the preset value "9" to others?
>> Q2) Why the initial value set to "9?"
>> Q3) I think the message "ferility limit" should be "fertility limit," so
>> how do I escalate or report this ? (Or, is a pull request to some
>> repository available?)
>>
>> Regards,
>> Masataka
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160309/cbe4b9b2/attachment-0001.html

------------------------------

Message: 2
Date: Thu, 10 Mar 2016 10:27:28 +0530
From: BIRENDRA CHAUHAN SINGH <birendrachauhan3@gmail.com>
Subject: [Moses-support] (no subject)
To: moses-support@mit.edu
Message-ID:
<CAKogDtz9HuU-e+gk0weJe9sWZUWwKG2o_UYMVMrA7bswrp3tOQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

on running this command:

mkdir ~/working
cd ~/working
nohup nice ~/mosesdecoder/scripts/training/train-model.perl -root-dir train \
-corpus ~/corpus/news-commentary-v8.fr-en.clean \
-f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe \
-lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8 \
-external-bin-dir ~/mosesdecoder/tools >& training.out &


output:
[1] 2217
and this output comes within a sec.It does not take hours to train.I think
its not do training may be due to some reasons.and not directory
~/working/train/model
is formed.
So there is no moses.ini file formed.

Can any one tell me what is happenging and why?
Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160309/fe2d6f0c/attachment-0001.html

------------------------------

Message: 3
Date: Thu, 10 Mar 2016 07:00:32 -0500
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] apostrophe: detokenization or corpus
issue ?
To: Vincent Nguyen <vnguyen@neuf.fr>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAAFADDBt_9vOZTjaX-mH6ALB-K15jV6UrvcL-stsKxZzbWzqAw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

I do not think that the detokenizer would cause conversion of ' to ".
You can check the raw output of the decoder, and see how it is
changed by the detokenizer.

-phi

On Wed, Mar 9, 2016 at 11:44 AM, Vincent Nguyen <vnguyen@neuf.fr> wrote:

> Hi,
>
> I got the following situation:
>
> This group age
> is translated sometimes in:
> ce groupe d'?ge (correct)
> ce groupe d" ?ge (incorrect)
> ce groupe d "?ge (incorrect)
>
> I am wondering if this is more a detokenizer issue or a corpus issue, or
> both.
>
> Technically in French, there shouldn't be any space before or after the
> apostrophe.
> In the Europarl Corpus, as well as in the News2014 one, there are some
> instances with a space before or after.
>
> Then I have the feeling that the decoder gets a &apos; with surrounding
> spaces leading to the detokenizer to transform into "
>
> Anyone with a similar issue ?
>
> thanks.
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160310/f0dde20d/attachment-0001.html

------------------------------

Message: 4
Date: Thu, 10 Mar 2016 07:20:57 -0500
From: Philipp Koehn <phi@jhu.edu>
Subject: Re: [Moses-support] (no subject)
To: BIRENDRA CHAUHAN SINGH <birendrachauhan3@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAAFADDDYXSKr1ndVSpku8R3GEbgBVa0+O4W9158Zy4m83KrANw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

it's probably a good idea to use the full path instead of "-root-dir
train". But what is in "training.out"? That should give some clues. Are any
files created?

-phi

On Wed, Mar 9, 2016 at 11:57 PM, BIRENDRA CHAUHAN SINGH <
birendrachauhan3@gmail.com> wrote:

> on running this command:
>
> mkdir ~/working
> cd ~/working
> nohup nice ~/mosesdecoder/scripts/training/train-model.perl -root-dir train \
> -corpus ~/corpus/news-commentary-v8.fr-en.clean \
> -f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe \
> -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8 \
> -external-bin-dir ~/mosesdecoder/tools >& training.out &
>
>
> output:
> [1] 2217
> and this output comes within a sec.It does not take hours to train.I think
> its not do training may be due to some reasons.and not directory ~/working/train/model
> is formed.
> So there is no moses.ini file formed.
>
> Can any one tell me what is happenging and why?
> Thanks.
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160310/55d48733/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 113, Issue 30
**********************************************

0 Response to "Moses-support Digest, Vol 113, Issue 30"

Post a Comment