Moses-support Digest, Vol 99, Issue 61

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Bulk download of very large files from statmt.org
(Christian Buck)
2. Problem With Factored Training (Haydar ?mren)
3. Re: Problem With Factored Training (Haydar ?mren)


----------------------------------------------------------------------

Message: 1
Date: Mon, 26 Jan 2015 03:49:54 +0000
From: Christian Buck <cbuck@lantis.de>
Subject: Re: [Moses-support] Bulk download of very large files from
statmt.org
To: moses-support@mit.edu
Message-ID: <54C5B962.7000702@lantis.de>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi Lane,

Thanks for your interest in large files!

As Marcin said, HTTP using aria2c is the preferred method of download. I
also produced zsync [1] files. That's something like rsync over http
using a static metadata file which should be able to fix a broken
download / recover from a partial download.

Raw English text is still available here:
http://statmt.org/ngrams/raw_en/ but I may shuffle the files around
occasionally because they clog up our fileservers. If possible send me a
notice before downloading so I can take care that they persist for that
time.

cheers,
Christian

[1] http://zsync.moria.org.uk/

On 25/01/2015 17:27, Marcin Junczys-Dowmunt wrote:
> Hi,
> I managed to download the 5.5 TB monster using just aria2c, worked
> splendidly. BTW, what happend to the English language data text files?
>
> W dniu 25.01.2015 o 17:06, Lane Schwartz pisze:
>> I'm interested in downloading some of the pre-trained models that are
>> hosted at statmt.org, including the 5.5 TB English language model
>> (http://statmt.org/ngrams/lm/).
>>
>> My University cluster strongly recommends that I make use of Globus
>> Connect (https://www.globus.org/globus-connect-personal) to manage
>> large file transfers. It appears that software takes care of a lot of
>> the details (reconnections, etc) when transferring large files.
>>
>> I was wondering if it might be possible to set up that software (at
>> least on a temporary basis) on whatever server is hosting the statmt
>> files. Who would be the person for me to talk to about that?
>>
>> Thanks,
>> Lane
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



------------------------------

Message: 2
Date: Mon, 26 Jan 2015 12:06:49 +0100
From: Haydar ?mren <haydarimren@gmail.com>
Subject: [Moses-support] Problem With Factored Training
To: moses-support@mit.edu
Message-ID:
<CAP3OHV4ghhFwwA0fBqco=+d5MrjJsr2FjxnJtuxKPP4pLXLQcQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

I am trying to build a factored translation system for English-Turkish
language pair, but I ran into some problems. I would be really pleased if
you can answer my questions.

1. My parallel data is formatted like:

y+POS|y|POS -----> English side (source)
x+morph|x|morph ----> Turkish side (target)

2. I am training the system with these options:

-translation-factors 0-0+1-1+2-2
-reordering-factors 0,1,2-0,1,2
-generation-factors 0-1,2+1-2+1,2+0
-decoding-steps t0,g0:t1,g1,t2,g2

So I first translate from surface to surface and generate target lemma and
morphology from surface. Another path is, translation from lemma to lemma,
generation of morph. from lemma, translation from POS to morph and
generation of surface from lemma and morph.

3. I do not know how to add the reordering into this so i just added
"-reordering-factors 0,1,2-0,1,2" with "-alignment grow-diag-final-and
-reordering msd-bidirectional-fe" options.
Is this right?

4. After the training I am trying to run testing but immediately I get a
segmentation fault.
The segmentation fault comes when it tries to translate the first sentence
while calculating options.
I feel I have done something wrong but I don't know what.

5. Another problem is if I use generation models, tuning also throws
segmentation fault. So I cannot tune with MERT. It throws the error while
loading generation table. What should I do to make it run?

I would be really pleased if you can help me.

Kind regards,
Haydar Imren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150126/4a94e5e8/attachment-0001.htm

------------------------------

Message: 3
Date: Mon, 26 Jan 2015 13:20:37 +0100
From: Haydar ?mren <haydarimren@gmail.com>
Subject: Re: [Moses-support] Problem With Factored Training
To: moses-support@mit.edu
Message-ID:
<CAP3OHV6x6gdB5B0VgdonLVvnZGNpkgSDWw7-ay+s1N_KfvNdxA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

I am attaching mert.out and moses.ini. It seems like there is a problem
with one of the generation models, but I cannot understand what it is.

Kind regards,

Haydar Imren

On Mon, Jan 26, 2015 at 12:06 PM, Haydar ?mren <haydarimren@gmail.com>
wrote:

> Hi,
>
> I am trying to build a factored translation system for English-Turkish
> language pair, but I ran into some problems. I would be really pleased if
> you can answer my questions.
>
> 1. My parallel data is formatted like:
>
> y+POS|y|POS -----> English side (source)
> x+morph|x|morph ----> Turkish side (target)
>
> 2. I am training the system with these options:
>
> -translation-factors 0-0+1-1+2-2
> -reordering-factors 0,1,2-0,1,2
> -generation-factors 0-1,2+1-2+1,2+0
> -decoding-steps t0,g0:t1,g1,t2,g2
>
> So I first translate from surface to surface and generate target lemma and
> morphology from surface. Another path is, translation from lemma to lemma,
> generation of morph. from lemma, translation from POS to morph and
> generation of surface from lemma and morph.
>
> 3. I do not know how to add the reordering into this so i just added
> "-reordering-factors 0,1,2-0,1,2" with "-alignment grow-diag-final-and
> -reordering msd-bidirectional-fe" options.
> Is this right?
>
> 4. After the training I am trying to run testing but immediately I get a
> segmentation fault.
> The segmentation fault comes when it tries to translate the first sentence
> while calculating options.
> I feel I have done something wrong but I don't know what.
>
> 5. Another problem is if I use generation models, tuning also throws
> segmentation fault. So I cannot tune with MERT. It throws the error while
> loading generation table. What should I do to make it run?
>
> I would be really pleased if you can help me.
>
> Kind regards,
> Haydar Imren
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150126/c1112204/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mert.out
Type: application/octet-stream
Size: 5587 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150126/c1112204/attachment.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: moses.ini
Type: application/octet-stream
Size: 2139 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20150126/c1112204/attachment-0001.obj

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 99, Issue 61
*********************************************

0 Response to "Moses-support Digest, Vol 99, Issue 61"

Post a Comment