Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Tokenization (Justin Cunningham)
2. Re: Tokenization (Hieu Hoang)
3. Re: Tokenization (Justin Cunningham)
----------------------------------------------------------------------
Message: 1
Date: Sun, 12 Apr 2020 17:23:06 +0000
From: Justin Cunningham <just1brill@outlook.com>
Subject: [Moses-support] Tokenization
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<DB6PR0601MB2149847F821248D5B88BC64B8CDC0@DB6PR0601MB2149.eurprd06.prod.outlook.com>
Content-Type: text/plain; charset="utf-8"
Hi,
I?m currently working on a Neural Machine Translator but I am quite new to it all. I am trying to tokenise my files in Linux using the following shell script (https://github.com/JustCunn/IrishNMT/blob/master/GaeilgePrepare.sh) and these files:
http://opus.nlpl.eu/download.php?f=EUbookshop/v2/moses/en-ga.txt.zip<http://opus.nlpl.eu/download.php?f=EUbookshop/v2/moses/de-fr.txt.zip>
http://opus.nlpl.eu/download.php?f=QED/v2.0a/moses/en-ga.txt.zip
But it just won?t work. Sometimes it will skip it, others it will just be stuck on the ?Tokenizer... number of threads...?. For context, they are all plain text files. Am I not formatting the text correctly?
I?d appreciate if someone could help me with this as it would be a huge help in my understanding of it all.
Thanks,
Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20200412/1c5a6531/attachment-0001.html
------------------------------
Message: 2
Date: Sun, 12 Apr 2020 13:20:43 -0700
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Tokenization
To: Justin Cunningham <just1brill@outlook.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAEKMkbhNP6rcRC9+HvSai8FE_vFb0jO5oYkweCzqD7FDG9KtyQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
the moses tokenizer expects the input from standard in
Hieu Hoang
http://statmt.org/hieu
On Sun, 12 Apr 2020 at 10:27, Justin Cunningham <just1brill@outlook.com>
wrote:
> Hi,
>
> I?m currently working on a Neural Machine Translator but I am quite new to
> it all. I am trying to tokenise my files in Linux using the following shell
> script (https://github.com/JustCunn/IrishNMT/blob/master/GaeilgePrepare.sh)
> and these files:
>
> http://opus.nlpl.eu/download.php?f=EUbookshop/v2/moses/en-ga.txt.zip
> <http://opus.nlpl.eu/download.php?f=EUbookshop/v2/moses/de-fr.txt.zip>
> http://opus.nlpl.eu/download.php?f=QED/v2.0a/moses/en-ga.txt.zip
>
> But it just won?t work. Sometimes it will skip it, others it will just be
> stuck on the ?Tokenizer... number of threads...?. For context, they are all
> plain text files. Am I not formatting the text correctly?
>
> I?d appreciate if someone could help me with this as it would be a huge
> help in my understanding of it all.
>
> Thanks,
> Justin
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20200412/3ca33bdc/attachment-0001.html
------------------------------
Message: 3
Date: Sun, 12 Apr 2020 20:39:03 +0000
From: Justin Cunningham <just1brill@outlook.com>
Subject: Re: [Moses-support] Tokenization
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<DB6PR0601MB2149FDD259C5BC003DB579AF8CDC0@DB6PR0601MB2149.eurprd06.prod.outlook.com>
Content-Type: text/plain; charset="us-ascii"
Thanks for replying! It actually ended up being a spelling error in the code.
Thanks,
Justin
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 162, Issue 7
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 162, Issue 7"
Post a Comment