Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Moses + BPE ? (Noe Casas)
----------------------------------------------------------------------
Message: 1
Date: Sat, 16 Mar 2019 12:40:10 +0100
From: Noe Casas <noe.casas@gmail.com>
Subject: [Moses-support] Moses + BPE ?
To: moses-support@mit.edu
Message-ID:
<CAM+h5g0E4+GxyFH5hWsB=BdPxPEko5gCkLk1anxow6bN+UBDpA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Dear Moses Community,
I want to train Moses with byte-pair encoding tokenization (BPE,
https://github.com/rsennrich/subword-nmt). I plan to do it "by hand"
without the EMS.
Is there any problem with the idea?
Would it be Ok just to apply BPE after tokenization, truecasing, etc and
then go on with the rest of the typical steps?
Is there any gotcha I should take into account?
I have only identified as potential pitfall that I have to clean the corpus
with clean-corpus-n.perl after applying BPE in order not to reach the
maximum fertility 9 for mgiza.
Any success/failure experiences doing similar stuff are also very welcome.
Thanks,
Noe.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20190316/66fe7277/attachment-0001.html
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 149, Issue 10
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 149, Issue 10"
Post a Comment