Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. factored tuning time (Tomasz Gawryl)
2. Re: Issue with alignment (gang tang)
3. Re: NPLM with Europarl ? (Rico Sennrich)
4. Re: factored tuning time (Hieu Hoang)
----------------------------------------------------------------------
Message: 1
Date: Thu, 22 Oct 2015 08:28:04 +0200
From: "Tomasz Gawryl" <tomasz.gawryl@skrivanek.pl>
Subject: [Moses-support] factored tuning time
To: <moses-support@mit.edu>
Message-ID: <000301d10c92$cee72b70$6cb58250$@gawryl@skrivanek.pl>
Content-Type: text/plain; charset="us-ascii"
Hi,
I've one question to you about time of factored tuning. How many times
longer it takes compared to phrase based tuning?
I'm asking because it's 7'th day and it's still tuning (3,3 mln corpus
sentences). Phrase based tuning took around 3h for the same corpus.
Top shows me that moses uses near 100% CPU. So the speed is the same.
Regards,
Tomek Gawryl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151022/81370bb2/attachment-0001.html
------------------------------
Message: 2
Date: Thu, 22 Oct 2015 15:56:32 +0800 (CST)
From: "gang tang" <gangtang2014@126.com>
Subject: Re: [Moses-support] Issue with alignment
To: "Per Tunedal" <per.tunedal@operamail.com>
Cc: moses-support@mit.edu
Message-ID: <425338cb.aa18.1508e8ca163.Coremail.gangtang2014@126.com>
Content-Type: text/plain; charset="gbk"
Dear Per,
Thanks for your kind suggestions. I am digging into my data and the source code of giza++ to find out what happened to my precious pair of "sandalo vernice" and "vernice sandal". I will certainly look into how to utilize Hunalign to advance my cause later on.
Thanks again, and best regards,
Gang
At 2015-10-21 22:03:36, "Per Tunedal" <per.tunedal@operamail.com> wrote:
Dear Gang,
I don't know any tool for word alignment using a dictionary. Anyhow, Hunalign does sentence alignment with the help of dictionaries.
I have done some promising experiments using dictionaries to clean sentence aligned corpora. I found that:
- dictionaries with domain specific vocabulary are very beneficial
- bad dictionaries e.g. created with GIZA++ are somewhat beneficial
- dictionaries are best used to prevent suspicious sentence pairs to be unduly removed. The other way around may remove a lot of good pairs with uncommon words.
Yours,
Per Tunedal
On Fri, Oct 9, 2015, at 13:02, gang tang wrote:
Dear All,
Since there are no answers to my questions, I assume that there are no easy fixes to the alignment problem. However, just out of curiosity, shouldn't there be alignment tools that take lexical considerations into account while aligning parallel corpus? I mean, alignment tools that look up translations for specific words in a domain-specifc dictionary during alignment? Could there be any reason that it is not an interesting area to explore?
Best Regards,
Gang
? 2015-09-25 19:34:13?"gang tang" <gangtang2014@126.com> ???
Dear all,
I have a problem with alignment. I'd greatly appreciate if anyone can help solve my issue.
I have the following corpus:
?sandalo camufluge" -> "camufluge sandal"
"sandalo daino" -> "daino sandal"
"sandalo madras" -> "madras sandal"
"sandalo vernice" -> "vernice sandal"
The alignment software I used was GIZA++, and the alignment result was always 0-0 1-1, which meant that "sandalo" wasn't aligned with "sandal". And after training phrase.translation.table always had entries such as "sandalo" -> "camufluge", "sandalo" -> "daino", "sandalo"->"madras", and "sandalo"->"vernice", and no "sandalo"->"sandal". Is there any way this problem could be solved? Could I add more data to align "sandalo" with "sandal" and translate "sandalo" to "sandal"? How should I tune the system?
Thanks for your attention,
Gang
????iPhone6s???5288???????
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151022/0d826999/attachment-0001.html
------------------------------
Message: 3
Date: Thu, 22 Oct 2015 10:57:09 +0100
From: Rico Sennrich <rico.sennrich@gmx.ch>
Subject: Re: [Moses-support] NPLM with Europarl ?
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <5628B2F5.5030500@gmx.ch>
Content-Type: text/plain; charset=windows-1252; format=flowed
Hi Vincent,
here's some results on WMT data (not just Europarl):
+0.6 BLEU for English->German
(https://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/download/510/120)
+/-0 for French->English, +0.7 for English->French,
+0.4 for Finnish->English, +0.4 for English->Finnish
(http://www.statmt.org/wmt15/pdf/WMT13.pdf)
best wishes,
Rico
On 21.10.2015 14:53, Vincent Nguyen wrote:
> Hi,
>
> Before spending obviously a lot of machine time in this, I would like to
> know if someone ran EMS with NPLM on Europarl
> (ie European languages duh ...)
> and if so, what are the results in potential BLEU improvements.
>
>
> alternatively, I spent some time in ASR and saw some major improvements
> WER-wise when switching to DNN.
> Do you guys think it could potentially benefit to MT ?
>
> read this, interesting ....
>
> http://www.wsj.com/articles/businesses-try-to-fix-machine-translation-1444788434
>
> Cheers,
>
> Vincent
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
------------------------------
Message: 4
Date: Thu, 22 Oct 2015 11:21:48 +0100
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] factored tuning time
To: Tomasz Gawryl <tomasz.gawryl@skrivanek.pl>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbii1Ly6SW6SN9Q8cxw+W80rqhq+v-2ovHrvn72ArXG_0Q@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Can I have a look at your moses.ini file?
You should be careful trying to use complicated factored models as they can
take a long time to run.
Also, you can use multithreading to make it run faster
Hieu Hoang
http://www.hoang.co.uk/hieu
On 22 October 2015 at 07:28, Tomasz Gawryl <tomasz.gawryl@skrivanek.pl>
wrote:
> Hi,
>
>
>
> I?ve one question to you about time of factored tuning. How many times
> longer it takes compared to phrase based tuning?
>
> I?m asking because it?s 7?th day and it?s still tuning (3,3 mln corpus
> sentences). Phrase based tuning took around 3h for the same corpus.
>
> Top shows me that moses uses near 100% CPU. So the speed is the same.
>
>
>
> Regards,
>
> Tomek Gawryl
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151022/c1a07afd/attachment.html
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 108, Issue 61
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 108, Issue 61"
Post a Comment