Moses-support Digest, Vol 103, Issue 16

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Europarl monolingual corpus (joerg)
2. Re: Fwd: Fwd: Server development (Tomas Fulajtar)


----------------------------------------------------------------------

Message: 1
Date: Wed, 6 May 2015 09:18:56 +0200
From: joerg <tiedeman@gmail.com>
Subject: Re: [Moses-support] Europarl monolingual corpus
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <6DB67BBB-F960-48B0-932D-5EC4625DC91E@gmail.com>
Content-Type: text/plain; charset="iso-8859-1"


Go to http://opus.lingfil.uu.se
Select the language pair your interested in and click on the language ID's in the column "mono" (or "raw" next to it) to download the data you like to use. Europarl is version 7 in the list. You can also take them from here:
http://opus.lingfil.uu.se/Europarl/mono/

Best,
J?rg

**********************************************************************************
J?rg Tiedemann http://stp.lingfil.uu.se/~joerg/



On May 6, 2015, at 7:25 AM, Hieu Hoang wrote:

> ah thx. The tgz file only has data for a subset of the languages.
>
> It would be useful to be able to download them all, or at least know how to extract them from the raw data.
>
>
>
> Hieu Hoang
> Researcher
> New York University, Abu Dhabi
> http://www.hoang.co.uk/hieu
>
> On 6 May 2015 at 03:04, Ulrich Germann <ulrich.germann@gmail.com> wrote:
> Extract it from commoncrawl, of course! ;-)
>
> ... or get it here: http://www.statmt.org/wmt13/training-monolingual-europarl-v7.tgz
>
> - Uli
>
> On Mon, May 4, 2015 at 5:46 AM, Hieu Hoang <hieuhoang@gmail.com> wrote:
> What's the easiest way get the single-language data from the Europarl corpus as described in the 1st table in:
> http://statmt.org/europarl/
>
> I tried downloading the xml source
> http://statmt.org/europarl/v7/europarl.tgz
> stripping the xml and running split-sentence.perl, but this takes an unfathomably long time
>
> Hieu Hoang
> Researcher
> New York University, Abu Dhabi
> http://www.hoang.co.uk/hieu
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> Ulrich Germann
> Senior Researcher
> School of Informatics
> University of Edinburgh
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150506/776a4940/attachment-0001.htm

------------------------------

Message: 2
Date: Wed, 6 May 2015 07:38:51 +0000
From: Tomas Fulajtar <TomasFu@moravia.com>
Subject: Re: [Moses-support] Fwd: Fwd: Server development
To: Barry Haddow <bhaddow@inf.ed.ac.uk>, Hieu Hoang
<hieuhoang@gmail.com>, moses-support <moses-support@mit.edu>
Message-ID:
<BY1PR0201MB0965AF43E5EB5B92E76DCA64ADD00@BY1PR0201MB0965.namprd02.prod.outlook.com>

Content-Type: text/plain; charset="utf-8"

Hi Barry,

Thanks for explanation ? I was referring to ?[Moses-support] mosesserver parallelization issue<https://www.mail-archive.com/search?l=moses-support@mit.edu&q=subject:%22%5C%5BMoses%5C-support%5C%5D+mosesserver+parallelization+issue%22&o=newest>? thread in April 2014.

Do you thing the thread pool would be also the solution for this issue reported on 2.1.1 branch.?

Thanks,

Tomas


From: Barry Haddow [mailto:bhaddow@inf.ed.ac.uk]
Sent: Tuesday, May 5, 2015 9:27 PM
To: Hieu Hoang; moses-support
Subject: Re: [Moses-support] Fwd: Fwd: Server development

HI Tomas

There were some issues in v2 with the way that caching was done in the binarised phrase table. It used a cache per thread, and mosesserver used a thread per request, so caching was effectively broken in the server. Since last Autumn, mosesserver uses a thread pool ... and the binarised phrase table is gone now anyway,

cheers - Barry



On 05/05/15 18:27, Hieu Hoang wrote:

What limitations are you referring to?
---------- Forwarded message ----------
From: "Ulrich Germann" <ulrich.germann@gmail.com<mailto:ulrich.germann@gmail.com>>
Date: 5 May 2015 19:49
Subject: [Moses-support] Fwd: Server development
To: "moses-support@mit.edu<mailto:moses-support@mit.edu>" <moses-support@mit.edu<mailto:moses-support@mit.edu>>
Cc:

This response was meant to go to moses-support as well Tomas.

---------- Forwarded message ----------
From: Tomas Fulajtar <TomasFu@moravia.com<mailto:TomasFu@moravia.com>>
Date: Fri, Apr 3, 2015 at 5:03 PM
Subject: RE: [Moses-support] Server development
To: "ugermann@inf.ed.ac.uk<mailto:ugermann@inf.ed.ac.uk>" <ugermann@inf.ed.ac.uk<mailto:ugermann@inf.ed.ac.uk>>

Hi Ulrich,

Thanks for the thorough explanation - the idea of merging the server code back to moses is great.
Apart from this (and I know is is a huge workload), were there any changes in the thread support? I know this part had some limitations ? as discussed on the forum.

Kind regards,
Tomas


From: Ulrich Germann [mailto:ulrich.germann@gmail.com<mailto:ulrich.germann@gmail.com>]
Sent: Thursday, April 2, 2015 12:57 AM
To: Tomas Fulajtar
Subject: Re: [Moses-support] Server development

Hi Tomas,

the plan is to fold server capabilities into the main moses executable. In fact, that has already happened (in the sense that you can run the main moses executable in server mode), but functional equivalence with the old code has not been tested.

There are currently no server tests included in the regression tests, so I left the old code mostly intact (adjusting only for changes in the API of functions called) for legacy reasons, but adding new functionality to mosesserver is extremely strongly DIScouraged.

Supplying regression tests for server functionality, on the other hand, is equally strongly ENcouraged. In a nutshell, what you get back from calling mosesserver and moses --server should be identical.

The long-term plan is to offer through RPC calls (almost) everything that moses offers in batch mode (i.e., send search and output parameters through json/RCP calls and have them noticed and respected). Notice the "long-term" there.

So mosesserver is on its way out, and moses --server-port=<port> --server will replace the old call to mosesserver.

Best regards - Uli

On Wed, Apr 1, 2015 at 9:48 AM, Tomas Fulajtar <TomasFu@moravia.com<mailto:TomasFu@moravia.com>> wrote:
Dear all,

I have spotted there were numerous commits in the server side development - could the developers share the news/goals with the forum? I think it might be interesting for more users ? especially those out of core team.

Thank you,

Tom?? Fulajt?r | Researcher
T: +420-545-552-340<tel:%2B420-545-552-340>
tomasfu@moravia.com<mailto:tomasfu@moravia.com> | moravia.com<http://www.moravia.com/> | Skype: tomasfulajtar


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support



--
Ulrich Germann
Senior Researcher
School of Informatics
University of Edinburgh



--
Ulrich Germann
Senior Researcher
School of Informatics
University of Edinburgh

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support




_______________________________________________

Moses-support mailing list

Moses-support@mit.edu<mailto:Moses-support@mit.edu>

http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150506/92dbbce2/attachment.htm
-------------- next part --------------
An embedded message was scrubbed...
From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Subject: Re: [Moses-support] mosesserver parallelization issue
Date: Mon, 16 Jun 2014 20:13:33 +0100
Size: 4995
Url: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150506/92dbbce2/attachment.eml

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 103, Issue 16
**********************************************

0 Response to "Moses-support Digest, Vol 103, Issue 16"

Post a Comment