Moses-support Digest, Vol 112, Issue 34

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Training recaser in EMS (Jeremy Gwinnup)
2. Is ProcessLexicalTableMin multi threads ? (Vincent Nguyen)
3. Re: Is ProcessLexicalTableMin multi threads ?
(Marcin Junczys-Dowmunt)
4. Re: Is ProcessLexicalTableMin multi threads ?
(Marcin Junczys-Dowmunt)
5. Re: Is ProcessLexicalTableMin multi threads ? (Vincent Nguyen)
6. Re: Is ProcessLexicalTableMin multi threads ?
(Marcin Junczys-Dowmunt)
7. 2nd CfP EAMT 2016: 19th Annual Conference of the European
Association for Machine Translation (Antonio Toral)


----------------------------------------------------------------------

Message: 1
Date: Wed, 17 Feb 2016 16:17:14 -0500
From: Jeremy Gwinnup <jeremy@gwinnup.org>
Subject: [Moses-support] Training recaser in EMS
To: moses-support@mit.edu
Message-ID: <C9217E64-D073-4278-8917-A6C808053A9E@gwinnup.org>
Content-Type: text/plain; charset=utf-8

Hi,

I?m using EMS to train up a system that has multiple input corpora and language models - I?d like to be able to specify all of these data sources when training a recaser (currently I only have 'tokenized = [LM:foo:tokenized-corpus]? in my config). Is there an easy way to consolidate all of my target-language input sources?

Thanks!
-Jeremy


------------------------------

Message: 2
Date: Wed, 17 Feb 2016 22:44:35 +0100
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: [Moses-support] Is ProcessLexicalTableMin multi threads ?
To: moses-support <moses-support@mit.edu>
Message-ID: <56C4E9C3.6050004@neuf.fr>
Content-Type: text/plain; charset=utf-8; format=flowed


I have the feeling it's not.


------------------------------

Message: 3
Date: Wed, 17 Feb 2016 23:07:37 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Is ProcessLexicalTableMin multi threads ?
To: moses-support@mit.edu
Message-ID: <56C4EF29.9090801@amu.edu.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

It is, just not very well done. It generally does not make sense to have
more than 8-10 threads. That should however be somewhat faster than only
a single thread.

On 17.02.2016 22:44, Vincent Nguyen wrote:
> I have the feeling it's not.
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support



------------------------------

Message: 4
Date: Wed, 17 Feb 2016 23:16:27 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Is ProcessLexicalTableMin multi threads ?
To: moses-support@mit.edu
Message-ID: <56C4F13B.2050100@amu.edu.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

I just checked, it's really weirdly slow now. Apparently using more than
4 threads is a bad idea. But 4 threads seems to be about 2 times faster
than just one. I remember that used to work better. Maybe because I
haven't tcmalloc linked?

On 17.02.2016 23:07, Marcin Junczys-Dowmunt wrote:
> It is, just not very well done. It generally does not make sense to have
> more than 8-10 threads. That should however be somewhat faster than only
> a single thread.
>
> On 17.02.2016 22:44, Vincent Nguyen wrote:
>> I have the feeling it's not.
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support



------------------------------

Message: 5
Date: Thu, 18 Feb 2016 08:59:57 +0100
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: Re: [Moses-support] Is ProcessLexicalTableMin multi threads ?
To: moses-support@mit.edu
Message-ID: <56C579FD.6010407@neuf.fr>
Content-Type: text/plain; charset=windows-1252; format=flowed


yeah but then if in EMS we want to use ProcessPhrasetablemin with 8 threads
and ProcessLexicalTableMin with 4 threads, difficult, right ?

just letting you know, with 8 threads the processlexicaltablemin seems
to run with 1 thread only .....



Le 17/02/2016 23:16, Marcin Junczys-Dowmunt a ?crit :
> I just checked, it's really weirdly slow now. Apparently using more than
> 4 threads is a bad idea. But 4 threads seems to be about 2 times faster
> than just one. I remember that used to work better. Maybe because I
> haven't tcmalloc linked?
>
> On 17.02.2016 23:07, Marcin Junczys-Dowmunt wrote:
>> It is, just not very well done. It generally does not make sense to have
>> more than 8-10 threads. That should however be somewhat faster than only
>> a single thread.
>>
>> On 17.02.2016 22:44, Vincent Nguyen wrote:
>>> I have the feeling it's not.
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


------------------------------

Message: 6
Date: Thu, 18 Feb 2016 09:15:22 +0100
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Is ProcessLexicalTableMin multi threads ?
To: moses-support@mit.edu
Message-ID: <56C57D9A.4040400@amu.edu.pl>
Content-Type: text/plain; charset=windows-1252; format=flowed

Again just badly written multithreading. It is still much faster than
processPhraseTableMin, isn't EMS running them in parallel or something?
(I don't use EMS).

On 18.02.2016 08:59, Vincent Nguyen wrote:
> yeah but then if in EMS we want to use ProcessPhrasetablemin with 8 threads
> and ProcessLexicalTableMin with 4 threads, difficult, right ?
>
> just letting you know, with 8 threads the processlexicaltablemin seems
> to run with 1 thread only .....
>
>
>
> Le 17/02/2016 23:16, Marcin Junczys-Dowmunt a ?crit :
>> I just checked, it's really weirdly slow now. Apparently using more than
>> 4 threads is a bad idea. But 4 threads seems to be about 2 times faster
>> than just one. I remember that used to work better. Maybe because I
>> haven't tcmalloc linked?
>>
>> On 17.02.2016 23:07, Marcin Junczys-Dowmunt wrote:
>>> It is, just not very well done. It generally does not make sense to have
>>> more than 8-10 threads. That should however be somewhat faster than only
>>> a single thread.
>>>
>>> On 17.02.2016 22:44, Vincent Nguyen wrote:
>>>> I have the feeling it's not.
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support




------------------------------

Message: 7
Date: Thu, 18 Feb 2016 11:52:27 +0000
From: Antonio Toral <atoral@computing.dcu.ie>
Subject: [Moses-support] 2nd CfP EAMT 2016: 19th Annual Conference of
the European Association for Machine Translation
To: atoral@computing.dcu.ie, apertium-stuff@lists.sourceforge.net,
moses-support@mit.edu
Message-ID: <56C5B07B.9080208@computing.dcu.ie>
Content-Type: text/plain; charset=utf-8; format=flowed

**************************************************************************
Second Call for Papers
EAMT 2016, Riga, Latvia - http://eamt2016.tilde.com/
19th Conference of the European Association for Machine Translation
**************************************************************************


***** New in this second call for papers ******
1. Proceedings will be published as a special issue of the Baltic
Journal of Modern Computing, an open access journal indexed by Thomson
Reuters Web of Science Core Collection. See more below under ?Publications?
2. Selected papers will be published (in an extended version) in the
Machine Translation journal. See more below under ?Publications?.
3. Students of Computational Linguistics and Natural Language Processing
in the Baltic countries can avail of a reduced registration fee. Contact
local organisers for details.



The European Association for Machine Translation (EAMT,
http://www.eamt.org) invites everyone interested in machine translation,
translation-related tools and resources to participate in this
conference ? developers, researchers, users, professional translators
and translation/localisation managers: anyone who has a stake in the
vision of an information world in which language barriers and issues
become less visible to the information consumer. We especially invite
researchers to describe the state of the art and demonstrate their
cutting-edge results, and professional MT users to share their experiences.

EAMT 2016, the 19th Annual Conference of the European Association for
Machine Translation, will be held in Riga, Latvia, from May 30th to June
1st, 2016.

We expect to receive manuscripts in these three categories:


-------------------
(R) Research papers
-------------------

Long-paper submissions (12 pages) are invited for reports of significant
research results in any aspect of machine translation and related areas.
Such reports should include a substantial evaluation component, or have
a strong theoretical and/or methodological contribution where results
and in-depth evaluations may not be appropriate. Papers are welcome on
all topics in the areas of machine translation and translation-related
technologies, including:

* Advances in various MT paradigms: data-driven, rule-based, and hybrid
approaches
* Technologies for MT deployment: quality estimation, domain adaptation,
etc.
* MT in special settings: low resources, massive resources, high volume,
low computing resources
* MT applications: translation/localisation aids, speech-to-speech,
speech-to-text, OCR, MT for user generated content (blogs, social
networks), etc.
* Linguistic resources for MT: dictionaries, terminology, corpora, etc.
* MT evaluation techniques and evaluation results
* Human factors in MT and user interfaces
* Related multilingual technologies: natural language generation,
information retrieval, text categorisation, text summarisation,
information extraction, etc.

Papers should describe original work. They should emphasise completed
work rather than intended work, and should indicate clearly the state of
completion of the reported results. Where appropriate, concrete
evaluation results should be included.


----------------
(U) User studies
----------------

Short-paper submissions (3-6 pages) are invited for reports on users'
experiences with MT, be it in small or medium size business (SMB),
enterprise, government, or NGOs. Contributions are welcome on:

* Integrating MT and computer-assisted translation into a translation
production workflow (e.g. transforming terminology glossaries into MT
resources, optimizing TM/MT thresholds, mixing online and offline tools,
using interactive MT, dealing with MT confidence scores);
* Use of MT to improve translation or localisation workflows (e.g.
reducing turnaround times, improving translation consistency, increasing
the scope of globalisation projects);
* Managing change when implementing and using MT (e.g. switching between
multiple MT systems, limiting degradations when updating or upgrading an
MT system);
* Implementing open-source MT in the SMB or enterprise (e.g. strategies
to get support, reports on taking pilot results into full deployment,
examples of advanced customisation sought and obtained thanks to the
open-source paradigm, collaboration within open-source MT projects);
* Evaluation of MT in a real-world setting (e.g. error detection
strategies employed, metrics used, productivity or translation quality
gains achieved);
* Post-editing strategies and tools (e.g. limitations of traditional
translation quality assurance tools, challenges associated with
post-editing guidelines);
* Legal issues associated with MT, especially MT in the cloud (e.g.
copyright, privacy);
* Use of MT in social networking or real-time communication (e.g.
enterprise support chat, multilingual content for social media);
* Use of MT to process multilingual content for assimilation purposes
(e.g. cross-lingual information retrieval, MT for e-discovery or spam
detection, MT for highly dynamic content);
* Use of standards for MT.

Papers should highlight problems and solutions and not merely describe
MT integration process or project settings. Where solutions do not seem
to exist, suggestions for MT researchers and developers should be
clearly emphasised. For user papers produced by academics, we require
co-authorship with the actual users.


-------------------------------
(P) Project/Product description
-------------------------------

Abstract submissions (1 page) are invited to report new, interesting:

* Tools for machine translation, computer aided translation, and the
like (including commercial products and open-source software). The
authors should be ready to present the tools in the form of demos or
posters during the conference.
* Research projects related to machine translation. The authors should
be ready to present the projects in the form of posters during the
conference. This follows on from the successful ?project villages? held
at the last EAMT conferences.


---------
Programme
---------

The programme will include oral presentations and poster sessions.
Accepted papers may be assigned to an oral or poster session, but no
differentiation will be made in the conference proceedings.


---------------
Important Dates
---------------

* Paper submission: March 25th, 2016
* Notification to authors: April 22nd, 2016
* Camera-ready deadline: May 2nd, 2016
* Conference: May 30th-June 1st, 2016


-----------
Publications
-----------

The conference proceedings will be published as a special issue of the
Baltic Journal of Modern Computing (BJMC, http://www.bjmc.lu.lv/), a
scholarly open access electronic quarterly journal, which is indexed by
Thomson Reuters Web of Science Core Collection (Emerging Sources
Citation Index), EBSCO, ProQuest, Directory of Open Access Journals
(DOAJ), Google Scholar, VINITI, Directory of Research Journal Indexing
(DRJI) and Open J-Gate, and has applied to be indexed in Scopus.

In addition, the best accepted papers will be selected to be published,
in an extended version, and with a lighter reviewing process, as regular
papers in the Springer Machine Translation journal
(http://link.springer.com/journal/10590).


-----------
Submissions
-----------

Submissions will be judged on correctness, originality, technical
strength, significance and relevance to the conference, and potential
interest to all attendees. They should mostly contain new material that
has not been presented at any other meeting with publicly available
proceedings.

EAMT 2016 will use electronic submission through the EasyChair
conference tool. To submit a paper, go to the submission website at:
https://easychair.org/conferences/?conf=eamt2016 and follow the
instructions.

Papers that are being submitted in parallel to other conferences or
workshops and papers that contain significant overlap with previously
published work must indicate this on the title page and in the abstract
submitted through EasyChair (using capital letters). In case of
acceptance the paper will only be included in the proceedings if it is
not published in any other conference or workshop to which it was submitted.

Papers should be anonymised (no authors, affiliations or addresses, and
no explicit self-references), be no longer than 12 pages (A4 size) for
research papers, and no longer than 6 pages (A4 size) for user papers,
all in PDF format. Papers must conform to the format defined by the BJMC
template: http://www.bjmc.lu.lv/instructions-to-authors/.
Project/product descriptions do not need to be anonymised and should use
the 1-page template given in
http://eamt2016.tilde.com/sites/eamt2016.tilde.com/files/eamt-2016-product-template.doc.

For further information about this call for papers or if you encounter
any problem regarding submission please contact the track chairs at
eamt2016chairs@tilde.com and put in the subject "[user]" or "[research]"
depending on which track your question is related to. For questions
about the organisation (venue, registration, accommodation, visa,
payments, etc.) please contact the local organisers at eamt2016@tilde.com.


----------------------
EAMT Best Thesis Award
----------------------

The EAMT Best Thesis Award for PhD theses submitted during 2015 will be
awarded at the conference, together with a presentation of the winner?s
work. Information for candidates to the award is available at:
http://www.eamt.org/news/news_best_thesis2015.php. The deadline is the
same as for the paper submission.


------------------
Conference website
------------------

Please visit the conference web page (http://eamt2016.tilde.com/) for
the most up-to-date information about the calendar, the call for papers
and formatting requirements, the programme, invited speakers, related
conference activities, the venue, travel and registration.


---------------------
Conference organisers
---------------------

General Chair: Mikel Forcada (Universitat d?Alacant)

Track Chairs:
* Antonio Toral, Research programme chair (Dublin City University, Ireland)
* Tony O'Dowd, User programme co-chair (KantanMT, Ireland)
* Alexandru Ceausu, User programme co-chair (Amplexor, Luxembourg)

Local Organisation Chair: Andrejs Vasi?jevs (Tilde, Latvia)

Local host: Juris Borzovs (University of Latvia)





------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 112, Issue 34
**********************************************

0 Response to "Moses-support Digest, Vol 112, Issue 34"

Post a Comment