Moses-support Digest, Vol 107, Issue 3

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: g++: error: unrecognized command line option
'-no-cpp-precomp' (Hieu Hoang)
2. Re: Failure to Open Output when using Chart Decoder
(Rico Sennrich)
3. Re: clarification CBPT vs MMSAPT (Ulrich Germann)


----------------------------------------------------------------------

Message: 1
Date: Tue, 1 Sep 2015 15:21:33 +0300
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] g++: error: unrecognized command line
option '-no-cpp-precomp'
To: Joerg Tiedemann <tiedeman@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbjeu8+m1juvYBqC8H4=MbF269SB46b=8U-rcEHQTuighw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

My advice on osx is don't install GCC. Clang is the ordained compiler now,
you'll be fighting apple every step of the way. Don't think different!

Hieu Hoang
Sent while bumping into things
On 31 Aug 2015 5:14 pm, "Jorg Tiedemann" <tiedeman@gmail.com> wrote:

>
> Well, I have /opt/local/ search paths in various environment variables to
> get macports to work.
> I deleted all this paths and tried again but I still get the same problem.
>
> I am confused. And why is gcc not working anymore when installed via
> macports? I also installed boost with macports. Is that a problem as well?
>
> I have also some problems with kenlm but part of it compiles and links
> fine. build_binary and query seems to compile fine but lmplz does not link
> because of some undefined symbols:
> Undefined symbols for architecture x86_64:
>
> "boost::program_options::value_semantic_codecvt_helper<char>::parse(boost::any&,
> std::vector<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > > > const&, bool) const",
> referenced from:
> ?.
>
> I also had to link /opt/local/lib to /opt/local/lib64 (which didn?t exist
> in my setup).
> I am afraid that I started to make quite a mess on my system but what did
> I do wrong?
>
> Is macports not working properly anymore?
> As I said, I have gcc 5.2.0 and boost 1.59.0 via macports on my system. Is
> that bad?
>
> Thanks for helping!
> J?rg
>
>
>
>
> On 31 Aug 2015, at 16:19, Hieu Hoang <hieuhoang@gmail.com> wrote:
>
> the errors for clang looks like it's coming from the stl library. Have you
> fiddled with the PATH variable or otherwise tried to make gcc on OSX work?
> You shouldn't do that, it will just mess up the compilation environment on
> your machine
>
> On 31/08/2015 10:28, Jorg Tiedemann wrote:
>
>
> Unfortunately, this didn?t work for me either. I attach both logiles - one
> for clang and one for gcc (which I installed via macports)
> What can I do? Thanks!
>
> J?rg
>
>
>
>
>
>
>
>
> On 30 Aug 2015, at 11:33, Hieu Hoang < <hieuhoang@gmail.com>
> hieuhoang@gmail.com> wrote:
>
> Add
> toolset=clang
> to the bjam compile command. Osx no longer has gcc
>
> Hieu Hoang
> Sent while bumping into things
> On 29 Aug 2015 11:56 pm, "Jorg Tiedemann" <tiedeman@gmail.com> wrote:
>
>> Hi,
>>
>> I tried to make a fresh install of Moses on my new Mac and I get the
>> following error
>> g++: error: unrecognized command line option '-no-cpp-precomp'
>>
>> What?s wrong? I have gcc5 and boost 1.59 on my machine via macports ...
>>
>> Thanks for your help!
>> J?rg
>>
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> --
> Hieu Hoang
> Researcher
> New York University, Abu Dhabihttp://www.hoang.co.uk/hieu
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150901/c3bd63b8/attachment-0001.html

------------------------------

Message: 2
Date: Tue, 1 Sep 2015 13:27:17 +0100
From: Rico Sennrich <rico.sennrich@gmx.ch>
Subject: Re: [Moses-support] Failure to Open Output when using Chart
Decoder
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <55E599A5.7070609@gmx.ch>
Content-Type: text/plain; charset="windows-1252"

Hello Shyam,

this is probably not a bug in the code (this is a check in
std::ostream), but a problem with the location you're trying to write
to. Can you double-check if your path to the n-best-list is correct, and
that you can write to it?

best wishes,
Rico


On 01.09.2015 00:36, Shyam Upadhyay wrote:
> I am new to using moses and I am trying to use the chart decoder to
> obtain 100 best decodings as follows,
>
> moses/bin/moses_chart -f mymodel/moses.ini --drop-unknown
> --n-best-list myout/hyp.mrl.nbest 100
>
> I encounter the following error,
>
> Start loading text phrase table. Moses format : [0.009] seconds
> Reading
> /home/upadhya3/smt-semparse-fresh/work/2015-08-30T21.06.43/model/glue-grammar
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> max-chart-span: 20
> max-chart-span: 1000
> Created input-output object : [0.009] seconds
> Exception: ./moses/OutputCollector.h:64 in
> Moses::OutputCollector::OutputCollector(std::string, std::string)
> threw util::Exception because `!m_outStream->good()'.
> Failed to open output
> file/home/upadhya3/smt-semparse-fresh/work/2015-08-30T19.44.07/hyp.mrl.nbest
>
> My moses.ini file is, (this was generated automatically by previous steps)
>
> #########################
> ### MOSES CONFIG FILE ###
> #########################
>
> # input factors
> [input-factors]
> 0
>
> # mapping steps
> [mapping]
> 0 T 0
> 1 T 1
>
> [cube-pruning-pop-limit]
> 1000
>
> [non-terminals]
> X
>
> [search-algorithm]
> 3
>
> [inputtype]
> 3
>
> [max-chart-span]
> 20
> 1000
>
> # feature functions
> [feature]
> UnknownWordPenalty
> WordPenalty
> PhrasePenalty
> PhraseDictionaryMemory name=TranslationModel0 num-features=4
> path=/home/upadhya3/smt-semparse-fresh/work/2015-08-30T21.06.43/model/rule-table.gz
> input-factor=0 output-factor=0
> PhraseDictionaryMemory name=TranslationModel1 num-features=1
> path=/home/upadhya3/smt-semparse-fresh/work/2015-08-30T21.06.43/model/glue-grammar
> input-factor=0 output-factor=0 tuneable=true
>
> KENLM name=LM0 factor=0
> path=/home/upadhya3/smt-semparse-fresh/work/2015-08-30T21.06.43/mrl.arpa
> order=3
>
> # dense weights for feature functions
> [weight]
> # The default weights are NOT optimized for translation quality. You
> MUST tune the weights.
> # Documentation for tuning is here:
> http://www.statmt.org/moses/?n=FactoredTraining.Tuning
> UnknownWordPenalty0= 1
> WordPenalty0= -1
> PhrasePenalty0= 0.2
> TranslationModel0= 0.2 0.2 0.2 0.2
> TranslationModel1= 1
> LM0= 0.5
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150901/c60d60cf/attachment-0001.html

------------------------------

Message: 3
Date: Tue, 1 Sep 2015 14:32:22 +0100
From: Ulrich Germann <ulrich.germann@gmail.com>
Subject: Re: [Moses-support] clarification CBPT vs MMSAPT
To: Vincent Nguyen <vnguyen@neuf.fr>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAHQSRUr7szcikMe_xnQifqTXU5S-jv15Q6QSL+4McksQs=3wUg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Vincent,


1. To seed the foreground corpus at start-up, you need to provide three
files (I use ${L1} and ${L2} to indicate language tags, ${L1} is the source
language, ${L2} the target language. These tags must match those given in
the L1 and L2 parameters of the Mmsapt line in moses.ini.

/some/path/[basename.]${L1}.txt.gz
/some/path/[basename.]${L2}.txt.gz
/some/path/[basename.]${L1}-${L2}.symal.gz

Then, in the Mmsapt line in moses.ini, add the parameter
extra=/some/path/[basename.]

Note that the extra specifiation (like the path parameter) must end
either in '.' (of the files have a prefix) or '/' (if they don't). Files
must be gzipped and end in .txt.gz or symal.gz, respectively.

2. To add material dynamically:


- with the moses server, use the update interface of the xmlrpc server;
see scripts/contrib/sim-pe.py for an example.
- to simulate post-editing with moses in batch mode, specify
--spe-src /path/to/source --spe-trg /path/to/target --spe-aln
/path/to/word-alignment-file.
E.g.

moses -f moses.ini --spe-src input.en --spe-trg reference.de
--spe-aln en-de.symal

This will translate one sentence, then add input sentence, reference
(as read from file), and pre-computed word alignment to the
parallel data.
In this case (in contrast to the parameter 'extra ' in the Mmsapt
line, which mandates that the text files are gzipped), the files
should be
plain, uncompressed text files.

- Uli

On Tue, Sep 1, 2015 at 1:11 PM, Vincent Nguyen <vnguyen@neuf.fr> wrote:

> Hi Uli,
>
> For your point3. here is what I would like to do / understand :
>
> I have an LM and a TM built with EMS but alignment being done by
> FastAlign. So there is no vcb files for the baseline.
>
> In this context I don't see if I can to integrate a new incremental corpus
> to the previous baseline corpus.
>
> hope this is clearer.
>
> Vincent
>
>
>
> Le 23/08/2015 00:36, Ulrich Germann a ?crit :
>
> Hi Vincent,
>
> 1. I don't use EMS, so I'm the wrong person to ask.
> 2. Please always post questions to the moses-support mailing list, so that
> others can benefit from questions and answers as well.
> 3. Can you briefly explain what you are trying to accomplish? I don't
> think I understand what you are actually trying to do.
>
> Best regards - Uli
>
> On Sat, Aug 22, 2015 at 10:45 PM, Vincent Nguyen <vnguyen@neuf.fr> wrote:
>
>>
>> I kept reading again and again this
>> <http://www.statmt.org/moses/?n=Advanced.Incremental>
>> http://www.statmt.org/moses/?n=Advanced.Incremental
>> but this is not clear enough for a newbie like me for use with EMS.
>> I also see a section in the EMS config file :
>> use of baseline aligment model (incremental training)
>> and I don't really see how it comes with the rest of parameters.
>>
>>
>>
>> Le 22/08/2015 16:31, vnguyen@neuf.fr a ?crit :
>>
>> Oops
>> Using EMS i built the phrase table with the mmsapt=
>> Option and it went through
>> But i had not added the training-options
>> -final-alignment-model hmm
>>
>> Do i need to start again?
>>
>> The thing is i use dyers aligner because of the giga corpus and i am not
>> sure that training option is compatible since the tuto mentions giza++
>> modified...
>>
>>
>>
>> ____________________
>>
>> De : "Ulrich Germann"
>> Date : 21 ao?t 2015 15:54:08
>> A : Vincent Nguyen
>> Cc : prashant@fbk.eu, moses-support@mit.edu
>> Sujet : Re: [Moses-support] clarification CBPT vs MMSAPT
>>
>>
>>
>> On Thu, Aug 20, 2015 at 5:40 PM, Vincent Nguyen <vnguyen@neuf.fr> wrote:
>>
>>> Thanks to both of you. I will it a try to both solutions.
>>>
>>> For MMSAPT :
>>> Will I be able to make it work with the Giga corpus fr-en ? If
>>> everything is loaded in memory I may be short of ram rather quickly.
>>>
>>
>> For the WMT-15 fr-en data, mmsapt's files are about 20GB in total, but
>> not all of it will normally be kept in memory. Mmsapt degrades gracefully,
>> it just gets slow if the VM manager has to drop memory pages and re-load
>> them. The LM is about 40GB, so for optimal performance you should calculate
>> 60+GB of RAM. Provided you have enough RAM, cat all model files to
>> /dev/null prior to starting moses. Sequential disk access is much faster
>> than random disk access, and the cat to /dev/null will push them into the
>> OS's file cache.
>>
>>
>>
>>> Plus I was using dyers fast align ... so do I need to realign the whole
>>> corpus with the modified version of giza++ ?
>>>
>>> You need word alignments in the output format produced by symal (ie.
>> row-column pairs 1-1 2-2 3-4 etc.). How these alignments are produced
>> doesn't matter for Mmsapts ability to handle them. It may, of course,
>> affect the alignment quality, but that's independent of which phrase table
>> implementation you use.
>>
>> - Uli
>>
>>
>>
>>> For CBPT :
>>> I would like to give the the MT adative server a try but I don't really
>>> understand how to adapt the given "adaptive model" and "updater model"
>>> in a context where my language pair is different. these preliminary
>>> steps are not part of the tutorial. (especially the
>>> updater_models/alignment folders ...)
>>>
>>> The only glitch I see in the CBPT is that adaptive changes cannot be
>>> made permanent.
>>>
>>>
>>>
>>>
>>> Le 20/08/2015 16:17, Ulrich Germann a ?crit :
>>>
>>> Memory-mapped phrase tables are an alternative to conventional phrase
>>> tables. They are much, much faster to build, only slightly slower than
>>> CompactPT at runtime, and at the very least competitive in terms of BLEU
>>> performance. I usually observe slightly higher BLEU scores, but for each
>>> individual evaluation, the difference is usually not significant. They
>>> support only phrase-based MT, but not syntax-based MT.
>>>
>>> Both Mmsapt and CBPT also cater to post-editing scenarios (CBPT were
>>> specifically developed for this purpose). They allow adding new material to
>>> the phrase tables at run time. I can't say much about CBPT (apparently you
>>> add phrase table entries, and there is a decay function that rewards more
>>> recent choices approved by the translator), but in the case of Mmsapt
>>> (since it samples at lookup time anyway), you can add new word-aligned
>>> parallel text at run time to the training data (or additional material at
>>> start-up; additions are currently not stored on disk by the server (do NOT
>>> use mosesserver, use moses --server --port ...) and are lost when the
>>> server exits, but can be loaded at startup time from text files, if they
>>> are available (in other words: it's currently up to the user/client who
>>> submits the additions to also store them on disk if they are meant to be
>>> permanent). Mmsapt offers numerous configuration options (separate scores
>>> or joint scores for background and foreground corpus, a provenance feature,
>>> etc.) that affect the number of features, and there is no established best
>>> practice for use in interactive MT (unless Michael Denkowski has advice to
>>> offer in this respect).
>>>
>>> For phrase-based MT I recommend Mmsapt (see also my paper in the coming
>>> issue of PBML), as it saves you a lot of phrase table building agony. For
>>> interactive use, the infrastructure is there but additional research is
>>> required to figure out the optimal configuration of feature functions and
>>> associated parameters.
>>>
>>> Best regards - Uli Germann
>>>
>>> On Thu, Aug 20, 2015 at 12:56 AM, Prashant Mathur < <prashant@fbk.eu>
>>> prashant@fbk.eu> wrote:
>>>
>>>> Hi Vincent,
>>>>
>>>> The goal is incremental adaptation but these two are different
>>>> techniques in principle.
>>>> CBPT adds additional dynamic phrase table (with 1 additional feature)
>>>> which allows deletion, insertion of phrase pairs at any given time. For
>>>> incremental adaptation CBPT can be used in conjunction with constraint
>>>> based decoding as in [1] or cascading onlineMgiza++ and normal phrase
>>>> extractor as in [2].
>>>> I don't have much idea about memory mapped suffix array implementation
>>>> but afaik with MMSAPT (which uses 7 features) you can do incremental
>>>> updates to your model by adding stream of parallel data along with the
>>>> alignments.
>>>>
>>>> --Prashant
>>>>
>>>> [1]
>>>> <http://www.cl.uni-heidelberg.de/%7Eriezler/publications/papers/MTJOURNAL2014.pdf>
>>>> http://www.cl.uni-heidelberg.de/~riezler/publications/papers/MTJOURNAL2014.pdf
>>>> [2] <http://mt4cat.org/software/adaptive-mt-server>
>>>> http://mt4cat.org/software/adaptive-mt-server
>>>>
>>>>
>>>> On Wed, Aug 19, 2015 at 6:53 PM, Vincent Nguyen < <vnguyen@neuf.fr>
>>>> vnguyen@neuf.fr> wrote:
>>>>
>>>>> Hello support,
>>>>>
>>>>> Going into advanced features of Moses, I am a bit confused by the
>>>>> differences and therefore which path to follow, regarding the 2
>>>>> features
>>>>> CBPT and MMSAPT.
>>>>>
>>>>> I have the feeling the ultimate goal of both is the same but maybe I am
>>>>> wrong.
>>>>>
>>>>> Can someone explain the actual difference ?
>>>>>
>>>>> by the way the "update" feature of this page <http://demo.statmt.org/>
>>>>> http://demo.statmt.org/ is
>>>>> based on which one ?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Vincent.
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> <Moses-support@mit.edu>Moses-support@mit.edu
>>>>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>
>>>
>>> --
>>> Ulrich Germann
>>> Senior Researcher
>>> School of Informatics
>>> University of Edinburgh
>>>
>>>
>>>
>>
>>
>> --
>> Ulrich Germann
>> Senior Researcher
>> School of Informatics
>> University of Edinburgh
>>
>>
>>
>
>
> --
> Ulrich Germann
> Senior Researcher
> School of Informatics
> University of Edinburgh
>
>
>


--
Ulrich Germann
Senior Researcher
School of Informatics
University of Edinburgh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150901/54a415e5/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 107, Issue 3
*********************************************

0 Response to "Moses-support Digest, Vol 107, Issue 3"

Post a Comment