Moses-support Digest, Vol 103, Issue 11

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Transliteration model is using processPhraseTable, which
is not found in Moses version 3.0 (Ergun Bicici)


----------------------------------------------------------------------

Message: 1
Date: Tue, 5 May 2015 15:33:16 +0100
From: Ergun Bicici <Ergun.Bicici@computing.dcu.ie>
Subject: Re: [Moses-support] Transliteration model is using
processPhraseTable, which is not found in Moses version 3.0
To: Nadir Durrani <nadir.durrani@nu.edu.pk>
Cc: moses-support <moses-support@mit.edu>, p.j.williams-2@sms.ed.ac.uk
Message-ID:
<CAB2pGncotF485NZR6NzTnxShMrxBU_gewU3GH=Ubg3=Qo9gpWA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Nadir,

I am using Moses 3.0 and for transliteration to work, I copied
scripts/Transliteration/ from latest onto Moses 3.0 path, re-ran, and
obtained translation results.


Best Regards,
Ergun

Ergun Bi?ici, CNGL, School of Computing, DCU, www.cngl.ie
http://www.computing.dcu.ie/~ebicici/


On Mon, May 4, 2015 at 7:32 AM, Nadir Durrani <nadir.durrani@nu.edu.pk>
wrote:

> Hi Ergun,
>
> processPhraseTable is no longer supported by Moses. But I see that
> Phil Williams has already fixed this problem in transliteration
> module, by changing
>
> `$MOSES_SRC/scripts/training/filter-model-given-input.pl
> $TRANSLIT_MODEL/evaluation/$eval_file.filtered
> $TRANSLIT_MODEL/evaluation/$eval_file.moses.table.ini
> $TRANSLIT_MODEL/evaluation/$eval_file -Binarizer
> "$MOSES_SRC/bin/processPhraseTable"`;
>
> to
>
> `$MOSES_SRC/scripts/training/filter-model-given-input.pl
> $TRANSLIT_MODEL/evaluation/$eval_file.filtered
> $TRANSLIT_MODEL/evaluation/$eval_file.moses.table.ini
> $TRANSLIT_MODEL/evaluation/$eval_file -Binarizer
> "$MOSES_SRC/bin/CreateOnDiskPt 1 1 4 100 2"`;
>
> in
>
> path-to-moses/scripts/Transliteration/in-decoding-transliteration.pl
>
> Here's the commit
>
>
> https://github.com/moses-smt/mosesdecoder/commit/7e54e23fe234ac48f44beeee0e473d09a5b4d5f6
>
> May be you pulled and in between version where the processPhraseTable
> was removed but transliteration scripts were not fixed.
>
> Cheers,
> Nadir
>
>
> On Mon, May 4, 2015 at 7:46 AM, <moses-support-request@mit.edu> wrote:
> > Send Moses-support mailing list submissions to
> > moses-support@mit.edu
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> > or, via email, send a message with subject or body 'help' to
> > moses-support-request@mit.edu
> >
> > You can reach the person managing the list at
> > moses-support-owner@mit.edu
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Moses-support digest..."
> >
> >
> > Today's Topics:
> >
> > 1. Re: 12-gram language model ARPA file for 16GB (liling tan)
> > 2. Transliteration model is using processPhraseTable, which is
> > not found in Moses version 3.0 (Ergun Bicici)
> > 3. Re: Transliteration model is using processPhraseTable, which
> > is not found in Moses version 3.0 (Hieu Hoang)
> > 4. Europarl monolingual corpus (Hieu Hoang)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Sun, 3 May 2015 19:44:12 +0200
> > From: liling tan <alvations@gmail.com>
> > Subject: Re: [Moses-support] 12-gram language model ARPA file for 16GB
> > To: moses-support <moses-support@mit.edu>
> > Message-ID:
> > <CAKzPaJJ7fY=9C89POact542vu32d+H3=0i_Dnaj=
> YfizbFA+cQ@mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > Dear Moses devs/users,
> >
> > For now, I only know that it takes more than 250GB. I've 250GB of free
> > space and KenLM got "poisoned" by insufficient space...
> >
> > Does anyone have an idea how big would a 12-gram language model ARPA file
> > trained on 16GB of text become?
> >
> > STDERR:
> >
> > === 1/5 Counting and sorting n-grams ===
> > Reading /media/2tb/wmt15/corpus.truecase/train-lm.en
> >
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > tcmalloc: large alloc 7846035456 bytes == 0x10f4000 @
> > tcmalloc: large alloc 73229664256 bytes == 0x1d542e000 @
> >
> ****************************************************************************************************
> > Unigram tokens 3038737446 types 5924314
> > === 2/5 Calculating and sorting adjusted counts ===
> > Chain sizes: 1:71091768 2:804524736 3:1508483968 4:2413574144
> 5:3519795968
> > 6:4827148288 7:6335632384 8:8045247488 9:9955993600 10:12067871744
> > 11:14380880896 12:16895020032
> > tcmalloc: large alloc 16895025152 bytes == 0x1d542e000 @
> > tcmalloc: large alloc 2413576192 bytes == 0x8f2a0000 @
> > tcmalloc: large alloc 3519799296 bytes == 0x5c4488000 @
> > tcmalloc: large alloc 4827152384 bytes == 0x696146000 @
> > tcmalloc: large alloc 6335635456 bytes == 0x7b5cce000 @
> > tcmalloc: large alloc 8045248512 bytes == 0x92f6f0000 @
> > tcmalloc: large alloc 9955999744 bytes == 0xb0ef7c000 @
> > tcmalloc: large alloc 12067872768 bytes == 0xd60644000 @
> > tcmalloc: large alloc 14380883968 bytes == 0x12f616e000 @
> > Last input should have been poison.
> > Last input should have been poison.util/file.cc:196 in void
> > util::WriteOrThrow(int, const void*, std::size_t) threw FDException
> because
> > `ret < 1'.
> > No space left on device in /tmp/PC2o3z (deleted) while writing 5301120368
> > bytes
> >
> > Last input should have been poison.util/file.cc:196 in void
> > util::WriteOrThrow(int, const void*, std::size_t) threw FDException
> because
> > `ret < 1'.
> > No space left on device in /tmp/PftXeo (deleted) while writing 1941075872
> > bytesLast input should have been poison.
> >
> > util/file.cc:196 in void util::WriteOrThrow(int, const void*,
> std::size_t)
> > threw FDException because `ret < 1'.
> > No space left on device in /tmp/CuZcPM (deleted) while writing 2984722272
> > bytes
> >
> > util/file.cc:196 in void util::WriteOrThrow(int, const void*,
> std::size_t)
> > threw FDException because `ret < 1'.
> > No space left on device in /tmp/F2bE8A (deleted) while writing 389439488
> > bytes
> >
> > Regards,
> > Liling
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20150503/b56dc8ba/attachment-0001.htm
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Sun, 3 May 2015 22:42:22 +0100
> > From: Ergun Bicici <Ergun.Bicici@computing.dcu.ie>
> > Subject: [Moses-support] Transliteration model is using
> > processPhraseTable, which is not found in Moses version 3.0
> > To: moses-support <moses-support@mit.edu>
> > Message-ID:
> > <CAB2pGncpvc4roLXwLcFcXytZHKEqSZvzaX2L16Yfo=
> P-vq1jBA@mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > binarizing...gzip -cd
> >
> en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1.gz
> > | LC_ALL=C sort -T en-ru_path/model/Transliteration.8/tuning/filtered |
> > moses_3.0/mosesdecoder/bin/processPhraseTable -ttable 0 0 - -nscores 4
> -out
> > en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1
> > sh: moses_3.0/mosesdecoder/bin/processPhraseTable: No such file or
> directory
> > sort: write failed: standard output: Broken pipe
> > sort: write error
> >
> > How can I have processPhraseTable built?
> >
> > Best Regards,
> > Ergun
> >
> > Ergun Bi?ici, CNGL, School of Computing, DCU, www.cngl.ie
> > http://www.computing.dcu.ie/~ebicici/
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20150503/dacaa1c9/attachment-0001.htm
> >
> > ------------------------------
> >
> > Message: 3
> > Date: Mon, 04 May 2015 08:31:18 +0400
> > From: Hieu Hoang <hieuhoang@gmail.com>
> > Subject: Re: [Moses-support] Transliteration model is using
> > processPhraseTable, which is not found in Moses version 3.0
> > To: Ergun Bicici <Ergun.Bicici@computing.dcu.ie>, moses-support
> > <moses-support@mit.edu>
> > Message-ID: <5546F616.4000007@gmail.com>
> > Content-Type: text/plain; charset="windows-1252"
> >
> > do you know where the processPhraseTable exec is being called from?
> >
> > it would be helpful so we can make sure it uses something else.
> >
> > if you really want processPhraseTable back, uncomment 3 lines in
> > misc/Jamfile
> >
> > +++ b/misc/Jamfile
> > @@ -1,8 +1,8 @@
> > -#exe processPhraseTable : GenerateTuples.cpp processPhraseTable.cpp
> > ..//boost_filesystem ../moses//moses ;
> > +exe processPhraseTable : GenerateTuples.cpp processPhraseTable.cpp
> > ..//boost_filesystem ../moses//moses ;
> >
> > exe processLexicalTable : processLexicalTable.cpp ..//boost_filesystem
> > ../moses//moses ;
> >
> > -#exe queryPhraseTable : queryPhraseTable.cpp ..//boost_filesystem
> > ../moses//moses ;
> > +exe queryPhraseTable : queryPhraseTable.cpp ..//boost_filesystem
> > ../moses//moses ;
> >
> > exe queryLexicalTable : queryLexicalTable.cpp ..//boost_filesystem
> > ../moses//moses ;
> >
> > @@ -46,6 +46,6 @@ $(TOP)//boost_iostreams
> > $(TOP)//boost_program_options
> > ;
> >
> > -alias programs : 1-1-Extraction TMining generateSequences
> > processLexicalTable queryLexicalTable programsMin programsProbing
> > merge-sorted prunePhraseTable ;
> > -#processPhraseTable queryPhraseTable
> > +alias programs : 1-1-Extraction TMining generateSequences
> > processLexicalTable queryLexicalTable programsMin programsProbing
> > merge-sorted prunePhraseTable processPhraseTable queryPhraseTable ;
> >
> > On 04/05/2015 01:42, Ergun Bicici wrote:
> >>
> >> binarizing...gzip -cd
> >>
> en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1.gz
> >> | LC_ALL=C sort -T en-ru_path/model/Transliteration.8/tuning/filtered
> >> | moses_3.0/mosesdecoder/bin/processPhraseTable -ttable 0 0 - -nscores
> >> 4 -out
> >> en-ru_path/model/Transliteration.8/tuning/filtered/phrase-table.0-0.1.1
> >> sh: moses_3.0/mosesdecoder/bin/processPhraseTable: No such file or
> >> directory
> >> sort: write failed: standard output: Broken pipe
> >> sort: write error
> >>
> >> How can I have processPhraseTable built?
> >>
> >> Best Regards,
> >> Ergun
> >>
> >> Ergun Bi?ici, CNGL, School of Computing, DCU, www.cngl.ie
> >> <http://www.cngl.ie>
> >> http://www.computing.dcu.ie/~ebicici/
> >> <http://www.computing.dcu.ie/%7Eebicici/>
> >>
> >>
> >>
> >> _______________________________________________
> >> Moses-support mailing list
> >> Moses-support@mit.edu
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> > --
> > Hieu Hoang
> > Researcher
> > New York University, Abu Dhabi
> > http://www.hoang.co.uk/hieu
> >
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20150504/303023d0/attachment-0001.htm
> >
> > ------------------------------
> >
> > Message: 4
> > Date: Mon, 4 May 2015 08:46:15 +0400
> > From: Hieu Hoang <hieuhoang@gmail.com>
> > Subject: [Moses-support] Europarl monolingual corpus
> > To: moses-support <moses-support@mit.edu>
> > Message-ID:
> > <
> CAEKMkbiO64F_m20RwNXyDOj60FHEZ_oo+BY+hzkW3TBFukPfAQ@mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > What's the easiest way get the single-language data from the Europarl
> > corpus as described in the 1st table in:
> > http://statmt.org/europarl/
> >
> > I tried downloading the xml source
> > http://statmt.org/europarl/v7/europarl.tgz
> > stripping the xml and running split-sentence.perl, but this takes an
> > unfathomably long time
> >
> > Hieu Hoang
> > Researcher
> > New York University, Abu Dhabi
> > http://www.hoang.co.uk/hieu
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20150504/ba5b4087/attachment.htm
> >
> > ------------------------------
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> > End of Moses-support Digest, Vol 103, Issue 5
> > *********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150505/0bbd64ae/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 103, Issue 11
**********************************************

0 Response to "Moses-support Digest, Vol 103, Issue 11"

Post a Comment