Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: CeateOnDiskPt throws 'Already Saved' Exception on phrase
table in sample_models/phrase-model (kwame porter robinson)
2. Re: Duplicated source files (Ulrich Germann)
----------------------------------------------------------------------
Message: 1
Date: Tue, 5 May 2015 10:38:43 -0400
From: kwame porter robinson <k.porter.robinson@gmail.com>
Subject: Re: [Moses-support] CeateOnDiskPt throws 'Already Saved'
Exception on phrase table in sample_models/phrase-model
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CA+RrxgXf22akA6MJFumzROt=YhOKqboKLQLzBGa2r-1SAfOYug@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Note for others: After converting the phrase table I had to update my
moses.ini [weight] section to read PhraseDictionaryOnDisk0= 1 so that moses
would weight the new feature (and not the PhraseDictionaryMemory one, which
I commented out).
On Tue, May 5, 2015 at 10:17 AM, Hieu Hoang <hieuhoang@gmail.com> wrote:
> yes, it process a bit at a time so u don't need a big pc to run it
>
> Hieu Hoang
> Researcher
> New York University, Abu Dhabi
> http://www.hoang.co.uk/hieu
>
> On 5 May 2015 at 18:10, kwame porter robinson <k.porter.robinson@gmail.com
> > wrote:
>
>> Ah, that was my error. Thanks for pointing that out. The exception is no
>> longer thrown.
>>
>> Quick follow up question: does CreateOnDiskPt process phrase tables
>> lazily? (that is, in chunks so that the entire table is not loaded into
>> memory).
>>
>> On Tue, May 5, 2015 at 9:33 AM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>>
>>> is your phrase-table sorted? You can sort is like
>>> LC_ALL=C sort big-pt > big.pt.sorted
>>>
>>>
>>> Hieu Hoang
>>> Researcher
>>> New York University, Abu Dhabi
>>> http://www.hoang.co.uk/hieu
>>>
>>> On 5 May 2015 at 17:22, kwame porter robinson <
>>> k.porter.robinson@gmail.com> wrote:
>>>
>>>> Hmm, so I tried that and received the same exception:
>>>>
>>>> $ CreateOnDiskPt 1 1 1 10 2 big-pt ondiskpt
>>>>
>>>> Starting : [0] seconds
>>>> terminate called after throwing an instance of 'util::Exception'
>>>> what(): OnDiskPt/PhraseNode.cpp:97 in void
>>>> OnDiskPt::PhraseNode::Save(OnDiskPt::OnDiskWrapper&, size_t, size_t) threw
>>>> util::Exception because `m_saved'.
>>>> Already saved
>>>> Aborted (core dumped)
>>>>
>>>> - Kwame
>>>>
>>>>
>>>> On Tue, May 5, 2015 at 12:48 AM, Hieu Hoang <hieuhoang@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 05/05/2015 02:46, kwame porter robinson wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am attempting to binarize a large ascii phrase table using
>>>>> CreateOnDiskPt. Each phrase table row contains the source phrase, target
>>>>> phrase and a single score. I am getting a 'Already Saved' exception for
>>>>> phrase tables larger than 9 lines. I am using Moses release 2.1.
>>>>>
>>>>> The exception is reproduced below:
>>>>>
>>>>> 1) Using the 10 line phrase table below*, fromhttp://www.statmt.org/moses/download/sample-models.tgz.
>>>>>
>>>>> 2) Truncate the phrase-table with 'head phrase-table -n9 > small-pt'
>>>>>
>>>>> 3) Truncate another version of the phrase-table with 'head phrase-table
>>>>> -n10 > big-pt'
>>>>>
>>>>> 4) CreateOnDisk works with 'CreateOnDiskPt 0 0 1 10 2 small-pt
>>>>> myphrasetable'
>>>>>
>>>>> it should be
>>>>> CreateOnDiskPt 1 1 ....
>>>>> The 1st two arguments are the NUMBER of source and target factors.
>>>>>
>>>>> 5) CreateOnDisk throws the following exception with 'CreateOnDiskPt 0 0 1
>>>>> 10 2 big-pt myphrasetable'
>>>>>
>>>>> ---
>>>>> Starting : [0] seconds
>>>>> terminate called after throwing an instance of 'util::Exception'
>>>>> what(): OnDiskPt/PhraseNode.cpp:97 in void
>>>>> OnDiskPt::PhraseNode::Save(OnDiskPt::OnDiskWrapper&, size_t, size_t) threw
>>>>> util::Exception because `m_saved'.
>>>>> Already saved
>>>>> Aborted (core dumped)
>>>>> ---
>>>>>
>>>>> Any thoughts on how to fix this? For hints I've looked at:https://www.mail-archive.com/moses-support%40mit.edu/msg11999.html andhttps://www.mail-archive.com/moses-support%40mit.edu/msg10602.html
>>>>> but was unable to resolve this.
>>>>>
>>>>> Thank you for your time.
>>>>>
>>>>> * The phrase table
>>>>> ----
>>>>> der ||| the ||| 0.3 ||| |||
>>>>> das ||| the ||| 0.4 ||| |||
>>>>> das ||| it ||| 0.1 ||| |||
>>>>> das ||| this ||| 0.1 ||| |||
>>>>> die ||| the ||| 0.3 ||| |||
>>>>> ist ||| is ||| 1.0 ||| |||
>>>>> ist ||| 's ||| 1.0 ||| |||
>>>>> das ist ||| it is ||| 0.2 ||| |||
>>>>> das ist ||| this is ||| 0.8 ||| |||
>>>>> es ist ||| it is ||| 0.8 ||| |||
>>>>> ---
>>>>>
>>>>> - Kwame
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>>>
>>>>> --
>>>>> Hieu Hoang
>>>>> Researcher
>>>>> New York University, Abu Dhabihttp://www.hoang.co.uk/hieu
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150505/494d3057/attachment-0001.htm
------------------------------
Message: 2
Date: Tue, 5 May 2015 16:26:34 +0100
From: Ulrich Germann <ulrich.germann@gmail.com>
Subject: Re: [Moses-support] Duplicated source files
To: Kenneth Heafield <moses@kheafield.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAHQSRUq_WhYcYCq=VzTX2Q-L7RTnbgBXDJDUs8bdTWg5YprCcg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
I'm very strongly in favor of using boost's filtering streams.
See here:
https://github.com/moses-smt/mosesdecoder/blob/master/moses/TranslationModel/UG/generic/file_io/ug_stream.cpp
- Uli
On Sun, May 3, 2015 at 12:08 PM, Kenneth Heafield <moses@kheafield.com>
wrote:
> http://www.boost.org/doc/libs/1_57_0/libs/iostreams/doc/classes/gzip.html
> (best
> to scroll to the bottom first).
>
> On 05/03/2015 04:08 AM, Jeroen Vermeulen wrote:
> > On May 2, 2015 2:58:08 AM GMT+07:00, Kenneth Heafield <
> moses@kheafield.com> wrote:
> >> If this comment accurate that gzfilebuf is only used for writing?
> >>
> >> /** wrapper around gzip input stream. Unknown parentage
> >> * @todo replace with boost version - output stream already uses it
> >> */
> >>
> >> If so I'll just extend util/fake_ofstream.hh to have gzip support.
> >>
> >> Time to print a bunch of integers:
> >>
> >> FakeOFStream:
> >>
> >> real 0m3.460s
> >> user 0m3.459s
> >> sys 0m0.004s
> >>
> >> std::cout
> >>
> >> real 0m23.010s
> >> user 0m22.895s
> >> sys 0m0.134s
> >>
> >> Time to print a bunch of floats:
> >>
> >> FakeOFStream:
> >>
> >> real 0m34.871s
> >> user 0m34.894s
> >> sys 0m0.006s
> >>
> >> std::cout
> >>
> >> real 1m56.628s
> >> user 1m56.690s
> >> sys 0m0.037s
> >>
> >> The conversion is done by https://github.com/miloyip/itoa-benchmark/
> >> and
> >> Google double conversion.
> >>
> >> Kenneth
> >>
> >> On 05/01/15 14:37, Barry Haddow wrote:
> >>> What about the util directory?
> >>>
> >>> On 1 May 2015 19:13:26 BST, Hieu Hoang <hieuhoang@gmail.com> wrote:
> >>>
> >>> i suppose everything should reference the moses lib.
> >>>
> >>> that's getting a bit bloated, one day we should look at splitting
> >> it up
> >>>
> >>> On 30/04/2015 10:24, Jeroen Vermeulen wrote:
> >>>
> >>> Any chance we could re-unify the gzfilebuf and
> >> InputFileStream
> >>> modules?
> >>> Looks like we're carrying around 4 copies of each, and
> >> they're
> >>> starting
> >>> to diverge.
> >>>
> >>> I'd be happy to make the change, if we know a good reusable
> >>> place to put it.
> >>>
> >>>
> >>> Jeroen
> >>>
> >> ------------------------------------------------------------------------
> >>>
> >>> Moses-support mailing list
> >>> Moses-support@mit.edu
> >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>>
> >>>
> >>>
> >>> --
> >>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
> >>>
> >>>
> >>> The University of Edinburgh is a charitable body, registered in
> >>> Scotland, with registration number SC005336.
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Moses-support mailing list
> >>> Moses-support@mit.edu
> >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>>
> >> _______________________________________________
> >> Moses-support mailing list
> >> Moses-support@mit.edu
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> > I don't have the code at hand but I'm fairly sure I saw it
> being used for reading.
> >
> > Maybe Boost has an equivalent that we can drop in?
> >
> >
> > Jeroen
> >
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
--
Ulrich Germann
Senior Researcher
School of Informatics
University of Edinburgh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150505/47626769/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 103, Issue 12
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 103, Issue 12"
Post a Comment