Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Weird code in Hypothesis::RecombineCompare() (Barry Haddow)
2. Re: Weird code in Hypothesis::RecombineCompare()
(Jeroen Vermeulen)
3. Re: Weird code in Hypothesis::RecombineCompare() (Hieu Hoang)
4. Fwd: A small typo in Moses manual (Guchun Zhang)
5. Re: Fwd: A small typo in Moses manual (Rico Sennrich)
----------------------------------------------------------------------
Message: 1
Date: Thu, 25 Jun 2015 09:55:41 +0100
From: Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Subject: Re: [Moses-support] Weird code in
Hypothesis::RecombineCompare()
To: Jeroen Vermeulen <jtv@precisiontranslationtools.com>,
"moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <558BC20D.8000006@staffmail.ed.ac.uk>
Content-Type: text/plain; charset=utf-8; format=flowed
Hi Jeroen
> Am I right in thinking this comparison is more or less arbitrary, as
> long as the result is consistent and only zero if the two pointers are
> both null? If so, would anyone mind if I made it compare just the
> nullness of the two pointers?
From memory, I think you are correct. For recombination we only care if
the FF states are equal or not equal, the actual order does not matter.
The hypotheses are added to an (ordered) set when they get created,
where the orderer uses the RecombineCompare methods in WordsBitmap and
Hypothesis. If the insert() does not result in adding a new Hypothesis
to the set, then it is recombined. Look at the AddPrune() method in
HypothesisStackNormal.
> Looking at replacing WordsBitmap's implementation with std::vector<bool>
> (less code, less memory)
WordsBitmap can be a major performance hog (e.g. scanning for first
zero) so if you can find something faster (yet still allows arbitrarily
large distortion limit) that would be great,
cheers - Barry
On 25/06/15 09:09, Jeroen Vermeulen wrote:
> Looking at replacing WordsBitmap's implementation with std::vector<bool>
> (less code, less memory) I came across this function:
>
> ?
> /** check, if two hypothesis can be recombined.
> this is actually a sorting function that allows us to
> keep an ordered list of hypotheses. This makes recombination
> much quicker.
> */
> int
> Hypothesis::
> RecombineCompare(const Hypothesis &compare) const
> {
> // -1 = this < compare
> // +1 = this > compare
> // 0 = this ==compare
> int comp = m_sourceCompleted.Compare(compare.m_sourceCompleted);
> if (comp != 0)
> return comp;
>
> for (unsigned i = 0; i < m_ffStates.size(); ++i) {
> if (m_ffStates[i] == NULL || compare.m_ffStates[i] == NULL) {
> comp = m_ffStates[i] - compare.m_ffStates[i];
> } else {
> comp = m_ffStates[i]->Compare(*compare.m_ffStates[i]);
> }
> if (comp != 0) return comp;
> }
>
> return 0;
> }
> ?
>
>
> My problem is with this conditional subtraction:
>
> ?
> if (m_ffStates[i] == NULL || compare.m_ffStates[i] == NULL) {
> comp = m_ffStates[i] - compare.m_ffStates[i];
> ?
>
> The result of that subtraction looks technically undefined to me, in
> which case _theoretically_ I could replace it with anything I liked
> including code to recite Homer in Morse code on the hard-disk light.
> But what is it meant to do in practice?
>
> The assignment to comp casts the value from std::ptrdiff_t to int. On a
> two's-complement system with 64-bit pointers, 32-bit ints, and a zero
> null pointer, the could would boil down to: take either m_ffStates[i] or
> ~m_ff_states[i] + 1 depending on which one is non-null, divide it by the
> size of FFState, drop the most-significant 32 bits, and compare the
> least-signficant 32 bits to zero. Occasionally you may get a completely
> unexpected zero even though one of the pointers is non-null, but usually
> you'll get an arbitrary positive or negative result. On the other hand
> it wouldn't surprise me if the optimizer were allowed to make convenient
> assumptions about the truncation, and just re-use the sign of the
> original ptrdiff_t.
>
> Am I right in thinking this comparison is more or less arbitrary, as
> long as the result is consistent and only zero if the two pointers are
> both null? If so, would anyone mind if I made it compare just the
> nullness of the two pointers?
>
> The actual code may not end up looking like this, but I'm thinking along
> the lines of:
>
> ?
> comp = (
> int(m_ffStates[i] == NULL) - int(compare.m_ffStates[i] == NULL)
> );
> ?
>
> This way, a null pointer would be deterministically "less than" a
> non-null pointer.
>
>
> Jeroen
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
------------------------------
Message: 2
Date: Thu, 25 Jun 2015 16:58:53 +0700
From: Jeroen Vermeulen <jtv@precisiontranslationtools.com>
Subject: Re: [Moses-support] Weird code in
Hypothesis::RecombineCompare()
To: Barry Haddow <bhaddow@staffmail.ed.ac.uk>, "moses-support@mit.edu"
<moses-support@mit.edu>
Message-ID: <558BD0DD.9040104@precisiontranslationtools.com>
Content-Type: text/plain; charset=utf-8
On 25/06/15 15:55, Barry Haddow wrote:
> From memory, I think you are correct. For recombination we only care if
> the FF states are equal or not equal, the actual order does not matter.
> The hypotheses are added to an (ordered) set when they get created,
> where the orderer uses the RecombineCompare methods in WordsBitmap and
> Hypothesis. If the insert() does not result in adding a new Hypothesis
> to the set, then it is recombined. Look at the AddPrune() method in
> HypothesisStackNormal.
What I'm seeing so far suggests that all we really need is a "less"
operator. So simpler than what we have now, but still an asymmetric
comparison.
How about I rename RecombineCompare to RecombineCompareLess, and make it
return a bool?
> WordsBitmap can be a major performance hog (e.g. scanning for first
> zero) so if you can find something faster (yet still allows arbitrarily
> large distortion limit) that would be great,
Unfortunately it doesn't look as if gcc 4.9 specializes <algorithm> for
vector<bool>. So as it stands, a std::find() is going to be slower.
Better to go with vector<char> for now, which is essentially what the
current layout is.
Jeroen
------------------------------
Message: 3
Date: Thu, 25 Jun 2015 14:07:26 +0400
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Weird code in
Hypothesis::RecombineCompare()
To: Jeroen Vermeulen <jtv@precisiontranslationtools.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>, Barry Haddow
<bhaddow@staffmail.ed.ac.uk>
Message-ID:
<CAEKMkbjcAagAFwhf66xPCiMGGduN-aCwV+JnDeUdY2tJL7muAg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu
On 25 June 2015 at 13:58, Jeroen Vermeulen <
jtv@precisiontranslationtools.com> wrote:
> On 25/06/15 15:55, Barry Haddow wrote:
>
> > From memory, I think you are correct. For recombination we only care if
> > the FF states are equal or not equal, the actual order does not matter.
> > The hypotheses are added to an (ordered) set when they get created,
> > where the orderer uses the RecombineCompare methods in WordsBitmap and
> > Hypothesis. If the insert() does not result in adding a new Hypothesis
> > to the set, then it is recombined. Look at the AddPrune() method in
> > HypothesisStackNormal.
>
> What I'm seeing so far suggests that all we really need is a "less"
> operator. So simpler than what we have now, but still an asymmetric
> comparison.
>
that's correct. It's my fault that it compares 'more than' too. I didn't
know what was going to be needed in the beginning so it was defensive
programming. I also thought it was slightly easier to understand.
How about I rename RecombineCompare to RecombineCompareLess, and make it
> return a bool?
>
sure
>
>
> > WordsBitmap can be a major performance hog (e.g. scanning for first
> > zero) so if you can find something faster (yet still allows arbitrarily
> > large distortion limit) that would be great,
>
> Unfortunately it doesn't look as if gcc 4.9 specializes <algorithm> for
> vector<bool>. So as it stands, a std::find() is going to be slower.
> Better to go with vector<char> for now, which is essentially what the
> current layout is.
>
>
> Jeroen
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150625/175e7c8d/attachment-0001.htm
------------------------------
Message: 4
Date: Thu, 25 Jun 2015 16:05:26 +0100
From: Guchun Zhang <gzhang@alphacrc.com>
Subject: [Moses-support] Fwd: A small typo in Moses manual
To: "moses-support@MIT.EDU" <Moses-support@mit.edu>
Message-ID:
<CA+cfSV+LtJaSFUt7o1RD_GBdRaATdxDnAGa_tMpC0jJJ00Yh0Q@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi there,
In Section 5.13.7 NPLM on Page 267, the option "--words_file" passed to
prepareNeuralLM expects an existing file containing the words to be added
in the vocabulary. Considering the line right below the command, I guess
what you really mean is "--write_words_file".
?
Also, "?--
n_vocab
?" has been replaced by "--vocab_size".?
Regards,
Guchun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150625/d8915c49/attachment-0001.htm
------------------------------
Message: 5
Date: Thu, 25 Jun 2015 16:19:18 +0100
From: Rico Sennrich <rico.sennrich@gmx.ch>
Subject: Re: [Moses-support] Fwd: A small typo in Moses manual
To: moses-support@mit.edu
Message-ID: <558C1BF6.7000902@gmx.ch>
Content-Type: text/plain; charset="utf-8"
Hi Guchun.
thanks - fixed.
best wishes,
Rico
On 25.06.2015 16:05, Guchun Zhang wrote:
> Hi there,
>
> In Section 5.13.7 NPLM on Page 267, the option "--words_file" passed
> to prepareNeuralLM expects an existing file containing the words to be
> added in the vocabulary. Considering the line right below the command,
> I guess what you really mean is "--write_words_file".
> ?
>
> Also, "?--
> n_vocab
> ?" has been replaced by "--vocab_size".?
>
> Regards,
> Guchun
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150625/ef1f4aa6/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 104, Issue 88
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 104, Issue 88"
Post a Comment