Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Getting counts in Moses instead of probabilities (Hieu Hoang)
2. Re: Getting counts in Moses instead of probabilities
(Harshit Gupta)
----------------------------------------------------------------------
Message: 1
Date: Thu, 9 Jul 2015 14:43:31 +0400
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Getting counts in Moses instead of
probabilities
To: Harshit Gupta <harshitgupta165@gmail.com>
Cc: moses-support@mit.edu
Message-ID: <559E5053.8040909@gmail.com>
Content-Type: text/plain; charset="utf-8"
On 09/07/2015 14:19, Harshit Gupta wrote:
> Hi Hieu, Thanks fot the reply. However, I have some further doubts in
> this.
> By count of a phrase, I want to know how many times a phrase is
> repeated in the corpora. So, can I get this counts from the cpp source
> file you have mentioned ?
> Also, in the phrase tables, the first four columns are for lexical
> weighting and phrase translation probabilities and then there are
> alignments between the source and target language. Here also, is it
> possible to get the counts of the phrases ?
yes, the next column (after the alignments) are the counts. In your png
file, the column '1 3 1' are the counts for the 1st translation rule
>
> Regards
> Harshit
>
> On Thu, Jul 9, 2015 at 1:29 PM, Hieu Hoang <hieuhoang@gmail.com
> <mailto:hieuhoang@gmail.com>> wrote:
>
> The counts are written in the 5th column in the phrase table.
> http://www.statmt.org/moses/?n=FactoredTraining.ScorePhrases
> This is for debugging purposes only, they don't influence decoding
> in anyway.
>
> IF you want to know more about how it works - the counts are
> stored in the file extract.*.sorted.gz and
> extract.*.inv.sorted.gz. The counts are summed and the probability
> is calculated by the score program. The source code for the score
> program is in
> phrase-extract/score-main.cpp
>
>
> On 08/07/2015 18:05, Harshit Gupta wrote:
>> Hi, I am currently working on Moses platform and in the phrase
>> tables, I am interested in the counts of phrases instead of
>> phrase translation probabilities. Can I get to know this counts ?
>> In the Moses manual, it is mentioned that in training process in
>> calculating phrase scores that
>> "To estimate the phrase translation probability ?(e|f) we proceed
>> as follows: First, the extract file is sorted. This ensures that
>> all English phrase translations for an foreign phrase are next to
>> each other in the file. Thus, we can process the file, one
>> foreign phrase at a time, *collect counts* and compute ?(e|f) for
>> that foreign phrase f."
>>
>> Where are these counts collected ? Where can I get these counts ?
>>
>> Regards
>> Harshit
>>
>> --
>> Harshit Gupta
>> Third Year Undergraduate
>> Electrical Engineering
>> IIT Madras
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> --
> Hieu Hoang
> Researcher
> New York University, Abu Dhabi
> http://www.hoang.co.uk/hieu
>
>
>
>
> --
> Harshit Gupta
> Third Year Undergraduate
> Electrical Engineering
> IIT Madras
--
Hieu Hoang
Researcher
New York University, Abu Dhabi
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150709/4459eddd/attachment-0001.htm
------------------------------
Message: 2
Date: Thu, 9 Jul 2015 17:19:49 +0530
From: Harshit Gupta <harshitgupta165@gmail.com>
Subject: Re: [Moses-support] Getting counts in Moses instead of
probabilities
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support@mit.edu
Message-ID:
<CAHgj_vvR1UXCJgaf2r58mK5WVqwouXgXf46c5XFPEZCPJrDeKQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi Hieu, sorry but I didn't get the exact meaning of counts. As an example,
I am considering few lines from my png file which have same English phrase
(be) as
be ||| ?? ??? ?? ||| 1 0.1 0.142857 2.63535e-05 ||| 0-2 ||| 1 7 1 ||| |||
be ||| ?? ||| 0.0606061 0.1 0.285714 0.375 ||| 0-0 ||| 33 7 2 ||| |||
be ||| ??? ?? ||| 1 0.0238095 0.142857 0.00162337 ||| 0-1 ||| 1 7 1 ||| |||
be ||| ???? ??? ?? ||| 1 0.0238095 0.142857 1.40552e-05 ||| 0-2 ||| 1 7 1
||| |||
be ||| ??? ?? ||| 1 0.1 0.142857 0.000811687 ||| 0-1 ||| 1 7 1 ||| |||
be ||| ?? ||| 0.0196078 0.0238095 0.142857 0.125 ||| 0-0 ||| 51 7 1 ||| |||
The column after the alignment column shows count. Why are these counts
different for the same English phrase ? And what does the three discrete
numbers '1 7 1' or '51 7 1' or '33 7 2' represents ? Does these represents
the number of times the source/target phrase is repeated in corpora or they
are calculated using some rule/function in Moses ?
Thanks
Regards
Harshit
On Thu, Jul 9, 2015 at 4:13 PM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>
>
> On 09/07/2015 14:19, Harshit Gupta wrote:
>
> Hi Hieu, Thanks fot the reply. However, I have some further doubts in
> this.
> By count of a phrase, I want to know how many times a phrase is repeated
> in the corpora. So, can I get this counts from the cpp source file you have
> mentioned ?
> Also, in the phrase tables, the first four columns are for lexical
> weighting and phrase translation probabilities and then there are
> alignments between the source and target language. Here also, is it
> possible to get the counts of the phrases ?
>
> yes, the next column (after the alignments) are the counts. In your png
> file, the column '1 3 1' are the counts for the 1st translation rule
>
>
> Regards
> Harshit
>
> On Thu, Jul 9, 2015 at 1:29 PM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>
>> The counts are written in the 5th column in the phrase table.
>> http://www.statmt.org/moses/?n=FactoredTraining.ScorePhrases
>> This is for debugging purposes only, they don't influence decoding in
>> anyway.
>>
>> IF you want to know more about how it works - the counts are stored in
>> the file extract.*.sorted.gz and extract.*.inv.sorted.gz. The counts are
>> summed and the probability is calculated by the score program. The source
>> code for the score program is in
>> phrase-extract/score-main.cpp
>>
>>
>> On 08/07/2015 18:05, Harshit Gupta wrote:
>>
>> Hi, I am currently working on Moses platform and in the phrase
>> tables, I am interested in the counts of phrases instead of phrase
>> translation probabilities. Can I get to know this counts ?
>> In the Moses manual, it is mentioned that in training process in
>> calculating phrase scores that
>> "To estimate the phrase translation probability ?(e|f) we proceed as
>> follows: First, the extract file is sorted. This ensures that all English
>> phrase translations for an foreign phrase are next to each other in the
>> file. Thus, we can process the file, one foreign phrase at a time, *collect
>> counts* and compute ?(e|f) for that foreign phrase f."
>>
>> Where are these counts collected ? Where can I get these counts ?
>>
>> Regards
>> Harshit
>>
>> --
>> Harshit Gupta
>> Third Year Undergraduate
>> Electrical Engineering
>> IIT Madras
>>
>>
>> _______________________________________________
>> Moses-support mailing listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>> --
>> Hieu Hoang
>> Researcher
>> New York University, Abu Dhabihttp://www.hoang.co.uk/hieu
>>
>>
>
>
> --
> Harshit Gupta
> Third Year Undergraduate
> Electrical Engineering
> IIT Madras
>
>
> --
> Hieu Hoang
> Researcher
> New York University, Abu Dhabihttp://www.hoang.co.uk/hieu
>
>
--
Harshit Gupta
Third Year Undergraduate
Electrical Engineering
IIT Madras
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150709/efc972b9/attachment.htm
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 105, Issue 19
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 105, Issue 19"
Post a Comment