Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: mismatch in alignment file when using
-alignment-output-file (Hieu Hoang)
----------------------------------------------------------------------
Message: 1
Date: Mon, 2 Nov 2015 21:58:07 +0000
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] mismatch in alignment file when using
-alignment-output-file
To: Arefeh Kazemi <kazemina@dcu.ie>, moses-support@mit.edu
Message-ID: <5637DC6F.1090206@gmail.com>
Content-Type: text/plain; charset="utf-8"
On 02/11/2015 19:12, Arefeh Kazemi wrote:
> Hi Hieu
>
> On 2 November 2015 at 21:37, Hieu Hoang <hieuhoang@gmail.com
> <mailto:hieuhoang@gmail.com>> wrote:
>
> ok. I can't find the input sentence you gave me as an example
>
> however, i think i know the issue. it's not a bug, but we haven't
> made clear that input and output sentences in phrase-based and
> chart decoders are slighty different.
>
>
> In the chart decoder, there are implied <s> and </s> at the
> beginning and end of the each input and output sentence. They are
> not displayed, but the alignment still refers to them. So the
> input sentence
> "darya alexandrovna went alone to her room ."
> has 8 words, but in the decoder, it's actually
> "<s> darya alexandrovna went alone to her room . </s>"
>
> does that explain your problem?
>
>
> I think so! Thanks!
> I noticed that for all input sentences the highest index in the
> alignment is the sentence length +2.
> If I change the alignment file and remove all the alignment with 0 or
> (source sentence's length in the left or target sentence's length in
> the right) the problem will be solved. right?
yep
>
> and another question:
> for using reordering evaluation metrics such as kendall or hamming, we
> need aligned dev file. Have you considered this type of alignment for
> that too? Should I change the input alignment file for tuning with
> these metrics?
I don't think the metric have been developed with hiero models in mind.
You may need to add an extra argument to let them know the alignments
need to be shifted
>
>
> Thanks
> Arefeh
>
>
>
> On 02/11/2015 17:19, Arefeh Kazemi wrote:
>>
>> Hi Hieu
>>
>>
>> This is the command:
>>
>> (I've not tuned the system, so I use the initial moses.ini file)
>>
>> the moses.ini is attached.
>>
>>
>> nohup nice ~/mosesdecoder/bin/moses_chart \
>>
>> -f ~ /moses.ini -alignment-output-file ~/align.txt \
>>
>> < ~ /toyCorpus/mizan-test-toy.en \
>>
>> > ~ /mizan-translated.fa \
>>
>> 2> ~ /mizan-test.out
>>
>> ~/mosesdecoder/scripts/generic/multi-bleu.perl \
>>
>> -lc ~ / toyCorpus /mizan-test-toy.fa \
>>
>> < ~ /mizan-translated.fa
>>
>>
>> Thanks again
>>
>> Arefeh
>>
>>
>> On 2 November 2015 at 20:07, Hieu Hoang <hieuhoang@gmail.com
>> <mailto:hieuhoang@gmail.com>> wrote:
>>
>> err. What exactly do I run to reproduce the problem? What is
>> the input? Which ini file? I don't need the extract file or
>> the corpus
>>
>>
>> On 01/11/2015 11:41, Arefeh Kazemi wrote:
>>> ?
>>> extract.inv.sorted.gz
>>> <https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjLVp6X2ZQNm5TUHM/view?usp=drive_web>
>>> ??
>>> extract.sorted.gz
>>> <https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjaHF6Nms5dEtJWFE/view?usp=drive_web>
>>> ??
>>> other.zip
>>> <https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjdFh2NkhIaVFEQVU/view?usp=drive_web>
>>> ?Hi Hieu
>>> Thanks for the reply.
>>> my original files are so huge so I attached a toy model
>>> which the mismatch happens for it too.
>>>
>>> Thanks again.
>>> Arefeh?
>>> toyCorpus.zip
>>> <https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjS0dSSnA2NjVBTnc/view?usp=drive_web>
>>> ??
>>> toyLM.zip
>>> <https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjM2xiVkVYaDY3ZFk/view?usp=drive_web>
>>> ?
>>>
>>> On 1 November 2015 at 01:10, Hieu Hoang <hieuhoang@gmail.com
>>> <mailto:hieuhoang@gmail.com>> wrote:
>>>
>>> that should never happen. Can you please make available
>>> the model and input files for download so I can check it
>>>
>>>
>>> On 31/10/2015 10:30, Arefeh Kazemi wrote:
>>>> Hi
>>>>
>>>> I needed the word alignment between the source and the
>>>> output translation and I used -alignment-output-file
>>>> parameter. It gives me an alignment file but there are
>>>> some mismatches between the source sentences' length
>>>> and the alignment so that the highest index in the
>>>> alignment is greater than the sentence length.
>>>> for example, for the source sentence
>>>> "darya alexandrovna went alone to her room ."
>>>> the alignment is :
>>>> 0-0 1-1 2-1 3-6 4-3 5-2 6-5 7-4 8-7 9-8
>>>>
>>>> I checked the sentences but there is no strange string
>>>> in them.
>>>>
>>>> Does anyone know why this happens?!
>>>>
>>>> Regards
>>>> Arefeh
>>>>
>>>> /
>>>>
>>>> *Email Disclaimer*
>>>>
>>>> /"This e-mail and any files transmitted with it are
>>>> confidential and are intended solely for use by the
>>>> addressee. Any unauthorised dissemination, distribution
>>>> or copying of this message and any attachments is
>>>> strictly prohibited. If you have received this e-mail
>>>> in error, please notify the sender and delete the
>>>> message. Any views or opinions presented in this e-mail
>>>> may solely be the views of the author and cannot be
>>>> relied upon as being those of Dublin City University.
>>>> E-mail communications such as this cannot be guaranteed
>>>> to be virus-free, timely, secure or error-free and
>>>> Dublin City University does not accept liability for
>>>> any such matters o r their cons equences. Please
>>>> consider the environment before printing this e-mail."/
>>>>
>>>> *S?anadh R?omhphoist*
>>>>
>>>> /"T? an r?omhphost seo agus aon chomhad a sheoltar leis
>>>> faoi r?n agus is lena ?s?id ag an seola? agus sin
>>>> amh?in ?. T? cosc ioml?n ar scaipeadh, dh?ileadh n?
>>>> ch?ipe?il neamh?daraithe ar an teachtaireacht seo agus
>>>> ar aon cheangalt?n at? ag dul leis. M? t? an r?omhphost
>>>> seo faighte agat tr? dhearmad cuir sin in i?l le do
>>>> thoil don seolt?ir agus scrios an teachtaireacht.
>>>> D?fh?adfadh s? gurb iad tuairim? an ?dair agus sin
>>>> amh?in at? in aon tuairim? no dearctha? at? curtha i
>>>> l?thair sa r?omhphost seo agus n?or ch?ir glacadh leo
>>>> mar thuairim? n? dhearctha? Ollscoil Chathair Bhaile
>>>> ?tha Cliath. N? ghlactar leis go bhfuil cumars?id
>>>> r?omhphoist den s?rt seo saor ? v?reas, in am, sl?n, n?
>>>> saor ? earr?id agus n? ghlacann Olls coil Chathair
>>>> Bhaile ?tha Cliath le dliteanas in aon ch?s den s?rt
>>>> sin n? as aon iarmhairt a d?eascr?dh astu. Cuimhnigh ar
>>>> an timpeallacht le do thoil sula gcuireann t? an
>>>> r?omhphost seo i gcl?."/
>>>>
>>>> /
>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>> --
>>> Hieu Hoang
>>> http://www.hoang.co.uk/hieu
>>>
>>>
>>>
>>> /
>>>
>>> *Email Disclaimer*
>>>
>>> /"This e-mail and any files transmitted with it are
>>> confidential and are intended solely for use by the
>>> addressee. Any unauthorised dissemination, distribution or
>>> copying of this message and any attachments is strictly
>>> prohibited. If you have received this e-mail in error,
>>> please notify the sender and delete the message. Any views
>>> or opinions presented in this e-mail may solely be the views
>>> of the author and cannot be relied upon as being those of
>>> Dublin City University. E-mail communications such as this
>>> cannot be guaranteed to be virus-free, timely, secure or
>>> error-free and Dublin City University does not accept
>>> liability for any such matters or their cons equences.
>>> Please consider the environment before printing this e-mail."/
>>>
>>> *S?anadh R?omhphoist*
>>>
>>> /"T? an r?omhphost seo agus aon chomhad a sheoltar leis faoi
>>> r?n agus is lena ?s?id ag an seola? agus sin amh?in ?. T?
>>> cosc ioml?n ar scaipeadh, dh?ileadh n? ch?ipe?il
>>> neamh?daraithe ar an teachtaireacht seo agus ar aon
>>> cheangalt?n at? ag dul leis. M? t? an r?omhphost seo faighte
>>> agat tr? dhearmad cuir sin in i?l le do thoil don seolt?ir
>>> agus scrios an teachtaireacht. D?fh?adfadh s? gurb iad
>>> tuairim? an ?dair agus sin amh?in at? in aon tuairim? no
>>> dearctha? at? curtha i l?thair sa r?omhphost seo agus n?or
>>> ch?ir glacadh leo mar thuairim? n? dhearctha? Ollscoil
>>> Chathair Bhaile ?tha Cliath. N? ghlactar leis go bhfuil
>>> cumars?id r?omhphoist den s?rt seo saor ? v?reas, in am,
>>> sl?n, n? saor ? earr?id agus n? ghlacann Ollscoil Chathair
>>> Bhaile ?tha Cliath le dliteanas in aon ch?s den s?rt sin n?
>>> as aon iarmhairt a d?eascr?dh astu. Cuimhnigh ar an
>>> timpeallacht le do thoil sula gcuireann t? an r?omhphost seo
>>> i gcl?."/
>>>
>>> /
>>
>> --
>> Hieu Hoang
>> http://www.hoang.co.uk/hieu
>>
>>
>>
>> /
>>
>> *Email Disclaimer*
>>
>> /"This e-mail and any files transmitted with it are confidential
>> and are intended solely for use by the addressee. Any
>> unauthorised dissemination, distribution or copying of this
>> message and any attachments is strictly prohibited. If you have
>> received this e-mail in error, please notify the sender and
>> delete the message. Any views or opinions presented in this
>> e-mail may solely be the views of the author and cannot be relied
>> upon as being those of Dublin City University. E-mail
>> communications such as this cannot be guaranteed to be
>> virus-free, timely, secure or error-free and Dublin City
>> University does not accept liability for any such matters or
>> their cons equences. Please consider the environment before
>> printing this e-mail."/
>>
>> *S?anadh R?omhphoist*
>>
>> /"T? an r?omhphost seo agus aon chomhad a sheoltar leis faoi r?n
>> agus is lena ?s?id ag an seola? agus sin amh?in ?. T? cosc ioml?n
>> ar scaipeadh, dh?ileadh n? ch?ipe?il neamh?daraithe ar an
>> teachtaireacht seo agus ar aon cheangalt?n at? ag dul leis. M? t?
>> an r?omhphost seo faighte agat tr? dhearmad cuir sin in i?l le do
>> thoil don seolt?ir agus scrios an teachtaireacht. D?fh?adfadh s?
>> gurb iad tuairim? an ?dair agus sin amh?in at? in aon tuairim? no
>> dearctha? at? curtha i l?thair sa r?omhphost seo agus n?or ch?ir
>> glacadh leo mar thuairim? n? dhearctha? Ollscoil Chathair Bhaile
>> ?tha Cliath. N? ghlactar leis go bhfuil cumars?id r?omhphoist den
>> s?rt seo saor ? v?reas, in am, sl?n, n? saor ? earr?id agus n?
>> ghlacann Ollscoil Chathair Bhaile ?tha Cliath le dliteanas in aon
>> ch?s den s?rt sin n? as aon iarmhairt a d?eascr?dh astu.
>> Cuimhnigh ar an timpeallacht le do thoil sula gcuireann t? an
>> r?omhphost seo i gcl?."/
>>
>> /
>
> --
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
>
>
> /
>
> *Email Disclaimer*
>
> /"This e-mail and any files transmitted with it are confidential and
> are intended solely for use by the addressee. Any unauthorised
> dissemination, distribution or copying of this message and any
> attachments is strictly prohibited. If you have received this e-mail
> in error, please notify the sender and delete the message. Any views
> or opinions presented in this e-mail may solely be the views of the
> author and cannot be relied upon as being those of Dublin City
> University. E-mail communications such as this cannot be guaranteed to
> be virus-free, timely, secure or error-free and Dublin City University
> does not accept liability for any such matters or their consequences.
> Please consider the environment before printing this e-mail."/
>
> *S?anadh R?omhphoist*
>
> /"T? an r?omhphost seo agus aon chomhad a sheoltar leis faoi r?n agus
> is lena ?s?id ag an seola? agus sin amh?in ?. T? cosc ioml?n ar
> scaipeadh, dh?ileadh n? ch?ipe?il neamh?daraithe ar an teachtaireacht
> seo agus ar aon cheangalt?n at? ag dul leis. M? t? an r?omhphost seo
> faighte agat tr? dhearmad cuir sin in i?l le do thoil don seolt?ir
> agus scrios an teachtaireacht. D?fh?adfadh s? gurb iad tuairim? an
> ?dair agus sin amh?in at? in aon tuairim? no dearctha? at? curtha i
> l?thair sa r?omhphost seo agus n?or ch?ir glacadh leo mar thuairim? n?
> dhearctha? Ollscoil Chathair Bhaile ?tha Cliath. N? ghlactar leis go
> bhfuil cumars?id r?omhphoist den s?rt seo saor ? v?reas, in am, sl?n,
> n? saor ? earr?id agus n? ghlacann Ollscoil Chathair Bhaile ?tha
> Cliath le dliteanas in aon ch?s den s?rt sin n? as aon iarmhairt a
> d?eascr?dh astu. Cuimhnigh ar an timpeallacht le do thoil sula
> gcuireann t? an r?omhphost seo i gcl?."/
>
> /
--
Hieu Hoang
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20151102/fb158017/attachment.html
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 109, Issue 8
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 109, Issue 8"
Post a Comment