Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: MT Marathon 2010 page hacked. (liling tan)
2. Re: Factored instead of Phrase-based Model? (Ondrej Bojar)
----------------------------------------------------------------------
Message: 1
Date: Wed, 6 Jan 2016 19:36:31 +0100
From: liling tan <alvations@gmail.com>
Subject: Re: [Moses-support] MT Marathon 2010 page hacked.
To: Ventsislav Zhechev <ventsislavzhechev@me.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAKzPaJ+QMacgOQOtmxwuwPs2AirA81BeKQEH6u38RPR44iErJQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Phew, then it must be on my side. I was checking google cache and it's
having similar problems:
http://webcache.googleusercontent.com/search?q=cache:MSlNXwPXIsQJ:www.mtmarathon2010.info/web/Program_files/art-tyers-et-al.pdf+&cd=1&hl=en&ct=clnk&gl=uk
Regards,
Liling
On Wed, Jan 6, 2016 at 7:34 PM, Ventsislav Zhechev <ventsislavzhechev@me.com
> wrote:
> Hi Liling,
>
> The problem seems to be somewhere on your side, i.e. your computer or
> network. The MT Marathon 2010 page works fine for me and is retrieving the
> correct documents both on my company?s network and via my phone?s network.
>
> Can anyone else confirm getting fraudulent content from the
> http://mtmarathon2010.info website?
>
>
> Cheers,
>
> Ventzi
>
> ???????
> *Dr. Ventsislav Zhechev*
> Computational Linguist, Certified ScrumMaster?
>
> *http://VentsislavZhechev.eu <http://VentsislavZhechev.eu>*
>
>
> 6.01.2016 ?., ? 10:26, liling tan <alvations@gmail.com> ???????(?):
>
> Dear Moses dev and MT Marathon organizer,
>
> Whoops, I might have been mistaken. I'm not sure what happened but there's
> also this page:
> http://www.mtmarathon2010.info/web/Program_files/art-tyers-et-al.pdf that
> leads to the screen shot.
>
>
> Regards,
> Liling
>
> On Wed, Jan 6, 2016 at 7:05 PM, Ventsisav Zhechev <
> contact@ventsislavzhechev.eu> wrote:
>
>> Hi Liling,
>> I just got to the office and checked the page. I?m still looking after
>> hosting the page and can confirm that http://mtmarathon2010.info has not
>> been hacked.
>>
>> The presence of Russian text on the page is a side effect of the tool I
>> used to build the website at the time and most probably got switched from
>> English to Russian when I had to switch hosts around 2012. If I get a
>> chance, I?ll try to find some time to fix that next week.
>>
>>
>> As for baiting users, I?m not quite sure what you mean by that. The link
>> you provide is a genuine link to lecture material from the MT Marathon.
>>
>>
>> Cheers,
>>
>> Ventzi
>>
>> ???????
>> *Dr. Ventsislav Zhechev*
>> Computational Linguist, Certified ScrumMaster?
>>
>> *http://VentsislavZhechev.eu <http://ventsislavzhechev.eu/>*
>>
>>
>> 6.01.2016 ?., ? 9:23, Philipp Koehn <phi@jhu.edu> ???????(?):
>>
>> Hi,
>>
>> I did not find anything hacked about the page, but it is maintained
>> by Ventsislav Zhechev.
>>
>> -phi
>>
>> On Wed, Jan 6, 2016 at 8:07 AM, liling tan <alvations@gmail.com> wrote:
>>
>>> Dear Moses / MT Marathon organizers,
>>>
>>> I'm not sure whether this is the right place to report this.
>>>
>>> I was trying to retrieve a page from MT Marathon 2010 and it seems like
>>> a Russian hacker hacked the page and took over it:
>>> http://www.mtmarathon2010.info/ (see the lower right corner).
>>>
>>> And it's using the high google pagerank index to bait people onto pages
>>> like: http://www.mtmarathon2010.info/web/Program_files/survey.pdf
>>>
>>> Regards,
>>> Liling
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>>
> <Screenshot from 2016-01-06 19:25:37.png>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20160106/e59439cc/attachment-0001.html
------------------------------
Message: 2
Date: Thu, 7 Jan 2016 09:33:15 +0100 (CET)
From: Ondrej Bojar <bojar@ufal.mff.cuni.cz>
Subject: Re: [Moses-support] Factored instead of Phrase-based Model?
To: Shaimaa Marzouk <marzouk_s@yahoo.de>
Cc: moses-support@mit.edu
Message-ID:
<1255409735.2292.1452155595111.JavaMail.zimbra@ufal.mff.cuni.cz>
Content-Type: text/plain; charset="utf-8"
Dear Shaimaa,
I don't understand which files are supposed to form the word-aligned sentence-parallel training corpus. I expected three files with the exact same number of lines, but the alignment file has only 4 lines while the corpora have 29 lines:
4 aligned.grow-diag-final
29 car-ready2016-2.de
29 car-ready2016-2.en
45 phrase-table.gz
37 verbose.docx
I'm attaching my script to visually present alignments like this:
this ? - - - -
car - ? - - -
was - - ? - -
stolen - - - ? -
. - - - - ?
dieses .
auto
wurde
gestohlen
To get the output, use:
paste car-ready2016-2.en car-ready2016-2.de aligned.grow-diag-final | alitextview.pl | less
(I might have swapped the languages.)
If your training corpus consists now of the files: car-ready2016-2.en car-ready2016-2.de aligned.grow-diag-final, then obviously, no "ich verkaufe" can be translated, it's never aligned to anything in the training data.
Cheers, O.
----- Original Message -----
> From: "Shaimaa Marzouk" <marzouk_s@yahoo.de>
> To: Moses-support@mit.edu, "Ondrej Bojar" <bojar@ufal.mff.cuni.cz>
> Sent: Wednesday, 6 January, 2016 18:27:02
> Subject: Re: [Moses-support] Factored instead of Phrase-based Model?
> Dear Ondrej & Moses-Team,
>
> @Ondrej: thanks a lot for your quick feedback.
>
> The phrase "ich habe" does not appear in the phrase table. The word alignment
> file includes only the first 4 sentences of the training data.
>
> I have separated the sentence (ich habe das auto verkauf) in a separate "in"
> file, but got the same result. I also tried another sentence (ich verkaufe das
> auto), also here "ich verkaufe" can not be translated. I repeated the exact
> sentence (ich verkaufe das auto) many times in the training data and still get
> the same result.
> I attach the word alignment, phrase table, training data and verbose result..
> and would be very grateful to receive any tip.
>
> I would also highly appreciate, if you could let me know, where can I find
> information about
> 1.???how to prepare the training data with additional factors, before training
> the Factored Model?
> 2.???how to train the Language Model that considers the POS?
>
> I think that sooner or later, the sentences will get complexer and I would need
> to work with a Factored Model.
>
>
> Many Thanks
> Shaimaa
>
>
>
>
>
>
> --------------------------------------------
> Ondrej Bojar <bojar@ufal.mff.cuni.cz> schrieb am Mi, 6.1.2016:
>
> Betreff: Re: [Moses-support] Factored instead of Phrase-based Model?
> An: "Shaimaa Marzouk" <marzouk_s@yahoo.de>, "Shaimaa Marzouk"
> <marzouk_s@yahoo.de>, Moses-support@mit.edu
> Datum: Mittwoch, 6. Januar, 2016 08:42 Uhr
>
> Dear Shaimaa,
>
> Adding factors can only
> increase any out-of-vocabulary issues.
>
> Use -v (perhaps even a higher verbosity level)
> in moses to see what all translation options are considered
> for the problematic sentence. There could be some
> unfortunate weight settings? that for some reason prefer
> identity translation. (The identity translation must however
> appear in the data, or the source word must not appear in
> the data, otherwise Moses would not produce identity
> translation at all.)
>
> And
> then go back to the phrase table and manually search for the
> lines that are supposed to cover the missing words. Here you
> may find the identity entries.
>
> And then go back and check the word alignment
> this (test) sentence got in the training data. There are
> most likely some issues with the alignment that prevented
> proper translations to be extracted.
>
> Best, Ondrej.
>
>
> On January 6, 2016 4:48:26 AM CET, Shaimaa
> Marzouk <marzouk_s@yahoo.de>
> wrote:
> >Dear Moses-Team,
> >
> >I am trying to
> translate two short sentences included in the same file
> >from German into English using a
> ?Phrase-based Model?. The first
> >sentence (das auto wurde verkauft) is
> translated correctly, while the
> >second
> is partly translated:
> >
> >I receive as a result for ?ich habe das
> auto verkauft?
> >Ich|UNK|UNK|UNK
> habe|UNK|UNK|UNK the car sold? [11111]
> >[total=-203.330]???core=(-200.000,
> -5.000, 5.000, 0.000, 0.000, 0.000,
> >0.000, 0.000, -18.660)
> >
> >I tried to modify the
> training data in different ways, and at last
> >included the exact sentence (along with its
> translation) in the
> >training data (see
> attachment). But, I still get the same result.
> >
> >Do I need to use a
> ?Factored Translation Model? instead of the
> >?Phrase-based Model? to be able to
> translate this sentence? If yes, I
> >find
> here http://www.statmt.org/moses/?n=Moses.FactoredTutorial
> >explanation of how to train Factored
> Models. Could you please tell me,
> >where
> can I find information about
> >1.
> how to prepare the training data with additional factors,
> before
> >training the Factored Model?
> >2.??? how to train the Language Model
> that considers the POS?
> >
> >I currently use KenLM and Giza++.
> >
> >Thanks a lot for your
> support.
> >
> >Kind
> regards,
> >Shaimaa
> >
> >------------------------------------------------------------------------
> >
> >_______________________________________________
> >Moses-support mailing list
> >Moses-support@mit.edu
> >http://mailman.mit.edu/mailman/listinfo/moses-support
>
> --
> Ondrej
> Bojar (mailto:obo@cuni.cz / bojar@ufal.mff.cuni.cz)
> http://www.cuni.cz/~obo
--
Ondrej Bojar (mailto:obo@cuni.cz / bojar@ufal.mff.cuni.cz)
http://www.cuni.cz/~obo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: alitextview.pl
Type: application/x-perl
Size: 3754 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20160107/44967551/attachment.bin
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 111, Issue 6
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 111, Issue 6"
Post a Comment