Moses-support Digest, Vol 86, Issue 29

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."

Today's Topics:

1. Re: using Moses in Monolingual dialogue setting (Read, James C)

----------------------------------------------------------------------

Message: 1
Date: Mon, 9 Dec 2013 17:14:17 +0000
From: "Read, James C" <jcread@essex.ac.uk>
Subject: Re: [Moses-support] using Moses in Monolingual dialogue
setting
To: Kevin Gimpel <kgimpel@cs.cmu.edu>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<F00840E41983C645928E21E3C35F4EB1012CF80520@mbx1-node2.essex.ac.uk>
Content-Type: text/plain; charset="iso-8859-1"

I'm guessing he wants to make a conversational agent that produces a most likely response based on the stimulus.

In any case, the distinction between 1 and 2 is probably redundant if GIZA++ is being used to train in both directions. The two phrase tables could be merged I guess. I guess the advantage of 2 over 1 is that you don't need to worry about the merging logic at the cost of more training time.

I'm not sure I understand the question of A1~B3. Unless I'm reading his question wrong I don't see how this could happen.

I suppose my main concern would be the inordinate amounts of training data you would need to get something useful up and running.

James

________________________________
From: kgimpel@gmail.com [kgimpel@gmail.com] on behalf of Kevin Gimpel [kgimpel@cs.cmu.edu]
Sent: 09 December 2013 15:17
To: Read, James C
Cc: Andrew; moses-support@mit.edu
Subject: Re: [Moses-support] using Moses in Monolingual dialogue setting

Hi Andrew, it's an interesting idea.. I would guess that it would depend on what the data look like. If the A's and B's are of fundamentally different type (e.g., they are utterances in an automatic dialogue system, where A's are always questions and B's are always responses), then approach 2 seems a bit odd as it will conflate A's and B's utterances. However, if the A's and B's are just part of a conversation, e.g., in IM chats, then they are of the same "type" and approach 2 would make sense. In fact, I think approach 2 would make more sense than approach 1 in that case. It also of course depends on how you want to use the resulting translation system.
Kevin

On Mon, Dec 9, 2013 at 5:18 AM, Read, James C <jcread@essex.ac.uk<mailto:jcread@essex.ac.uk>> wrote:
Are you trying to figure out the probability of a response given a stimulus?

Given that GIZA++ aligns words and makes heavy use of co-occurrence statistics I doubt this is likely to produce very fruitful results. How big is your data set?

Give it a whirl and see what happens. I would be interested to hear what comes of it?

James

________________________________
From: moses-support-bounces@mit.edu<mailto:moses-support-bounces@mit.edu> [moses-support-bounces@mit.edu<mailto:moses-support-bounces@mit.edu>] on behalf of Andrew [ravenyj@hotmail.com<mailto:ravenyj@hotmail.com>]
Sent: 08 December 2013 20:10
To: moses-support@mit.edu<mailto:moses-support@mit.edu>
Subject: [Moses-support] using Moses in Monolingual dialogue setting

Hi,

I'm using Moses in monolingual dialogue setting as in http://aritter.github.io/mt_chat.pdf,
where source and target are both in English and target is a response to source.
I'd like to propose a little thought experiment in this setting, and hear what you think would happen.

Suppose we have a conversation with six utterances, A1,B1,A2,B2,A3,B3 where A and B indicate speakers,
and the number indicates n-th statement by the speaker. They are all in one conversation of continuous topic.

Now suppose we train it using Moses in two different ways as following:
1) Source file contains A1, A2, A3 and target contains B1, B2, B3 so that A1-B1 is a pair and so on.
2) Source contains A1,B1,A2,B2,A3 and target contains B1,A2,B2,A3,B3, taking advantage of the fact that response is a stimulus to the next response.

Then, How will the results be different and why?
Since GIZA++ gets alignment in both directions, will 2) result in any of A1~B3 being the translation of any other?

This may be a strange question, but I would really like to get your insight.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

End of Moses-support Digest, Vol 86, Issue 29
*********************************************

Moses-support Digest, Vol 86, Issue 29

0 Response to "Moses-support Digest, Vol 86, Issue 29"

Post a Comment