Moses-support Digest, Vol 94, Issue 3

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Are there source generation steps in Moses? (Ondrej Bojar)
2. Re: Are there source generation steps in Moses?
(Marcin Junczys-Dowmunt)
3. Re: Are there source generation steps in Moses? (Hieu Hoang)
4. Re: problem with translation output (ULStudent:GIOVANNI.GALLO)


----------------------------------------------------------------------

Message: 1
Date: Fri, 01 Aug 2014 21:05:44 +0200
From: Ondrej Bojar <bojar@ufal.mff.cuni.cz>
Subject: Re: [Moses-support] Are there source generation steps in
Moses?
To: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>, moses-support@mit.edu
Message-ID: <ccff4066-2831-4f2b-9f14-cbe0e9dec932@email.android.com>
Content-Type: text/plain; charset=UTF-8

Hi, Marcin,

If I understand your intentions well, the non-determinism in g0 is indeed the critical thing. I have not used confusion networks with factors (so it might not work out of the box), but if your source language POS ambiguity you want to feed in is not too big, why not just trying them.

Shamefully, I also have never used OSM so far, so I have no idea whether it conflicts with your planned setup in a any way.

I'm sending a couple of students to MTM this September so I'll try talk them into this. (Not all of them are reading moses-support ;-)

Cheers, O.

On August 1, 2014 8:00:09 PM CEST, Marcin Junczys-Dowmunt <junczys@amu.edu.pl> wrote:
>On the other hand ... that sounds like a small MT-Marathon project.
>
>W dniu 01.08.2014 o 19:52, Marcin Junczys-Dowmunt pisze:
>
>Well, I agree :)
>Anyone want to tackle this with me? At least something basic that
>emulates multiple input factors on the input sentence level should not
>be that hard... non-determinism is maybe an issue, but that would then
>look like an input confusion network?
>
>W dniu 01.08.2014 o 19:45, Philipp Koehn pisze:
>
>Hi,
>
>
>there are not, but there should be.
>
>
>-phi
>
>
>
>On Fri, Aug 1, 2014 at 1:31 PM, Marcin Junczys-Dowmunt
><junczys@amu.edu.pl> wrote:
>
>Hi,
>does Moses support source generation steps before translation? I would
>like to accomplish something like that incredible ASCII art below,
>where t0 is a surface form phrase table, g0 is the source POS
>generation model, g1 taget POS generation model, lm0 is a surface
>language model, lm1 a POS language model, osm0 is a OSM is a defined
>over the generated POS tags. Is that possible?
>
>
>0 src_word --t0--> trg_word --> lm0
> | |
> g0 g1
> | |
> V v
>1 src_pos trg_pos --> lm1
> \ /
> \ /
> - osm0 -
>
>
>
>_______________________________________________
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
>
>_______________________________________________ Moses-support mailing
>list Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support

--
Ondrej Bojar (mailto:obo@cuni.cz / bojar@ufal.mff.cuni.cz)
http://www.cuni.cz/~obo




------------------------------

Message: 2
Date: Fri, 01 Aug 2014 21:12:32 +0200
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] Are there source generation steps in
Moses?
To: Ondrej Bojar <bojar@ufal.mff.cuni.cz>, moses-support@mit.edu
Message-ID: <53DBE6A0.2060800@amu.edu.pl>
Content-Type: text/plain; charset=utf-8; format=flowed

Ho Ondrej,
Hieu probably solved that already using just his phone and a shoe lace
on the way home :)

W dniu 01.08.2014 o 21:05, Ondrej Bojar pisze:
> Hi, Marcin,
>
> If I understand your intentions well, the non-determinism in g0 is indeed the critical thing. I have not used confusion networks with factors (so it might not work out of the box), but if your source language POS ambiguity you want to feed in is not too big, why not just trying them.
>
> Shamefully, I also have never used OSM so far, so I have no idea whether it conflicts with your planned setup in a any way.
>
> I'm sending a couple of students to MTM this September so I'll try talk them into this. (Not all of them are reading moses-support ;-)
>
> Cheers, O.
>
> On August 1, 2014 8:00:09 PM CEST, Marcin Junczys-Dowmunt <junczys@amu.edu.pl> wrote:
>> On the other hand ... that sounds like a small MT-Marathon project.
>>
>> W dniu 01.08.2014 o 19:52, Marcin Junczys-Dowmunt pisze:
>>
>> Well, I agree :)
>> Anyone want to tackle this with me? At least something basic that
>> emulates multiple input factors on the input sentence level should not
>> be that hard... non-determinism is maybe an issue, but that would then
>> look like an input confusion network?
>>
>> W dniu 01.08.2014 o 19:45, Philipp Koehn pisze:
>>
>> Hi,
>>
>>
>> there are not, but there should be.
>>
>>
>> -phi
>>
>>
>>
>> On Fri, Aug 1, 2014 at 1:31 PM, Marcin Junczys-Dowmunt
>> <junczys@amu.edu.pl> wrote:
>>
>> Hi,
>> does Moses support source generation steps before translation? I would
>> like to accomplish something like that incredible ASCII art below,
>> where t0 is a surface form phrase table, g0 is the source POS
>> generation model, g1 taget POS generation model, lm0 is a surface
>> language model, lm1 a POS language model, osm0 is a OSM is a defined
>> over the generated POS tags. Is that possible?
>>
>>
>> 0 src_word --t0--> trg_word --> lm0
>> | |
>> g0 g1
>> | |
>> V v
>> 1 src_pos trg_pos --> lm1
>> \ /
>> \ /
>> - osm0 -
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>>
>>
>> _______________________________________________ Moses-support mailing
>> list Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support



------------------------------

Message: 3
Date: Fri, 1 Aug 2014 22:55:22 +0100
From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Subject: Re: [Moses-support] Are there source generation steps in
Moses?
To: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbhLaz6Mg+8W=y5PXRwUhD3Kkc0EgGMMqb0MGXcm20n49g@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

i've added another virtual method to FeatureFunction:
virtual void ChangeSource(InputType *&input) const
Override it in your own FF to change the input sentence. You can
1. add or change a factor. This could be used as a wrapper for a POS
tagger or a simple lookup
2. delete or add a word
3. delete the entire sentence and replace it with another 1, or replace
it with another input type (eg. replace a sentence with a tree). This could
be used as a wrapper for a parser that parses the input sentence.

i've added an example feature function
moses/FF/SkeletonChangeInput.cpp

ps. Not tested in any way. When you run it, it will probably assert that
you don't have the right factors. Comment out that code.



On 1 August 2014 18:31, Marcin Junczys-Dowmunt <junczys@amu.edu.pl> wrote:

> Hi,
> does Moses support source generation steps before translation? I would
> like to accomplish something like that incredible ASCII art below, where t0
> is a surface form phrase table, g0 is the source POS generation model, g1
> taget POS generation model, lm0 is a surface language model, lm1 a POS
> language model, osm0 is a OSM is a defined over the generated POS tags. Is
> that possible?
>
>
> 0 src_word --t0--> trg_word --> lm0
> | |
> g0 g1
> | |
> V v
> 1 src_pos trg_pos --> lm1
> \ /
> \ /
> - osm0 -
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140801/46fe8bcf/attachment-0001.htm

------------------------------

Message: 4
Date: Sat, 2 Aug 2014 11:12:46 +0000
From: "ULStudent:GIOVANNI.GALLO" <12064866@studentmail.ul.ie>
Subject: Re: [Moses-support] problem with translation output
To: Philipp Koehn <pkoehn@inf.ed.ac.uk>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID: <1406977965665.62451@studentmail.ul.ie>
Content-Type: text/plain; charset="iso-8859-1"

Hi Philipp,


Thank you for your reply. Here is an example:


??Source sentence in Spanish

Es un local climatizado con capacidad para 150 personas, con una terraza para 30 personas y rodeado de una zona ajardinada.

Translated sentence into Italian

Es un locale climatizzato capienza 150 personas,|UNK|UNK|UNK con terrazza per 30 persone e circondato una zona ajardinada. |UNK|UNK|UNK


The strange thing is that if I remove the comma and the period from the source sentence, the |UNK|UNK|UNK disappear and personas and ajardinada get translated:


Es un locale climatizzato capienza 150 persone con terrazza per 30 persone e circondato una giardino


Any idea? Does it have to do with the fact that my model in unfactored?

Thank you in advance.


GG


________________________________
Da: phkoehn@gmail.com <phkoehn@gmail.com> per conto di Philipp Koehn <pkoehn@inf.ed.ac.uk>
Inviato: venerd? 1 agosto 2014 16.59
A: ULStudent:GIOVANNI.GALLO
Cc: moses-support@mit.edu
Oggetto: Re: [Moses-support] problem with translation output

Hi,

this should not happen - are they really identically (uppercase/lowercase etc.)?

You can run the decoder with more verbose output (-v 2 or even -v 3) and the
trace options (-t) to dig a bit deeper on what is going on.

-phi



On Fri, Aug 1, 2014 at 7:24 AM, ULStudent:GIOVANNI.GALLO <12064866@studentmail.ul.ie<mailto:12064866@studentmail.ul.ie>> wrote:

Hi everyone,


I'm running some experiments with Moses and I noticed that in translated sentences there are always one or two words/phrases that don't get translated (they appear as in the source sentence). I checked the phrase table and there are many entries corresponding to those words/phrases that might be used during decoding. Do you have any idea what's happening here? Maybe something I need to modify in the moses.ini file?

Thank you in advance for your help.


Regards,

Giovanni Gallo.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu<mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140802/a3c79bd4/attachment.htm

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 94, Issue 3
********************************************

0 Response to "Moses-support Digest, Vol 94, Issue 3"

Post a Comment