Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Moses-support Digest, Vol 91, Issue 52 (Miles Osborne)
2. Re: Moses-support Digest, Vol 91, Issue 52 (Hieu Hoang)
3. Re: Moses-support Digest, Vol 91, Issue 52 (Miles Osborne)
4. Re: Moses-support Digest, Vol 91, Issue 52 (Hieu Hoang)
----------------------------------------------------------------------
Message: 1
Date: Fri, 30 May 2014 12:22:26 -0400
From: Miles Osborne <miles@inf.ed.ac.uk>
Subject: Re: [Moses-support] Moses-support Digest, Vol 91, Issue 52
To: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAPRfTYq1CebBWkeb6rP02ePm0n_yorp2FCGvsceDRV0qjMWrPg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
this perl snippet:
$line =~ tr/\040-\176/ /c;
On 30 May 2014 12:17, <moses-support-request@mit.edu> wrote:
> Send Moses-support mailing list submissions to
> moses-support@mit.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
> moses-support-request@mit.edu
>
> You can reach the person managing the list at
> moses-support-owner@mit.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
> 1. removing non-printing character (Hieu Hoang)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 30 May 2014 16:24:30 +0100
> From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
> Subject: [Moses-support] removing non-printing character
> To: moses-support <moses-support@mit.edu>
> Message-ID:
> <CAEKMkbj4tEDZYVGeAStmg51+w-5SYE5YGRmibcYPC2j8YbKGfg@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> does anyone have a script/program that can remove all non-printing
> characters?
>
> I don't care if it's fast or slow, as long as it's ABSOLUTELY removes all
> non-printing chars
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140530/daee61ea/attachment-0001.htm
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 91, Issue 52
> *********************************************
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
------------------------------
Message: 2
Date: Fri, 30 May 2014 17:43:33 +0100
From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Subject: Re: [Moses-support] Moses-support Digest, Vol 91, Issue 52
To: Miles Osborne <miles@inf.ed.ac.uk>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAEKMkbghfb5L3X=VspyNUr_95PDvJvGxxkAA56dR+vZPeAtLRA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
forgot to say. The input is utf8. The snippet turns
gonz?lez
to
gonz lez
On 30 May 2014 17:22, Miles Osborne <miles@inf.ed.ac.uk> wrote:
> this perl snippet:
>
> $line =~ tr/\040-\176/ /c;
>
> On 30 May 2014 12:17, <moses-support-request@mit.edu> wrote:
> > Send Moses-support mailing list submissions to
> > moses-support@mit.edu
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> > or, via email, send a message with subject or body 'help' to
> > moses-support-request@mit.edu
> >
> > You can reach the person managing the list at
> > moses-support-owner@mit.edu
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Moses-support digest..."
> >
> >
> > Today's Topics:
> >
> > 1. removing non-printing character (Hieu Hoang)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Fri, 30 May 2014 16:24:30 +0100
> > From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
> > Subject: [Moses-support] removing non-printing character
> > To: moses-support <moses-support@mit.edu>
> > Message-ID:
> > <
> CAEKMkbj4tEDZYVGeAStmg51+w-5SYE5YGRmibcYPC2j8YbKGfg@mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > does anyone have a script/program that can remove all non-printing
> > characters?
> >
> > I don't care if it's fast or slow, as long as it's ABSOLUTELY removes all
> > non-printing chars
> >
> > --
> > Hieu Hoang
> > Research Associate
> > University of Edinburgh
> > http://www.hoang.co.uk/hieu
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20140530/daee61ea/attachment-0001.htm
> >
> > ------------------------------
> >
> > _______________________________________________
> > Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> > End of Moses-support Digest, Vol 91, Issue 52
> > *********************************************
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140530/23915604/attachment-0001.htm
------------------------------
Message: 3
Date: Fri, 30 May 2014 12:51:22 -0400
From: Miles Osborne <miles@inf.ed.ac.uk>
Subject: Re: [Moses-support] Moses-support Digest, Vol 91, Issue 52
To: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAPRfTYp2hDuvSRk2YRVK3js49U5iXpgQy7wZgbHMmOn-Z3hUOg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
it is trivial to change it to say a ? mark.
but I'm not sure what you want as output now. the original request
was for removing non-printable characters, which the Perl does,
Miles
On 30 May 2014 12:43, Hieu Hoang <Hieu.Hoang@ed.ac.uk> wrote:
> forgot to say. The input is utf8. The snippet turns
> gonz?lez
> to
> gonz lez
>
>
> On 30 May 2014 17:22, Miles Osborne <miles@inf.ed.ac.uk> wrote:
>>
>> this perl snippet:
>>
>> $line =~ tr/\040-\176/ /c;
>>
>> On 30 May 2014 12:17, <moses-support-request@mit.edu> wrote:
>> > Send Moses-support mailing list submissions to
>> > moses-support@mit.edu
>> >
>> > To subscribe or unsubscribe via the World Wide Web, visit
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> > or, via email, send a message with subject or body 'help' to
>> > moses-support-request@mit.edu
>> >
>> > You can reach the person managing the list at
>> > moses-support-owner@mit.edu
>> >
>> > When replying, please edit your Subject line so it is more specific
>> > than "Re: Contents of Moses-support digest..."
>> >
>> >
>> > Today's Topics:
>> >
>> > 1. removing non-printing character (Hieu Hoang)
>> >
>> >
>> > ----------------------------------------------------------------------
>> >
>> > Message: 1
>> > Date: Fri, 30 May 2014 16:24:30 +0100
>> > From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
>> > Subject: [Moses-support] removing non-printing character
>> > To: moses-support <moses-support@mit.edu>
>> > Message-ID:
>> >
>> > <CAEKMkbj4tEDZYVGeAStmg51+w-5SYE5YGRmibcYPC2j8YbKGfg@mail.gmail.com>
>> > Content-Type: text/plain; charset="utf-8"
>> >
>> > does anyone have a script/program that can remove all non-printing
>> > characters?
>> >
>> > I don't care if it's fast or slow, as long as it's ABSOLUTELY removes
>> > all
>> > non-printing chars
>> >
>> > --
>> > Hieu Hoang
>> > Research Associate
>> > University of Edinburgh
>> > http://www.hoang.co.uk/hieu
>> > -------------- next part --------------
>> > An HTML attachment was scrubbed...
>> > URL:
>> > http://mailman.mit.edu/mailman/private/moses-support/attachments/20140530/daee61ea/attachment-0001.htm
>> >
>> > ------------------------------
>> >
>> > _______________________________________________
>> > Moses-support mailing list
>> > Moses-support@mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>> >
>> > End of Moses-support Digest, Vol 91, Issue 52
>> > *********************************************
>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
------------------------------
Message: 4
Date: Fri, 30 May 2014 18:01:42 +0100
From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
Subject: Re: [Moses-support] Moses-support Digest, Vol 91, Issue 52
To: Miles Osborne <miles@inf.ed.ac.uk>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CAEKMkbj4qa11r_ByXAahd68qeXgwpFAqVXdLA4+jEQhThyc87A@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
in the attached file, there are 2 or more non-printing chars on the 1st
line, between the words 'place' and 'binding'. They should be
removed/replaced with a space. Those chars are deleted by parsers, making
the word alignments incorrect and crashing extract
The 2nd line is perfectly good utf8. It shouldn't be touched.
just another friday nlp malaise
On 30 May 2014 17:51, Miles Osborne <miles@inf.ed.ac.uk> wrote:
> it is trivial to change it to say a ? mark.
>
> but I'm not sure what you want as output now. the original request
> was for removing non-printable characters, which the Perl does,
>
> Miles
>
> On 30 May 2014 12:43, Hieu Hoang <Hieu.Hoang@ed.ac.uk> wrote:
> > forgot to say. The input is utf8. The snippet turns
> > gonz?lez
> > to
> > gonz lez
> >
> >
> > On 30 May 2014 17:22, Miles Osborne <miles@inf.ed.ac.uk> wrote:
> >>
> >> this perl snippet:
> >>
> >> $line =~ tr/\040-\176/ /c;
> >>
> >> On 30 May 2014 12:17, <moses-support-request@mit.edu> wrote:
> >> > Send Moses-support mailing list submissions to
> >> > moses-support@mit.edu
> >> >
> >> > To subscribe or unsubscribe via the World Wide Web, visit
> >> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >> > or, via email, send a message with subject or body 'help' to
> >> > moses-support-request@mit.edu
> >> >
> >> > You can reach the person managing the list at
> >> > moses-support-owner@mit.edu
> >> >
> >> > When replying, please edit your Subject line so it is more specific
> >> > than "Re: Contents of Moses-support digest..."
> >> >
> >> >
> >> > Today's Topics:
> >> >
> >> > 1. removing non-printing character (Hieu Hoang)
> >> >
> >> >
> >> > ----------------------------------------------------------------------
> >> >
> >> > Message: 1
> >> > Date: Fri, 30 May 2014 16:24:30 +0100
> >> > From: Hieu Hoang <Hieu.Hoang@ed.ac.uk>
> >> > Subject: [Moses-support] removing non-printing character
> >> > To: moses-support <moses-support@mit.edu>
> >> > Message-ID:
> >> >
> >> > <CAEKMkbj4tEDZYVGeAStmg51+w-5SYE5YGRmibcYPC2j8YbKGfg@mail.gmail.com>
> >> > Content-Type: text/plain; charset="utf-8"
> >> >
> >> > does anyone have a script/program that can remove all non-printing
> >> > characters?
> >> >
> >> > I don't care if it's fast or slow, as long as it's ABSOLUTELY removes
> >> > all
> >> > non-printing chars
> >> >
> >> > --
> >> > Hieu Hoang
> >> > Research Associate
> >> > University of Edinburgh
> >> > http://www.hoang.co.uk/hieu
> >> > -------------- next part --------------
> >> > An HTML attachment was scrubbed...
> >> > URL:
> >> >
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20140530/daee61ea/attachment-0001.htm
> >> >
> >> > ------------------------------
> >> >
> >> > _______________________________________________
> >> > Moses-support mailing list
> >> > Moses-support@mit.edu
> >> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >> >
> >> >
> >> > End of Moses-support Digest, Vol 91, Issue 52
> >> > *********************************************
> >>
> >>
> >>
> >> --
> >> The University of Edinburgh is a charitable body, registered in
> >> Scotland, with registration number SC005336.
> >> _______________________________________________
> >> Moses-support mailing list
> >> Moses-support@mit.edu
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> >
> >
> > --
> > Hieu Hoang
> > Research Associate
> > University of Edinburgh
> > http://www.hoang.co.uk/hieu
> >
> >
> > The University of Edinburgh is a charitable body, registered in
> > Scotland, with registration number SC005336.
> >
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140530/824a4d31/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: baa
Type: application/octet-stream
Size: 446 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20140530/824a4d31/attachment.obj
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 91, Issue 53
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 91, Issue 53"
Post a Comment