Moses-support Digest, Vol 128, Issue 26

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Error display japanese from output of mecab, please me!
(Ng? Th? Vinh)
2. Re: Error display japanese from output of mecab, please me!
(Hieu Hoang)
3. Re: Error display japanese from output of mecab, please me!
(Ng? Th? Vinh)
4. Re: Error display japanese from output of mecab, please me!
(Ng? Th? Vinh)


----------------------------------------------------------------------

Message: 1
Date: Thu, 22 Jun 2017 11:44:09 +0700
From: Ng? Th? Vinh <ntvinh@ictu.edu.vn>
Subject: [Moses-support] Error display japanese from output of mecab,
please me!
To: moses-support@mit.edu
Message-ID:
<CA+VYTDhBvZOed5+R6tUjyiDmH=mQ4JVVrvnFNUfK6QL5mmcgMA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi all,
Does anyone has ever done experiments with Japanese-English and using mecab
for tokenization japanese?
I test mecab with a simple file temp from terminal such as:
mecab -O wakati temp -o temp_rs
but temp_rs is error with font, although I have installed font for Japanese?
Please help me to solve the problem if you know.
Thank you!

--
*Ng? Thi? Vinh*
Faculty of Electronics and Communications,
Thai Nguyen University of Information and Communication Technology (ICTU).
TEL: 0987 706 830
Email: *ntvinh@ictu.edu.vn <ptnghia@ictu.edu.vn>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20170622/f82abbbd/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: temp
Type: application/octet-stream
Size: 93 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20170622/f82abbbd/attachment-0002.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: temp_rs
Type: application/octet-stream
Size: 114 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20170622/f82abbbd/attachment-0003.obj

------------------------------

Message: 2
Date: Thu, 22 Jun 2017 08:05:24 +0100
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support] Error display japanese from output of
mecab, please me!
To: Ng? Th? Vinh <ntvinh@ictu.edu.vn>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbi5a4E+arV24g4bHGrcR6=mczCafHCqCfev7mibYRaBZQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

You have to find out what encoding the output is in. This page says MeCab
output in EUC-JP
https://forum.koohii.com/thread-8952.html
But running
iconv -f EUCJP -t UTF-8 temp_rs
doesn't seem to fix it.

You should ask the Mecab developers to get a definite answer

Hieu Hoang
http://moses-smt.org/


On 22 June 2017 at 05:44, Ng? Th? Vinh <ntvinh@ictu.edu.vn> wrote:

> Hi all,
> Does anyone has ever done experiments with Japanese-English and using
> mecab for tokenization japanese?
> I test mecab with a simple file temp from terminal such as:
> mecab -O wakati temp -o temp_rs
> but temp_rs is error with font, although I have installed font for
> Japanese?
> Please help me to solve the problem if you know.
> Thank you!
>
> --
> *Ng? Thi? Vinh*
> Faculty of Electronics and Communications,
> Thai Nguyen University of Information and Communication Technology (ICTU).
> TEL: 0987 706 830
> Email: *ntvinh@ictu.edu.vn <ptnghia@ictu.edu.vn>*
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20170622/328a47f8/attachment-0001.html

------------------------------

Message: 3
Date: Thu, 22 Jun 2017 14:55:16 +0700
From: Ng? Th? Vinh <ntvinh@ictu.edu.vn>
Subject: Re: [Moses-support] Error display japanese from output of
mecab, please me!
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CA+VYTDib2s-T7gJtTDqbKWnxeoXVHjQVY8Tp_W3=cCFDJ34yPw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Thank Hieu Hoang!

2017-06-22 14:05 GMT+07:00 Hieu Hoang <hieuhoang@gmail.com>:

> You have to find out what encoding the output is in. This page says MeCab
> output in EUC-JP
> https://forum.koohii.com/thread-8952.html
> But running
> iconv -f EUCJP -t UTF-8 temp_rs
> doesn't seem to fix it.
>
> You should ask the Mecab developers to get a definite answer
>
> Hieu Hoang
> http://moses-smt.org/
>
>
> On 22 June 2017 at 05:44, Ng? Th? Vinh <ntvinh@ictu.edu.vn> wrote:
>
>> Hi all,
>> Does anyone has ever done experiments with Japanese-English and using
>> mecab for tokenization japanese?
>> I test mecab with a simple file temp from terminal such as:
>> mecab -O wakati temp -o temp_rs
>> but temp_rs is error with font, although I have installed font for
>> Japanese?
>> Please help me to solve the problem if you know.
>> Thank you!
>>
>> --
>> *Ng? Thi? Vinh*
>> Faculty of Electronics and Communications,
>> Thai Nguyen University of Information and Communication Technology (ICTU).
>> TEL: 0987 706 830
>> Email: *ntvinh@ictu.edu.vn <ptnghia@ictu.edu.vn>*
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>


--
*Ng? Thi? Vinh*
Faculty of Electronics and Communications,
Thai Nguyen University of Information and Communication Technology (ICTU).
TEL: 0987 706 830
Email: *ntvinh@ictu.edu.vn <ptnghia@ictu.edu.vn>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20170622/331ceb81/attachment-0001.html

------------------------------

Message: 4
Date: Thu, 22 Jun 2017 16:31:44 +0700
From: Ng? Th? Vinh <ntvinh@ictu.edu.vn>
Subject: Re: [Moses-support] Error display japanese from output of
mecab, please me!
To: moses-support <moses-support@mit.edu>
Message-ID:
<CA+VYTDjMV9WBFAWuk6Z4wARNxPaf8a2+PHRdCk+N4a5_3TttwA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

I have solve my problem with guide:
http://qiita.com/junpooooow/items/0a7d13addc0acad10606
if anyone have them same my problem can see that.

$ tar zxfv mecab-ipadic-2.7.0-XXXX.tar.gz
$ cd mecab-ipadic-2.7.0-XXXX
$ nkf -w - overwrite *. csv
$ nkf -w - overwrite * .def

Edit dicrc

- config-charset = EUC-JP -->replace by Config-charset = UTF-8

Installation

$ ./configure --with-charset=utf8
$ make
$ sudo make install



2017-06-22 14:55 GMT+07:00 Ng? Th? Vinh <ntvinh@ictu.edu.vn>:

> Thank Hieu Hoang!
>
> 2017-06-22 14:05 GMT+07:00 Hieu Hoang <hieuhoang@gmail.com>:
>
>> You have to find out what encoding the output is in. This page says MeCab
>> output in EUC-JP
>> https://forum.koohii.com/thread-8952.html
>> But running
>> iconv -f EUCJP -t UTF-8 temp_rs
>> doesn't seem to fix it.
>>
>> You should ask the Mecab developers to get a definite answer
>>
>> Hieu Hoang
>> http://moses-smt.org/
>>
>>
>> On 22 June 2017 at 05:44, Ng? Th? Vinh <ntvinh@ictu.edu.vn> wrote:
>>
>>> Hi all,
>>> Does anyone has ever done experiments with Japanese-English and using
>>> mecab for tokenization japanese?
>>> I test mecab with a simple file temp from terminal such as:
>>> mecab -O wakati temp -o temp_rs
>>> but temp_rs is error with font, although I have installed font for
>>> Japanese?
>>> Please help me to solve the problem if you know.
>>> Thank you!
>>>
>>> --
>>> *Ng? Thi? Vinh*
>>> Faculty of Electronics and Communications,
>>> Thai Nguyen University of Information and Communication Technology
>>> (ICTU).
>>> TEL: 0987 706 830
>>> Email: *ntvinh@ictu.edu.vn <ptnghia@ictu.edu.vn>*
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>
>
> --
> *Ng? Thi? Vinh*
> Faculty of Electronics and Communications,
> Thai Nguyen University of Information and Communication Technology (ICTU).
> TEL: 0987 706 830
> Email: *ntvinh@ictu.edu.vn <ptnghia@ictu.edu.vn>*
>



--
*Ng? Thi? Vinh*
Faculty of Electronics and Communications,
Thai Nguyen University of Information and Communication Technology (ICTU).
TEL: 0987 706 830
Email: *ntvinh@ictu.edu.vn <ptnghia@ictu.edu.vn>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20170622/abd220dd/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 128, Issue 26
**********************************************

0 Response to "Moses-support Digest, Vol 128, Issue 26"

Post a Comment