Moses-support Digest, Vol 138, Issue 14

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: https://www.mail-archive.com/moses-support@mit.edu/ is
still not reached from Turkey after 3 years (Hieu Hoang)
2. Re: NCv12 number of lines mismatch (Vincent Nguyen)


----------------------------------------------------------------------

Message: 1
Date: Mon, 23 Apr 2018 14:12:51 +0100
From: Hieu Hoang <hieuhoang@gmail.com>
Subject: Re: [Moses-support]
https://www.mail-archive.com/moses-support@mit.edu/ is still not
reached from Turkey after 3 years
To: Ergun Bicici <bicici@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAEKMkbhMw2an-KM8mfwkfgFwZKfbNY18P4VitSfiG_LuUTEM+A@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hieu Hoang
http://moses-smt.org/


On 23 April 2018 at 14:02, Ergun Bicici <bicici@gmail.com> wrote:

>
> Dear Hieu,
>
> Thank you for your message. Is there a mirror website to access?
>
external websites mirror the moses-support mailing list archive, eg
http://moses-support.mit.narkive.com/#slidedown_90


> or could the archives and current group moved to another domain that may
> be more secure against impersonation or internet access cutoff? I found
> that http://www.mail-archive.com/ is blocked for some reason. For
> instance googlegroups might be a possibility as they might be more focused
> on personal communication rather than forums / blogs as the raised issue
> might be sourcing from this. My main concern is that I would like to be
> able to send messages to moses-support and see that they are there at least
> for some time so that even if there is some man-in-the-middle attack, the
> messages can still find their way.
>
> Anyway, I did the reporting part from my side to gain access to
> moses-support from Turkey without internet tunnelling. I don't know the
> exact reason for blocking wikipedia but I heard that it is due to some
> content and my current suggestion is to block that part of wikipedia and it
> is damaging internet search since usually wikipedia entries come up.
>
> Also,? when I tried to download the CzEng dataset, we encountered a
> difficulty with Onrei Bojar, for that we talked about md5sum usage to
> verify downloaded files. Maybe there was some interception of the network
> download. Therefore, md5sums can be provided with datasets shared.
>
> Regards,
> Ergun
>
> On Mon, Apr 23, 2018 at 1:43 PM, Hieu Hoang <hieuhoang@gmail.com> wrote:
>
>> sorry to hear that. I was in Turkey a few weeks ago and saw they blocked
>> innocuous sites such wikipedia, so not surprised mail-archived is also
>> blocked for whatever reason.
>>
>> PM you link to the mit.edu's internal archive. I don't share this
>> publicly as it has people's raw emails, don't want it to be harvested by
>> spammers
>>
>> On 23/04/18 11:28, Ergun Bicici wrote:
>>
>>
>> Dear Moses mailing list,
>>
>> I am not able to reach https://www.mail-archive
>> .com/moses-support@mit.edu/ from Turkey for the last 3 years and why is
>> so is still a mystery to me. I am able to post/send messages but could
>> check them through internet tunneling.
>>
>> Do you know why / have some explanation? Thank you.
>>
>> Here are steps:
>> - I verified that is is blocked
>> - I asked about this to BTK (https://www.btk.gov.tr/, but they did not
>> solve a more specific issue I asked for)
>> - I reported to https://turkeyblocks.org
>>
>> https://www.comparitech.com/privacy-security-tools/blockedinturkey/
>>
>> DOMAIN TO CHECK
>>
>> Istanbul - www.mail-archive.com/moses-support@mit.edu/ *Not Working* in
>> Turkey.
>>
>> Ankara - www.mail-archive.com/moses-support@mit.edu/ *Not Working* in
>> Turkey.
>> ------------------------------
>> This URL appears to be blocked in Turkey.
>>
>>
>>
>> Best Regards,
>> Ergun
>>
>> Ergun Bi?ici
>> http://bicici.github.com/ <http://ergunbicici.blogspot.com/>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Workshop on Statistical Machine Translation" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to wmt-tasks+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>> --
>> Hieu Hoanghttp://moses-smt.org/
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
>
> Regards,
> Ergun
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20180423/7fcc5fa4/attachment-0001.html

------------------------------

Message: 2
Date: Mon, 23 Apr 2018 16:06:10 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: Re: [Moses-support] NCv12 number of lines mismatch
To: moses-support@mit.edu, Barry Haddow <bhaddow@staffmail.ed.ac.uk>
Message-ID: <c1b96cf4-3c95-b322-aa70-cf1a2be14714@neuf.fr>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hello

just to get back to this issue since I bumped into it again:

$ tr -d -c '\n' < news-commentary-v12.de-en.de | wc -c
270769
$ tr -d -c '\r' < news-commentary-v12.de-en.de | wc -c
3920
$ tr -d -c '\n' < news-commentary-v12.de-en.en | wc -c
270769
$ tr -d -c '\r' < news-commentary-v12.de-en.en | wc -c
4099

so v12 is broken somehow when reading it with some tools / primitive,
but it works with some others.

Just to let you know.



Le 14/09/2017 ? 08:48, Vincent Nguyen a ?crit?:
> okay really weird.
> wc gives me the same numbers as you, but gedit give another 2 different
> numbers for each file. Must be special characters somewhere.
>
>
> Le 13/09/2017 ? 18:52, Barry Haddow a ?crit?:
>> Hi Vincent
>>
>> Looks fine to me:
>>
>>> wc -l news-commentary-v12.de-en.*
>>> ? 270769 news-commentary-v12.de-en.de
>>> ? 270769 news-commentary-v12.de-en.en
>>> ? 541538 total
>> What are you running that shows you different line numbers?
>>
>> cheers - Barry
>>
>> On 12/09/17 10:06, Vincent Nguyen wrote:
>>> Hi,
>>> Is there an updated version of NCv12 for this
>>> http://data.statmt.org/wmt17/translation-task/training-parallel-nc-v12.tgz
>>>
>>>
>>> the number of lines for de-en is not the same in the 2 languages.
>>>
>>> Cheers,
>>> Vincent
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support



------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 138, Issue 14
**********************************************

0 Response to "Moses-support Digest, Vol 138, Issue 14"

Post a Comment