Moses-support Digest, Vol 103, Issue 73

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. Re: Factored-model (Marwa Refaie)
2. Re: New source formatter & checker (Christian Buck)
3. Brahmi-Net: Transliteration and script conversion for Indian
languages (Anoop (?????))


----------------------------------------------------------------------

Message: 1
Date: Wed, 27 May 2015 19:42:17 +0300
From: Marwa Refaie <basmallah@hotmail.com>
Subject: Re: [Moses-support] Factored-model
To: toty qa <toty2009qaisi@hotmail.com>, <moses-support@mit.edu>
Message-ID: <DUB406-EAS277740B7B4DBEA42A11465EBACB0@phx.gbl>
Content-Type: text/plain; charset="utf-8"

Hi
You must learn language model twice ..
.........................................
One just the factor in target language::
VB DT NN
NN VB VB NN
DT VB NN

Second the surface of target language
We will go cinema
Kids are wanderful
The spring showers
.............................................

The translation model :: use files as needed POS in both languages or just the target languge & adjust the factors numbers in train script & moses.ini

--- Original Message ---

From: "toty qa" <toty2009qaisi@hotmail.com>
Sent: 27 May 2015 19:26
To: moses-support@mit.edu
Subject: [Moses-support] Factored-model

Hi,
I am trying to build English-Arabic factored baseline MT system. I have data formated with POS as:


word1|POS1 word2|POS2......wordn|POSn

as well as POS and SUrface form only data



I could not follow the pipline from the website as it is unclear for me

My questions are:
1- in training the lm step , do I have train the lm twice with POS
and with surface ?

2- Do I have to include Lemma in data?

3- ?How to train the translation model



By the way the data form in the website :
word0factor0|word0factor1|word0factor2 word1factor0|word1factor1|word1factor2 ...

Regards
Taghreed

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150527/6f72e270/attachment-0001.htm
-------------- next part --------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

------------------------------

Message: 2
Date: Wed, 27 May 2015 13:46:10 -0400
From: Christian Buck <cbuck@lantis.de>
Subject: Re: [Moses-support] New source formatter & checker
To: moses-support@mit.edu
Message-ID: <556602E2.6060505@lantis.de>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi,

[My opinion should not count much because I don't contribute a lot of
code to Moses.]

Howver, I enjoy clang-format [1] which can be integrated [2] with
Eclipse CDT (and of course vim/emacs/sublime text) and suggest you give
it a try. clang-format supports different styles so we can still have
the war about whitespace :)

[1] http://clang.llvm.org/docs/ClangFormat.html
[2] https://github.com/schulmar/clang_format_eclipsecdt

On 05/27/2015 01:25 PM, Lane Schwartz wrote:
> Never mind - I see the difference reading down farther. In that case,
> please change my vote from Java to Google.
>
> On Wed, May 27, 2015 at 12:24 PM, Lane Schwartz <dowobeha@gmail.com
> <mailto:dowobeha@gmail.com>> wrote:
>
> Hieu,
>
> What's the difference between Java and Google styles? They look the
> same to me in the examples. I voted Java, but I'd be fine with
> Google, since it looks basically the same.
>
> Lane
>
>
> On Wed, May 27, 2015 at 9:58 AM, Hieu Hoang <hieuhoang@gmail.com
> <mailto:hieuhoang@gmail.com>> wrote:
>
> seems like referendums are all the rage at the moment so we're
> gonna have 1 to decide coding style.
> https://www.surveymonkey.com/s/BN38LBQ
> If you work with the Moses code, please vote. This will give us
> a clear mandate for future development.
>
> Polls will open until Sunday evening. All results are public
>
> On 27/05/2015 05:16, Jeroen Vermeulen wrote:
>> On May 27, 2015 2:47:02 AM GMT+07:00, Ulrich Germann
>> <ulrich.germann@gmail.com> <mailto:ulrich.germann@gmail.com>
>> wrote:
>>
>> Well, the second document does indeed give some darn good
>> arguments for
>> trailing braces: 1. "
>> as shown to us by the prophets Kernighan and Ritchie
>> "; 2. "
>> K&R are _right_ and [...] K&R are right
>> "; 3. "
>> think 25-line terminal screens here
>> "
>>
>> The document also has other great suggestions that we may
>> want to adopt."
>> Tabs are 8 characters, and thus indentations are also 8
>> characters
>> "
>>
>> The fact is: with trailing braces and steadfast refusal to
>> use vertical space to improve
>> code readibility, you do need a tab width of 8 or so to be
>> able to recognize code blocks.
>>
>> Incidentally, there's actually one recommendation in that
>> document that IS worth
>> adopting *(and in which pretty much all style guides agree
>> plus-minus 5 characters or so):*
>>
>> The limit on the length of lines is 80 columns and this is
>> a strongly preferred limit.
>>
>> Let me repeat that:
>>
>> The limit on the length of lines is 80 columns and this is
>> a strongly preferred limit.
>>
>> And again:
>>
>> The limit on the length of lines is 80 columns and this is
>> a strongly preferred limit.
>>
>> That is the only thing worth really considering from that
>> document.
>>
>> We can safely consider the 25-line limit a thing of the
>> past unless you program on
>> your mobile phone. However, there is a good justification
>> for the 80-column
>> limit: merging code, where it's useful to see the code
>> side by side. When
>> resolving merge conflicts in git, we are usually dealing
>> with 3 versions that need
>> to be compared. Those three columns won't fit on my screen
>> when lines are 120+
>> characters long.
>>
>> I digress. Back to the braces issue. Unlike the arguments
>> for trailing braces
>> (K&R did it in a printed book to save paper; I'm a retro
>> programmer who likes to
>> do things old school; it obscures my code so people will
>> think I'm really awesome
>> because they can't really understand what I'm doing
>> without lots of effort, just like
>> Derrida and Piaget didn't write to be understood --- they
>> wrote NOT to be understood),
>> so unlike these arguments, there are very good arguments
>> for having opening braces
>> on a new line and vertically aligned with the closing bracket.
>>
>> 1. It's much easier to recognize logical code blocks. It
>> is. Try it out. Read code that you
>> haven't written and be honest with yourself. It may
>> seem unfamiliar at first, but once
>> you are used to it, there's no turning back.
>>
>> 2. If you want to temporarily disable a condition (e.g. in
>> debugging), you just comment
>> out the if statement. You do not have to THEN put an
>> opening brace at the beginning
>> of the line (extra editing) and remove it later (extra
>> editing again) --- which also
>> FORCES you to be inconsistent. (You can't always remove
>> the braces because
>> of variable scoping). You have the same issue when you
>> want to use preprocessor
>> commands to have conditions only under certain
>> circumstances, so instead of
>> #ifdef ENABLE_CONDITION
>> if (condition)
>>

0 Response to "Moses-support Digest, Vol 103, Issue 73"

Post a Comment