Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Concatenating large files for LM (Per Tunedal)
2. Re: Concatenating large files for LM (Per Tunedal)
3. ERROR: Can't generate symmetrized alignment file
(cyrine.nasri@univ-lorraine.fr)
4. Re: ERROR: Can't generate symmetrized alignment file
(Massinissa Ahmim)
----------------------------------------------------------------------
Message: 1
Date: Wed, 26 Feb 2014 15:11:33 +0100
From: Per Tunedal <per.tunedal@operamail.com>
Subject: [Moses-support] Concatenating large files for LM
To: moses-support@mit.edu
Message-ID:
<1393423893.3605.88043917.619E7C65@webmail.messagingengine.com>
Content-Type: text/plain
Hi,
I've collected tons of text to estimate a new language model, but ran
into trouble when trying to combine some large files.
Eg. if I combine Europarl7 (1 880 000 000 lines) with Wikipedia (15 070
000 000) I get only 15 644 000 lines, and not the expected 16 950 000
lines.
I tried:
cat /home/per/corpora/Europarl7.fr-sv.fr
/home/per/corpora_mono/Franska/frWikipedia.cleaned.text >
/home/per/corpora_mono/Franska/Corpus1.fr
l I was surprised by the small size of the new file and counted the
lines with:
wc -l /home/per/corpora_mono/Franska/Corpus1.fr etc.
I'm running Debian Squeeze 64-bit.
Yours,
Per Tunedal
------------------------------
Message: 2
Date: Wed, 26 Feb 2014 16:38:01 +0100
From: Per Tunedal <per.tunedal@operamail.com>
Subject: Re: [Moses-support] Concatenating large files for LM
To: moses-support@mit.edu
Message-ID:
<1393429081.28121.88085313.70025AFB@webmail.messagingengine.com>
Content-Type: text/plain
Oops! Now it works OK. I must have made a typo.
I thought the files might be too large to handle with cat.
Yours,
Per Tunedal
On Wed, Feb 26, 2014, at 15:11, Per Tunedal wrote:
>
> Hi,
> I've collected tons of text to estimate a new language model, but ran
> into trouble when trying to combine some large files.
>
> Eg. if I combine Europarl7 (1 880 000 000 lines) with Wikipedia (15 070
> 000 000) I get only 15 644 000 lines, and not the expected 16 950 000
> lines.
>
> I tried:
>
> cat /home/per/corpora/Europarl7.fr-sv.fr
> /home/per/corpora_mono/Franska/frWikipedia.cleaned.text >
> /home/per/corpora_mono/Franska/Corpus1.fr
>
> l I was surprised by the small size of the new file and counted the
> lines with:
> wc -l /home/per/corpora_mono/Franska/Corpus1.fr etc.
>
> I'm running Debian Squeeze 64-bit.
>
> Yours,
> Per Tunedal
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
------------------------------
Message: 3
Date: Wed, 26 Feb 2014 16:47:00 +0100
From: "cyrine.nasri@univ-lorraine.fr" <cyrine.nasri@gmail.com>
Subject: [Moses-support] ERROR: Can't generate symmetrized alignment
file
To: "moses-support@mit.edu" <moses-support@MIT.EDU>
Message-ID:
<CAPg_V0iKVnqJftf7KOOxcxsd9_xdQA-uBAe=M_nL+9Jg8fBvFg@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Hello,
I try to train phrase model with this command :
" train-model.perl -corpus
/home/corpus/Apprentissage/train..tok.true.clean.low1
/home/corpus/Apprentissage/train..tok.true.clean.low1 -f en -e fr
-alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm
0:3:/home/corpus/Apprentissage/FR3GR.lm:8 -external-bin-dir /home/moses/bin"
But it indicates this message "
"Executing: mkdir -p ./model
Executing: /home/moses/mosesdecoder/scripts/training/giza2bal.pl -d "gzip
-cd ./giza.fr-en/fr-en.A3.final.gz" -i "gzip -cd
./giza.en-fr/en-fr.A3.final.gz"
|/home/moses/mosesdecoder/scripts/../bin/symal -alignment="grow"
-diagonal="yes" -final="yes" -both="yes" >
./model/aligned.grow-diag-final-and
sh: 1: /home/moses/mosesdecoder/scripts/../bin/symal: not found
Exit code: 127
ERROR: Can't generate symmetrized alignment file
Any idea to solve this problem?
Thank you in advance
--
*Cyrine Ph.D. Student in Computer Science*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140226/b10c086b/attachment-0001.htm
------------------------------
Message: 4
Date: Wed, 26 Feb 2014 17:46:56 +0100
From: Massinissa Ahmim <massinissa.ahmim@linguacustodia.com>
Subject: Re: [Moses-support] ERROR: Can't generate symmetrized
alignment file
To: "cyrine.nasri@univ-lorraine.fr" <cyrine.nasri@gmail.com>
Cc: "moses-support@mit.edu" <moses-support@mit.edu>
Message-ID:
<CANN0mWaFyQbFw=Ew34z17vJsm8YXna9ybeAz4XOW2xEni0gs0Q@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
Hi Cyrine,
Try to run this command instead :
...train-model.perl -corpus /home/corpus/Apprentissage/
train..tok.true.clean.low1 -f en -e fr -alignment grow-diag-final-and
-reordering msd-bidirectional-fe -lm 0:3:/home/corpus/Apprentissage/FR3GR.lm:8
-external-bin-dir /home/moses/bin"
You don't have to call the files twice just the extensions, for instance if
your parallel files are named corpus.en and corpus.fr use the path
~/whatever/corpus -f en -e fr and make sure both are located in the same
directory
Regards
Massinissa
2014-02-26 16:47 GMT+01:00 cyrine.nasri@univ-lorraine.fr <
cyrine.nasri@gmail.com>:
> Hello,
>
> I try to train phrase model with this command :
> " train-model.perl -corpus
> /home/corpus/Apprentissage/train..tok.true.clean.low1
> /home/corpus/Apprentissage/train..tok.true.clean.low1 -f en -e fr
> -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm
> 0:3:/home/corpus/Apprentissage/FR3GR.lm:8 -external-bin-dir /home/moses/bin"
>
> But it indicates this message "
>
> "Executing: mkdir -p ./model
> Executing: /home/moses/mosesdecoder/scripts/training/giza2bal.pl -d "gzip
> -cd ./giza.fr-en/fr-en.A3.final.gz" -i "gzip -cd
> ./giza.en-fr/en-fr.A3.final.gz"
> |/home/moses/mosesdecoder/scripts/../bin/symal -alignment="grow"
> -diagonal="yes" -final="yes" -both="yes" >
> ./model/aligned.grow-diag-final-and
> sh: 1: /home/moses/mosesdecoder/scripts/../bin/symal: not found
> Exit code: 127
> ERROR: Can't generate symmetrized alignment file
>
> Any idea to solve this problem?
>
> Thank you in advance
>
>
>
> --
>
> *Cyrine Ph.D. Student in Computer Science*
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
--
[image: Description : Description : lingua_custodia_final full logo]
*The Translation Trustee*
*1, Place Charles de Gaulle*
*78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 7 61 44 40 84*
*Email :* *massinissa.ahmim@linguacustodia.com
<massinissa.ahmim@linguacustodia.com>*
*Website :* *www.linguacustodia.com <http://www.linguacustodia.com/> -
www.thetranslationtrustee.com <http://www.thetranslationtrustee.com>*
? Pensez ? l'environnement, n'imprimez ce courriel que si n?cessaire.
Please do not print this email unless it is absolutely necessary. Spread
environmental awareness.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20140226/ee1bbbf7/attachment.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4421 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20140226/ee1bbbf7/attachment.jpg
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 88, Issue 59
*********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 88, Issue 59"
Post a Comment