Moses-support Digest, Vol 106, Issue 38

Send Moses-support mailing list submissions to
moses-support@mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu

You can reach the person managing the list at
moses-support-owner@mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."


Today's Topics:

1. sigtest filtering reordering (Vincent Nguyen)
2. Re: sigtest filtering reordering (Marcin Junczys-Dowmunt)


----------------------------------------------------------------------

Message: 1
Date: Wed, 19 Aug 2015 13:44:18 +0200
From: Vincent Nguyen <vnguyen@neuf.fr>
Subject: [Moses-support] sigtest filtering reordering
To: moses-support <moses-support@mit.edu>
Message-ID: <55D46C12.5090707@neuf.fr>
Content-Type: text/plain; charset="utf-8"

Hi,

it crashed (whereas the sigtest filetring ttable continues ...) and no
message for disk space nor out of memory.

just a simple "killed" at the end of the stderr, any clue ?


-l = a+e
P(f|e) filter limit: 50
Loading Vocabulary...
Loading existing vocabulary file:
/home/moses/working/training/corpus.2.en.id_voc
Total 1891716 word types loaded
Max VocID=1891716
Vocabulary loaded in 25 seconds.
Loading corpus...
Corpus loaded in 3 seconds.
Loading suffix...
Initialize level-1 buckets...
Suffix loaded in 54 seconds.
Loading offset...
Offset loaded in 2 seconds.
Total: 23400935 sentences loaded.
Loading Vocabulary...
Loading existing vocabulary file:
/home/moses/working/training/corpus.2.fr.id_voc
Total 1952991 word types loaded
Max VocID=1952991
Vocabulary loaded in 26 seconds.
Loading corpus...
Corpus loaded in 14 seconds.
Loading suffix...
Initialize level-1 buckets...
Suffix loaded in 97 seconds.
Loading offset...
Offset loaded in 2 seconds.
Total: 23400935 sentences loaded.
Training corpus: 23400935 lines
\alpha = 16.9683
Sig filter threshold is = 16.9693
..................................................[n:500000]
..................................................[n:1000000]
..................................................[n:1500000]
..................................................[n:2000000]
..................................................[n:2500000]
..................................................[n:3000000]
..................................................[n:3500000]
..................................................[n:4000000]
..................................................[n:4500000]
..................................................[n:5000000]
..................................................[n:5500000]
..................................................[n:6000000]
..................................................[n:6500000]
..................................................[n:7000000]
..................................................[n:7500000]
..................................................[n:8000000]
..................................................[n:8500000]
..................................................[n:9000000]
..................................................[n:9500000]
..................................................[n:10000000]
------------------------------------------------------
unfiltered phrases pairs: 10000000
P(f|e) filter [first]: 1749725 (17.4972%)
significance filter: 6426892 (64.2689%)
TOTAL FILTERED: 8176617 (81.7662%)
FILTERED phrase pairs: 1823383 (18.2338%)
------------------------------------------------------
..................................................[n:10500000]
..................................................[n:11000000]
..................................................[n:11500000]
..................................................[n:12000000]
..................................................[n:12500000]
..................................................[n:13000000]
..................................................[n:13500000]
..................................................[n:14000000]
..................................................[n:14500000]
..................................................[n:15000000]
..................................................[n:15500000]
..................................................[n:16000000]
..................................................[n:16500000]
..................................................[n:17000000]
..................................................[n:17500000]
..................................................[n:18000000]
..................................................[n:18500000]
..................................................[n:19000000]
..................................................[n:19500000]
..................................................[n:20000000]
------------------------------------------------------
unfiltered phrases pairs: 20000000
P(f|e) filter [first]: 3905428 (19.5271%)
significance filter: 12645364 (63.2268%)
TOTAL FILTERED: 16550792 (82.754%)
FILTERED phrase pairs: 3449208 (17.246%)
------------------------------------------------------
..................................................[n:20500000]
..................................................[n:21000000]
..................................................[n:21500000]
..................................................[n:22000000]
..................................................[n:22500000]
..................................................[n:23000000]
..................................................[n:23500000]
..................................................[n:24000000]
..................................................[n:24500000]
..................................................[n:25000000]
..................................................[n:25500000]
..................................................[n:26000000]
..................................................[n:26500000]
..................................................[n:27000000]
..................................................[n:27500000]
..................................................[n:28000000]
..................................................[n:28500000]
..................................................[n:29000000]
..................................................[n:29500000]
..................................................[n:30000000]
------------------------------------------------------
unfiltered phrases pairs: 30000000
P(f|e) filter [first]: 6941355 (23.1378%)
significance filter: 18163654 (60.5455%)
TOTAL FILTERED: 25105009 (83.6834%)
FILTERED phrase pairs: 4894991 (16.3166%)
------------------------------------------------------
..................................................[n:30500000]
..................................................[n:31000000]
..................................................[n:31500000]
..................................................[n:32000000]
..................................................[n:32500000]
..................................................[n:33000000]
..................................................[n:33500000]
..................................................[n:34000000]
..................................................[n:34500000]
..................................................[n:35000000]
..................................................[n:35500000]
..................................................[n:36000000]
..................................................[n:36500000]
..................................................[n:37000000]
..................................................[n:37500000]
..................................................[n:38000000]
..................................................[n:38500000]
..................................................[n:39000000]
..................................................[n:39500000]
..................................................[n:40000000]
------------------------------------------------------
unfiltered phrases pairs: 40000000
P(f|e) filter [first]: 10132773 (25.3319%)
significance filter: 23513247 (58.7831%)
TOTAL FILTERED: 33646020 (84.1151%)
FILTERED phrase pairs: 6353980 (15.8849%)
------------------------------------------------------
..................................................[n:40500000]
..................................................[n:41000000]
..................................................[n:41500000]
..................................................[n:42000000]
..................................................[n:42500000]
..................................................[n:43000000]
..................................................[n:43500000]
..................................................[n:44000000]
..................................................[n:44500000]
..................................................[n:45000000]
..................................................[n:45500000]
..................................................[n:46000000]
..................................................[n:46500000]
..................................................[n:47000000]
..................................................[n:47500000]
..................................................[n:48000000]
..................................................[n:48500000]
.....................................Killed

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150819/eb8ae30c/attachment-0001.html

------------------------------

Message: 2
Date: Wed, 19 Aug 2015 14:10:34 +0200
From: Marcin Junczys-Dowmunt <junczys@amu.edu.pl>
Subject: Re: [Moses-support] sigtest filtering reordering
To: Vincent Nguyen <vnguyen@neuf.fr>
Cc: moses-support <moses-support@mit.edu>
Message-ID: <2a24f5fffb67c238647d9f99b4a8b7cb@amu.edu.pl>
Content-Type: text/plain; charset="utf-8"



Hi,

I guess that was the operation system killing the process due to lack of
memory. Do you have a filtered phrase-table already? If yes, you can
just remove the spurious reordering entries with the script
remove-orphaned-reordering-entries.perl (someting like that, I am
writing this from memory.). You give the pruned phrase-table and the
unpruned reordering model to the script, and the script takes care that
the contents match. The good thing is, is hardly requires any RAM.

Best,

Marcin

W dniu 2015-08-19 13:44, Vincent Nguyen napisa?(a):

> Hi,
>
> it crashed (whereas the sigtest filetring ttable continues ...) and no message for disk space nor out of memory.
>
> just a simple "killed" at the end of the stderr, any clue ?
>
> -l = a+e
> P(f|e) filter limit: 50
> Loading Vocabulary...
> Loading existing vocabulary file: /home/moses/working/training/corpus.2.en.id_voc
> Total 1891716 word types loaded
> Max VocID=1891716
> Vocabulary loaded in 25 seconds.
> Loading corpus...
> Corpus loaded in 3 seconds.
> Loading suffix...
> Initialize level-1 buckets...
> Suffix loaded in 54 seconds.
> Loading offset...
> Offset loaded in 2 seconds.
> Total: 23400935 sentences loaded.
> Loading Vocabulary...
> Loading existing vocabulary file: /home/moses/working/training/corpus.2.fr.id_voc
> Total 1952991 word types loaded
> Max VocID=1952991
> Vocabulary loaded in 26 seconds.
> Loading corpus...
> Corpus loaded in 14 seconds.
> Loading suffix...
> Initialize level-1 buckets...
> Suffix loaded in 97 seconds.
> Loading offset...
> Offset loaded in 2 seconds.
> Total: 23400935 sentences loaded.
> Training corpus: 23400935 lines
> alpha = 16.9683
> Sig filter threshold is = 16.9693
> ..................................................[n:500000]
> ..................................................[n:1000000]
> ..................................................[n:1500000]
> ..................................................[n:2000000]
> ..................................................[n:2500000]
> ..................................................[n:3000000]
> ..................................................[n:3500000]
> ..................................................[n:4000000]
> ..................................................[n:4500000]
> ..................................................[n:5000000]
> ..................................................[n:5500000]
> ..................................................[n:6000000]
> ..................................................[n:6500000]
> ..................................................[n:7000000]
> ..................................................[n:7500000]
> ..................................................[n:8000000]
> ..................................................[n:8500000]
> ..................................................[n:9000000]
> ..................................................[n:9500000]
> ..................................................[n:10000000]
> ------------------------------------------------------
> unfiltered phrases pairs: 10000000
> P(f|e) filter [first]: 1749725 (17.4972%)
> significance filter: 6426892 (64.2689%)
> TOTAL FILTERED: 8176617 (81.7662%)
> FILTERED phrase pairs: 1823383 (18.2338%)
> ------------------------------------------------------
> ..................................................[n:10500000]
> ..................................................[n:11000000]
> ..................................................[n:11500000]
> ..................................................[n:12000000]
> ..................................................[n:12500000]
> ..................................................[n:13000000]
> ..................................................[n:13500000]
> ..................................................[n:14000000]
> ..................................................[n:14500000]
> ..................................................[n:15000000]
> ..................................................[n:15500000]
> ..................................................[n:16000000]
> ..................................................[n:16500000]
> ..................................................[n:17000000]
> ..................................................[n:17500000]
> ..................................................[n:18000000]
> ..................................................[n:18500000]
> ..................................................[n:19000000]
> ..................................................[n:19500000]
> ..................................................[n:20000000]
> ------------------------------------------------------
> unfiltered phrases pairs: 20000000
> P(f|e) filter [first]: 3905428 (19.5271%)
> significance filter: 12645364 (63.2268%)
> TOTAL FILTERED: 16550792 (82.754%)
> FILTERED phrase pairs: 3449208 (17.246%)
> ------------------------------------------------------
> ..................................................[n:20500000]
> ..................................................[n:21000000]
> ..................................................[n:21500000]
> ..................................................[n:22000000]
> ..................................................[n:22500000]
> ..................................................[n:23000000]
> ..................................................[n:23500000]
> ..................................................[n:24000000]
> ..................................................[n:24500000]
> ..................................................[n:25000000]
> ..................................................[n:25500000]
> ..................................................[n:26000000]
> ..................................................[n:26500000]
> ..................................................[n:27000000]
> ..................................................[n:27500000]
> ..................................................[n:28000000]
> ..................................................[n:28500000]
> ..................................................[n:29000000]
> ..................................................[n:29500000]
> ..................................................[n:30000000]
> ------------------------------------------------------
> unfiltered phrases pairs: 30000000
> P(f|e) filter [first]: 6941355 (23.1378%)
> significance filter: 18163654 (60.5455%)
> TOTAL FILTERED: 25105009 (83.6834%)
> FILTERED phrase pairs: 4894991 (16.3166%)
> ------------------------------------------------------
> ..................................................[n:30500000]
> ..................................................[n:31000000]
> ..................................................[n:31500000]
> ..................................................[n:32000000]
> ..................................................[n:32500000]
> ..................................................[n:33000000]
> ..................................................[n:33500000]
> ..................................................[n:34000000]
> ..................................................[n:34500000]
> ..................................................[n:35000000]
> ..................................................[n:35500000]
> ..................................................[n:36000000]
> ..................................................[n:36500000]
> ..................................................[n:37000000]
> ..................................................[n:37500000]
> ..................................................[n:38000000]
> ..................................................[n:38500000]
> ..................................................[n:39000000]
> ..................................................[n:39500000]
> ..................................................[n:40000000]
> ------------------------------------------------------
> unfiltered phrases pairs: 40000000
> P(f|e) filter [first]: 10132773 (25.3319%)
> significance filter: 23513247 (58.7831%)
> TOTAL FILTERED: 33646020 (84.1151%)
> FILTERED phrase pairs: 6353980 (15.8849%)
> ------------------------------------------------------
> ..................................................[n:40500000]
> ..................................................[n:41000000]
> ..................................................[n:41500000]
> ..................................................[n:42000000]
> ..................................................[n:42500000]
> ..................................................[n:43000000]
> ..................................................[n:43500000]
> ..................................................[n:44000000]
> ..................................................[n:44500000]
> ..................................................[n:45000000]
> ..................................................[n:45500000]
> ..................................................[n:46000000]
> ..................................................[n:46500000]
> ..................................................[n:47000000]
> ..................................................[n:47500000]
> ..................................................[n:48000000]
> ..................................................[n:48500000]
> .....................................Killed
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support [1]



Links:
------
[1] http://mailman.mit.edu/mailman/listinfo/moses-support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20150819/fe9d86ea/attachment.html

------------------------------

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


End of Moses-support Digest, Vol 106, Issue 38
**********************************************

0 Response to "Moses-support Digest, Vol 106, Issue 38"

Post a Comment