Send Moses-support mailing list submissions to
moses-support@mit.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.mit.edu/mailman/listinfo/moses-support
or, via email, send a message with subject or body 'help' to
moses-support-request@mit.edu
You can reach the person managing the list at
moses-support-owner@mit.edu
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Moses-support digest..."
Today's Topics:
1. Re: Fwd: Different translations are obtained from the same
decoder without alignment information (Ergun Bicici)
----------------------------------------------------------------------
Message: 1
Date: Fri, 24 Aug 2018 17:27:14 +0300
From: Ergun Bicici <bicici@gmail.com>
Subject: Re: [Moses-support] Fwd: Different translations are obtained
from the same decoder without alignment information
To: Hieu Hoang <hieuhoang@gmail.com>
Cc: moses-support <moses-support@mit.edu>
Message-ID:
<CAB59qTOgxvwytDLxu3k1MVLS7iQBQLHGXMQRU2w6=Nn2gPb9mA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
tuning step is not repeated. decoding use the same moses.ini and the same
input but different parameters:
moses/mosesdecoder/65c75ff/bin/moses -search-algorithm 1
-cube-pruning-pop-limit 5000 -s 5000 -threads 8 -text-type "test" -v 0 -f
wmt18_en-de/evaluation/test.filtered.ini.7 <
wmt18_en-de/evaluation/test.input.tc.1 >
wmt18_en-de/evaluation/test.output.7
vs. with alignment:
moses/mosesdecoder/65c75ff/bin/moses -search-algorithm 1
-cube-pruning-pop-limit 5000 -s 5000 -threads 8 --mark-unknown
--unknown-word-prefix UNK_ --print-alignment-info -text-type "test" -v 0 -f
wmt18_en-de/evaluation/test.filtered.ini.7 <
wmt18_en-de/evaluation/test.input.tc.1 >
wmt18_en-de/evaluation/test.output.9
both are followed by the following steps:
moses/mosesdecoder/scripts/ems/support/remove-segmentation-markup.perl <
wmt18_en-de/evaluation/test.output.7 > wmt18_en-de/evaluation/test.cleaned.7
moses/mosesdecoder/scripts/recaser/detruecase.perl <
wmt18_en-de/evaluation/test.cleaned.7 >
wmt18_en-de/evaluation/test.truecased.7
and equivalently with:
moses/mosesdecoder/scripts/ems/support/remove-segmentation-markup.perl <
wmt18_en-de/evaluation/test.output.9 > wmt18_en-de/evaluation/test.cleaned.9
moses/mosesdecoder/scripts/recaser/detruecase.perl <
wmt18_en-de/evaluation/test.cleaned.9 >
wmt18_en-de/evaluation/test.truecased.9
scoring step use test.truecased.7 and test.truecased.9.
Ergun
On Fri, Aug 24, 2018 at 5:15 PM Ergun Bicici <bicici@gmail.com> wrote:
>
> only the evaluation decoding steps are repeated that are steps 10, 9, and
> 7 in the following steps in EMS output:
> 48 TRAINING:consolidate -> re-using (1)
> 47 TRAINING:prepare-data -> re-using (1)
> 46 TRAINING:run-giza -> re-using (1)
> 45 TRAINING:run-giza-inverse -> re-using (1)
> 44 TRAINING:symmetrize-giza -> re-using (1)
> 43 TRAINING:build-lex-trans -> re-using (1)
> 40 TRAINING:build-osm -> re-using (1)
> 39 TRAINING:extract-phrases -> re-using (1)
> 38 TRAINING:build-reordering -> re-using (1)
> 37 TRAINING:build-ttable -> re-using (1)
> 34 TRAINING:create-config -> re-using (1)
> 28 TUNING:truecase-input -> re-using (1)
> 24 TUNING:truecase-reference -> re-using (1)
> 21 TUNING:filter -> re-using (1)
> 20 TUNING:apply-filter -> re-using (1)
> 19 TUNING:tune -> re-using (1)
> 18 TUNING:apply-weights -> re-using (1)
> 15 EVALUATION:test:truecase-input -> re-using (1)
> 12 EVALUATION:test:filter -> re-using (1)
> 11 EVALUATION:test:apply-filter -> re-using (1)
>
>
>
> *10 EVALUATION:test:decode -> run 9 EVALUATION:test:remove-markup ->
> run 7 EVALUATION:test:detruecase-output -> run *3
> EVALUATION:test:multi-bleu-c -> run
> 2 EVALUATION:test:analysis-coverage -> re-using (1)
> 1 EVALUATION:test:analysis-precision -> run
>
>
> On Fri, Aug 24, 2018 at 4:39 PM Hieu Hoang <hieuhoang@gmail.com> wrote:
>
>> are you rerunning tuning for each case? Or are you using exactly the same
>> moses.ini file for the with and with alignment experiments?
>>
>> Hieu Hoang
>> http://statmt.org/hieu
>>
>>
>> On Fri, 24 Aug 2018 at 14:34, Ergun Bicici <bicici@gmail.com> wrote:
>>
>>>
>>> Dear Moses maintainers,
>>>
>>> I discovered that the translations obtained differ when alignment flags (--mark-unknown
>>> --unknown-word-prefix UNK --print-alignment-inf) are used. Comparison
>>> table is attached (en-ru and ru-en are being recomputed). We expect them to
>>> be the same since alignment flags only print additional information and
>>> they are not supposed to alter decoding. In both, the same EMS system was
>>> re-run with the alignment information flags or not.
>>>
>>> - Average of the absolute difference is 0.0094 BLEU (about 1 BLEU
>>> points).
>>> - Average of the difference is 0.0051 BLEU (about 0.5 BLEU points,
>>> results are better with alignment flags).
>>>
>>> ?
>>>
>>> /opt/Programs/SMT/moses/mosesdecoder/bin/moses --version
>>>
>>> Moses code version (git tag or commit hash):
>>> mmt-mvp-v0.12.1-2775-g65c75ff07-dirty
>>> Libraries used:
>>> Boost version 1.62.0
>>>
>>> git status
>>> On branch RELEASE-4.0
>>> Your branch is up to date with 'origin/RELEASE-4.0'.
>>>
>>>
>>> Note: Using alignment information to recase tokens was tried in [1] for
>>> en-fi and en-tr to claim positive results. We tried this method in all
>>> translation directions we considered as as can be seen in the align row,
>>> this only improves the performance for tr-en and en-tr and for tr-en Moses
>>> provides better translations without the alignment flags.
>>> [1]The JHU Machine Translation Systems for WMT 2016
>>> Shuoyang Ding, Kevin Duh, Huda Khayrallah, Philipp Koehn and Matt Post
>>> http://www.statmt.org/wmt16/pdf/W16-2310.pdf
>>>
>>>
>>> Best Regards,
>>> Ergun
>>>
>>> Ergun Bi?ici
>>> http://bicici.github.com/ <http://ergunbicici.blogspot.com/>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>
> --
>
> Regards,
> Ergun
>
>
>
--
Regards,
Ergun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/mailman/private/moses-support/attachments/20180824/d619defc/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 59618 bytes
Desc: not available
Url : http://mailman.mit.edu/mailman/private/moses-support/attachments/20180824/d619defc/attachment.png
------------------------------
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
End of Moses-support Digest, Vol 142, Issue 14
**********************************************
Subscribe to:
Post Comments (Atom)
0 Response to "Moses-support Digest, Vol 142, Issue 14"
Post a Comment