//Leverage translation assets for MT customization//
Prepare existing data (previous translations & terminology)
–Is format suitable?
–Is content suitable?
- are there any gaps in the TM aligned pairs?
- are there any duplicates in the translation memory?
- how are translation units defined? (word-level, sentence level, paragraph level)
- are TMs clean from formatting data?
- are sentences too long?
- is terminology properly identified?
- are term equivalents correctly established?
This can really make a difference in MT output quality: from 28% to 33% BLEU score (30% is rated as “understandable” content”). Some reports indicate you can reach up to 64,2% (an experiment in the pair German-English, Schnaider 2012)