Statistical Error Analysis of Machine Translation: The Case of Arabic
Abstract
In this paper, we present a study of an automatic error analysis in the context of machine translation into Arabic. We have created a pipeline tool allowing evaluation of machine translation outputs and identification of errors. A statistical analysis based on cumulative link models is performed also in order to have a global overview about errors of statistical machine translation from English to Arabic, and to investigate the relationship between encountered errors and the human perception of machine translation quality. As expected, this analysis demonstrates that the impact of lexical, semantic and reordering errors is more significant than
other errors related to the fluency of the machine translation outputs.
other errors related to the fluency of the machine translation outputs.
Keywords
Machine translation evaluation, error analysis, cumulative link models, Arabic NLP