Towards the automatic detection of the source language of a literary translation
Citation:
Lynch, Gerard and Carl Vogel, Towards the automatic detection of the source language of a literary translation, The 24th International Conference on Computational Linguistics (COLING2012), Mumbai, India, December 8-15, 2012, Martin Kay and Christian Boitet, Volume 1: Posters, ACL, 2012, 775 - 784Download Item:
LVColingCamReady.pdf (Published (author's copy) - Peer Reviewed) 164.0Kb
Abstract:
Experiments on the detection of the source language of literary translations are described. Two feature types are exploited, n-gram based features and document-level statistics. Cross- validation results on a corpus of twenty 19th-century texts including translations from Russian, French, German and texts written in English are promising: single feature classifiers yield significant gains on the baseline, although classifiers containing a combination of feature types outperform these, bringing L1 detection accuracy to ~80% using ten-fold training set cross validation. Average test set results are slightly lower but still comparable to the cross- validation results. Relative frequencies of a number of salient features are studied, including several English contractions (I'll, that's, etc.) and uncontracted forms; we articulate hypotheses, anchored in source languages, towards explaining differences.
Sponsor
Grant Number
Science Foundation Ireland (SFI)
(Grant 07/CE/I1142)
Author's Homepage:
http://people.tcd.ie/vogelDescription:
PUBLISHEDMumbai, India
Author: VOGEL, CARL
Other Titles:
The 24th International Conference on Computational Linguistics (COLING2012)Publisher:
ACLType of material:
Conference PaperCollections:
Series/Report no:
Volume 1: PostersAvailability:
Full text availableLicences: