Identification and interpretation of figurative language with computational semantic models
Citation:
Aaron Gerow, 'Identification and interpretation of figurative language with computational semantic models', [thesis], Trinity College (Dublin, Ireland). School of Computer Science & Statistics, 2014, pp 240Download Item:
Gerow, Aaron_TCD-SCSS-PHD-2014-04.pdf (PDF) 15.61Mb
Gerow TCD THESIS 10776 Identification and.pdf (Scan of TCD Library print copy) 149.0Mb
Abstract:
This thesis is about the automatic extraction of metaphors as they appear in English text. This
task is important to research in information retrieval, corpus linguistics and computational linguistics.
The work was motivated by theories of metaphor comprehension and statistical semantics
and contributes to areas of natural language processing (NLP) and information extraction where
figurative language continues to present a challenge. Chapter 2 reviews related psychological and
computational work and provides a foundation for a method described in chapter 3. Chapter 4
describes my implementation of this method – a system called MetID. Chapter 5 evaluates MetID
on three increasingly difficult tasks: identification, interpretation and extraction of figurative language.
The final chapter describes the contribution of this research, contextualising it in light of
the research goals and concludes with a discussion of future work.
Methods and techniques of the project were inspired by research on how people comprehend
metaphors, by linguistic research in how metaphor is used in text, and by NLP techniques for
extracting particular types of metaphor. The goal was to build and test a system for automatically
finding and providing interpretations of figurative language. A central task is representing
word associations that account for the semantics of figurative language. Specifically, three types
of lexical models were evaluated: WordNet, distributional semantic models and co-occurrence
likelihood estimation. The method also uses a number of heuristics that typically mark linguistic
metaphor, such as selectional violation and predication. The system can be used to analyse
individual phrases, a corpus (which can simultaneously be used to build the lexical model) or a
collection using pre-built models. The output is a ranked list of candidate metaphors by which
to interpret a statement. For example, analysing “my heart is on fire” produces the interpretation
AFFECTION AS WARMTH. The system attempts to account for two common forms: noun- and
verb-based metaphors. Evaluation results suggest that the method performs significantly above
chance on noun-based statements but not for verb-based. The choice of lexical model has a significant
effect when analysing noun-based statements, but not verbs. The results on an interpretation
task, which were validated with participant ratings, found that 1) noun-based statements were more
easily interpreted, 2) the system was better at interpreting figurative statements than literal statements
and 3) in some configurations, the system’s scores correlate strongly to participant ratings.
Additionally, an interesting interaction was found: the literal / non-literal distinction mediated the
role of a statement’s grammatical form when considering the quality of interpretation. Last, a case
study was used to aid a corpus-based terminological analysis of the word contagion in finance and
economics where it has been adopted with a number of figurative features.
Author: Gerow, Aaron
Advisor:
Ahmad, KhurshidQualification name:
Doctor of Philosophy (Ph.D.)Publisher:
Trinity College (Dublin, Ireland). School of Computer Science & StatisticsNote:
TARA (Trinity's Access to Research Archive) has a robust takedown policy. Please contact us if you have any concerns: rssadmin@tcd.ieType of material:
thesisCollections:
Availability:
Full text availableLicences: