Investigating the application of structured representations of unstructured content in personalisation tasks
Citation:MC GOVERN, AONGHUS, Investigating the application of structured representations of unstructured content in personalisation tasks, Trinity College Dublin.School of Computer Science & Statistics, 2019
Thesis_to_submit.pdf (PhD thesis, examined and approved) 825.6Kb
For personalisation approaches that analyse unstructured content, a common task is converting that unstructured content to a structured representation. Each structured representation has strengths and weaknesses, and the choice of representation should be made with respect to the personalisation task at hand. However, the way in which the choice of structured representations affects the personalisation that can be performed using that representation has not been clearly articulated. This is because personalisation approaches tend to focus on the success of their chosen personalisation task (e.g. recommendation accuracy) without examining how the characteristics of their chosen structured representation influenced this success. This motivates an investigation of of the characteristics of structured representations in the context of different personalisation tasks. This investigation is the subject of this thesis, and is carried out as a series of experiments. Each of these experiments examines the effect of a single characteristic of structured representations on personalisation performance. The first experiment investigates how the inability of the Named Entity and Bag-Of-Words representations to capture context limits their ability to fully represent different forms of user expression. This limitation can be overcome by leveraging the contextual information contained in the conceptual hierarchy of an external linguistic resource. The second experiment describes a comparison between the conceptural hierarchies of two different kinds of external linguistic resource: a purely lexical resource (WordNet) and a general knowledge base (DBpedia). The comparison takes the form of an investigation of the ability of each resource to represent the differences between users' descriptions of their interests and knowledge on Twitter with their description of the same characteristics on LinkedIn. The results of this experiment indicate that the DBpedia-based approach is most effective. Another finding of this experiment is that a structured representation's inability to accurately reflect category distinctions affects its ability to provide accurate recommendations. The third experiment investigates this distinction through a series of recommendation tasks spanning multiple domains, user models and recommendation methods. This experiment yields a test to determine whether a structured representation accurately reflects domain category distinctions. Furthermore, this experiment reveals that structured representations that pass this test will facilitate accurate recommendation, while structured representations that do not pass this test will not facilitate accurate recommendation. The contributions of this thesis consist of indicative guidelines as to the limitations of particular structured representations as well as guidance with respect to the methods for addressing these limitations.
Author: MC GOVERN, AONGHUS
Publisher:Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science
Type of material:Thesis
Availability:Full text available