Information Visualisation Applied to Corpus Linguistic Methodologies
Citation:
Sheehan, Shane, Information Visualisation Applied to Corpus Linguistic Methodologies, Trinity College Dublin, School of Computer Science & Statistics, Computer Science, 2023Download Item:

Abstract:
This thesis uses established visualisation design methods to characterize problems in corpus linguistics. The identified problem areas are concordance collocation patterns, frequency list comparison, and concordance meta-data analysis. The identification of these problems required collaboration with researchers from corpus linguistics. These collaborations explored example methodologies and research questions in the domain.
Each of the three identified problem areas was addressed by designing visualisation tools. The three visualisations described in the thesis are:
Mosaic visualisation of positional collocation patterns in concordance.
ComFre visualisation for frequency list comparison.
MetaFacet visualisation for exploring meta-data facet distributions of concordance lists.
A mix of encoding justifications, methodological impact/adoption, and laboratory study are used to validate the visualisations.
Mosaic effectively visualized collocation patterns showing improved speed and accuracy over established methods. Concordance Mosaic's methodological impact was also high as corpus linguistic researchers adopted it to improve the efficiency of analysis. The ComFre visualisation was effective in comparing frequency lists even in situations where the lists are of vastly different sizes. The methodological impact of the technique had to be assessed as low since its only evidence of methodological adoption was in the form of an example method created by a domain expert to demonstrate the tool's usefulness. MetaFacet was not available during the methodological review process. It does, however, show clear advantages in task time for methodologies revealed during the review.
Sponsor
Grant Number
ADAPT:Centre for Digital Content Platform Research
13/RC/2106
Author: Sheehan, Shane
Advisor:
Luz, SaturninoEmms, Martin
Type of material:
ThesisCollections:
Availability:
Full text availableKeywords:
Visualisation, Corpus, Linguistics, NLP, Concordance, Text analysisLicences: