Browsing by Subject "Natural Language Processing"
Now showing items 1-19 of 19
-
Analysis and Insights from the PARSEME Shared Task dataset
(Language Science Press, 2018)The PARSEME Shared Task on the automatic identification of verbal multiword expressions (VMWEs) was the first collaborative study on the subject to cover a wide and diverse range of languages. One observation that emerged ... -
Assessing Human-Parity in Machine Translation on the Segment Level
(Association for Computational Linguistics, 2020)Recent machine translation shared tasks have shown top-performing systems to tie or in some cases even outperform human translation. Such conclusions about system and human performance are, however, based on estimates ... -
Automatic Extraction of Data Governance Knowledge from Slack Chat Channels
(2018)This paper describes a data governance knowledge extraction prototype for Slack channels based on an OWL ontology abstracted from the Collibra data governance operating model and the application of statistical techniques ... -
C-HTS: A Concept-based Hierarchical Text Segmentation Approach
(2018)Hierarchical Text Segmentation is the task of building a hierarchical structure out of text to reflect its sub-topic hierarchy. Current text segmentation approaches are based upon using lexical and/or syntactic similarity ... -
Findings of the 2021 Conference on Machine Translation (WMT21)
(Association for Computational Linguistics, 2021)This paper presents the results of the news translation task, the multilingual low-resource translation for Indo-European languages, the triangular translation task, and the automatic post-editing task organised as part ... -
The Impact of Training Data Bias on Automatic Generation of Video Captions
(2019)A major issue in machine learning is availability of training data. While this historically referred to the availability of a sufficient volume of training data, recently this has shifted to the availability of sufficient ... -
Improving Document-level Sentiment Analysis with User and Product Context
(Association for Computational Linguistics, 2020)Past work that improves document-level sentiment analysis by encoding user and product information has been limited to considering only the text of the current review. We investigate incorporating additional review text ... -
Improving Unsupervised Question Answering via Summarization-Informed Question Generation
(Association for Computational Linguistics, 2021)Question Generation (QG) is the task of generating a plausible question for a given <passage, answer> pair. Template-based QG uses linguistically-informed heuristics to transform declarative sentences into interrogatives, ... -
Moral Sentiment: Investigating the Roles of Ethics and Affect in Determining Asset Returns
(Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science, 2017)Research in behavioural finance claims that markets are inefficient, in the sense that market prices can deviate from their fundamental value. Proponents of this theory suggest that investors succumb to emotional and ... -
News, Sentiment, and Financial Markets: A Computational System to Evaluate the Influence of Text Sentiment on Financial Assets.
(Trinity College Dublin, 2016)With the advent of the internet and digitisation of news and books, the volume of unstructured text has increased dramatically in recent years. This deluge of information is set to grow and come from new and unconventional ... -
Quantification of Mutual Understanding in Task-Based Human-Human Interactions
(Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science, 2021)This thesis explores the quantification of mutual understanding in task-based interactions by observing the relation between patterns of repetitions and measures of communicative success. Two important characteristics of ... -
Semantic reranking of CRF label sequences for verbal multiword expression identification
(Language Science Press, 2018)Verbal multiword Expressions (VMWE) identification can be addressed successfully as a sequence labelling problem via conditional random fields (CRFs) by returning the one label sequence with maximal probability. This work ... -
A Semi-Automatic Indexing System for Cell Images
(IEEE, 2008)A method is described that can be used for annotating and indexing an arbitrary set of images with texts collateral to the images. The collateral texts comprise digitised texts, e.g. journal papers and newspapers in which ... -
Statistical Power and Translationese in Machine Translation Evaluation
(Association for Computational Linguistics, 2020)The term translationese has been used to describe features of translated text, and in this paper, we provide detailed analysis of potential adverse effects of translationese on machine translation evaluation. Our analysis ... -
Stylochronometry: Timeline Prediction in Stylometric Analysis
(Springer, 2015)We examine stylochronometry, the question of measuring change in linguistic style over time within an authorial canon and in relation to change in language in general use over a contemporaneous period. We take the works ... -
The Third Multilingual Surface Realisation Shared Task (SR?20): Overview and Evaluation Results
(2020)This paper presents results from the Third Shared Task on Multilingual Surface Realisation (SR’20) which was organised as part of the COLING’20 Workshop on Multilingual Surface Realisation. As in SR’18 and SR’19, the shared ... -
Towards efficient string processing of annotated events
(2017)This paper explores the use of strings as models to effectively represent event data such as might be found in a document annotated with ISO-TimeML. We describe the translation of such data to strings, as well as a number ...