Interaction-based information retrieval in multimodal, online, artefact-focused meeting recordings
Citation:
Matt-Mouley Bouamrane, 'Interaction-based information retrieval in multimodal, online, artefact-focused meeting recordings', [thesis], Trinity College (Dublin, Ireland). School of Computer Science & Statistics, 2007, pp 204Download Item:
Bouamrane, Matt-Mouley_TCD-SCSS-PHD-2007-06.pdf (PDF) 5.401Mb
Abstract:
Traditional search operations, topic detection or summarisation of meeting recordings are generally
performed using segmentation and indexing techniques originally developed in the field of
Multimedia Information Retrieval. However, many assumptions possible in a media production
environment are often inadequate when applied to spontaneous meeting recordings, as features
such as sudden changes in sound energy levels or high motion can be sparse or even inexistent in
typical meetings. As a result, the dominant paradigm used for meeting browsing currently consists
in performing text-based information retrieval operations on automatic speech recognition (ASR)
transcripts. Meetings produce however another type of information not readily available in other
multimedia recordings: interactions between participants. Although there is a growing interest in
using this rich source of information, it remains difficult to harness due to the current limitations
of (speech, gesture or higher-level action) recognition technologies.
In computer-mediated online meetings, in which a space-based artefact (shared text or graphical
document) acts as the focal point of the meeting, it is possible to generate metadata describing lowlevel
actions of participants during the meeting. The semantics of these actions is many-fold: it is
defined by the person who performed the actions, the nature, the content, the timing of these actions
and finally the context (or target) of these actions. We explore a number of segmentation, indexing
and search techniques specifically based on information collected about participants’ actions. We
developed a temporal model in which navigation of meeting recordings is performed according to
actions’content or actions’context. We investigate the relationships between the timing and content
of actions and concurrent speech communications and if the temporal distance between the content
of certain actions can be used as a reliable indication of semantic relatedness (topic) between these
neighbouring actions. We explore visualisation of meeting information centred on the concept
of data objects with persisting histories, rather than the more traditional Multimedia concept of
media streams. A meeting browsing tool called the “Meeting Miner” was implemented. Evaluation
of the Meeting Miner was performed through an analytic evaluation, a usability study, and a
task-oriented information retrieval experiment. We complement the emerging Browser Evaluation
Test (BET) framework with additional performance metrics. Results of our evaluation showed
that interaction-based techniques incorporated into meeting browsing systems can indeed be used
efficiently for navigating multimodal meeting recordings.
Author: Bouamrane, Matt-Mouley
Advisor:
Luz, SaturninoQualification name:
Doctor of Philosophy (Ph.D.)Publisher:
Trinity College (Dublin, Ireland). School of Computer Science & StatisticsNote:
TARA (Trinity's Access to Research Archive) has a robust takedown policy. Please contact us if you have any concerns: rssadmin@tcd.ieType of material:
thesisCollections:
Availability:
Full text availableLicences: