Using broad phonetic group experts for improved speech recognition
Citation:
Scanlon, P. and Ellis, D. and Reilly, R. B. 'Using broad phonetic group experts for improved speech recognition' in IEEE Transactions on Audio Speech and Language Processing, 15, (3), 2007, pp. 803 ? 812.Download Item:
broad phonetics.pdf (publisher pdf) 1.094Mb
Abstract:
In phoneme recognition experiments, it was found
that approximately 75% of misclassified frames were assigned
labels within the same broad phonetic group (BPG). While the
phoneme can be described as the smallest distinguishable unit
of speech, phonemes within BPGs contain very similar characteristics
and can be easily confused. However, different BPGs,
such as vowels and stops, possess very different spectral and temporal
characteristics. In order to accommodate the full range of
phonemes, acoustic models of speech recognition systems calculate
input features from all frequencies over a large temporal context
window. A new phoneme classifier is proposed consisting of a
modular arrangement of experts, with one expert assigned to each
BPG and focused on discriminating between phonemes within that
BPG. Due to the different temporal and spectral structure of each
BPG, novel feature sets are extracted using mutual information, to
select a relevant time-frequency (TF) feature set for each expert.
To construct a phone recognition system, the output of each expert
is combined with a baseline classifier under the guidance of a
separate BPG detector. Considering phoneme recognition experiments
using the TIMIT continuous speech corpus, the proposed
architecture afforded significant error rate reductions up to 5%
relative.
Sponsor
Grant Number
Enterprise Ireland
Author's Homepage:
http://people.tcd.ie/reillyriDescription:
PUBLISHED
Author: REILLY, RICHARD
Publisher:
IEEEType of material:
Journal ArticleCollections:
Series/Report no:
153
Availability:
Full text availableISSN:
5184251842
Licences: