HRTF Clustering for Robust Training of a DNN for Sound Source Localization

Boland, Francis

This item is covered by a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 Internationa. Click to find out more

File Type:

PDF

Item Type:

Journal Article

Date:

2022

Author:

Boland, Francis

Access:

openAccess

Citation:

H. O'Dwyer, and F. Boland, HRTF Clustering for Robust Training of a DNN for Sound Source Localization, Journal Audio Engineering Society, 2022, 70, 12, 1015 - 1026

Download Item:

(HRTF Clustering for Robust Trainng o DNN for SSL.pdf) 3.233Mb

Abstract:

This study shows how spherical sound source localization of binaural audio signals in the mismatched head-related transfer function (HRTF) condition can be improved by implementing HRTF clustering when usingmachine learning. A new feature set of cross-correlation function, interaural level difference, and Gammatone cepstral coefficients is introduced and shown to outperform state-of-the-art methods in vertical localization in the mismatched HRTF condition by up to 5%. By examining the performance of Deep Neural Networks trained on single HRTF sets from the CIPIC database on other HRTFs, it is shown that HRTF sets can be clustered into groups of similar HRTFs. This results in the formulation of central HRTF sets representative of their specific cluster.By training a machine learning algorithm on these central HRTFs, it is shown that a more robust algorithm can be trained capable of improving sound source localization accuracy by up to 13% in the mismatched HRTF condition. Concurrently, localization accuracy is decreased by approximately 6% in thematchedHRTF condition, which accounts for less than 9% of all test conditions. Results demonstrate that HRTF clustering can vastly improve the robustness of binaural sound source localization to unseenHRTF conditions.

URI:

http://hdl.handle.net/2262/102004

Sponsor

Grant Number

Science Foundation Ireland (SFI)

13/IA/1900

Author's Homepage:

http://people.tcd.ie/fboland

Author: Boland, Francis

Type of material:

Journal Article

URI:

http://hdl.handle.net/2262/102004

Collections

Series/Report no:

Journal Audio Engineering Society;
70;
12;

Availability:

Full text available

Subject (TCD):

Creative Technologies , Digital Engagement , Telecommunications

DOI:

https://doi.org/10.17743/jaes.2022.0051

Metadata

Show full item record

Licences:

Original License

Browse

My Account