dc.contributor.author | Smolic, Aljosa | |
dc.contributor.author | Rana, Aakanksha | |
dc.contributor.author | Oczinar, Cagri | |
dc.date.accessioned | 2019-11-08T16:02:13Z | |
dc.date.available | 2019-11-08T16:02:13Z | |
dc.date.issued | 2019 | |
dc.date.submitted | 2019 | en |
dc.identifier.citation | Rana, A., Ozcinar, C. & Smolic, A., Towards Generating Ambisonics Using Audio-Visual Cue for Virtual Reality, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 44th International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), Forthcoming., 2019, 2012-2016 | en |
dc.identifier.other | Y | |
dc.identifier.uri | https://ieeexplore.ieee.org/document/8683318 | |
dc.identifier.uri | http://hdl.handle.net/2262/90360 | |
dc.description | PUBLISHED | en |
dc.description.abstract | Ambisonics i.e., a full-sphere surround sound, is quintessential with
360◦
visual content to provide a realistic virtual reality (VR) experience. While 360◦
visual content capture gained a tremendous boost
recently, the estimation of corresponding spatial sound is still challenging due to the required sound-field microphones or information
about the sound-source locations. In this paper, we introduce a novel
problem of generating Ambisonics in 360◦
videos using the audiovisual cue. With this aim, firstly, a novel 360◦
audio-visual video
dataset of 265 videos is introduced with annotated sound-source locations. Secondly, a pipeline is designed for an automatic Ambisonic
estimation problem. Benefiting from the deep learning based audiovisual feature-embedding and prediction modules, our pipeline estimates the 3D sound-source locations and further use such locations
to encode to the B-format. To benchmark our dataset and pipeline,
we additionally propose evaluation criteria to investigate the performance using different 360◦
input representations. Our results
demonstrate the efficacy of the proposed pipeline and open up a new
area of research in 360◦
audio-visual analysis for future investigations. | en |
dc.format.extent | 2012-2016 | en |
dc.language.iso | en | en |
dc.rights | Y | en |
dc.subject | Virtual reality | en |
dc.subject | 360 video | en |
dc.subject | Spatial sound | en |
dc.subject | Ambisonics | en |
dc.subject | Multi-model | en |
dc.subject | Deep learning | en |
dc.title | Towards Generating Ambisonics Using Audio-Visual Cue for Virtual Reality | en |
dc.title.alternative | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | en |
dc.title.alternative | 44th International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), Forthcoming. | en |
dc.type | Conference Paper | en |
dc.type.supercollection | scholarly_publications | en |
dc.type.supercollection | refereed_publications | en |
dc.identifier.peoplefinderurl | http://people.tcd.ie/smolica | |
dc.identifier.rssinternalid | 199013 | |
dc.rights.ecaccessrights | openAccess | |
dc.subject.TCDTheme | Creative Technologies | en |
dc.subject.TCDTag | Multimedia & Creativity | en |
dc.identifier.rssuri | https://v-sense.scss.tcd.ie/wp-content/uploads/2019/02/ICASSP2019_multimodal.pdf | |
dc.status.accessible | N | en |
dc.contributor.sponsor | Science Foundation Ireland (SFI) | en |
dc.contributor.sponsorGrantNumber | 15/RP/2776 | en |