Show simple item record

dc.contributor.authorSmolic, Aljosa
dc.contributor.authorRana, Aakanksha
dc.contributor.authorOczinar, Cagri
dc.date.accessioned2019-11-08T16:02:13Z
dc.date.available2019-11-08T16:02:13Z
dc.date.issued2019
dc.date.submitted2019en
dc.identifier.citationRana, A., Ozcinar, C. & Smolic, A., Towards Generating Ambisonics Using Audio-Visual Cue for Virtual Reality, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 44th International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), Forthcoming., 2019, 2012-2016en
dc.identifier.otherY
dc.identifier.urihttps://ieeexplore.ieee.org/document/8683318
dc.identifier.urihttp://hdl.handle.net/2262/90360
dc.descriptionPUBLISHEDen
dc.description.abstractAmbisonics i.e., a full-sphere surround sound, is quintessential with 360◦ visual content to provide a realistic virtual reality (VR) experience. While 360◦ visual content capture gained a tremendous boost recently, the estimation of corresponding spatial sound is still challenging due to the required sound-field microphones or information about the sound-source locations. In this paper, we introduce a novel problem of generating Ambisonics in 360◦ videos using the audiovisual cue. With this aim, firstly, a novel 360◦ audio-visual video dataset of 265 videos is introduced with annotated sound-source locations. Secondly, a pipeline is designed for an automatic Ambisonic estimation problem. Benefiting from the deep learning based audiovisual feature-embedding and prediction modules, our pipeline estimates the 3D sound-source locations and further use such locations to encode to the B-format. To benchmark our dataset and pipeline, we additionally propose evaluation criteria to investigate the performance using different 360◦ input representations. Our results demonstrate the efficacy of the proposed pipeline and open up a new area of research in 360◦ audio-visual analysis for future investigations.en
dc.format.extent2012-2016en
dc.language.isoenen
dc.rightsYen
dc.subjectVirtual realityen
dc.subject360 videoen
dc.subjectSpatial sounden
dc.subjectAmbisonicsen
dc.subjectMulti-modelen
dc.subjectDeep learningen
dc.titleTowards Generating Ambisonics Using Audio-Visual Cue for Virtual Realityen
dc.title.alternativeICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedingsen
dc.title.alternative44th International Conference on Acoustics, Speech, and Signal Processing, (ICASSP), Forthcoming.en
dc.typeConference Paperen
dc.type.supercollectionscholarly_publicationsen
dc.type.supercollectionrefereed_publicationsen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/smolica
dc.identifier.rssinternalid199013
dc.rights.ecaccessrightsopenAccess
dc.subject.TCDThemeCreative Technologiesen
dc.subject.TCDTagMultimedia & Creativityen
dc.identifier.rssurihttps://v-sense.scss.tcd.ie/wp-content/uploads/2019/02/ICASSP2019_multimodal.pdf
dc.status.accessibleNen
dc.contributor.sponsorScience Foundation Ireland (SFI)en
dc.contributor.sponsorGrantNumber15/RP/2776en


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record