Aesthetic Image Captioning from Weakly-Labelled Photographs

Smolic, Aljosa; Ghosal, Koustav; Rana, Aakanksha

dc.contributor.author	Smolic, Aljosa
dc.contributor.author	Ghosal, Koustav
dc.contributor.author	Rana, Aakanksha
dc.date.accessioned	2020-02-18T17:17:33Z
dc.date.available	2020-02-18T17:17:33Z
dc.date.issued	2019
dc.date.submitted	2019	en
dc.identifier.citation	K. Ghosal, A. Rana and A. Smolic, "Aesthetic Image Captioning From Weakly-Labelled Photographs," 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Korea (South), 2019, pp. 4550-4560	en
dc.identifier.other	Y
dc.identifier.uri	https://v-sense.scss.tcd.ie/wp-content/uploads/2019/08/ICCVW_CROMOL_2019.pdf
dc.identifier.uri	http://hdl.handle.net/2262/91579
dc.description	PUBLISHED	en
dc.description.abstract	Aesthetic image captioning (AIC) refers to the multimodal task of generating critical textual feedbacks for photographs. While in natural image captioning (NIC), deep models are trained in an end-to-end manner using large curated datasets such as MS-COCO, no such large-scale, clean dataset exists for AIC. Towards this goal, we propose an automatic cleaning strategy to create a benchmarking AIC dataset, by exploiting the images and noisy comments easily available from photography websites. We propose a probabilistic caption-filtering method for cleaning the noisy web-data, and compile a large-scale, clean dataset ‘AVACaptions’, ( ∼ 230, 000 images with ∼ 5 captions per image). Additionally, by exploiting the latent associations between aesthetic attributes, we propose a strategy for training a convolutional neural network (CNN) based visual feature extractor, typically the first component of an AIC framework. The strategy is weakly supervised and can be effectively used to learn rich aesthetic representations, without requiring expensive ground-truth annotations. We finally showcase a thorough analysis of the proposed contributions using automatic metrics and subjective evaluations.	en
dc.language.iso	en	en
dc.rights	Y	en
dc.subject	Aesthetic image captioning	en
dc.subject	Natural image captioning	en
dc.subject	Convolutional neural networks	en
dc.title	Aesthetic Image Captioning from Weakly-Labelled Photographs	en
dc.type	Conference Paper	en
dc.type.supercollection	scholarly_publications	en
dc.type.supercollection	refereed_publications	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/smolica
dc.identifier.rssinternalid	212561
dc.identifier.doi	10.1109/ICCVW.2019.00556	en
dc.rights.ecaccessrights	openAccess
dc.subject.TCDTheme	Creative Technologies	en
dc.subject.TCDTag	Multimedia & Creativity	en
dc.subject.darat_impairment	Other	en
dc.status.accessible	N	en
dc.contributor.sponsor	SFI stipend	en
dc.contributor.sponsorGrantNumber	15/RP/2776	en

Files in this item

Name:: Ghosal_Aesthetic_Image_Caption ...
Size:: 915.2Kb
Format:: PDF

View/Open

Name:: license.txt
Size:: 3.499Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Computer Science (Scholarly Publications)
Computer Science (Scholarly Publications)
RSS Feeds

Show simple item record

Browse

My Account

Aesthetic Image Captioning from Weakly-Labelled Photographs

Files in this item

This item appears in the following Collection(s)