Search CORE

5 research outputs found

Show, Prefer and Tell: Incorporating User Preferences into Image Captioning

Author: Kelleher John
Lindh Annika
Ross Robert J.
Publication venue: Technological University Dublin
Publication date: 01/01/2023
Field of study

Image Captioning (IC) is the task of generating natural language descriptions for images. Models encode the image using a convolutional neural network (CNN) and generate the caption via a recurrent model or a multi-modal transformer. Success is measured by the similarity between generated captions and human-written “ground-truth” captions, using the CIDEr [14], SPICE [1] and METEOR [2] metrics. While incremental gains have been made on these metrics, there is a lack of focus on end-user opinions on the amount of content in captions. Studies with blind and low-vision participants have found that lack of detail is a problem [6, 13, 17], and that the preferred amount of content varies between individuals [13], as do individual opinions on the trade-off between correctness and adding additional content with lower confidence [9]. We propose a more user-centered approach with an adjustable amount of content based on the number of regions to describe

Arrow@TUDublin

Recommended from our members

The Care Work of Access

Author: Accessibility Vision
ACM
Bennett Cynthia L.
Bennett Cynthia L.
Berry Andrew B. L.
de La Bellacasa Maria Puig
Eva Giraud
Fiebrink Rebecca
Fritsch Kelly
Goodley Dan
Goodwin Charles
Hamraie Aimi
Hamraie Aimi
Hardt Michael
Heath Christian
Heath Christian
Houston Lara
Jack Margaret
Kafer Alison
Kaziunas Elizabeth
Kristin
Kristina
Mauldin Laura
Mingus Mia
Moser Ingunn
Piepzna-Samarasinha Leah Lakshmi
Profita Halley
Rua
Ruhleder Karen
Sacks Harvey
Sayers Jentery
Schegloff Emanuel A.
Scheuerman Morgan Klaus
Schlesinger Ari
Seeing
Shinohara Kristen
Stacy
Stacy
Thieme Anja
Titchkosky Tanya
Toombs Austin
Trewin Shari
Trewin Shari
Zhong Yu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/04/2020
Field of study

Current approaches to AI and Assistive Technology (AT) often foreground task completion over other encounters such as expressions of care. Our paper challenges and complements such task-completion approaches by attending to the care work of access—the continual affective and emotional adjustments that people make by noticing and attending to one another. We explore how this work impacts encounters among people with and without vision impairments who complete tasks together. We find that bound up in attempts to get things done are concerns for one another and how well people are doing together. Reading this work through emerging disability studies and feminist STS scholarship, we account for two important forms of work that give rise to access: (1) mundane attunements and (2) noninnocent authorizations. Together these processes work as sensitizing concepts to help HCI scholars account for the ways that intelligent ATs both produce access while sometimes subverting people with disabilities

City Research Online

Crossref

The Effect of Computer-Generated Descriptions on Photo-Sharing Experiences of People with Visual Impairments

Author: Achanta R.
Altman Irwin
Baumeister Roy F.
Blind Legally
Brady Erin
DePaulo Bella M.
Dimicco Joan Morris
Gatys Leon A.
Goffman E.
Google Photos
Greene Kathryn
Melissa
Morris Meredith Ringel
Zhang Ning
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref