Search CORE

12,919 research outputs found

A Deep Learning Approach to Automatic Caption Generation for News Images

Author: Batra Vishwash
He Yulan
Vogiatzis George
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2019
Field of study

Automatic caption generation of images has gained significant interest. It gives rise to a lot of interesting image-related applications. For example, it could help in image/video retrieval and management of vast amount of multimedia data available on the Internet. It could also help in development of tools that can aid visually impaired individuals in accessing multimedia content. In this paper, we particularly focus on news images and propose a methodology for automatically generating captions for news paper articles consisting of a text paragraph and an image. We propose several deep neural network architectures built upon Recurrent Neural Networks. Results on a BBC News dataset show that our proposed approach outperforms a traditional method based on Latent Dirichlet Allocation using both automatic evaluation based on BLEU scores and human evaluation

Aston Publications Explorer

Using remote vision: The effects of video image frame rate on visual object recognition performance

Author: Balachandran W
Garaj V
Hunaiti Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2010
Field of study

This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2010 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.The process of using remote vision was simulated in order to determine the effects of video image frame rate on the performance in visual recognition of stationary environmental hazards in the dynamic video footage of the pedestrian travel environment. The recognition performance was assessed against two different video image frame rate variations: 25 and 2 fps. The assessment included a range of objective and subjective criteria. The obtained results show that the effects of the frame rate variations on the performance are statistically insignificant. This paper belongs to the process of development of a novel system for navigation of visually impaired pedestrians. The navigation system includes a remote vision facility, and the visual recognition of the environmental hazards by the sighted human guide is a basic activity in aiding the visually impaired user of the system in mobility

Crossref

Brunel University Research Archive

An exploration of the potential of Automatic Speech Recognition to assist and enable receptive communication in higher education

Author: Wald Mike
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2006
Field of study

The potential use of Automatic Speech Recognition to assist receptive communication is explored. The opportunities and challenges that this technology presents students and staff to provide captioning of speech online or in classrooms for deaf or hard of hearing students and assist blind, visually impaired or dyslexic learners to read and search learning material more readily by augmenting synthetic speech with natural recorded real speech is also discussed and evaluated. The automatic provision of online lecture notes, synchronised with speech, enables staff and students to focus on learning and teaching issues, while also benefiting learners unable to attend the lecture or who find it difficult or impossible to take notes at the same time as listening, watching and thinking

Southampton (e-Prints Soton)

Crossref

ALT Open Access Repository

Directory of Open Access Journals

Accessibility-based reranking in multimedia search engines

Author: Anastasios Drosou
Dimitrios Tzovaras
DS Friedman
EM Fine
F Liu
H Brettel
H Hirvelä
H Kim
I Kalamaras
Ilias Kalamaras
IY Kim
J Liu
J Sang
JR Lavery
KW-T Leung
L Zhang
M Wang
Nikolaos Dimitriou
NJ Belkin
PK Atrey
S Lawrence
S Tajima
S Yang
T-L Ji
Y Nikulin
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/08/2016
Field of study

Traditional multimedia search engines retrieve results based mostly on the query submitted by the user, or using a log of previous searches to provide personalized results, while not considering the accessibility of the results for users with vision or other types of impairments. In this paper, a novel approach is presented which incorporates the accessibility of images for users with various vision impairments, such as color blindness, cataract and glaucoma, in order to rerank the results of an image search engine. The accessibility of individual images is measured through the use of vision simulation filters. Multi-objective optimization techniques utilizing the image accessibility scores are used to handle users with multiple vision impairments, while the impairment profile of a specific user is used to select one from the Pareto-optimal solutions. The proposed approach has been tested with two image datasets, using both simulated and real impaired users, and the results verify its applicability. Although the proposed method has been used for vision accessibility-based reranking, it can also be extended for other types of personalization context

Crossref

Springer - Publisher Connector

Spiral - Imperial College Digital Repository

A survey of comics research in computer science

Author: Augereau Olivier
Iwata Motoi
Kise Koichi
Publication venue
Publication date: 15/04/2018
Field of study

Graphical novels such as comics and mangas are well known all over the world. The digital transition started to change the way people are reading comics, more and more on smartphones and tablets and less and less on paper. In the recent years, a wide variety of research about comics has been proposed and might change the way comics are created, distributed and read in future years. Early work focuses on low level document image analysis: indeed comic books are complex, they contains text, drawings, balloon, panels, onomatopoeia, etc. Different fields of computer science covered research about user interaction and content generation such as multimedia, artificial intelligence, human-computer interaction, etc. with different sets of values. We propose in this paper to review the previous research about comics in computer science, to state what have been done and to give some insights about the main outlooks

arXiv.org e-Print Archive

Directory of Open Access Journals

ICT-related skills and needs of blind and visually impaired people

Author: Puffelen Carolina van
Publication venue: ACM
Publication date: 01/01/2009
Field of study

This study focuses on the relationship between the ICT-related training offered to blind and\ud visually impaired people and their actual, self-reported and demonstrated, competencies\ud for online activities and information processing. The findings of the study can shed light on\ud how people with severe visual disabilities are prepared to access the web for educational,\ud institutional and social participation. The study also gives insight in the validity of instruments\ud to measure ICT-linked skills for the target group and creates an empirical foundation for\ud improvements of ICT-related training. The first phase of the study investigated how blind\ud and visually impaired people perceive their participation in society through ICT. An\ud extensive interview showed how this audience perceives the frequency and quality of their\ud Internet use (or absence thereof) and how they acquired these skills

University of Twente Research Information

Comparing objective visual quality impairment detection in 2D and 3D video sequences

Author: Staelens Nicolas
Boussaer Arnaud
Vercammen Nick
Van hoogenbemt Geert
Vermeulen Brecht
Demeester Piet
Publication venue: Ghent University, Department of Information technology
Publication date: 01/01/2012
Field of study

The skill level of teleoperator plays a key role in the telerobotic operation. However, plenty of experiments are required to evaluate the skill level in a conventional assessment. In this paper, a novel brain-based method of skill assessment is introduced, and the relationship between the teleoperator's brain states and skill level is first investigated based on a kernel canonical correlation analysis (KCCA) method. The skill of teleoperator (SoT) is defined by a statistic method using the cumulative probability function (CDF). Five indicators are extracted from the electroencephalo-graph (EEG) of the teleoperator to represent the brain states during the telerobotic operation. By using the KCCA algorithm in modeling the relationship between the SoT and the brain states, the correlation has been proved. During the telerobotic operation, the skill level of teleoperator can be well predicted through the brain states. © 2013 IEEE.Link_to_subscribed_fulltex

Crossref

Ghent University Academic Bibliography

HKU Scholars Hub