Search CORE

10,536 research outputs found

An exploration of the potential of Automatic Speech Recognition to assist and enable receptive communication in higher education

Author: Wald Mike
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2006
Field of study

The potential use of Automatic Speech Recognition to assist receptive communication is explored. The opportunities and challenges that this technology presents students and staff to provide captioning of speech online or in classrooms for deaf or hard of hearing students and assist blind, visually impaired or dyslexic learners to read and search learning material more readily by augmenting synthetic speech with natural recorded real speech is also discussed and evaluated. The automatic provision of online lecture notes, synchronised with speech, enables staff and students to focus on learning and teaching issues, while also benefiting learners unable to attend the lecture or who find it difficult or impossible to take notes at the same time as listening, watching and thinking

Southampton (e-Prints Soton)

Crossref

ALT Open Access Repository

Directory of Open Access Journals

Multimedia Interfaces for BSL Using Lip Readers

Author: George J.
Gnanayutham Paul
Joumun F.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/07/2008
Field of study

Portsmouth University Research Portal (Pure)

Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed

Author: Beerends John G
Chung Joon Son
Cornu Thomas Le
Lan Yuxuan
Lee Daehyun
Ngiam Jiquan
Pachoud Samuel
Summerfield Quentin
Thiede Thilo
Zimmermann Marina
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/08/2018
Field of study

Speechreading or lipreading is the technique of understanding and getting phonetic features from a speaker's visual features such as movement of lips, face, teeth and tongue. It has a wide range of multimedia applications such as in surveillance, Internet telephony, and as an aid to a person with hearing impairments. However, most of the work in speechreading has been limited to text generation from silent videos. Recently, research has started venturing into generating (audio) speech from silent video sequences but there have been no developments thus far in dealing with divergent views and poses of a speaker. Thus although, we have multiple camera feeds for the speech of a user, but we have failed in using these multiple video feeds for dealing with the different poses. To this end, this paper presents the world's first ever multi-view speech reading and reconstruction system. This work encompasses the boundaries of multimedia research by putting forth a model which leverages silent video feeds from multiple cameras recording the same subject to generate intelligent speech for a speaker. Initial results confirm the usefulness of exploiting multiple camera views in building an efficient speech reading and reconstruction system. It further shows the optimal placement of cameras which would lead to the maximum intelligibility of speech. Next, it lays out various innovative applications for the proposed system focusing on its potential prodigious impact in not just security arena but in many other multimedia analytics problems.Comment: 2018 ACM Multimedia Conference (MM '18), October 22--26, 2018, Seoul, Republic of Kore

arXiv.org e-Print Archive

Crossref

Synote: Multimedia Annotation ‘Designed for all'

Author: Wald Mike
Publication venue
Publication date: 01/12/2010
Field of study

This paper describes the development and evaluation of Synote, a freely available web based application that makes multimedia web resources (e.g. podcasts) easier to access, search, manage, and exploit for all learners, teachers and other users through the creation of notes, bookmarks, tags, links, images and text captions synchronized to any part of the recording. Synote uniquely enables users to easily find, or associate their notes or resources with any part of a podcast or video recording available on the web and the students surveyed would like to be able to access all their lectures through Synot

Southampton (e-Prints Soton)

Synote: Designed for all Advanced Learning Technology for Disabled and Non-Disabled People

Author: Wald Mike
Publication venue
Publication date: 01/07/2010
Field of study

This paper describes the development and evaluation of Synote, a freely available accessible web based application that makes multimedia web resources (e.g. podcasts) easier to access, search, manage, and exploit for all learners, teachers and other users through the creation of accessible notes, bookmarks, tags, links, images and text captions synchronized to any part of the recording

Southampton (e-Prints Soton)

Crossref

TwNC: a Multifaceted Dutch News Corpus

Author: Hessen Arjan van
Hondorp Hendri
Jong Franciska de
Ordelman Roeland
Publication venue: ELRA
Publication date: 01/01/2007
Field of study

This contribution describes the Twente News Corpus (TwNC), a multifaceted corpus for Dutch that is being deployed in a number of NLP research projects among which tracks within the Dutch national research programme MultimediaN, the NWO programme CATCH, and the Dutch-Flemish programme STEVIN.\ud \ud The development of the corpus started in 1998 within a predecessor project DRUID and has currently a size of 530M words. The text part has been built from texts of four different sources: Dutch national newspapers, television subtitles, teleprompter (auto-cues) files, and both manually and automatically generated broadcast news transcripts along with the broadcast news audio. TwNC plays a crucial role in the development and evaluation of a wide range of tools and applications for the domain of multimedia indexing, such as large vocabulary speech recognition, cross-media indexing, cross-language information retrieval etc. Part of the corpus was fed into the Dutch written text corpus in the context of the Dutch-Belgian STEVIN project D-COI that was completed in 2007. The sections below will describe the rationale that was the starting point for the corpus development; it will outline the cross-media linking approach adopted within MultimediaN, and finally provide some facts and figures about the corpus

University of Twente Research Information

A model for hypermedia learning environments based on electronic books

Author: Aedo Ignacio
Catenazzi Nadia
Díaz Paloma
Publication venue: 'Informa UK Limited'
Publication date: 01/01/1997
Field of study

Designers of hypermedia learning environments could take advantage of a theoretical scheme which takes into account various kinds of learning activities and solves some of the problems associated with them. In this paper, we present a model which inherits a number of characteristics from hypermedia and electronic books. It can provide designers with the tools for creating hypermedia learning systems, by allowing the elements and functions involved in the definition of a specific application to be formally represented A practical example, CESAR, a hypermedia learning environment for hearing‐impaired children, is presented, and some conclusions derived from the use of the model are also shown

ALT Open Access Repository

Directory of Open Access Journals

Concurrent collaborative captioning

Author: Wald Mike
Publication venue: CSREA Press
Publication date: 01/01/2013
Field of study

Captioned text transcriptions of the spoken word can benefit hearing impaired people, non native speakers, anyone if no audio is available (e.g. watching TV at an airport) and also anyone who needs to review recordings of what has been said (e.g. at lectures, presentations, meetings etc.) In this paper, a tool is described that facilitates concurrent collaborative captioning by correction of speech recognition errors to provide a sustainable method of making videos accessible to people who find it difficult to understand speech through hearing alone. The tool stores all the edits of all the users and uses a matching algorithm to compare users’ edits to check if they are in agreement

Southampton (e-Prints Soton)

BIBS: A Lecture Webcasting System

Author: Diane Harley
Lawrence A. Rowe
Peter Pletcher
Shannon Lawrence
Publication venue: Center for Studies in Higher Education
Publication date: 01/01/2001
Field of study

The Berkeley Internet Broadcasting System (BIBS) is a lecture webcasting system developed and operated by the Berkeley Multimedia Research Center. The system offers live remote viewing and on-demand replay of course lectures using streaming audio and video over the Internet. During the Fall 2000 semester 14 classes were webcast, including several large lower division classes, with a total enrollment of over 4,000 students. Lectures were played over 15,000 times per month during the semester. The primary use of the webcasts is to study for examinations. Students report they watch BIBS lectures because they did not understand material presented in lecture, because they wanted to review what the instructor said about selected topics, because they missed a lecture, and/or because they had difficulty understanding the speaker (e.g., non-native English speakers). Analysis of various survey data suggests that more than 50% of the students enrolled in some large classes view lectures and that as many as 75% of the lectures are played by members of the Berkeley community. Faculty attitudes vary about the virtues of lecture webcasting. Some question the use of this technology while others believe it is a valuable aid to education. Further study is required to accurately assess the pedagogical impact that lecture webcasts have on student learning

CiteSeerX

eScholarship - University of California

IssueLab

Mental capital and wellbeing : making the most of ourselves in the 21st century : learning difficulties : future challenges

Author: Goswami Usha
Publication venue: Government Office for Science
Publication date: 01/01/2008
Field of study

Digital Education Resource Archive