1,375 research outputs found
Geometric deep learning: going beyond Euclidean data
Many scientific fields study data with an underlying structure that is a
non-Euclidean space. Some examples include social networks in computational
social sciences, sensor networks in communications, functional networks in
brain imaging, regulatory networks in genetics, and meshed surfaces in computer
graphics. In many applications, such geometric data are large and complex (in
the case of social networks, on the scale of billions), and are natural targets
for machine learning techniques. In particular, we would like to use deep
neural networks, which have recently proven to be powerful tools for a broad
range of problems from computer vision, natural language processing, and audio
analysis. However, these tools have been most successful on data with an
underlying Euclidean or grid-like structure, and in cases where the invariances
of these structures are built into networks used to model them. Geometric deep
learning is an umbrella term for emerging techniques attempting to generalize
(structured) deep neural models to non-Euclidean domains such as graphs and
manifolds. The purpose of this paper is to overview different examples of
geometric deep learning problems and present available solutions, key
difficulties, applications, and future research directions in this nascent
field
Visual Speech Recognition
Lip reading is used to understand or interpret speech without hearing it, a
technique especially mastered by people with hearing difficulties. The ability
to lip read enables a person with a hearing impairment to communicate with
others and to engage in social activities, which otherwise would be difficult.
Recent advances in the fields of computer vision, pattern recognition, and
signal processing has led to a growing interest in automating this challenging
task of lip reading. Indeed, automating the human ability to lip read, a
process referred to as visual speech recognition (VSR) (or sometimes speech
reading), could open the door for other novel related applications. VSR has
received a great deal of attention in the last decade for its potential use in
applications such as human-computer interaction (HCI), audio-visual speech
recognition (AVSR), speaker recognition, talking heads, sign language
recognition and video surveillance. Its main aim is to recognise spoken word(s)
by using only the visual signal that is produced during speech. Hence, VSR
deals with the visual domain of speech and involves image processing,
artificial intelligence, object detection, pattern recognition, statistical
modelling, etc.Comment: Speech and Language Technologies (Book), Prof. Ivo Ipsic (Ed.), ISBN:
978-953-307-322-4, InTech (2011
Using Active Shape Modeling Based on MRI to Study Morphologic and Pitch-Related Functional Changes Affecting Vocal Structures and the Airway
Copyright © 2013 The Voice Foundation. Published by Mosby, Inc. All rights reserved.Peer reviewedPostprin
- …