178 research outputs found
Evidential Reasoning for Multimodal Fusion in Human Computer Interaction
Fusion of information from multiple modalities in Human Computer Interfaces
(HCI) has gained a lot of attention in recent years, and has far reaching
implications in many areas of human-machine interaction. However, a major
limitation of current HCI fusion systems is that the fusion process tends to
ignore the semantic nature of modalities, which may reinforce, complement or
contradict each other over time. Also, most systems are not robust in
representing the ambiguity inherent in human gestures. In this work, we
investigate an evidential reasoning based approach for intelligent multimodal
fusion, and apply this algorithm to a proposed multimodal system consisting of
a Hand Gesture sensor and a Brain Computing Interface (BCI). There are three
major contributions of this work to the area of human computer interaction.
First, we propose an algorithm for reconstruction of the 3D hand pose given a
2D input video. Second, we develop a BCI using Steady State Visually Evoked
Potentials, and show how a multimodal system consisting of the two sensors can
improve the efficiency and the complexity of the system, while retaining the
same levels of accuracy. Finally, we propose an semantic fusion algorithm based
on Transferable Belief Models, which can successfully fuse information from
these two sensors, to form meaningful concepts and resolve ambiguity. We also
analyze this system for robustness under various operating scenarios
Emotion Profiling: Ingredient for Rule based Emotion Recognition Engine
Emotions are considered to be the reflection of human thinking and decision-making process which increase his/her performance by producing an intelligent outcome. Hence it is a challenging task to embed the emotional intelligence in machine as well so that it could respond appropriately. However, present human computer interfaces still don2019;t fully utilize emotion feedback to create a more natural environment because the performance of the emotion recognition is still not very robust and reliable and far from real life experience. In this paper, we present an attempt in addressing this aspect and identifying the major challenges in the process. We introduce the concept of 2018;emotion profile2019; to evaluate an individual feature as each feature irrespective of the modality has different capability for differentiating among the various subsets of emotions. To capture the discrimination across target emotions w.r.t. each feature we propose a framework for emotion recognition built around if-then rules using certainty factors to represent uncertainty and unreliability of individual features. This technique appears to be simple and effective for these kind of problems
Fusion of Multimodal Information in Music Content Analysis
Music is often processed through its acoustic realization. This is restrictive in the sense that music is clearly a highly multimodal concept where various types of heterogeneous information can be associated to a given piece of music (a musical score, musicians\u27 gestures, lyrics, user-generated metadata, etc.). This has recently led researchers to apprehend music through its various facets, giving rise to "multimodal music analysis" studies. This article gives a synthetic overview of methods that have been successfully employed in multimodal signal analysis. In particular, their use in music content processing is discussed in more details through five case studies that highlight different multimodal integration techniques. The case studies include an example of cross-modal correlation for music video analysis, an audiovisual drum transcription system, a description of the concept of informed source separation, a discussion of multimodal dance-scene analysis, and an example of user-interactive music analysis. In the light of these case studies, some perspectives of multimodality in music processing are finally suggested
Unimodal and multimodal biometric sensing systems : a review
Biometric systems are used for the verification and identification of individuals using their physiological or behavioral features. These features can be categorized into unimodal and multimodal systems, in which the former have several deficiencies that reduce the accuracy of the system, such as noisy data, inter-class similarity, intra-class variation, spoofing, and non-universality. However, multimodal biometric sensing and processing systems, which make use of the detection and processing of two or more behavioral or physiological traits, have proved to improve the success rate of identification and verification significantly. This paper provides a detailed survey of the various unimodal and multimodal biometric sensing types providing their strengths and weaknesses. It discusses the stages involved in the biometric system recognition process and further discusses multimodal systems in terms of their architecture, mode of operation, and algorithms used to develop the systems. It also touches on levels and methods of fusion involved in biometric systems and gives researchers in this area a better understanding of multimodal biometric sensing and processing systems and research trends in this area. It furthermore gives room for research on how to find solutions to issues on various unimodal biometric systems.http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639am2017Electrical, Electronic and Computer Engineerin
Contributions to Robust Multi-view 3D Action Recognition
This thesis focus on human action recognition using volumetric reconstructions
obtained from multiple monocular cameras. The problem of action recognition has been
addressed using di erent approaches, both in the 2D and 3D domains, and using one or
multiple views. However, the development of robust recognition methods, independent
from the view employed, remains an open problem.
Multi-view approaches allow to exploit 3D information to improve the recognition
performance. Nevertheless, manipulating the large amount of information of 3D representations
poses a major problem. As a consequence, standard dimensionality reduction
techniques must be applied prior to the use of machine learning approaches. The rst
contribution of this work is a new descriptor of volumetric information that can be further
reduced using standard Dimensionality Reduction techniques in both holistic and
sequential recognition approaches. However, the descriptor itself reduces the amount of
data up to an order of magnitude (compared to previous descriptors) without a ecting
to the classi cation performance.
The descriptor represents the volumetric information obtained by SfS techniques.
However, this family of techniques are highly in
uenced by errors in the segmentation
process (e.g., undersegmentation causes false negatives in the reconstructed volumes)
so that the recognition performance is highly a ected by this rst step. The second
contribution of this work is a new SfS technique (named SfSDS) that employs the
Dempster-Shafer theory to fuse evidences provided by multiple cameras. The central
idea is to consider the relative position between cameras so as to deal with inconsistent
silhouettes and obtain robust volumetric reconstructions.
The basic SfS technique still have a main drawback, it requires the whole volume
to be analized in order to obtain the reconstruction. On the other hand, octree-based representations allows to save memory and time employing a dynamic tree structure
where only occupied nodes are stored. Nevertheless, applying the SfS method to octreebased
representations is not straightforward. The nal contribution of this work is a
method for generating octrees using our proposed SfSDS technique so as to obtain
robust and compact volumetric representations.Esta tesis se centra en el reconocimiento de acciones humanas usando reconstrucciones
volum etricas obtenidas a partir de m ultiples c amaras monoculares. El problema
del reconocimiento de acciones ha sido tratado usando diferentes enfoques, en los dominios
2D y 3D, y usando una o varias vistas. No obstante, el desarrollo de m etodos de
reconocimiento robustos, independientes de la vista empleada, sigue siendo un problema
abierto.
Los enfoques multi-vista permiten explotar la informaci on 3D para mejorar el
rendimiento del reconocimiento. Sin embargo, manipular las grandes cantidades de
informaci on de las representaciones 3D plantea un importante problema. Como consecuencia,
deben ser aplicadas t ecnicas est andar de reducci on de dimensionalidad con
anterioridad al uso de propuestas de aprendizaje. La primera contribuci on de este trabajo
es un nuevo descriptor de informaci on volum etrica que puede ser posteriormente
reducido mediante t ecnicas est andar de reducci on de dimensionalidad en los enfoques
de reconocimiento hol sticos y secuenciales. El descriptor, por si mismo, reduce la
cantidad de datos hasta en un orden de magnitud (en comparaci on con descriptores
previos) sin afectar al rendimiento de clasi caci on.
El descriptor representa la informaci on volum etrica obtenida en t ecnicas SfS. Sin
embargo, esta familia de t ecnicas est a altamente in
uenciada por los errores en el
proceso de segmentaci on (p.e., una sub-segmentaci on causa falsos negativos en los
vol umenes reconstruidos) de forma que el rendimiento del reconocimiento est a signi cativamente
afectado por este primer paso. La segunda contribuci on de este trabajo es una
nueva t ecnica SfS (denominada SfSDS) que emplea la teor a de Dempster-Shafer para
fusionar evidencias proporcionadas por m ultiples c amaras. La idea central consiste en considerar la posici on relativa entre c amaras de forma que se traten las inconsistencias
en las siluetas y se obtenga reconstrucciones volum etricas robustas.
La t ecnica SfS b asica sigue teniendo un inconveniente principal; requiere que el
volumen completo sea analizado para obtener la reconstrucci on. Por otro lado, las
representaciones basadas en octrees permiten salvar memoria y tiempo empleando una
estructura de arbol din amica donde s olo se almacenan los nodos ocupados. No obstante,
la aplicaci on del m etodo SfS a representaciones basadas en octrees no es directa.
La contribuci on nal de este trabajo es un m etodo para la generaci on de octrees
usando nuestra t ecnica SfSDS propuesta de forma que se obtengan representaciones
volum etricas robustas y compactas
A Review Of Multilevel Multibiometric Fusion System
Biometric systems allow automatic person recognition and authenticate based on the physical or behavioral characteristic. In recent years, researchers have paid close attention to the design of efficient multi-modal biometric systems due to their ability to withstand spoof attacks. Sometimes single biometric traits fail to extract relevant information for verifying the identity of a person. Therefore, combining multiple modalities, enhanced performance reliability could be achieved. If the security level increases then multi-level fusion techniques are used. This paper discusses the many fusion levels: algorithms, level of fusion, methods used for integrating the multiple verifiers and their applications
- …