142,099 research outputs found
Analyzing First-Person Stories Based on Socializing, Eating and Sedentary Patterns
First-person stories can be analyzed by means of egocentric pictures acquired
throughout the whole active day with wearable cameras. This manuscript presents
an egocentric dataset with more than 45,000 pictures from four people in
different environments such as working or studying. All the images were
manually labeled to identify three patterns of interest regarding people's
lifestyle: socializing, eating and sedentary. Additionally, two different
approaches are proposed to classify egocentric images into one of the 12 target
categories defined to characterize these three patterns. The approaches are
based on machine learning and deep learning techniques, including traditional
classifiers and state-of-art convolutional neural networks. The experimental
results obtained when applying these methods to the egocentric dataset
demonstrated their adequacy for the problem at hand.Comment: Accepted at First International Workshop on Social Signal Processing
and Beyond, 19th International Conference on Image Analysis and Processing
(ICIAP), September 201
Analyzing First-Person Stories Based on Socializing, Eating and Sedentary Patterns
First-person stories can be analyzed by means of egocentric pictures acquired
throughout the whole active day with wearable cameras. This manuscript presents
an egocentric dataset with more than 45,000 pictures from four people in
different environments such as working or studying. All the images were
manually labeled to identify three patterns of interest regarding people's
lifestyle: socializing, eating and sedentary. Additionally, two different
approaches are proposed to classify egocentric images into one of the 12 target
categories defined to characterize these three patterns. The approaches are
based on machine learning and deep learning techniques, including traditional
classifiers and state-of-art convolutional neural networks. The experimental
results obtained when applying these methods to the egocentric dataset
demonstrated their adequacy for the problem at hand.Comment: Accepted at First International Workshop on Social Signal Processing
and Beyond, 19th International Conference on Image Analysis and Processing
(ICIAP), September 201
Similarity-Aware Spectral Sparsification by Edge Filtering
In recent years, spectral graph sparsification techniques that can compute
ultra-sparse graph proxies have been extensively studied for accelerating
various numerical and graph-related applications. Prior nearly-linear-time
spectral sparsification methods first extract low-stretch spanning tree from
the original graph to form the backbone of the sparsifier, and then recover
small portions of spectrally-critical off-tree edges to the spanning tree to
significantly improve the approximation quality. However, it is not clear how
many off-tree edges should be recovered for achieving a desired spectral
similarity level within the sparsifier. Motivated by recent graph signal
processing techniques, this paper proposes a similarity-aware spectral graph
sparsification framework that leverages efficient spectral off-tree edge
embedding and filtering schemes to construct spectral sparsifiers with
guaranteed spectral similarity (relative condition number) level. An iterative
graph densification scheme is introduced to facilitate efficient and effective
filtering of off-tree edges for highly ill-conditioned problems. The proposed
method has been validated using various kinds of graphs obtained from public
domain sparse matrix collections relevant to VLSI CAD, finite element analysis,
as well as social and data networks frequently studied in many machine learning
and data mining applications
Guest Editorial: Non-Euclidean Machine Learning
Over the past decade, deep learning has had a revolutionary impact on a broad range of fields such as computer vision and image processing, computational photography, medical imaging and speech and language analysis and synthesis etc. Deep learning technologies are estimated to have added billions in business value, created new markets, and transformed entire industrial segments. Most of today’s successful deep learning methods such as Convolutional Neural Networks (CNNs) rely on classical signal processing models that limit their applicability to data with underlying Euclidean grid-like structure, e.g., images or acoustic signals. Yet, many applications deal with non-Euclidean (graph- or manifold-structured) data. For example, in social network analysis the users and their attributes are generally modeled as signals on the vertices of graphs. In biology protein-to-protein interactions are modeled as graphs. In computer vision & graphics 3D objects are modeled as meshes or point clouds. Furthermore, a graph representation is a very natural way to describe interactions between objects or signals. The classical deep learning paradigm on Euclidean domains falls short in providing appropriate tools for such kind of data. Until recently, the lack of deep learning models capable of correctly dealing with non-Euclidean data has been a major obstacle in these fields. This special section addresses the need to bring together leading efforts in non-Euclidean deep learning across all communities. From the papers that the special received twelve were selected for publication. The selected papers can naturally fall in three distinct categories: (a) methodologies that advance machine learning on data that are represented as graphs, (b) methodologies that advance machine learning on manifold-valued data, and (c) applications of machine learning methodologies on non-Euclidean spaces in computer vision and medical imaging. We briefly review the accepted papers in each of the groups
Nonverbal Vocalizations as Speech: Characterizing Natural-Environment Audio from Nonverbal Individuals with Autism
The study of nonverbal vocalizations, such as sighs, grunts, and monosyllabic sounds, has largely revolved around the social and affective implications of these sounds within typical speech. However, for individuals who do not use any traditional speech, including those with non- or minimally verbal (nv/mv) autism, these vocalizations contain important, individual-specific affective and communicative information. This paper outlines the methodology, analysis, and technology to investigate the production, perception, and meaning of nonverbal vocalizations from nv/mv individuals in natural environments. We are developing novel signal processing and machine learning methods that will help enable augmentative communication technology, and we are producing a nonverbal vocalization dataset for public release. We hope this work will expand the scientific understanding of these exceptional individuals’ language development and the field of communication more generally
Nonverbal Vocalizations as Speech: Characterizing Natural-Environment Audio from Nonverbal Individuals with Autism
The study of nonverbal vocalizations, such as sighs, grunts, and monosyllabic sounds, has largely revolved around the social and affective implications of these sounds within typical speech. However, for individuals who do not use any traditional speech, including those with non- or minimally verbal (nv/mv) autism, these vocalizations contain important, individual-specific affective and communicative information. This paper outlines the methodology, analysis, and technology to investigate the production, perception, and meaning of nonverbal vocalizations from nv/mv individuals in natural environments. We are developing novel signal processing and machine learning methods that will help enable augmentative communication technology, and we are producing a nonverbal vocalization dataset for public release. We hope this work will expand the scientific understanding of these exceptional individuals’ language development and the field of communication more generally
Adaptive multiple importance sampling for Gaussian processes and its application in social signal processing
Social signal processing aims to automatically understand and interpret social signals (e.g. facial expressions and prosody) generated during human-human and human-machine interactions. Automatic interpretation of social signals involves two fundamentally important aspects: feature extraction and machine learning. So far, machine learning approaches applied to social signal processing have mainly focused on parametric approaches (e.g. linear regression) or non-parametric models such as support vector machine (SVM). However, these approaches fall short of taking into account any uncertainty as a result of model misspecification or lack interpretability for analyses of scenarios in social signal processing. Consequently, they are less able to understand and interpret human behaviours effectively.
Gaussian processes (GPs), that have gained popularity in data analysis, offer a solution to these limitations through their attractive properties: being non-parametric enables them to flexibly model data and being probabilistic makes them capable of quantifying uncertainty. In addition, a proper parametrisation in the covariance function makes it possible to gain insights into the application under study.
However, these appealing properties of GP models hinge on an accurate characterisation of the posterior distribution with respect to the covariance parameters. This is normally done by means of standard MCMC algorithms, which require repeated expensive calculations involving the marginal likelihood. Motivated by the desire to avoid the inefficiencies of MCMC algorithms rejecting a considerable number of expensive proposals, this thesis has developed an alternative inference framework based on adaptive multiple importance sampling (AMIS).
In particular, this thesis studies the application of AMIS for Gaussian processes in the case of a Gaussian likelihood, and proposes a novel pseudo-marginal-based AMIS (PM-AMIS) algorithm for non-Gaussian likelihoods, where the marginal likelihood is unbiasedly estimated.
Experiments on benchmark data sets show that the proposed framework outperforms the MCMC-based inference of GP covariance parameters in a wide range of scenarios.
The PM-AMIS classifier - based on Gaussian processes with a newly designed group-automatic relevance determination (G-ARD) kernel - has been applied to predict whether a Flickr user is perceived to be above the median or not with respect to each of the Big-Five personality traits. The results show that, apart from the high prediction accuracies achieved (up to 79% depending on the trait), the parameters of the G-ARD kernel allow the identification of the groups of features that better account for the classification outcome and provide indications about cultural effects through their weight differences. Therefore, this demonstrates the value of the proposed non-parametric probabilistic framework for social signal processing.
Feature extraction in signal processing is dominated by various methods based on short time Fourier transform (STFT). Recently, Hilbert spectral analysis (HSA), a new representation of signal which is fundamentally different from STFT has been proposed. This thesis is also the first attempt to investigate the extraction of features from this newly proposed HSA and its application in social signal processing. The experimental results reveal that, using features extracted from the Hilbert spectrum of voice data of female speakers, the prediction accuracy can be achieved by up to 81% when predicting their Big-Five personality traits, and hence show that HSA can work as an effective alternative to STFT for feature extraction in social signal processing
- …