38,610 research outputs found
Multi-view constrained clustering with an incomplete mapping between views
Multi-view learning algorithms typically assume a complete bipartite mapping
between the different views in order to exchange information during the
learning process. However, many applications provide only a partial mapping
between the views, creating a challenge for current methods. To address this
problem, we propose a multi-view algorithm based on constrained clustering that
can operate with an incomplete mapping. Given a set of pairwise constraints in
each view, our approach propagates these constraints using a local similarity
measure to those instances that can be mapped to the other views, allowing the
propagated constraints to be transferred across views via the partial mapping.
It uses co-EM to iteratively estimate the propagation within each view based on
the current clustering model, transfer the constraints across views, and then
update the clustering model. By alternating the learning process between views,
this approach produces a unified clustering model that is consistent with all
views. We show that this approach significantly improves clustering performance
over several other methods for transferring constraints and allows multi-view
clustering to be reliably applied when given a limited mapping between the
views. Our evaluation reveals that the propagated constraints have high
precision with respect to the true clusters in the data, explaining their
benefit to clustering performance in both single- and multi-view learning
scenarios
Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data
There are threefold challenges in emotion recognition. First, it is difficult
to recognize human's emotional states only considering a single modality.
Second, it is expensive to manually annotate the emotional data. Third,
emotional data often suffers from missing modalities due to unforeseeable
sensor malfunction or configuration issues. In this paper, we address all these
problems under a novel multi-view deep generative framework. Specifically, we
propose to model the statistical relationships of multi-modality emotional data
using multiple modality-specific generative networks with a shared latent
space. By imposing a Gaussian mixture assumption on the posterior approximation
of the shared latent variables, our framework can learn the joint deep
representation from multiple modalities and evaluate the importance of each
modality simultaneously. To solve the labeled-data-scarcity problem, we extend
our multi-view model to semi-supervised learning scenario by casting the
semi-supervised classification problem as a specialized missing data imputation
task. To address the missing-modality problem, we further extend our
semi-supervised multi-view model to deal with incomplete data, where a missing
view is treated as a latent variable and integrated out during inference. This
way, the proposed overall framework can utilize all available (both labeled and
unlabeled, as well as both complete and incomplete) data to improve its
generalization ability. The experiments conducted on two real multi-modal
emotion datasets demonstrated the superiority of our framework.Comment: arXiv admin note: text overlap with arXiv:1704.07548, 2018 ACM
Multimedia Conference (MM'18
3D Shape Segmentation with Projective Convolutional Networks
This paper introduces a deep architecture for segmenting 3D objects into
their labeled semantic parts. Our architecture combines image-based Fully
Convolutional Networks (FCNs) and surface-based Conditional Random Fields
(CRFs) to yield coherent segmentations of 3D shapes. The image-based FCNs are
used for efficient view-based reasoning about 3D object parts. Through a
special projection layer, FCN outputs are effectively aggregated across
multiple views and scales, then are projected onto the 3D object surfaces.
Finally, a surface-based CRF combines the projected outputs with geometric
consistency cues to yield coherent segmentations. The whole architecture
(multi-view FCNs and CRF) is trained end-to-end. Our approach significantly
outperforms the existing state-of-the-art methods in the currently largest
segmentation benchmark (ShapeNet). Finally, we demonstrate promising
segmentation results on noisy 3D shapes acquired from consumer-grade depth
cameras.Comment: This is an updated version of our CVPR 2017 paper. We incorporated
new experiments that demonstrate ShapePFCN performance under the case of
consistent *upright* orientation and an additional input channel in our
rendered images for encoding height from the ground plane (upright axis
coordinate values). Performance is improved in this settin
Eye-CU: Sleep Pose Classification for Healthcare using Multimodal Multiview Data
Manual analysis of body poses of bed-ridden patients requires staff to
continuously track and record patient poses. Two limitations in the
dissemination of pose-related therapies are scarce human resources and
unreliable automated systems. This work addresses these issues by introducing a
new method and a new system for robust automated classification of sleep poses
in an Intensive Care Unit (ICU) environment. The new method,
coupled-constrained Least-Squares (cc-LS), uses multimodal and multiview (MM)
data and finds the set of modality trust values that minimizes the difference
between expected and estimated labels. The new system, Eye-CU, is an affordable
multi-sensor modular system for unobtrusive data collection and analysis in
healthcare. Experimental results indicate that the performance of cc-LS matches
the performance of existing methods in ideal scenarios. This method outperforms
the latest techniques in challenging scenarios by 13% for those with poor
illumination and by 70% for those with both poor illumination and occlusions.
Results also show that a reduced Eye-CU configuration can classify poses
without pressure information with only a slight drop in its performance.Comment: Ten-page manuscript including references and ten figure
Automatic 3D bi-ventricular segmentation of cardiac images by a shape-refined multi-task deep learning approach
Deep learning approaches have achieved state-of-the-art performance in
cardiac magnetic resonance (CMR) image segmentation. However, most approaches
have focused on learning image intensity features for segmentation, whereas the
incorporation of anatomical shape priors has received less attention. In this
paper, we combine a multi-task deep learning approach with atlas propagation to
develop a shape-constrained bi-ventricular segmentation pipeline for short-axis
CMR volumetric images. The pipeline first employs a fully convolutional network
(FCN) that learns segmentation and landmark localisation tasks simultaneously.
The architecture of the proposed FCN uses a 2.5D representation, thus combining
the computational advantage of 2D FCNs networks and the capability of
addressing 3D spatial consistency without compromising segmentation accuracy.
Moreover, the refinement step is designed to explicitly enforce a shape
constraint and improve segmentation quality. This step is effective for
overcoming image artefacts (e.g. due to different breath-hold positions and
large slice thickness), which preclude the creation of anatomically meaningful
3D cardiac shapes. The proposed pipeline is fully automated, due to network's
ability to infer landmarks, which are then used downstream in the pipeline to
initialise atlas propagation. We validate the pipeline on 1831 healthy subjects
and 649 subjects with pulmonary hypertension. Extensive numerical experiments
on the two datasets demonstrate that our proposed method is robust and capable
of producing accurate, high-resolution and anatomically smooth bi-ventricular
3D models, despite the artefacts in input CMR volumes
A Concise yet Effective model for Non-Aligned Incomplete Multi-view and Missing Multi-label Learning
In reality, learning from multi-view multi-label data inevitably confronts
three challenges: missing labels, incomplete views, and non-aligned views.
Existing methods mainly concern the first two and commonly need multiple
assumptions to attack them, making even state-of-the-arts involve at least two
explicit hyper-parameters such that model selection is quite difficult. More
roughly, they will fail in handling the third challenge, let alone addressing
the three jointly. In this paper, we aim at meeting these under the least
assumption by building a concise yet effective model with just one
hyper-parameter. To ease insufficiency of available labels, we exploit not only
the consensus of multiple views but also the global and local structures hidden
among multiple labels. Specifically, we introduce an indicator matrix to tackle
the first two challenges in a regression form while aligning the same
individual labels and all labels of different views in a common label space to
battle the third challenge. In aligning, we characterize the global and local
structures of multiple labels to be high-rank and low-rank, respectively.
Subsequently, an efficient algorithm with linear time complexity in the number
of samples is established. Finally, even without view-alignment, our method
substantially outperforms state-of-the-arts with view-alignment on five real
datasets.Comment: 15 pages, 7 figure
- …