67,837 research outputs found
Probabilistic models for multi-view semi-supervised learning and coding
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 146-160).This thesis investigates the problem of classification from multiple noisy sensors or modalities. Examples include speech and gesture interfaces and multi-camera distributed sensor networks. Reliable recognition in such settings hinges upon the ability to learn accurate classification models in the face of limited supervision and to cope with the relatively large amount of potentially redundant information transmitted by each sensor or modality (i.e., view). We investigate and develop novel multi view learning algorithms capable of learning from semi-supervised noisy sensor data, for automatically adapting to new users and working conditions, and for performing distributed feature selection on bandwidth limited sensor networks. We propose probabilistic models built upon multi-view Gaussian Processes (GPs) for solving this class of problems, and demonstrate our approaches for solving audio-visual speech and gesture, and multi-view object classification problems. Multi-modal tasks are good candidates for multi-view learning, since each modality provides a potentially redundant view to the learning algorithm. On audio-visual speech unit classification, and user agreement recognition using spoken utterances and head gestures, we demonstrate that multi-modal co-training can be used to learn from only a few labeled examples in one or both of the audio-visual modalities. We also propose a co-adaptation algorithm, which adapts existing audio-visual classifiers to a particular user or noise condition by leveraging the redundancy in the unlabeled data. Existing methods typically assume constant per-channel noise models.(cont.) In contrast we develop co-training algorithms that are able to learn from noisy sensor data corrupted by complex per-sample noise processes, e.g., occlusion common to multi sensor classification problems. We propose a probabilistic heteroscedastic approach to co-training that simultaneously discovers the amount of noise on a per-sample basis, while solving the classification task. This results in accurate performance in the presence of occlusion or other complex noise processes. We also investigate an extension of this idea for supervised multi-view learning where we develop a Bayesian multiple kernel learning algorithm that can learn a local weighting over each view of the input space. We additionally consider the problem of distributed object recognition or indexing from multiple cameras, where the computational power available at each camera sensor is limited and communication between cameras is prohibitively expensive. In this scenario, it is desirable to avoid sending redundant visual features from multiple views. Traditional supervised feature selection approaches are inapplicable as the class label is unknown at each camera. In this thesis, we propose an unsupervised multi-view feature selection algorithm based on a distributed coding approach. With our method, a Gaussian Process model of the joint view statistics is used at the receiver to obtain a joint encoding of the views without directly sharing information across encoders. We demonstrate our approach on recognition and indexing tasks with multi-view image databases and show that our method compares favorably to an independent encoding of the features from each camera.by C. Mario Christoudias.Ph.D
Structure fusion based on graph convolutional networks for semi-supervised classification
Suffering from the multi-view data diversity and complexity for
semi-supervised classification, most of existing graph convolutional networks
focus on the networks architecture construction or the salient graph structure
preservation, and ignore the the complete graph structure for semi-supervised
classification contribution. To mine the more complete distribution structure
from multi-view data with the consideration of the specificity and the
commonality, we propose structure fusion based on graph convolutional networks
(SF-GCN) for improving the performance of semi-supervised classification.
SF-GCN can not only retain the special characteristic of each view data by
spectral embedding, but also capture the common style of multi-view data by
distance metric between multi-graph structures. Suppose the linear relationship
between multi-graph structures, we can construct the optimization function of
structure fusion model by balancing the specificity loss and the commonality
loss. By solving this function, we can simultaneously obtain the fusion
spectral embedding from the multi-view data and the fusion structure as
adjacent matrix to input graph convolutional networks for semi-supervised
classification. Experiments demonstrate that the performance of SF-GCN
outperforms that of the state of the arts on three challenging datasets, which
are Cora,Citeseer and Pubmed in citation networks
- …