33,259 research outputs found

    Multi-view Data Analysis

    Get PDF
    Multi-view data analysis is a key technology for making effective decisions by leveraging information from multiple data sources. The process of data acquisition across various sensory modalities gives rise to the heterogeneous property of data. In my thesis, multi-view data representations are studied towards exploiting the enriched information encoded in different domains or feature types, and novel algorithms are formulated to enhance feature discriminability. Extracting informative data representation is a critical step in visual recognition and data mining tasks. Multi-view embeddings provide a new way of representation learning to bridge the semantic gap between the low-level observations and high-level human comprehensible knowledge beneļ¬tting from enriched information in multiple modalities.Recent advances on multi-view learning have introduced a new paradigm in jointly modeling cross-modal data. Subspace learning method, which extracts compact features by exploiting a common latent space and fuses multi-view information, has emerged proiminent among different categories of multi-view learning techniques. This thesis provides novel solutions in learning compact and discriminative multi-view data representations by exploiting the data structures in low dimensional subspace. We also demonstrate the performance of the learned representation scheme on a number of challenging tasks in recognition, retrieval and ranking problems.The major contribution of the thesis is a uniļ¬ed solution for subspace learning methods, which is extensible for multiple views, supervised learning, and non-linear transformations. Traditional statistical learning techniques including Canonical Correlation Analysis, Partial Least Square regression and Linear Discriminant Analysis are studied by constructing graphs of speciļ¬c forms under the same framework. Methods using non-linear transforms based on kernels and (deep) neural networks are derived, which lead to superior performance compared to the linear ones. A novel multi-view discriminant embedding method is proposed by taking the view difference into consideration. Secondly, a multiview nonparametric discriminant analysis method is introduced by exploiting the class boundary structure and discrepancy information of the available views. This allows for multiple projecion directions, by relaxing the Gaussian distribution assumption of related methods. Thirdly, we propose a composite ranking method by keeping a close correlation with the individual rankings for optimal rank fusion. We propose a multi-objective solution to ranking problems by capturing inter-view and intra-view information using autoencoderlike networks. Finally, a novel end-to-end solution is introduced to enhance joint ranking with minimum view-speciļ¬c ranking loss, so that we can achieve the maximum global view agreements within a single optimization process.In summary, this thesis aims to address the challenges in representing multi-view data across different tasks. The proposed solutions have shown superior performance in numerous tasks, including object recognition, cross-modal image retrieval, face recognition and object ranking

    Coupled Deep Learning for Heterogeneous Face Recognition

    Full text link
    Heterogeneous face matching is a challenge issue in face recognition due to large domain difference as well as insufficient pairwise images in different modalities during training. This paper proposes a coupled deep learning (CDL) approach for the heterogeneous face matching. CDL seeks a shared feature space in which the heterogeneous face matching problem can be approximately treated as a homogeneous face matching problem. The objective function of CDL mainly includes two parts. The first part contains a trace norm and a block-diagonal prior as relevance constraints, which not only make unpaired images from multiple modalities be clustered and correlated, but also regularize the parameters to alleviate overfitting. An approximate variational formulation is introduced to deal with the difficulties of optimizing low-rank constraint directly. The second part contains a cross modal ranking among triplet domain specific images to maximize the margin for different identities and increase data for a small amount of training samples. Besides, an alternating minimization method is employed to iteratively update the parameters of CDL. Experimental results show that CDL achieves better performance on the challenging CASIA NIR-VIS 2.0 face recognition database, the IIIT-D Sketch database, the CUHK Face Sketch (CUFS), and the CUHK Face Sketch FERET (CUFSF), which significantly outperforms state-of-the-art heterogeneous face recognition methods.Comment: AAAI 201
    • ā€¦
    corecore