18,924 research outputs found
Canonical Correlation Analysis of Datasets with a Common Source Graph
Canonical correlation analysis (CCA) is a powerful technique for discovering
whether or not hidden sources are commonly present in two (or more) datasets.
Its well-appreciated merits include dimensionality reduction, clustering,
classification, feature selection, and data fusion. The standard CCA however,
does not exploit the geometry of the common sources, which may be available
from the given data or can be deduced from (cross-) correlations. In this
paper, this extra information provided by the common sources generating the
data is encoded in a graph, and is invoked as a graph regularizer. This leads
to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The
novel gCCA accounts for the graph-induced knowledge of common sources, while
minimizing the distance between the wanted canonical variables. Tailored for
diverse practical settings where the number of data is smaller than the data
vector dimensions, the dual formulation of gCCA is also developed. One such
setting includes kernels that are incorporated to account for nonlinear data
dependencies. The resultant graph-kernel (gk) CCA is also obtained in closed
form. Finally, corroborating image classification tests over several real
datasets are presented to showcase the merits of the novel linear, dual, and
kernel approaches relative to competing alternatives.Comment: 10 pages, 7 figure
Graph Multiview Canonical Correlation Analysis
Multiview canonical correlation analysis (MCCA) seeks latent low-dimensional
representations encountered with multiview data of shared entities (a.k.a.
common sources). However, existing MCCA approaches do not exploit the geometry
of the common sources, which may be available \emph{a priori}, or can be
constructed using certain domain knowledge. This prior information about the
common sources can be encoded by a graph, and be invoked as a regularizer to
enrich the maximum variance MCCA framework. In this context, the present
paper's novel graph-regularized (G) MCCA approach minimizes the distance
between the wanted canonical variables and the common low-dimensional
representations, while accounting for graph-induced knowledge of the common
sources. Relying on a function capturing the extent low-dimensional
representations of the multiple views are similar, a generalization bound of
GMCCA is established based on Rademacher's complexity. Tailored for setups
where the number of data pairs is smaller than the data vector dimensions, a
graph-regularized dual MCCA approach is also developed. To further deal with
nonlinearities present in the data, graph-regularized kernel MCCA variants are
put forward too. Interestingly, solutions of the graph-regularized linear,
dual, and kernel MCCA, are all provided in terms of generalized eigenvalue
decomposition. Several corroborating numerical tests using real datasets are
provided to showcase the merits of the graph-regularized MCCA variants relative
to several competing alternatives including MCCA, Laplacian-regularized MCCA,
and (graph-regularized) PCA
An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges
Multimedia retrieval plays an indispensable role in big data utilization.
Past efforts mainly focused on single-media retrieval. However, the
requirements of users are highly flexible, such as retrieving the relevant
audio clips with one query of image. So challenges stemming from the "media
gap", which means that representations of different media types are
inconsistent, have attracted increasing attention. Cross-media retrieval is
designed for the scenarios where the queries and retrieval results are of
different media types. As a relatively new research topic, its concepts,
methodologies and benchmarks are still not clear in the literatures. To address
these issues, we review more than 100 references, give an overview including
the concepts, methodologies, major challenges and open issues, as well as build
up the benchmarks including datasets and experimental results. Researchers can
directly adopt the benchmarks to promptly evaluate their proposed methods. This
will help them to focus on algorithm design, rather than the time-consuming
compared methods and results. It is noted that we have constructed a new
dataset XMedia, which is the first publicly available dataset with up to five
media types (text, image, video, audio and 3D model). We believe this overview
will attract more researchers to focus on cross-media retrieval and be helpful
to them.Comment: 14 pages, accepted by IEEE Transactions on Circuits and Systems for
Video Technolog
Linked Component Analysis from Matrices to High Order Tensors: Applications to Biomedical Data
With the increasing availability of various sensor technologies, we now have
access to large amounts of multi-block (also called multi-set,
multi-relational, or multi-view) data that need to be jointly analyzed to
explore their latent connections. Various component analysis methods have
played an increasingly important role for the analysis of such coupled data. In
this paper, we first provide a brief review of existing matrix-based (two-way)
component analysis methods for the joint analysis of such data with a focus on
biomedical applications. Then, we discuss their important extensions and
generalization to multi-block multiway (tensor) data. We show how constrained
multi-block tensor decomposition methods are able to extract similar or
statistically dependent common features that are shared by all blocks, by
incorporating the multiway nature of data. Special emphasis is given to the
flexible common and individual feature analysis of multi-block data with the
aim to simultaneously extract common and individual latent components with
desired properties and types of diversity. Illustrative examples are given to
demonstrate their effectiveness for biomedical data analysis.Comment: 20 pages, 11 figures, Proceedings of the IEEE, 201
Multi-View Bayesian Correlated Component Analysis
Correlated component analysis as proposed by Dmochowski et al. (2012) is a
tool for investigating brain process similarity in the responses to multiple
views of a given stimulus. Correlated components are identified under the
assumption that the involved spatial networks are identical. Here we propose a
hierarchical probabilistic model that can infer the level of universality in
such multi-view data, from completely unrelated representations, corresponding
to canonical correlation analysis, to identical representations as in
correlated component analysis. This new model, which we denote Bayesian
correlated component analysis, evaluates favourably against three relevant
algorithms in simulated data. A well-established benchmark EEG dataset is used
to further validate the new model and infer the variability of spatial
representations across multiple subjects
SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules
Simplified Molecular Input Line Entry System (SMILES) is a single line text
representation of a unique molecule. One molecule can however have multiple
SMILES strings, which is a reason that canonical SMILES have been defined,
which ensures a one to one correspondence between SMILES string and molecule.
Here the fact that multiple SMILES represent the same molecule is explored as a
technique for data augmentation of a molecular QSAR dataset modeled by a long
short term memory (LSTM) cell based neural network. The augmented dataset was
130 times bigger than the original. The network trained with the augmented
dataset shows better performance on a test set when compared to a model built
with only one canonical SMILES string per molecule. The correlation coefficient
R2 on the test set was improved from 0.56 to 0.66 when using SMILES
enumeration, and the root mean square error (RMS) likewise fell from 0.62 to
0.55. The technique also works in the prediction phase. By taking the average
per molecule of the predictions for the enumerated SMILES a further improvement
to a correlation coefficient of 0.68 and a RMS of 0.52 was found
Alternating Diffusion Map Based Fusion of Multimodal Brain Connectivity Networks for IQ Prediction
To explain individual differences in development, behavior, and cognition,
most previous studies focused on projecting resting-state functional MRI (fMRI)
based functional connectivity (FC) data into a low-dimensional space via linear
dimensionality reduction techniques, followed by executing analysis operations.
However, linear dimensionality analysis techniques may fail to capture
nonlinearity of brain neuroactivity. Moreover, besides resting-state FC, FC
based on task fMRI can be expected to provide complementary information.
Motivated by these considerations, we nonlinearly fuse resting-state and
task-based FC networks (FCNs) to seek a better representation in this paper. We
propose a framework based on alternating diffusion map (ADM), which extracts
geometry-preserving low-dimensional embeddings that successfully parameterize
the intrinsic variables driving the phenomenon of interest. Specifically, we
first separately build resting-state and task-based FCNs by symmetric positive
definite matrices using sparse inverse covariance estimation for each subject,
and then utilize the ADM to fuse them in order to extract significant
low-dimensional embeddings, which are used as fingerprints to identify
individuals. The proposed framework is validated on the Philadelphia
Neurodevelopmental Cohort data, where we conduct extensive experimental study
on resting-state and fractal -back task fMRI for the classification of
intelligence quotient (IQ). The fusion of resting-state and -back task fMRI
by the proposed framework achieves better classification accuracy than any
single fMRI, and the proposed framework is shown to outperform several other
data fusion methods. To our knowledge, this paper is the first to demonstrate a
successful extension of the ADM to fuse resting-state and task-based fMRI data
for accurate prediction of IQ
Unsupervised Multi-modal Hashing for Cross-modal retrieval
With the advantage of low storage cost and high efficiency, hashing learning
has received much attention in the domain of Big Data. In this paper, we
propose a novel unsupervised hashing learning method to cope with this open
problem to directly preserve the manifold structure by hashing. To address this
problem, both the semantic correlation in textual space and the locally
geometric structure in the visual space are explored simultaneously in our
framework. Besides, the `2;1-norm constraint is imposed on the projection
matrices to learn the discriminative hash function for each modality. Extensive
experiments are performed to evaluate the proposed method on the three publicly
available datasets and the experimental results show that our method can
achieve superior performance over the state-of-the-art methods.Comment: 4 pages, 4 figure
Modality-specific Cross-modal Similarity Measurement with Recurrent Attention Network
Nowadays, cross-modal retrieval plays an indispensable role to flexibly find
information across different modalities of data. Effectively measuring the
similarity between different modalities of data is the key of cross-modal
retrieval. Different modalities such as image and text have imbalanced and
complementary relationships, which contain unequal amount of information when
describing the same semantics. For example, images often contain more details
that cannot be demonstrated by textual descriptions and vice versa. Existing
works based on Deep Neural Network (DNN) mostly construct one common space for
different modalities to find the latent alignments between them, which lose
their exclusive modality-specific characteristics. Different from the existing
works, we propose modality-specific cross-modal similarity measurement (MCSM)
approach by constructing independent semantic space for each modality, which
adopts end-to-end framework to directly generate modality-specific cross-modal
similarity without explicit common representation. For each semantic space,
modality-specific characteristics within one modality are fully exploited by
recurrent attention network, while the data of another modality is projected
into this space with attention based joint embedding to utilize the learned
attention weights for guiding the fine-grained cross-modal correlation
learning, which can capture the imbalanced and complementary relationships
between different modalities. Finally, the complementarity between the semantic
spaces for different modalities is explored by adaptive fusion of the
modality-specific cross-modal similarities to perform cross-modal retrieval.
Experiments on the widely-used Wikipedia and Pascal Sentence datasets as well
as our constructed large-scale XMediaNet dataset verify the effectiveness of
our proposed approach, outperforming 9 state-of-the-art methods.Comment: 13 pages, submitted to IEEE Transactions on Image Processin
CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning
It is known that the inconsistent distribution and representation of
different modalities, such as image and text, cause the heterogeneity gap that
makes it challenging to correlate such heterogeneous data. Generative
adversarial networks (GANs) have shown its strong ability of modeling data
distribution and learning discriminative representation, existing GANs-based
works mainly focus on generative problem to generate new data. We have
different goal, aim to correlate heterogeneous data, by utilizing the power of
GANs to model cross-modal joint distribution. Thus, we propose Cross-modal GANs
to learn discriminative common representation for bridging heterogeneity gap.
The main contributions are: (1) Cross-modal GANs architecture is proposed to
model joint distribution over data of different modalities. The inter-modality
and intra-modality correlation can be explored simultaneously in generative and
discriminative models. Both of them beat each other to promote cross-modal
correlation learning. (2) Cross-modal convolutional autoencoders with
weight-sharing constraint are proposed to form generative model. They can not
only exploit cross-modal correlation for learning common representation, but
also preserve reconstruction information for capturing semantic consistency
within each modality. (3) Cross-modal adversarial mechanism is proposed, which
utilizes two kinds of discriminative models to simultaneously conduct
intra-modality and inter-modality discrimination. They can mutually boost to
make common representation more discriminative by adversarial training process.
To the best of our knowledge, our proposed CM-GANs approach is the first to
utilize GANs to perform cross-modal common representation learning. Experiments
are conducted to verify the performance of our proposed approach on cross-modal
retrieval paradigm, compared with 10 methods on 3 cross-modal datasets
- …