14,899 research outputs found
A Comprehensive Survey on Cross-modal Retrieval
In recent years, cross-modal retrieval has drawn much attention due to the
rapid growth of multimodal data. It takes one type of data as the query to
retrieve relevant data of another type. For example, a user can use a text to
retrieve relevant pictures or videos. Since the query and its retrieved results
can be of different modalities, how to measure the content similarity between
different modalities of data remains a challenge. Various methods have been
proposed to deal with such a problem. In this paper, we first review a number
of representative methods for cross-modal retrieval and classify them into two
main groups: 1) real-valued representation learning, and 2) binary
representation learning. Real-valued representation learning methods aim to
learn real-valued common representations for different modalities of data. To
speed up the cross-modal retrieval, a number of binary representation learning
methods are proposed to map different modalities of data into a common Hamming
space. Then, we introduce several multimodal datasets in the community, and
show the experimental results on two commonly used multimodal datasets. The
comparison reveals the characteristic of different kinds of cross-modal
retrieval methods, which is expected to benefit both practical applications and
future research. Finally, we discuss open problems and future research
directions.Comment: 20 pages, 11 figures, 9 table
Why Can't We Predict RNA Structure At Atomic Resolution?
No existing algorithm can start with arbitrary RNA sequences and return the
precise, three-dimensional structures that ensures their biological function.
This chapter outlines current algorithms for automated RNA structure prediction
(including our own FARNA-FARFAR), highlights their successes, and dissects
their limitations, using a tetraloop and the sarcin/ricin motif as examples.
The barriers to future advances are considered in light of three particular
challenges: improving computational sampling, reducing reliance on
experimentally solved structures, and avoiding coarse-grained representations
of atomic-level interactions. To help meet these challenges and better
understand the current state of the field, we propose an ongoing community-wide
CASP-style experiment for evaluating the performance of current structure
prediction algorithms.Comment: K. Beauchamp & P. Sripakdeevong are equally contributing authors.
Submission for book: RNA 3D Structure Analysis and Prediction, editors: N.
Leontis & E. Westho
L0-norm Sparse Graph-regularized SVD for Biclustering
Learning the "blocking" structure is a central challenge for high dimensional
data (e.g., gene expression data). Recently, a sparse singular value
decomposition (SVD) has been used as a biclustering tool to achieve this goal.
However, this model ignores the structural information between variables (e.g.,
gene interaction graph). Although typical graph-regularized norm can
incorporate such prior graph information to get accurate discovery and better
interpretability, it fails to consider the opposite effect of variables with
different signs. Motivated by the development of sparse coding and
graph-regularized norm, we propose a novel sparse graph-regularized SVD as a
powerful biclustering tool for analyzing high-dimensional data. The key of this
method is to impose two penalties including a novel graph-regularized norm
() and -norm () on singular
vectors to induce structural sparsity and enhance interpretability. We design
an efficient Alternating Iterative Sparse Projection (AISP) algorithm to solve
it. Finally, we apply our method and related ones to simulated and real data to
show its efficiency in capturing natural blocking structures.Comment: 8 pages, 2 figure
DNA Steganalysis Using Deep Recurrent Neural Networks
Recent advances in next-generation sequencing technologies have facilitated
the use of deoxyribonucleic acid (DNA) as a novel covert channels in
steganography. There are various methods that exist in other domains to detect
hidden messages in conventional covert channels. However, they have not been
applied to DNA steganography. The current most common detection approaches,
namely frequency analysis-based methods, often overlook important signals when
directly applied to DNA steganography because those methods depend on the
distribution of the number of sequence characters. To address this limitation,
we propose a general sequence learning-based DNA steganalysis framework. The
proposed approach learns the intrinsic distribution of coding and non-coding
sequences and detects hidden messages by exploiting distribution variations
after hiding these messages. Using deep recurrent neural networks (RNNs), our
framework identifies the distribution variations by using the classification
score to predict whether a sequence is to be a coding or non-coding sequence.
We compare our proposed method to various existing methods and biological
sequence analysis methods implemented on top of our framework. According to our
experimental results, our approach delivers a robust detection performance
compared to other tools
Kernelized Low Rank Representation on Grassmann Manifolds
Low rank representation (LRR) has recently attracted great interest due to
its pleasing efficacy in exploring low-dimensional subspace structures embedded
in data. One of its successful applications is subspace clustering which means
data are clustered according to the subspaces they belong to. In this paper, at
a higher level, we intend to cluster subspaces into classes of subspaces. This
is naturally described as a clustering problem on Grassmann manifold. The
novelty of this paper is to generalize LRR on Euclidean space onto an LRR model
on Grassmann manifold in a uniform kernelized framework. The new methods have
many applications in computer vision tasks. Several clustering experiments are
conducted on handwritten digit images, dynamic textures, human face clips and
traffic scene sequences. The experimental results show that the proposed
methods outperform a number of state-of-the-art subspace clustering methods.Comment: 13 page
Multi-view Low-rank Sparse Subspace Clustering
Most existing approaches address multi-view subspace clustering problem by
constructing the affinity matrix on each view separately and afterwards propose
how to extend spectral clustering algorithm to handle multi-view data. This
paper presents an approach to multi-view subspace clustering that learns a
joint subspace representation by constructing affinity matrix shared among all
views. Relying on the importance of both low-rank and sparsity constraints in
the construction of the affinity matrix, we introduce the objective that
balances between the agreement across different views, while at the same time
encourages sparsity and low-rankness of the solution. Related low-rank and
sparsity constrained optimization problem is for each view solved using the
alternating direction method of multipliers. Furthermore, we extend our
approach to cluster data drawn from nonlinear subspaces by solving the
corresponding problem in a reproducing kernel Hilbert space. The proposed
algorithm outperforms state-of-the-art multi-view subspace clustering
algorithms on one synthetic and four real-world datasets
Video Compressive Sensing for Dynamic MRI
We present a video compressive sensing framework, termed kt-CSLDS, to
accelerate the image acquisition process of dynamic magnetic resonance imaging
(MRI). We are inspired by a state-of-the-art model for video compressive
sensing that utilizes a linear dynamical system (LDS) to model the motion
manifold. Given compressive measurements, the state sequence of an LDS can be
first estimated using system identification techniques. We then reconstruct the
observation matrix using a joint structured sparsity assumption. In particular,
we minimize an objective function with a mixture of wavelet sparsity and joint
sparsity within the observation matrix. We derive an efficient convex
optimization algorithm through alternating direction method of multipliers
(ADMM), and provide a theoretical guarantee for global convergence. We
demonstrate the performance of our approach for video compressive sensing, in
terms of reconstruction accuracy. We also investigate the impact of various
sampling strategies. We apply this framework to accelerate the acquisition
process of dynamic MRI and show it achieves the best reconstruction accuracy
with the least computational time compared with existing algorithms in the
literature.Comment: 30 pages, 9 figure
Set-Difference Range Queries
We introduce the problem of performing set-difference range queries, where
answers to queries are set-theoretic symmetric differences between sets of
items in two geometric ranges. We describe a general framework for answering
such queries based on a novel use of data-streaming sketches we call signed
symmetric-difference sketches. We show that such sketches can be realized using
invertible Bloom filters (IBFs), which can be composed, differenced, and
searched so as to solve set-difference range queries in a wide range of
scenarios
Learning Invariant Color Features for Person Re-Identification
Matching people across multiple camera views known as person
re-identification, is a challenging problem due to the change in visual
appearance caused by varying lighting conditions. The perceived color of the
subject appears to be different with respect to illumination. Previous works
use color as it is or address these challenges by designing color spaces
focusing on a specific cue. In this paper, we propose a data driven approach
for learning color patterns from pixels sampled from images across two camera
views. The intuition behind this work is that, even though pixel values of same
color would be different across views, they should be encoded with the same
values. We model color feature generation as a learning problem by jointly
learning a linear transformation and a dictionary to encode pixel values. We
also analyze different photometric invariant color spaces. Using color as the
only cue, we compare our approach with all the photometric invariant color
spaces and show superior performance over all of them. Combining with other
learned low-level and high-level features, we obtain promising results in
ViPER, Person Re-ID 2011 and CAVIAR4REID datasets
- …