328 research outputs found
Composite Correlation Quantization for Efficient Multimodal Retrieval
Efficient similarity retrieval from large-scale multimodal database is
pervasive in modern search engines and social networks. To support queries
across content modalities, the system should enable cross-modal correlation and
computation-efficient indexing. While hashing methods have shown great
potential in achieving this goal, current attempts generally fail to learn
isomorphic hash codes in a seamless scheme, that is, they embed multiple
modalities in a continuous isomorphic space and separately threshold embeddings
into binary codes, which incurs substantial loss of retrieval accuracy. In this
paper, we approach seamless multimodal hashing by proposing a novel Composite
Correlation Quantization (CCQ) model. Specifically, CCQ jointly finds
correlation-maximal mappings that transform different modalities into
isomorphic latent space, and learns composite quantizers that convert the
isomorphic latent features into compact binary codes. An optimization framework
is devised to preserve both intra-modal similarity and inter-modal correlation
through minimizing both reconstruction and quantization errors, which can be
trained from both paired and partially paired data in linear time. A
comprehensive set of experiments clearly show the superior effectiveness and
efficiency of CCQ against the state of the art hashing methods for both
unimodal and cross-modal retrieval
A Brief Introduction to Machine Learning for Engineers
This monograph aims at providing an introduction to key concepts, algorithms,
and theoretical results in machine learning. The treatment concentrates on
probabilistic models for supervised and unsupervised learning problems. It
introduces fundamental concepts and algorithms by building on first principles,
while also exposing the reader to more advanced topics with extensive pointers
to the literature, within a unified notation and mathematical framework. The
material is organized according to clearly defined categories, such as
discriminative and generative models, frequentist and Bayesian approaches,
exact and approximate inference, as well as directed and undirected models.
This monograph is meant as an entry point for researchers with a background in
probability and linear algebra.Comment: This is an expanded and improved version of the original posting.
Feedback is welcom
A Neighborhood-preserving Graph Summarization
We introduce in this paper a new summarization method for large graphs. Our
summarization approach retains only a user-specified proportion of the
neighbors of each node in the graph. Our main aim is to simplify large graphs
so that they can be analyzed and processed effectively while preserving as many
of the node neighborhood properties as possible. Since many graph algorithms
are based on the neighborhood information available for each node, the idea is
to produce a smaller graph which can be used to allow these algorithms to
handle large graphs and run faster while providing good approximations.
Moreover, our compression allows users to control the size of the compressed
graph by adjusting the amount of information loss that can be tolerated. The
experiments conducted on various real and synthetic graphs show that our
compression reduces considerably the size of the graphs. Moreover, we conducted
several experiments on the obtained summaries using various graph algorithms
and applications, such as node embedding, graph classification and shortest
path approximations. The obtained results show interesting trade-offs between
the algorithms runtime speed-up and the precision loss.Comment: 17 pages, 10 figure
Effective AER Object Classification Using Segmented Probability-Maximization Learning in Spiking Neural Networks
Address event representation (AER) cameras have recently attracted more
attention due to the advantages of high temporal resolution and low power
consumption, compared with traditional frame-based cameras. Since AER cameras
record the visual input as asynchronous discrete events, they are inherently
suitable to coordinate with the spiking neural network (SNN), which is
biologically plausible and energy-efficient on neuromorphic hardware. However,
using SNN to perform the AER object classification is still challenging, due to
the lack of effective learning algorithms for this new representation. To
tackle this issue, we propose an AER object classification model using a novel
segmented probability-maximization (SPA) learning algorithm. Technically, 1)
the SPA learning algorithm iteratively maximizes the probability of the classes
that samples belong to, in order to improve the reliability of neuron responses
and effectiveness of learning; 2) a peak detection (PD) mechanism is introduced
in SPA to locate informative time points segment by segment, based on which
information within the whole event stream can be fully utilized by the
learning. Extensive experimental results show that, compared to
state-of-the-art methods, not only our model is more effective, but also it
requires less information to reach a certain level of accuracy.Comment: AAAI 2020 (Oral
Taking the bite out of automated naming of characters in TV video
We investigate the problem of automatically labelling appearances of characters in TV or film material
with their names. This is tremendously challenging due to the huge variation in imaged appearance of each character and the weakness and ambiguity of available annotation. However, we demonstrate that high precision can be achieved by combining multiple sources of information, both visual and textual. The principal novelties that we introduce are: (i) automatic generation of time stamped character annotation by aligning subtitles and transcripts; (ii) strengthening the supervisory information by identifying
when characters are speaking. In addition, we incorporate complementary cues of face matching and clothing matching to propose common annotations for face tracks, and consider choices of classifier which can potentially correct errors made in the automatic extraction of training data from the weak textual annotation. Results are presented on episodes of the TV series ‘‘Buffy the Vampire Slayer”
Resource Allocation, Scheduling and Feedback Reduction in Multiple Input Multiple Output (MIMO) Orthogonal Frequency-Division Multiplexing (OFDM) Systems
The number of wireless systems, services, and users are constantly increasing and therefore the bandwidth requirements have become higher. One of the most robust modulations is Orthogonal Frequency-Division Multiplexing (OFDM). It has been considered as an attractive solution for future broadband wireless communications.
This dissertation investigates bit and power allocation, joint resource allocation, user scheduling, and limited feedback problem in multi-user OFDM systems. The following dissertation contributes to improved OFDM systems in the following manner. (1) A low complexity sub-carrier, power, and bit allocation algorithm is proposed. This algorithm has lower computational complexity and results in performance that is comparable to that of the existing algorithms. (2) Variations of the proportional fair scheduling scheme are proposed and analyzed. The proposed scheme improves system throughput and delay time, and achieves higher throughput without sacrificing fairness which makes it a better scheme in terms of efficiency and fairness. (3) A DCT feedback compression algorithm based on sorting is proposed. This algorithm uses sorting to increase the correlation between feedback channel quality information of frequency selective channels. The feedback overhead of system is successfully reduced
- …