38 research outputs found
Dimensionality Reduction using Similarity-induced Embeddings
The vast majority of Dimensionality Reduction (DR) techniques rely on
second-order statistics to define their optimization objective. Even though
this provides adequate results in most cases, it comes with several
shortcomings. The methods require carefully designed regularizers and they are
usually prone to outliers. In this work, a new DR framework, that can directly
model the target distribution using the notion of similarity instead of
distance, is introduced. The proposed framework, called Similarity Embedding
Framework, can overcome the aforementioned limitations and provides a
conceptually simpler way to express optimization targets similar to existing DR
techniques. Deriving a new DR technique using the Similarity Embedding
Framework becomes simply a matter of choosing an appropriate target similarity
matrix. A variety of classical tasks, such as performing supervised
dimensionality reduction and providing out-of-of-sample extensions, as well as,
new novel techniques, such as providing fast linear embeddings for complex
techniques, are demonstrated in this paper using the proposed framework. Six
datasets from a diverse range of domains are used to evaluate the proposed
method and it is demonstrated that it can outperform many existing DR
techniques.Comment: Accepted in IEEE Transactions on Neural Networks and Learning System
Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are well established models capable of
achieving state-of-the-art classification accuracy for various computer vision
tasks. However, they are becoming increasingly larger, using millions of
parameters, while they are restricted to handling images of fixed size. In this
paper, a quantization-based approach, inspired from the well-known
Bag-of-Features model, is proposed to overcome these limitations. The proposed
approach, called Convolutional BoF (CBoF), uses RBF neurons to quantize the
information extracted from the convolutional layers and it is able to natively
classify images of various sizes as well as to significantly reduce the number
of parameters in the network. In contrast to other global pooling operators and
CNN compression techniques the proposed method utilizes a trainable pooling
layer that it is end-to-end differentiable, allowing the network to be trained
using regular back-propagation and to achieve greater distribution shift
invariance than competitive methods. The ability of the proposed method to
reduce the parameters of the network and increase the classification accuracy
over other state-of-the-art techniques is demonstrated using three image
datasets.Comment: Accepted at ICCV 201
Deep Supervised Hashing leveraging Quadratic Spherical Mutual Information for Content-based Image Retrieval
Several deep supervised hashing techniques have been proposed to allow for
efficiently querying large image databases. However, deep supervised image
hashing techniques are developed, to a great extent, heuristically often
leading to suboptimal results. Contrary to this, we propose an efficient deep
supervised hashing algorithm that optimizes the learned codes using an
information-theoretic measure, the Quadratic Mutual Information (QMI). The
proposed method is adapted to the needs of large-scale hashing and information
retrieval leading to a novel information-theoretic measure, the Quadratic
Spherical Mutual Information (QSMI). Apart from demonstrating the effectiveness
of the proposed method under different scenarios and outperforming existing
state-of-the-art image hashing techniques, this paper provides a structured way
to model the process of information retrieval and develop novel methods adapted
to the needs of each application
Decoding Generic Visual Representations From Human Brain Activity using Machine Learning
Among the most impressive recent applications of neural decoding is the
visual representation decoding, where the category of an object that a subject
either sees or imagines is inferred by observing his/her brain activity. Even
though there is an increasing interest in the aforementioned visual
representation decoding task, there is no extensive study of the effect of
using different machine learning models on the decoding accuracy. In this paper
we provide an extensive evaluation of several machine learning models, along
with different similarity metrics, for the aforementioned task, drawing many
interesting conclusions. That way, this paper a) paves the way for developing
more advanced and accurate methods and b) provides an extensive and easily
reproducible baseline for the aforementioned decoding task.Comment: Accepted at 1st Workshop on Brain-Driven Computer Vision - ECCV 201
deepsing: Generating Sentiment-aware Visual Stories using Cross-modal Music Translation
In this paper we propose a deep learning method for performing
attributed-based music-to-image translation. The proposed method is applied for
synthesizing visual stories according to the sentiment expressed by songs. The
generated images aim to induce the same feelings to the viewers, as the
original song does, reinforcing the primary aim of music, i.e., communicating
feelings. The process of music-to-image translation poses unique challenges,
mainly due to the unstable mapping between the different modalities involved in
this process. In this paper, we employ a trainable cross-modal translation
method to overcome this limitation, leading to the first, to the best of our
knowledge, deep learning method for generating sentiment-aware visual stories.
Various aspects of the proposed method are extensively evaluated and discussed
using different songs
Learning Deep Representations with Probabilistic Knowledge Transfer
Knowledge Transfer (KT) techniques tackle the problem of transferring the
knowledge from a large and complex neural network into a smaller and faster
one. However, existing KT methods are tailored towards classification tasks and
they cannot be used efficiently for other representation learning tasks. In
this paper a novel knowledge transfer technique, that is capable of training a
student model that maintains the same amount of mutual information between the
learned representation and a set of (possible unknown) labels as the teacher
model, is proposed. Apart from outperforming existing KT techniques, the
proposed method allows for overcoming several limitations of existing methods
providing new insight into KT as well as novel KT applications, ranging from
knowledge transfer from handcrafted feature extractors to {cross-modal} KT from
the textual modality into the representation extracted from the visual modality
of the data.Comment: Accepted at ECCV201
Heterogeneous Knowledge Distillation using Information Flow Modeling
Knowledge Distillation (KD) methods are capable of transferring the knowledge
encoded in a large and complex teacher into a smaller and faster student. Early
methods were usually limited to transferring the knowledge only between the
last layers of the networks, while latter approaches were capable of performing
multi-layer KD, further increasing the accuracy of the student. However,
despite their improved performance, these methods still suffer from several
limitations that restrict both their efficiency and flexibility. First,
existing KD methods typically ignore that neural networks undergo through
different learning phases during the training process, which often requires
different types of supervision for each one. Furthermore, existing multi-layer
KD methods are usually unable to effectively handle networks with significantly
different architectures (heterogeneous KD). In this paper we propose a novel KD
method that works by modeling the information flow through the various layers
of the teacher model and then train a student model to mimic this information
flow. The proposed method is capable of overcoming the aforementioned
limitations by using an appropriate supervision scheme during the different
phases of the training process, as well as by designing and training an
appropriate auxiliary teacher model that acts as a proxy model capable of
"explaining" the way the teacher works to the student. The effectiveness of the
proposed method is demonstrated using four image datasets and several different
evaluation setups.Comment: Accepted at CVPR 202
Interactive dimensionality reduction using similarity projections
Recent advances in machine learning allow us to analyze and describe the
content of high-dimensional data like text, audio, images or other signals. In
order to visualize that data in 2D or 3D, usually Dimensionality Reduction (DR)
techniques are employed. Most of these techniques, e.g., PCA or t-SNE, produce
static projections without taking into account corrections from humans or other
data exploration scenarios. In this work, we propose the interactive Similarity
Projection (iSP), a novel interactive DR framework based on similarity
embeddings, where we form a differentiable objective based on the user
interactions and perform learning using gradient descent, with an end-to-end
trainable architecture. Two interaction scenarios are evaluated. First, a
common methodology in multidimensional projection is to project a subset of
data, arrange them in classes or clusters, and project the rest unseen dataset
based on that manipulation, in a kind of semi-supervised interpolation. We
report results that outperform competitive baselines in a wide range of metrics
and datasets. Second, we explore the scenario of manipulating some classes,
while enriching the optimization with high-dimensional neighbor information.
Apart from improving classification precision and clustering on images and text
documents, the new emerging structure of the projection unveils semantic
manifolds. For example, on the Head Pose dataset, by just dragging the faces
looking far left to the left and those looking far right to the right, all
faces are re-arranged on a continuum even on the vertical axis (face up and
down). This end-to-end framework can be used for fast, visual semi-supervised
learning, manifold exploration, interactive domain adaptation of neural
embeddings and transfer learning.Comment: Accepted at Knowledge-Based System
Style Decomposition for Improved Neural Style Transfer
Universal Neural Style Transfer (NST) methods are capable of performing style
transfer of arbitrary styles in a style-agnostic manner via feature transforms
in (almost) real-time. Even though their unimodal parametric style modeling
approach has been proven adequate to transfer a single style from relatively
simple images, they are usually not capable of effectively handling more
complex styles, producing significant artifacts, as well as reducing the
quality of the synthesized textures in the stylized image. To overcome these
limitations, in this paper we propose a novel universal NST approach that
separately models each sub-style that exists in a given style image (or a
collection of style images). This allows for better modeling the subtle style
differences within the same style image and then using the most appropriate
sub-style (or mixtures of different sub-styles) to stylize the content image.
The ability of the proposed approach to a) perform a wide range of different
stylizations using the sub-styles that exist in one style image, while giving
the ability to the user to appropriate mix the different sub-styles, b)
automatically match the most appropriate sub-style to different semantic
regions of the content image, improving existing state-of-the-art universal NST
approaches, and c) detecting and transferring the sub-styles from collections
of images are demonstrated through extensive experiments
Using Deep Learning for price prediction by exploiting stationary limit order book features
The recent surge in Deep Learning (DL) research of the past decade has
successfully provided solutions to many difficult problems. The field of
quantitative analysis has been slowly adapting the new methods to its problems,
but due to problems such as the non-stationary nature of financial data,
significant challenges must be overcome before DL is fully utilized. In this
work a new method to construct stationary features, that allows DL models to be
applied effectively, is proposed. These features are thoroughly tested on the
task of predicting mid price movements of the Limit Order Book. Several DL
models are evaluated, such as recurrent Long Short Term Memory (LSTM) networks
and Convolutional Neural Networks (CNN). Finally a novel model that combines
the ability of CNNs to extract useful features and the ability of LSTMs' to
analyze time series, is proposed and evaluated. The combined model is able to
outperform the individual LSTM and CNN models in the prediction horizons that
are tested