21,983 research outputs found
Supervised learning with quantum enhanced feature spaces
Machine learning and quantum computing are two technologies each with the
potential for altering how computation is performed to address previously
untenable problems. Kernel methods for machine learning are ubiquitous for
pattern recognition, with support vector machines (SVMs) being the most
well-known method for classification problems. However, there are limitations
to the successful solution to such problems when the feature space becomes
large, and the kernel functions become computationally expensive to estimate. A
core element to computational speed-ups afforded by quantum algorithms is the
exploitation of an exponentially large quantum state space through controllable
entanglement and interference. Here, we propose and experimentally implement
two novel methods on a superconducting processor. Both methods represent the
feature space of a classification problem by a quantum state, taking advantage
of the large dimensionality of quantum Hilbert space to obtain an enhanced
solution. One method, the quantum variational classifier builds on [1,2] and
operates through using a variational quantum circuit to classify a training set
in direct analogy to conventional SVMs. In the second, a quantum kernel
estimator, we estimate the kernel function and optimize the classifier
directly. The two methods present a new class of tools for exploring the
applications of noisy intermediate scale quantum computers [3] to machine
learning.Comment: Fixed typos, added figures and discussion about quantum error
mitigatio
Recognising the Clothing Categories from Free-Configuration Using Gaussian-Process-Based Interactive Perception
In this paper, we propose a Gaussian Process- based interactive perception approach for recognising highly- wrinkled clothes. We have integrated this recognition method within a clothes sorting pipeline for the pre-washing stage of an autonomous laundering process. Our approach differs from reported clothing manipulation approaches by allowing the robot to update its perception confidence via numerous interactions with the garments. The classifiers predominantly reported in clothing perception (e.g. SVM, Random Forest) studies do not provide true classification probabilities, due to their inherent structure. In contrast, probabilistic classifiers (of which the Gaussian Process is a popular example) are able to provide predictive probabilities. In our approach, we employ a multi-class Gaussian Process classification using the Laplace approximation for posterior inference and optimising hyper-parameters via marginal likelihood maximisation. Our experimental results show that our approach is able to recognise unknown garments from highly-occluded and wrinkled con- figurations and demonstrates a substantial improvement over non-interactive perception approaches
Liquid State Machine with Dendritically Enhanced Readout for Low-power, Neuromorphic VLSI Implementations
In this paper, we describe a new neuro-inspired, hardware-friendly readout
stage for the liquid state machine (LSM), a popular model for reservoir
computing. Compared to the parallel perceptron architecture trained by the
p-delta algorithm, which is the state of the art in terms of performance of
readout stages, our readout architecture and learning algorithm can attain
better performance with significantly less synaptic resources making it
attractive for VLSI implementation. Inspired by the nonlinear properties of
dendrites in biological neurons, our readout stage incorporates neurons having
multiple dendrites with a lumped nonlinearity. The number of synaptic
connections on each branch is significantly lower than the total number of
connections from the liquid neurons and the learning algorithm tries to find
the best 'combination' of input connections on each branch to reduce the error.
Hence, the learning involves network rewiring (NRW) of the readout network
similar to structural plasticity observed in its biological counterparts. We
show that compared to a single perceptron using analog weights, this
architecture for the readout can attain, even by using the same number of
binary valued synapses, up to 3.3 times less error for a two-class spike train
classification problem and 2.4 times less error for an input rate approximation
task. Even with 60 times larger synapses, a group of 60 parallel perceptrons
cannot attain the performance of the proposed dendritically enhanced readout.
An additional advantage of this method for hardware implementations is that the
'choice' of connectivity can be easily implemented exploiting address event
representation (AER) protocols commonly used in current neuromorphic systems
where the connection matrix is stored in memory. Also, due to the use of binary
synapses, our proposed method is more robust against statistical variations.Comment: 14 pages, 19 figures, Journa
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Training Support Vector Machines Using Frank-Wolfe Optimization Methods
Training a Support Vector Machine (SVM) requires the solution of a quadratic
programming problem (QP) whose computational complexity becomes prohibitively
expensive for large scale datasets. Traditional optimization methods cannot be
directly applied in these cases, mainly due to memory restrictions.
By adopting a slightly different objective function and under mild conditions
on the kernel used within the model, efficient algorithms to train SVMs have
been devised under the name of Core Vector Machines (CVMs). This framework
exploits the equivalence of the resulting learning problem with the task of
building a Minimal Enclosing Ball (MEB) problem in a feature space, where data
is implicitly embedded by a kernel function.
In this paper, we improve on the CVM approach by proposing two novel methods
to build SVMs based on the Frank-Wolfe algorithm, recently revisited as a fast
method to approximate the solution of a MEB problem. In contrast to CVMs, our
algorithms do not require to compute the solutions of a sequence of
increasingly complex QPs and are defined by using only analytic optimization
steps. Experiments on a large collection of datasets show that our methods
scale better than CVMs in most cases, sometimes at the price of a slightly
lower accuracy. As CVMs, the proposed methods can be easily extended to machine
learning problems other than binary classification. However, effective
classifiers are also obtained using kernels which do not satisfy the condition
required by CVMs and can thus be used for a wider set of problems
Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages
We propose an efficient nonparametric strategy for learning a message
operator in expectation propagation (EP), which takes as input the set of
incoming messages to a factor node, and produces an outgoing message as output.
This learned operator replaces the multivariate integral required in classical
EP, which may not have an analytic expression. We use kernel-based regression,
which is trained on a set of probability distributions representing the
incoming messages, and the associated outgoing messages. The kernel approach
has two main advantages: first, it is fast, as it is implemented using a novel
two-layer random feature representation of the input message distributions;
second, it has principled uncertainty estimates, and can be cheaply updated
online, meaning it can request and incorporate new training data when it
encounters inputs on which it is uncertain. In experiments, our approach is
able to solve learning problems where a single message operator is required for
multiple, substantially different data sets (logistic regression for a variety
of classification problems), where it is essential to accurately assess
uncertainty and to efficiently and robustly update the message operator.Comment: accepted to UAI 2015. Correct typos. Add more content to the
appendix. Main results unchange
- …