128 research outputs found
Quantum machine learning: a classical perspective
Recently, increased computational power and data availability, as well as
algorithmic advances, have led machine learning techniques to impressive
results in regression, classification, data-generation and reinforcement
learning tasks. Despite these successes, the proximity to the physical limits
of chip fabrication alongside the increasing size of datasets are motivating a
growing number of researchers to explore the possibility of harnessing the
power of quantum computation to speed-up classical machine learning algorithms.
Here we review the literature in quantum machine learning and discuss
perspectives for a mixed readership of classical machine learning and quantum
computation experts. Particular emphasis will be placed on clarifying the
limitations of quantum algorithms, how they compare with their best classical
counterparts and why quantum resources are expected to provide advantages for
learning problems. Learning in the presence of noise and certain
computationally hard problems in machine learning are identified as promising
directions for the field. Practical questions, like how to upload classical
data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde
Identification of SNP interactions using logic regression
Interactions of single nucleotide polymorphisms (SNPs) are assumed to be responsible for complex diseases such as sporadic breast cancer. Important goals of studies concerned with such genetic data are thus to identify combinations of SNPs that lead to a higher risk of developing a disease and to measure the importance of these interactions. There are many approaches based on classification methods such as CART and Random Forests that allow measuring the importance of single variables. But with none of these methods the importance of combinations of variables can be quantified directly. In this paper, we show how logic regression can be employed to identify SNP interactions explanatory for the disease status in a case- control study and propose two measures for quantifying the importance of these interactions for classification. These approaches are then applied, on the one hand, to simulated data sets, and on the other hand, to the SNP data of the GENICA study, a study dedicated to the identification of genetic and gene-environment interactions associated with sporadic breast cancer. --Single Nucleotide Polymorphism,Feature Selection,Variable Importance Measure,GENICA
Approximate In-memory computing on RERAMs
Computing systems have seen tremendous growth over the past few decades in their capabilities, efficiency, and deployment use cases. This growth has been driven by progress in lithography techniques, improvement in synthesis tools, architectures and power management. However, there is a growing disparity between computing power and the demands on modern computing systems. The standard Von-Neuman architecture has separate data storage and data processing locations. Therefore, it suffers from a memory-processor communication bottleneck, which is commonly referred to as the \u27memory wall\u27. The relatively slower progress in memory technology compared with processing units has continued to exacerbate the memory wall problem. As feature sizes in the CMOS logic family reduce further, quantum tunneling effects are becoming more prominent. Simultaneously, chip transistor density is already so high that all transistors cannot be powered up at the same time without violating temperature constraints, a phenomenon characterized as dark-silicon. Coupled with this, there is also an increase in leakage currents with smaller feature sizes, resulting in a breakdown of \u27Dennard\u27s\u27 scaling. All these challenges cannot be met without fundamental changes in current computing paradigms. One viable solution is in-memory computing, where computing and storage are performed alongside each other. A number of emerging memory fabrics such as ReRAMS, STT-RAMs, and PCM RAMs are capable of performing logic in-memory. ReRAMs possess high storage density, have extremely low power consumption and a low cost of fabrication. These advantages are due to the simple nature of its basic constituting elements which allow nano-scale fabrication. We use flow-based computing on ReRAM crossbars for computing that exploits natural sneak paths in those crossbars. Another concurrent development in computing is the maturation of domains that are error resilient while being highly data and power intensive. These include machine learning, pattern recognition, computer vision, image processing, and networking, etc. This shift in the nature of computing workloads has given weight to the idea of approximate computing , in which device efficiency is improved by sacrificing tolerable amounts of accuracy in computation. We present a mathematically rigorous foundation for the synthesis of approximate logic and its mapping to ReRAM crossbars using search based and graphical methods
An Analysis of MCMC Sampling Methods for Estimating Weighted Sums in Winnow
Chawla et al. introduced a way to use the Markov chain Monte Carlo method to estimate weighted sums in multiplicative weight update algorithms when the number of inputs is exponential. But their algorithm still required extensive simulation of the Markov chain in order to get accurate estimates of the weighted sums. We propose an optimized version of Chawla et al.’s algorithm, which produces exactly the same classifications while often using fewer Markov chain simulations. We also apply three other sampling techniques and empirically compare them with Chawla et al.’sMetropolis sampler to determine how effective each is in drawing good samples in the least amount of time, in terms of accuracy of weighted sum estimates and in terms of Winnow’s prediction accuracy
Explainable AI using expressive Boolean formulas
We propose and implement an interpretable machine learning classification
model for Explainable AI (XAI) based on expressive Boolean formulas. Potential
applications include credit scoring and diagnosis of medical conditions. The
Boolean formula defines a rule with tunable complexity (or interpretability),
according to which input data are classified. Such a formula can include any
operator that can be applied to one or more Boolean variables, thus providing
higher expressivity compared to more rigid rule-based and tree-based
approaches. The classifier is trained using native local optimization
techniques, efficiently searching the space of feasible formulas. Shallow rules
can be determined by fast Integer Linear Programming (ILP) or Quadratic
Unconstrained Binary Optimization (QUBO) solvers, potentially powered by
special purpose hardware or quantum devices. We combine the expressivity and
efficiency of the native local optimizer with the fast operation of these
devices by executing non-local moves that optimize over subtrees of the full
Boolean formula. We provide extensive numerical benchmarking results featuring
several baselines on well-known public datasets. Based on the results, we find
that the native local rule classifier is generally competitive with the other
classifiers. The addition of non-local moves achieves similar results with
fewer iterations, and therefore using specialized or quantum hardware could
lead to a speedup by fast proposal of non-local moves.Comment: 28 pages, 16 figures, 4 table
- …