2,911 research outputs found
Properties of spatial coupling in compressed sensing
In this paper we address a series of open questions about the construction of
spatially coupled measurement matrices in compressed sensing. For hardware
implementations one is forced to depart from the limiting regime of parameters
in which the proofs of the so-called threshold saturation work. We investigate
quantitatively the behavior under finite coupling range, the dependence on the
shape of the coupling interaction, and optimization of the so-called seed to
minimize distance from optimality. Our analysis explains some of the properties
observed empirically in previous works and provides new insight on spatially
coupled compressed sensing.Comment: 5 pages, 6 figure
An empirical evaluation of imbalanced data strategies from a practitioner's point of view
This research tested the following well known strategies to deal with binary
imbalanced data on 82 different real life data sets (sampled to imbalance rates
of 5%, 3%, 1%, and 0.1%): class weight, SMOTE, Underbagging, and a baseline
(just the base classifier). As base classifiers we used SVM with RBF kernel,
random forests, and gradient boosting machines and we measured the quality of
the resulting classifier using 6 different metrics (Area under the curve,
Accuracy, F-measure, G-mean, Matthew's correlation coefficient and Balanced
accuracy). The best strategy strongly depends on the metric used to measure the
quality of the classifier. For AUC and accuracy class weight and the baseline
perform better; for F-measure and MCC, SMOTE performs better; and for G-mean
and balanced accuracy, underbagging
Impact of Biases in Big Data
The underlying paradigm of big data-driven machine learning reflects the
desire of deriving better conclusions from simply analyzing more data, without
the necessity of looking at theory and models. Is having simply more data
always helpful? In 1936, The Literary Digest collected 2.3M filled in
questionnaires to predict the outcome of that year's US presidential election.
The outcome of this big data prediction proved to be entirely wrong, whereas
George Gallup only needed 3K handpicked people to make an accurate prediction.
Generally, biases occur in machine learning whenever the distributions of
training set and test set are different. In this work, we provide a review of
different sorts of biases in (big) data sets in machine learning. We provide
definitions and discussions of the most commonly appearing biases in machine
learning: class imbalance and covariate shift. We also show how these biases
can be quantified and corrected. This work is an introductory text for both
researchers and practitioners to become more aware of this topic and thus to
derive more reliable models for their learning problems
Towards higher precision and operational use of optical homodyne tomograms
We present the results of an operational use of experimentally measured
optical tomograms to determine state characteristics (purity) avoiding any
reconstruction of quasiprobabilities. We also develop a natural way how to
estimate the errors (including both statistical and systematic ones) by an
analysis of the experimental data themselves. Precision of the experiment can
be increased by postselecting the data with minimal (systematic) errors. We
demonstrate those techniques by considering coherent and photon-added coherent
states measured via the time-domain improved homodyne detection. The
operational use and precision of the data allowed us to check for the first
time purity-dependent uncertainty relations and uncertainty relations for
Shannon and R\'{e}nyi entropies.Comment: 11 pages, 6 figures, 1 table, some results are extende
Using RBF nets in rubber industry process control
This paper describes the use of a radial basis function (RBF) neural network. It approximates the process parameters for the extrusion of a rubber profile used in tyre production. After introducing the problem, we describe the RBF net algorithm and the modeling of the industrial problem. The algorithm shows good results even using only a few training samples. It turns out that the „curse of dimensions“ plays an important role in the model. The paper concludes by a discussion of possible systematic error influences and improvements
- …
