2,911 research outputs found

    Properties of spatial coupling in compressed sensing

    Full text link
    In this paper we address a series of open questions about the construction of spatially coupled measurement matrices in compressed sensing. For hardware implementations one is forced to depart from the limiting regime of parameters in which the proofs of the so-called threshold saturation work. We investigate quantitatively the behavior under finite coupling range, the dependence on the shape of the coupling interaction, and optimization of the so-called seed to minimize distance from optimality. Our analysis explains some of the properties observed empirically in previous works and provides new insight on spatially coupled compressed sensing.Comment: 5 pages, 6 figure

    An empirical evaluation of imbalanced data strategies from a practitioner's point of view

    Full text link
    This research tested the following well known strategies to deal with binary imbalanced data on 82 different real life data sets (sampled to imbalance rates of 5%, 3%, 1%, and 0.1%): class weight, SMOTE, Underbagging, and a baseline (just the base classifier). As base classifiers we used SVM with RBF kernel, random forests, and gradient boosting machines and we measured the quality of the resulting classifier using 6 different metrics (Area under the curve, Accuracy, F-measure, G-mean, Matthew's correlation coefficient and Balanced accuracy). The best strategy strongly depends on the metric used to measure the quality of the classifier. For AUC and accuracy class weight and the baseline perform better; for F-measure and MCC, SMOTE performs better; and for G-mean and balanced accuracy, underbagging

    Impact of Biases in Big Data

    Get PDF
    The underlying paradigm of big data-driven machine learning reflects the desire of deriving better conclusions from simply analyzing more data, without the necessity of looking at theory and models. Is having simply more data always helpful? In 1936, The Literary Digest collected 2.3M filled in questionnaires to predict the outcome of that year's US presidential election. The outcome of this big data prediction proved to be entirely wrong, whereas George Gallup only needed 3K handpicked people to make an accurate prediction. Generally, biases occur in machine learning whenever the distributions of training set and test set are different. In this work, we provide a review of different sorts of biases in (big) data sets in machine learning. We provide definitions and discussions of the most commonly appearing biases in machine learning: class imbalance and covariate shift. We also show how these biases can be quantified and corrected. This work is an introductory text for both researchers and practitioners to become more aware of this topic and thus to derive more reliable models for their learning problems

    Towards higher precision and operational use of optical homodyne tomograms

    Get PDF
    We present the results of an operational use of experimentally measured optical tomograms to determine state characteristics (purity) avoiding any reconstruction of quasiprobabilities. We also develop a natural way how to estimate the errors (including both statistical and systematic ones) by an analysis of the experimental data themselves. Precision of the experiment can be increased by postselecting the data with minimal (systematic) errors. We demonstrate those techniques by considering coherent and photon-added coherent states measured via the time-domain improved homodyne detection. The operational use and precision of the data allowed us to check for the first time purity-dependent uncertainty relations and uncertainty relations for Shannon and R\'{e}nyi entropies.Comment: 11 pages, 6 figures, 1 table, some results are extende

    Using RBF nets in rubber industry process control

    Get PDF
    This paper describes the use of a radial basis function (RBF) neural network. It approximates the process parameters for the extrusion of a rubber profile used in tyre production. After introducing the problem, we describe the RBF net algorithm and the modeling of the industrial problem. The algorithm shows good results even using only a few training samples. It turns out that the „curse of dimensions“ plays an important role in the model. The paper concludes by a discussion of possible systematic error influences and improvements
    corecore