82 research outputs found

    FADO: A Deterministic Detection/Learning Algorithm

    Full text link
    This paper proposes and studies a detection technique for adversarial scenarios (dubbed deterministic detection). This technique provides an alternative detection methodology in case the usual stochastic methods are not applicable: this can be because the studied phenomenon does not follow a stochastic sampling scheme, samples are high-dimensional and subsequent multiple-testing corrections render results overly conservative, sample sizes are too low for asymptotic results (as e.g. the central limit theorem) to kick in, or one cannot allow for the small probability of failure inherent to stochastic approaches. This paper instead designs a method based on insights from machine learning and online learning theory: this detection algorithm - named Online FAult Detection (FADO) - comes with theoretical guarantees of its detection capabilities. A version of the margin is found to regulate the detection performance of FADO. A precise expression is derived for bounding the performance, and experimental results are presented assessing the influence of involved quantities. A case study of scene detection is used to illustrate the approach. The technology is closely related to the linear perceptron rule, inherits its computational attractiveness and flexibility towards various extensions

    MINLIP for the Identification of Monotone Wiener Systems

    Full text link
    This paper studies the MINLIP estimator for the identification of Wiener systems consisting of a sequence of a linear FIR dynamical model, and a monotonically increasing (or decreasing) static function. Given TT observations, this algorithm boils down to solving a convex quadratic program with O(T)O(T) variables and inequality constraints, implementing an inference technique which is based entirely on model complexity control. The resulting estimates of the linear submodel are found to be almost consistent when no noise is present in the data, under a condition of smoothness of the true nonlinearity and local Persistency of Excitation (local PE) of the data. This result is novel as it does not rely on classical tools as a 'linearization' using a Taylor decomposition, nor exploits stochastic properties of the data. It is indicated how to extend the method to cope with noisy data, and empirical evidence contrasts performance of the estimator against other recently proposed techniques

    On the Nuclear Norm heuristic for a Hankel matrix Recovery Problem

    Full text link
    This note addresses the question if and why the nuclear norm heuristic can recover an impulse response generated by a stable single-real-pole system, if elements of the upper-triangle of the associated Hankel matrix were given. Since the setting is deterministic, theories based on stochastic assumptions for low-rank matrix recovery do not apply here. A 'certificate' which guarantees the completion is constructed by exploring the structural information of the hidden matrix. Experimental results and discussions regarding the nuclear norm heuristic applied to a more general setting are also given

    Sparse Estimation From Noisy Observations of an Overdetermined Linear System

    Full text link
    This note studies a method for the efficient estimation of a finite number of unknown parameters from linear equations, which are perturbed by Gaussian noise. In case the unknown parameters have only few nonzero entries, the proposed estimator performs more efficiently than a traditional approach. The method consists of three steps: (1) a classical Least Squares Estimate (LSE), (2) the support is recovered through a Linear Programming (LP) optimization problem which can be computed using a soft-thresholding step, (3) a de-biasing step using a LSE on the estimated support set. The main contribution of this note is a formal derivation of an associated ORACLE property of the final estimate. That is, when the number of samples is large enough, the estimate is shown to equal the LSE based on the support of the {\em true} parameters.Comment: This paper is provisionally accepted by Automatic

    On the Randomized Kaczmarz Algorithm

    Full text link
    The Randomized Kaczmarz Algorithm is a randomized method which aims at solving a consistent system of over determined linear equations. This note discusses how to find an optimized randomization scheme for this algorithm, which is related to the question raised by \cite{c2}. Illustrative experiments are conducted to support the findings.Comment: This paper will appear in IEEE Signal processing letters, vol. 21, no. 3, March 201

    A machine-learning approach to measuring the escape of ionizing radiation from galaxies in the reionization epoch

    Full text link
    Recent observations of galaxies at z≳7z \gtrsim 7, along with the low value of the electron scattering optical depth measured by the Planck mission, make galaxies plausible as dominant sources of ionizing photons during the epoch of reionization. However, scenarios of galaxy-driven reionization hinge on the assumption that the average escape fraction of ionizing photons is significantly higher for galaxies in the reionization epoch than in the local Universe. The NIRSpec instrument on the James Webb Space Telescope (JWST) will enable spectroscopic observations of large samples of reionization-epoch galaxies. While the leakage of ionizing photons will not be directly measurable from these spectra, the leakage is predicted to have an indirect effect on the spectral slope and the strength of nebular emission lines in the rest-frame ultraviolet and optical. Here, we apply a machine learning technique known as lasso regression on mock JWST/NIRSpec observations of simulated z=7z=7 galaxies in order to obtain a model that can predict the escape fraction from JWST/NIRSpec data. Barring systematic biases in the simulated spectra, our method is able to retrieve the escape fraction with a mean absolute error of Δfesc≈0.12\Delta f_{\mathrm{esc}} \approx 0.12 for spectra with S/N≈5S/N\approx 5 at a rest-frame wavelength of 1500 {\AA} for our fiducial simulation. This prediction accuracy represents a significant improvement over previous similar approaches.Comment: 13 pages, 11 figures. Accepted for publication in Ap

    Support and Quantile Tubes

    Full text link
    This correspondence studies an estimator of the conditional support of a distribution underlying a set of i.i.d. observations. The relation with mutual information is shown via an extension of Fano's theorem in combination with a generalization bound based on a compression argument. Extensions to estimating the conditional quantile interval, and statistical guarantees on the minimal convex hull are given

    Componentwise Least Squares Support Vector Machines

    Full text link
    This chapter describes componentwise Least Squares Support Vector Machines (LS-SVMs) for the estimation of additive models consisting of a sum of nonlinear components. The primal-dual derivations characterizing LS-SVMs for the estimation of the additive model result in a single set of linear equations with size growing in the number of data-points. The derivation is elaborated for the classification as well as the regression case. Furthermore, different techniques are proposed to discover structure in the data by looking for sparse components in the model based on dedicated regularization schemes on the one hand and fusion of the componentwise LS-SVMs training with a validation criterion on the other hand. (keywords: LS-SVMs, additive models, regularization, structure detection)Comment: 22 pages. Accepted for publication in Support Vector Machines: Theory and Applications, ed. L. Wang, 200

    An efficient method for sorting and selecting for social behaviour

    Full text link
    In this article we provide a systematic experimental method for sorting animals according to socially relevant traits, without assaying them or even tagging them individually. Instead, they are repeatedly subjected to behavioural assays in groups, between which the group memberships are rearranged, in order to test the effect of many different combinations of individuals on a group-level property or feature. We analyse this method using a general model for the group feature, and simulate a variety of specific cases to track how individuals are sorted in each case. We find that in the case where the members of a group contribute equally to the group feature, the sorting procedure increases the between-group behavioural variation well above what is expected for groups randomly sampled from a population. For a wide class of group feature models, the individual phenotypes are efficiently sorted across the groups and thus become available for further analysis on how individual properties affect group behaviour. We also show that the experimental data can be used to estimate the individual-level repeatability of the underlying traits.Comment: 16 pages, 3 figures + supplementary information (3 pages

    Identifying reionization-epoch galaxies with extreme levels of Lyman continuum leakage in James Webb Space Telescope surveys

    Full text link
    The James Webb Space Telescope (JWST) NIRSpec instrument will allow rest-frame ultraviolet/optical spectroscopy of galaxies in the epoch of reionization (EoR). Some galaxies may exhibit significant leakage of hydrogen-ionizing photons into the intergalactic medium, resulting in faint nebular emission lines. We present a machine learning framework for identifying cases of very high hydrogen-ionizing photon escape from galaxies based on the data quality expected from potential NIRSpec observations of EoR galaxies in lensed fields. We train our algorithm on mock samples of JWST/NIRSpec data for galaxies at redshifts z=6z=6--10. To make the samples more realistic, we combine synthetic galaxy spectra based on cosmological galaxy simulations with observational noise relevant for z≳6z\gtrsim 6 objects of a brightness similar to EoR galaxy candidates uncovered in Frontier Fields observations of galaxy cluster Abell-2744 and MACS-J0416. We find that ionizing escape fractions (fescf_\mathrm{esc}) of galaxies brighter than mAB,1500≈27m_\mathrm{AB,1500} \approx 27 mag may be retrieved with mean absolute error Δfesc≈\Delta f_\mathrm{esc}\approx0.09(0.12) for 24h (1.5h) JWST/NIRSpec exposures at resolution R=100. For 24h exposure time, even fainter galaxies (mAB,1500<28.5m_\mathrm{AB,1500} < 28.5 mag) can be processed with Δfesc≈\Delta f_\mathrm{esc}\approx0.14. This framework simultaneously estimates the redshift of these galaxies with a relative error less than 0.03 for both 24h (mAB,1500<28.5m_\mathrm{AB,1500} < 28.5 mag) and 1.5h (mAB,1500<27m_\mathrm{AB,1500} < 27 mag) exposure times. We also consider scenarios where just a minor fraction of galaxies attain high fescf_\mathrm{esc} and present the conditions required for detecting a subpopulation of high fescf_\mathrm{esc} galaxies within the dataset.Comment: 10 pages, 7 figures. Accepted to be published in MNRA
    • …
    corecore