29,392 research outputs found

    Linear regression for numeric symbolic variables: an ordinary least squares approach based on Wasserstein Distance

    Full text link
    In this paper we present a linear regression model for modal symbolic data. The observed variables are histogram variables according to the definition given in the framework of Symbolic Data Analysis and the parameters of the model are estimated using the classic Least Squares method. An appropriate metric is introduced in order to measure the error between the observed and the predicted distributions. In particular, the Wasserstein distance is proposed. Some properties of such metric are exploited to predict the response variable as direct linear combination of other independent histogram variables. Measures of goodness of fit are discussed. An application on real data corroborates the proposed method

    Approximation of probability density functions for PDEs with random parameters using truncated series expansions

    Full text link
    The probability density function (PDF) of a random variable associated with the solution of a partial differential equation (PDE) with random parameters is approximated using a truncated series expansion. The random PDE is solved using two stochastic finite element methods, Monte Carlo sampling and the stochastic Galerkin method with global polynomials. The random variable is a functional of the solution of the random PDE, such as the average over the physical domain. The truncated series are obtained considering a finite number of terms in the Gram-Charlier or Edgeworth series expansions. These expansions approximate the PDF of a random variable in terms of another PDF, and involve coefficients that are functions of the known cumulants of the random variable. To the best of our knowledge, their use in the framework of PDEs with random parameters has not yet been explored

    Evading Classifiers by Morphing in the Dark

    Full text link
    Learning-based systems have been shown to be vulnerable to evasion through adversarial data manipulation. These attacks have been studied under assumptions that the adversary has certain knowledge of either the target model internals, its training dataset or at least classification scores it assigns to input samples. In this paper, we investigate a much more constrained and realistic attack scenario wherein the target classifier is minimally exposed to the adversary, revealing on its final classification decision (e.g., reject or accept an input sample). Moreover, the adversary can only manipulate malicious samples using a blackbox morpher. That is, the adversary has to evade the target classifier by morphing malicious samples "in the dark". We present a scoring mechanism that can assign a real-value score which reflects evasion progress to each sample based on the limited information available. Leveraging on such scoring mechanism, we propose an evasion method -- EvadeHC -- and evaluate it against two PDF malware detectors, namely PDFRate and Hidost. The experimental evaluation demonstrates that the proposed evasion attacks are effective, attaining 100%100\% evasion rate on the evaluation dataset. Interestingly, EvadeHC outperforms the known classifier evasion technique that operates based on classification scores output by the classifiers. Although our evaluations are conducted on PDF malware classifier, the proposed approaches are domain-agnostic and is of wider application to other learning-based systems

    Model for Estimation of Bounds in Digital Coding of Seabed Images

    Get PDF
    This paper proposes the novel model for estimation of bounds in digital coding of images. Entropy coding of images is exploited to measure the useful information content of the data. The bit rate achieved by reversible compression using the rate-distortion theory approach takes into account the contribution of the observation noise and the intrinsic information of hypothetical noise-free image. Assuming the Laplacian probability density function of the quantizer input signal, SQNR gains are calculated for image predictive coding system with non-adaptive quantizer for white and correlated noise, respectively. The proposed model is evaluated on seabed images. However, model presented in this paper can be applied to any signal with Laplacian distribution

    Recommendations and illustrations for the evaluation of photonic random number generators

    Full text link
    The never-ending quest to improve the security of digital information combined with recent improvements in hardware technology has caused the field of random number generation to undergo a fundamental shift from relying solely on pseudo-random algorithms to employing optical entropy sources. Despite these significant advances on the hardware side, commonly used statistical measures and evaluation practices remain ill-suited to understand or quantify the optical entropy that underlies physical random number generation. We review the state of the art in the evaluation of optical random number generation and recommend a new paradigm: quantifying entropy generation and understanding the physical limits of the optical sources of randomness. In order to do this, we advocate for the separation of the physical entropy source from deterministic post-processing in the evaluation of random number generators and for the explicit consideration of the impact of the measurement and digitization process on the rate of entropy production. We present the Cohen-Procaccia estimate of the entropy rate h(ϵ,τ)h(\epsilon,\tau) as one way to do this. In order to provide an illustration of our recommendations, we apply the Cohen-Procaccia estimate as well as the entropy estimates from the new NIST draft standards for physical random number generators to evaluate and compare three common optical entropy sources: single photon time-of-arrival detection, chaotic lasers, and amplified spontaneous emission
    • …
    corecore