Search CORE

29,392 research outputs found

Linear regression for numeric symbolic variables: an ordinary least squares approach based on Wasserstein Distance

Author: A Irpino
Antonio Irpino
B Efron
CL Lawson
CL Mallows
E Diday
EAL Neto
EAL Neto
G Dall’Aglio
H Bock
J Arroyo
L Billard
L Kantorovich
L Wasserstein
M Noirhomme-Fraiture
P Bertrand
P Bickel
R Tibshirani
Rosanna Verde
WG Gilchrist
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/07/2012
Field of study

In this paper we present a linear regression model for modal symbolic data. The observed variables are histogram variables according to the definition given in the framework of Symbolic Data Analysis and the parameters of the model are estimated using the classic Least Squares method. An appropriate metric is introduced in order to measure the error between the observed and the predicted distributions. In particular, the Wasserstein distance is proposed. Some properties of such metric are exploited to predict the response variable as direct linear combination of other independent histogram variables. Measures of goodness of fit are discussed. An application on real data corroborates the proposed method

arXiv.org e-Print Archive

CiteSeerX

Crossref

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"

Approximation of probability density functions for PDEs with random parameters using truncated series expansions

Author: Capodaglio Giacomo
Gunzburger Max
Wynn Henry P.
Publication venue
Publication date: 23/09/2020
Field of study

The probability density function (PDF) of a random variable associated with the solution of a partial differential equation (PDE) with random parameters is approximated using a truncated series expansion. The random PDE is solved using two stochastic finite element methods, Monte Carlo sampling and the stochastic Galerkin method with global polynomials. The random variable is a functional of the solution of the random PDE, such as the average over the physical domain. The truncated series are obtained considering a finite number of terms in the Gram-Charlier or Edgeworth series expansions. These expansions approximate the PDF of a random variable in terms of another PDF, and involve coefficients that are functions of the known cumulants of the random variable. To the best of our knowledge, their use in the framework of PDEs with random parameters has not yet been explored

arXiv.org e-Print Archive

LSE Research Online

Evading Classifiers by Morphing in the Dark

Author: Chang Ee-Chien
Dang Hung
Huang Yue
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/08/2017
Field of study

Learning-based systems have been shown to be vulnerable to evasion through adversarial data manipulation. These attacks have been studied under assumptions that the adversary has certain knowledge of either the target model internals, its training dataset or at least classification scores it assigns to input samples. In this paper, we investigate a much more constrained and realistic attack scenario wherein the target classifier is minimally exposed to the adversary, revealing on its final classification decision (e.g., reject or accept an input sample). Moreover, the adversary can only manipulate malicious samples using a blackbox morpher. That is, the adversary has to evade the target classifier by morphing malicious samples "in the dark". We present a scoring mechanism that can assign a real-value score which reflects evasion progress to each sample based on the limited information available. Leveraging on such scoring mechanism, we propose an evasion method -- EvadeHC -- and evaluate it against two PDF malware detectors, namely PDFRate and Hidost. The experimental evaluation demonstrates that the proposed evasion attacks are effective, attaining

100\%

evasion rate on the evaluation dataset. Interestingly, EvadeHC outperforms the known classifier evasion technique that operates based on classification scores output by the classifiers. Although our evaluations are conducted on PDF malware classifier, the proposed approaches are domain-agnostic and is of wider application to other learning-based systems

arXiv.org e-Print Archive

Crossref

Model for Estimation of Bounds in Digital Coding of Seabed Images

Author: Samcovic A.
Publication venue: 'Brno University of Technology'
Publication date: 01/09/2015
Field of study

This paper proposes the novel model for estimation of bounds in digital coding of images. Entropy coding of images is exploited to measure the useful information content of the data. The bit rate achieved by reversible compression using the rate-distortion theory approach takes into account the contribution of the observation noise and the intrinsic information of hypothetical noise-free image. Assuming the Laplacian probability density function of the quantizer input signal, SQNR gains are calculated for image predictive coding system with non-adaptive quantizer for white and correlated noise, respectively. The proposed model is evaluated on seabed images. However, model presented in this paper can be applied to any signal with Laplacian distribution

Directory of Open Access Journals

Digital library of Brno University of Technology

Recommendations and illustrations for the evaluation of photonic random number generators

Author: Baumgartner Gerald B.
Hart Joseph D.
Murphy Thomas E.
Roy Rajarshi
Terashima Yuta
Uchida Atsushi
Publication venue: 'AIP Publishing'
Publication date: 15/08/2017
Field of study

The never-ending quest to improve the security of digital information combined with recent improvements in hardware technology has caused the field of random number generation to undergo a fundamental shift from relying solely on pseudo-random algorithms to employing optical entropy sources. Despite these significant advances on the hardware side, commonly used statistical measures and evaluation practices remain ill-suited to understand or quantify the optical entropy that underlies physical random number generation. We review the state of the art in the evaluation of optical random number generation and recommend a new paradigm: quantifying entropy generation and understanding the physical limits of the optical sources of randomness. In order to do this, we advocate for the separation of the physical entropy source from deterministic post-processing in the evaluation of random number generators and for the explicit consideration of the impact of the measurement and digitization process on the rate of entropy production. We present the Cohen-Procaccia estimate of the entropy rate

h(\epsilon,\tau)

as one way to do this. In order to provide an illustration of our recommendations, we apply the Cohen-Procaccia estimate as well as the entropy estimates from the new NIST draft standards for physical random number generators to evaluate and compare three common optical entropy sources: single photon time-of-arrival detection, chaotic lasers, and amplified spontaneous emission

arXiv.org e-Print Archive

Directory of Open Access Journals