29,392 research outputs found
Linear regression for numeric symbolic variables: an ordinary least squares approach based on Wasserstein Distance
In this paper we present a linear regression model for modal symbolic data.
The observed variables are histogram variables according to the definition
given in the framework of Symbolic Data Analysis and the parameters of the
model are estimated using the classic Least Squares method. An appropriate
metric is introduced in order to measure the error between the observed and the
predicted distributions. In particular, the Wasserstein distance is proposed.
Some properties of such metric are exploited to predict the response variable
as direct linear combination of other independent histogram variables. Measures
of goodness of fit are discussed. An application on real data corroborates the
proposed method
Approximation of probability density functions for PDEs with random parameters using truncated series expansions
The probability density function (PDF) of a random variable associated with
the solution of a partial differential equation (PDE) with random parameters is
approximated using a truncated series expansion. The random PDE is solved using
two stochastic finite element methods, Monte Carlo sampling and the stochastic
Galerkin method with global polynomials. The random variable is a functional of
the solution of the random PDE, such as the average over the physical domain.
The truncated series are obtained considering a finite number of terms in the
Gram-Charlier or Edgeworth series expansions. These expansions approximate the
PDF of a random variable in terms of another PDF, and involve coefficients that
are functions of the known cumulants of the random variable. To the best of our
knowledge, their use in the framework of PDEs with random parameters has not
yet been explored
Evading Classifiers by Morphing in the Dark
Learning-based systems have been shown to be vulnerable to evasion through
adversarial data manipulation. These attacks have been studied under
assumptions that the adversary has certain knowledge of either the target model
internals, its training dataset or at least classification scores it assigns to
input samples. In this paper, we investigate a much more constrained and
realistic attack scenario wherein the target classifier is minimally exposed to
the adversary, revealing on its final classification decision (e.g., reject or
accept an input sample). Moreover, the adversary can only manipulate malicious
samples using a blackbox morpher. That is, the adversary has to evade the
target classifier by morphing malicious samples "in the dark". We present a
scoring mechanism that can assign a real-value score which reflects evasion
progress to each sample based on the limited information available. Leveraging
on such scoring mechanism, we propose an evasion method -- EvadeHC -- and
evaluate it against two PDF malware detectors, namely PDFRate and Hidost. The
experimental evaluation demonstrates that the proposed evasion attacks are
effective, attaining evasion rate on the evaluation dataset.
Interestingly, EvadeHC outperforms the known classifier evasion technique that
operates based on classification scores output by the classifiers. Although our
evaluations are conducted on PDF malware classifier, the proposed approaches
are domain-agnostic and is of wider application to other learning-based
systems
Model for Estimation of Bounds in Digital Coding of Seabed Images
This paper proposes the novel model for estimation of bounds in digital coding of images. Entropy coding of images is exploited to measure the useful information content of the data. The bit rate achieved by reversible compression using the rate-distortion theory approach takes into account the contribution of the observation noise and the intrinsic information of hypothetical noise-free image. Assuming the Laplacian probability density function of the quantizer input signal, SQNR gains are calculated for image predictive coding system with non-adaptive quantizer for white and correlated noise, respectively. The proposed model is evaluated on seabed images. However, model presented in this paper can be applied to any signal with Laplacian distribution
Recommendations and illustrations for the evaluation of photonic random number generators
The never-ending quest to improve the security of digital information
combined with recent improvements in hardware technology has caused the field
of random number generation to undergo a fundamental shift from relying solely
on pseudo-random algorithms to employing optical entropy sources. Despite these
significant advances on the hardware side, commonly used statistical measures
and evaluation practices remain ill-suited to understand or quantify the
optical entropy that underlies physical random number generation. We review the
state of the art in the evaluation of optical random number generation and
recommend a new paradigm: quantifying entropy generation and understanding the
physical limits of the optical sources of randomness. In order to do this, we
advocate for the separation of the physical entropy source from deterministic
post-processing in the evaluation of random number generators and for the
explicit consideration of the impact of the measurement and digitization
process on the rate of entropy production. We present the Cohen-Procaccia
estimate of the entropy rate as one way to do this. In order
to provide an illustration of our recommendations, we apply the Cohen-Procaccia
estimate as well as the entropy estimates from the new NIST draft standards for
physical random number generators to evaluate and compare three common optical
entropy sources: single photon time-of-arrival detection, chaotic lasers, and
amplified spontaneous emission
- …