46 research outputs found
Robustness Verification of Support Vector Machines
We study the problem of formally verifying the robustness to adversarial
examples of support vector machines (SVMs), a major machine learning model for
classification and regression tasks. Following a recent stream of works on
formal robustness verification of (deep) neural networks, our approach relies
on a sound abstract version of a given SVM classifier to be used for checking
its robustness. This methodology is parametric on a given numerical abstraction
of real values and, analogously to the case of neural networks, needs neither
abstract least upper bounds nor widening operators on this abstraction. The
standard interval domain provides a simple instantiation of our abstraction
technique, which is enhanced with the domain of reduced affine forms, which is
an efficient abstraction of the zonotope abstract domain. This robustness
verification technique has been fully implemented and experimentally evaluated
on SVMs based on linear and nonlinear (polynomial and radial basis function)
kernels, which have been trained on the popular MNIST dataset of images and on
the recent and more challenging Fashion-MNIST dataset. The experimental results
of our prototype SVM robustness verifier appear to be encouraging: this
automated verification is fast, scalable and shows significantly high
percentages of provable robustness on the test set of MNIST, in particular
compared to the analogous provable robustness of neural networks
Linear Dimensionality Reduction for Margin-Based Classification: High-Dimensional Data and Sensor Networks
Low-dimensional statistics of measurements play an important role in detection problems, including those encountered in sensor networks. In this work, we focus on learning low-dimensional linear statistics of high-dimensional measurement data along with decision rules defined in the low-dimensional space in the case when the probability density of the measurements and class labels is not given, but a training set of samples from this distribution is given. We pose a joint optimization problem for linear dimensionality reduction and margin-based classification, and develop a coordinate descent algorithm on the Stiefel manifold for its solution. Although the coordinate descent is not guaranteed to find the globally optimal solution, crucially, its alternating structure enables us to extend it for sensor networks with a message-passing approach requiring little communication. Linear dimensionality reduction prevents overfitting when learning from finite training data. In the sensor network setting, dimensionality reduction not only prevents overfitting, but also reduces power consumption due to communication. The learned reduced-dimensional space and decision rule is shown to be consistent and its Rademacher complexity is characterized. Experimental results are presented for a variety of datasets, including those from existing sensor networks, demonstrating the potential of our methodology in comparison with other dimensionality reduction approaches.National Science Foundation (U.S.). Graduate Research Fellowship ProgramUnited States. Army Research Office (MURI funded through ARO Grant W911NF-06-1-0076)United States. Air Force Office of Scientific Research (Award FA9550-06-1-0324)Shell International Exploration and Production B.V
An optimal randomized algorithm for d-variate zonoid depth
AbstractA randomized linear expected-time algorithm for computing the zonoid depth [R. Dyckerhoff, G. Koshevoy, K. Mosler, Zonoid data depth: Theory and computation, in: A. Prat (Ed.), COMPSTAT 1996—Proceedings in Computational Statistics, Physica-Verlag, Heidelberg, 1996, pp. 235–240; K. Mosler, Multivariate Dispersion, Central Regions and Depth. The Lift Zonoid Approach, Lecture Notes in Statistics, vol. 165, Springer-Verlag, New York, 2002] of a point with respect to a fixed dimensional point set is presented
Set-based State Estimation with Probabilistic Consistency Guarantee under Epistemic Uncertainty
Consistent state estimation is challenging, especially under the epistemic
uncertainties arising from learned (nonlinear) dynamic and observation models.
In this work, we propose a set-based estimation algorithm, named Gaussian
Process-Zonotopic Kalman Filter (GP-ZKF), that produces zonotopic state
estimates while respecting both the epistemic uncertainties in the learned
models and aleatoric uncertainties. Our method guarantees probabilistic
consistency, in the sense that the true states are bounded by sets (zonotopes)
across all time steps, with high probability. We formally relate GP-ZKF with
the corresponding stochastic approach, GP-EKF, in the case of learned
(nonlinear) models. In particular, when linearization errors and aleatoric
uncertainties are omitted and epistemic uncertainties are simplified, GP-ZKF
reduces to GP-EKF. We empirically demonstrate our method's efficacy in both a
simulated pendulum domain and a real-world robot-assisted dressing domain,
where GP-ZKF produced more consistent and less conservative set-based estimates
than all baseline stochastic methods.Comment: Published at IEEE Robotics and Automation Letters, 2022. Video:
https://www.youtube.com/watch?v=CvIPJlALaFU Copyright: 2022 IEEE. Personal
use of this material is permitted. Permission from IEEE must be obtained for
all other uses, in any media, including reprinting/republishing for any
purposes, creating new works, for resale or redistribution, or reuse of any
copyrighted component of this wor
Frugal hypothesis testing and classification
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 157-175).The design and analysis of decision rules using detection theory and statistical learning theory is important because decision making under uncertainty is pervasive. Three perspectives on limiting the complexity of decision rules are considered in this thesis: geometric regularization, dimensionality reduction, and quantization or clustering. Controlling complexity often reduces resource usage in decision making and improves generalization when learning decision rules from noisy samples. A new margin-based classifier with decision boundary surface area regularization and optimization via variational level set methods is developed. This novel classifier is termed the geometric level set (GLS) classifier. A method for joint dimensionality reduction and margin-based classification with optimization on the Stiefel manifold is developed. This dimensionality reduction approach is extended for information fusion in sensor networks. A new distortion is proposed for the quantization or clustering of prior probabilities appearing in the thresholds of likelihood ratio tests. This distortion is given the name mean Bayes risk error (MBRE). The quantization framework is extended to model human decision making and discrimination in segregated populations.by Kush R. Varshney.Ph.D
Verification of RoboChart Models with Neural Network Components
Current software engineering frameworks for robotics treat artificial neural networks (ANNs) components as black boxes, and existing white-box techniques consider either component-level properties, or properties involving a specific case study. A method to establish properties that may depend on all components in such a system is, as yet, undefined. Our work consists of defining such a method. First, we developed a component whose behaviour is defined by an ANN and acts as a robotic controller. Considering our application to robotics, we focus on pre-trained ANNs used for control. We define our component in the context of RoboChart, where we define modelling notation involving a meta-model and well-formedness conditions, and a process-algebraic semantics. To further support our framework, we defined an implementation of these semantics in Java and CSPM, to enable validation and discretised verification. Given these components, we then developed an approach to verify software systems involving our ANN components. This approach involves replacing existing memoryless, cyclic, controller components with ANN components, and proving that the new system does not deviate in behaviour by more than a constant ε from the original system. Moreover, we describe a strategy for automating these proofs based on Isabelle and Marabou, combining ANN-specific verification tools with general verification tools. We demonstrate our framework using a case study involving a Segway robot where we replace a PID controller with an ANN component. Our contributions can be summarised as follows: we have generated a framework that enables the modelling, validation, and verification of robotic software involving neural network components. Finally, this work represents progress towards establishing the safety and reliability of autonomous robotics
Case Studies for Computing Density of Reachable States for Safe Autonomous Motion Planning
Density of the reachable states can help understand the risk of
safety-critical systems, especially in situations when worst-case reachability
is too conservative. Recent work provides a data-driven approach to compute the
density distribution of autonomous systems' forward reachable states online. In
this paper, we study the use of such approach in combination with model
predictive control for verifiable safe path planning under uncertainties. We
first use the learned density distribution to compute the risk of collision
online. If such risk exceeds the acceptable threshold, our method will plan for
a new path around the previous trajectory, with the risk of collision below the
threshold. Our method is well-suited to handle systems with uncertainties and
complicated dynamics as our data-driven approach does not need an analytical
form of the systems' dynamics and can estimate forward state density with an
arbitrary initial distribution of uncertainties. We design two challenging
scenarios (autonomous driving and hovercraft control) for safe motion planning
in environments with obstacles under system uncertainties. We first show that
our density estimation approach can reach a similar accuracy as the
Monte-Carlo-based method while using only 0.01X training samples. By leveraging
the estimated risk, our algorithm achieves the highest success rate in goal
reaching when enforcing the safety rate above 0.99.Comment: NASA Formal Methods 202