50,692 research outputs found

    Multi-party Poisoning through Generalized pp-Tampering

    Get PDF
    In a poisoning attack against a learning algorithm, an adversary tampers with a fraction of the training data TT with the goal of increasing the classification error of the constructed hypothesis/model over the final test distribution. In the distributed setting, TT might be gathered gradually from mm data providers P1,,PmP_1,\dots,P_m who generate and submit their shares of TT in an online way. In this work, we initiate a formal study of (k,p)(k,p)-poisoning attacks in which an adversary controls k[n]k\in[n] of the parties, and even for each corrupted party PiP_i, the adversary submits some poisoned data TiT'_i on behalf of PiP_i that is still "(1p)(1-p)-close" to the correct data TiT_i (e.g., 1p1-p fraction of TiT'_i is still honestly generated). For k=mk=m, this model becomes the traditional notion of poisoning, and for p=1p=1 it coincides with the standard notion of corruption in multi-party computation. We prove that if there is an initial constant error for the generated hypothesis hh, there is always a (k,p)(k,p)-poisoning attacker who can decrease the confidence of hh (to have a small error), or alternatively increase the error of hh, by Ω(pk/m)\Omega(p \cdot k/m). Our attacks can be implemented in polynomial time given samples from the correct data, and they use no wrong labels if the original distributions are not noisy. At a technical level, we prove a general lemma about biasing bounded functions f(x1,,xn)[0,1]f(x_1,\dots,x_n)\in[0,1] through an attack model in which each block xix_i might be controlled by an adversary with marginal probability pp in an online way. When the probabilities are independent, this coincides with the model of pp-tampering attacks, thus we call our model generalized pp-tampering. We prove the power of such attacks by incorporating ideas from the context of coin-flipping attacks into the pp-tampering model and generalize the results in both of these areas

    Sharp analysis of low-rank kernel matrix approximations

    Get PDF
    We consider supervised learning problems within the positive-definite kernel framework, such as kernel ridge regression, kernel logistic regression or the support vector machine. With kernels leading to infinite-dimensional feature spaces, a common practical limiting difficulty is the necessity of computing the kernel matrix, which most frequently leads to algorithms with running time at least quadratic in the number of observations n, i.e., O(n^2). Low-rank approximations of the kernel matrix are often considered as they allow the reduction of running time complexities to O(p^2 n), where p is the rank of the approximation. The practicality of such methods thus depends on the required rank p. In this paper, we show that in the context of kernel ridge regression, for approximations based on a random subset of columns of the original kernel matrix, the rank p may be chosen to be linear in the degrees of freedom associated with the problem, a quantity which is classically used in the statistical analysis of such methods, and is often seen as the implicit number of parameters of non-parametric estimators. This result enables simple algorithms that have sub-quadratic running time complexity, but provably exhibit the same predictive performance than existing algorithms, for any given problem instance, and not only for worst-case situations

    Stochastic Nonlinear Model Predictive Control with Efficient Sample Approximation of Chance Constraints

    Full text link
    This paper presents a stochastic model predictive control approach for nonlinear systems subject to time-invariant probabilistic uncertainties in model parameters and initial conditions. The stochastic optimal control problem entails a cost function in terms of expected values and higher moments of the states, and chance constraints that ensure probabilistic constraint satisfaction. The generalized polynomial chaos framework is used to propagate the time-invariant stochastic uncertainties through the nonlinear system dynamics, and to efficiently sample from the probability densities of the states to approximate the satisfaction probability of the chance constraints. To increase computational efficiency by avoiding excessive sampling, a statistical analysis is proposed to systematically determine a-priori the least conservative constraint tightening required at a given sample size to guarantee a desired feasibility probability of the sample-approximated chance constraint optimization problem. In addition, a method is presented for sample-based approximation of the analytic gradients of the chance constraints, which increases the optimization efficiency significantly. The proposed stochastic nonlinear model predictive control approach is applicable to a broad class of nonlinear systems with the sufficient condition that each term is analytic with respect to the states, and separable with respect to the inputs, states and parameters. The closed-loop performance of the proposed approach is evaluated using the Williams-Otto reactor with seven states, and ten uncertain parameters and initial conditions. The results demonstrate the efficiency of the approach for real-time stochastic model predictive control and its capability to systematically account for probabilistic uncertainties in contrast to a nonlinear model predictive control approaches.Comment: Submitted to Journal of Process Contro

    Learning to Discover Sparse Graphical Models

    Get PDF
    We consider structure discovery of undirected graphical models from observational data. Inferring likely structures from few examples is a complex task often requiring the formulation of priors and sophisticated inference procedures. Popular methods rely on estimating a penalized maximum likelihood of the precision matrix. However, in these approaches structure recovery is an indirect consequence of the data-fit term, the penalty can be difficult to adapt for domain-specific knowledge, and the inference is computationally demanding. By contrast, it may be easier to generate training samples of data that arise from graphs with the desired structure properties. We propose here to leverage this latter source of information as training data to learn a function, parametrized by a neural network that maps empirical covariance matrices to estimated graph structures. Learning this function brings two benefits: it implicitly models the desired structure or sparsity properties to form suitable priors, and it can be tailored to the specific problem of edge structure discovery, rather than maximizing data likelihood. Applying this framework, we find our learnable graph-discovery method trained on synthetic data generalizes well: identifying relevant edges in both synthetic and real data, completely unknown at training time. We find that on genetics, brain imaging, and simulation data we obtain performance generally superior to analytical methods

    Forecasting Time Series with VARMA Recursions on Graphs

    Full text link
    Graph-based techniques emerged as a choice to deal with the dimensionality issues in modeling multivariate time series. However, there is yet no complete understanding of how the underlying structure could be exploited to ease this task. This work provides contributions in this direction by considering the forecasting of a process evolving over a graph. We make use of the (approximate) time-vertex stationarity assumption, i.e., timevarying graph signals whose first and second order statistical moments are invariant over time and correlated to a known graph topology. The latter is combined with VAR and VARMA models to tackle the dimensionality issues present in predicting the temporal evolution of multivariate time series. We find out that by projecting the data to the graph spectral domain: (i) the multivariate model estimation reduces to that of fitting a number of uncorrelated univariate ARMA models and (ii) an optimal low-rank data representation can be exploited so as to further reduce the estimation costs. In the case that the multivariate process can be observed at a subset of nodes, the proposed models extend naturally to Kalman filtering on graphs allowing for optimal tracking. Numerical experiments with both synthetic and real data validate the proposed approach and highlight its benefits over state-of-the-art alternatives.Comment: submitted to the IEEE Transactions on Signal Processin
    corecore