336,376 research outputs found

    Response Spectrum Estimation Using Support Vector Machines

    Get PDF
    This study investigates the applicability and efficiency of support vector machines for the problem of estimating the earthquake response spectra from the Fourier amplitude spectra of the ground motion acceleration. Two methods are commonly used for this purpose: time domain simulations, and the random vibration theory. The use of time domain simulations offers high accuracy at high computational cost, while the use random vibration theory, although not computationally intensive, requires knowledge of the statistical distribution of the response amplitudes. This study treats the task of estimating response spectra from the Fourier spectra as a nonlinear regression problem, and constructs a supervised machine learning algorithm with minimal sensitivity to noise and outliers. In this method, pairs of vectors consisting of Fourier amplitude spectra and pseudo-velocity response spectra are transformed into a high dimensional feature space where the nonlinear relationship between them can be represented as a line. No assumptions regarding the probability density function of response amplitudes are required. A practical application is presented using artificially generated accelerograms, and it is shown that the support vector machines can predict the response spectra for a wide range of vibration periods

    From robust tests to Bayes-like posterior distributions

    Get PDF
    In the Bayes paradigm and for a given loss function, we propose the construction of a new type of posterior distributions, that extends the classical Bayes one, for estimating the law of an nn-sample. The loss functions we have in mind are based on the total variation and Hellinger distances as well as some Lj\mathbb{L}_{j}-ones. We prove that, with a probability close to one, this new posterior distribution concentrates its mass in a neighbourhood of the law of the data, for the chosen loss function, provided that this law belongs to the support of the prior or, at least, lies close enough to it. We therefore establish that the new posterior distribution enjoys some robustness properties with respect to a possible misspecification of the prior, or more precisely, its support. For the total variation and squared Hellinger losses, we also show that the posterior distribution keeps its concentration properties when the data are only independent, hence not necessarily i.i.d., provided that most of their marginals or the average of these are close enough to some probability distribution around which the prior puts enough mass. The posterior distribution is therefore also stable with respect to the equidistribution assumption. We illustrate these results by several applications. We consider the problems of estimating a location parameter or both the location and the scale of a density in a nonparametric framework. Finally, we also tackle the problem of estimating a density, with the squared Hellinger loss, in a high-dimensional parametric model under some sparsity conditions. The results established in this paper are non-asymptotic and provide, as much as possible, explicit constants

    Estimating differential entropy using recursive copula splitting

    Full text link
    A method for estimating the Shannon differential entropy of multidimensional random variables using independent samples is described. The method is based on decomposing the distribution into a product of the marginal distributions and the joint dependency, also known as the copula. The entropy of marginals is estimated using one-dimensional methods. The entropy of the copula, which always has a compact support, is estimated recursively by splitting the data along statistically dependent dimensions. Numerical examples demonstrate that the method is accurate for distributions with compact and non-compact supports, which is imperative when the support is not known or of mixed type (in different dimensions). At high dimensions (larger than 20), our method is not only more accurate, but also significantly more efficient than existing approaches

    On Convergence of Epanechnikov Mean Shift

    Full text link
    Epanechnikov Mean Shift is a simple yet empirically very effective algorithm for clustering. It localizes the centroids of data clusters via estimating modes of the probability distribution that generates the data points, using the `optimal' Epanechnikov kernel density estimator. However, since the procedure involves non-smooth kernel density functions, the convergence behavior of Epanechnikov mean shift lacks theoretical support as of this writing---most of the existing analyses are based on smooth functions and thus cannot be applied to Epanechnikov Mean Shift. In this work, we first show that the original Epanechnikov Mean Shift may indeed terminate at a non-critical point, due to the non-smoothness nature. Based on our analysis, we propose a simple remedy to fix it. The modified Epanechnikov Mean Shift is guaranteed to terminate at a local maximum of the estimated density, which corresponds to a cluster centroid, within a finite number of iterations. We also propose a way to avoid running the Mean Shift iterates from every data point, while maintaining good clustering accuracies under non-overlapping spherical Gaussian mixture models. This further pushes Epanechnikov Mean Shift to handle very large and high-dimensional data sets. Experiments show surprisingly good performance compared to the Lloyd's K-means algorithm and the EM algorithm.Comment: AAAI 201
    • …
    corecore