36 research outputs found

    On existence, uniqueness and scalability of adversarial robustness measures for AI classifiers

    Full text link
    Simply-verifiable mathematical conditions for existence, uniqueness and explicit analytical computation of minimal adversarial paths (MAP) and minimal adversarial distances (MAD) for (locally) uniquely-invertible classifiers, for generalized linear models (GLM), and for entropic AI (EAI) are formulated and proven. Practical computation of MAP and MAD, their comparison and interpretations for various classes of AI tools (for neuronal networks, boosted random forests, GLM and EAI) are demonstrated on the common synthetic benchmarks: on a double Swiss roll spiral and its extensions, as well as on the two biomedical data problems (for the health insurance claim predictions, and for the heart attack lethality classification). On biomedical applications it is demonstrated how MAP provides unique minimal patient-specific risk-mitigating interventions in the predefined subsets of accessible control variables.Comment: 16 pages, 3 figure

    Non-stationary data-driven computational portfolio theory and algorithms

    Get PDF
    The aim of the dissertation is the development of a data-driven portfolio optimization framework beyond standard assumptions. Investment decisions are either based on the opinion of a human expert, who evaluates information about companies, or on statistical models. The most famous methods based on statistics are the Markowitz portfolio model and utility maximization. All statistical methods assume certain knowledge over the underlying distribution of the returns, either by imposing Gaussianity, by expecting complete knowledge of the distribution or by inferring sufficiently good estimators of parameters. Yet in practice, all methods suffer from incomplete knowledge, small sample sizes and the problem that parameters might be varying over time. A new, model-free approach to the portfolio optimization problem allowing for time-varying dynamics in the price processes is presented. The methods proposed in this work are designed to solve the problem with less a-priori assumptions than standard methods, like assumptions on the distribution of the price processes or assumptions on time-invariant statistical properties. The new approach introduces two new parameters and a method to chose these based on principles of information theory. An analysis of different approaches to incorporate additional information is performed before a straightforward approach to the out-of-sample application is introduced. The structure of the numerical problem is obtained directly from the problem of portfolio optimization, resulting in a system of objective function and constraints known from non-stationary time series analysis. The incorporation of transaction costs allows to naturally obtain regularization that is normally included for numerical reasons. The applicability and the numerical feasibility of the method are demonstrated in a low-dimensional example in-sample and in a high-dimensional example in- and out-of-sample in an environment with mixed transaction costs. The performance of both examples is measured and compared to standard methods, as the Markowitz approach and to methods based on techniques to analyse non- stationary data, like Hidden Markov Models

    Data-based analysis of extreme events: inference, numerics and applications

    Get PDF
    The concept of extreme events describes the above average behavior of a process, for instance, heat waves in climate or weather research, earthquakes in geology and financial crashes in economics. It is significant to study the behavior of extremes, in order to reduce their negative impacts. Key objectives include the identification of the appropriate mathematical/statistical model, description of the underlying dependence structure in the multivariate or the spatial case, and the investigation of the most relevant external factors. Extreme value analysis (EVA), based on Extreme Value Theory, provides the necessary statistical tools. Assuming that all relevant covariates are known and observed, EVA often deploys statistical regression analysis to study the changes in the model parameters. Modeling of the dependence structure implies a priori assumptions such as Gaussian, locally stationary or isotropic behavior. Based on EVA and advanced time-series analysis methodology, this thesis introduces a semiparametric, nonstationary and non- homogenous framework for statistical regression analysis of spatio-temporal extremes. The involved regression analysis accounts explicitly for systematically missing covariates; their influence was reduced to an additive nonstationary offset. The nonstationarity was resolved by the Finite Element Time Series Analysis Methodology (FEM). FEM approximates the underlying nonstationarity by a set of locally stationary models and a nonstationary hidden switching process with bounded variation (BV). The resulting FEM-BV- EVA approach goes beyond a priori assumptions of standard methods based, for instance, on Bayesian statistics, Hidden Markov Models or Local Kernel Smoothing. The multivariate/spatial extension of FEM-BV-EVA describes the underlying spatial variability by the model parameters, referring to hierarchical modeling. The spatio-temporal behavior of the model parameters was approximated by locally stationary models and a spatial nonstationary switching process. Further, it was shown that the resulting spatial FEM-BV-EVA formulation is consistent with the max-stability postulate and describes the underlying dependence structure in a nonparametric way. The proposed FEM-BV-EVA methodology was integrated into the existent FEM MATLAB toolbox. The FEM-BV-EVA framework is computationally efficient as it deploys gradient free MCMC based optimization methods and numerical solvers for constrained, large, structured quadratic and linear problems. In order to demonstrate its performance, FEM-BV-EVA was applied to various test-cases and real-data and compared to standard methods. It was shown that parametric approaches lead to biased results if significant covariates are unresolved. Comparison to nonparametric methods based on smoothing regression revealed their weakness, the locality property and the inability to resolve discontinuous functions. Spatial FEM-BV-EVA was applied to study the dynamics of extreme precipitation over Switzerland. The analysis identified among others three major spatially dependent regions

    Linearly-scalable learning of smooth low-dimensional patterns with permutation-aided entropic dimension reduction

    Full text link
    In many data science applications, the objective is to extract appropriately-ordered smooth low-dimensional data patterns from high-dimensional data sets. This is challenging since common sorting algorithms are primarily aiming at finding monotonic orderings in low-dimensional data, whereas typical dimension reduction and feature extraction algorithms are not primarily designed for extracting smooth low-dimensional data patterns. We show that when selecting the Euclidean smoothness as a pattern quality criterium, both of these problems (finding the optimal 'crisp' data permutation and extracting the sparse set of permuted low-dimensional smooth patterns) can be efficiently solved numerically as one unsupervised entropy-regularized iterative optimization problem. We formulate and prove the conditions for monotonicity and convergence of this linearly-scalable (in dimension) numerical procedure, with the iteration cost scaling of O(DT2)\mathcal{O}(DT^2), where TT is the size of the data statistics and DD is a feature space dimension. The efficacy of the proposed method is demonstrated through the examination of synthetic examples as well as a real-world application involving the identification of smooth bankruptcy risk minimizing transition patterns from high-dimensional economical data. The results showcase that the statistical properties of the overall time complexity of the method exhibit linear scaling in the dimensionality DD within the specified confidence intervals

    A scalable approach to the computation of invariant measures for high-dimensional Markovian systems

    Get PDF
    Abstract The Markovian invariant measure is a central concept in many disciplines. Conventional numerical techniques for data-driven computation of invariant measures rely on estimation and further numerical processing of a transition matrix. Here we show how the quality of data-driven estimation of a transition matrix crucially depends on the validity of the statistical independence assumption for transition probabilities. Moreover, the cost of the invariant measure computation in general scales cubically with the dimension - and is usually unfeasible for realistic high-dimensional systems. We introduce a method relaxing the independence assumption of transition probabilities that scales quadratically in situations with latent variables. Applications of the method are illustrated on the Lorenz-63 system and for the molecular dynamics (MD) simulation data of the α-synuclein protein. We demonstrate how the conventional methodologies do not provide good estimates of the invariant measure based upon the available α-synuclein MD data. Applying the introduced approach to these MD data we detect two robust meta-stable states of α-synuclein and a linear transition between them, involving transient formation of secondary structure, qualitatively consistent with previous purely experimental reports

    Gauge-optimal approximate learning for small data classification problems

    Full text link
    Small data learning problems are characterized by a significant discrepancy between the limited amount of response variable observations and the large feature space dimension. In this setting, the common learning tools struggle to identify the features important for the classification task from those that bear no relevant information, and cannot derive an appropriate learning rule which allows to discriminate between different classes. As a potential solution to this problem, here we exploit the idea of reducing and rotating the feature space in a lower-dimensional gauge and propose the Gauge-Optimal Approximate Learning (GOAL) algorithm, which provides an analytically tractable joint solution to the dimension reduction, feature segmentation and classification problems for small data learning problems. We prove that the optimal solution of the GOAL algorithm consists in piecewise-linear functions in the Euclidean space, and that it can be approximated through a monotonically convergent algorithm which presents -- under the assumption of a discrete segmentation of the feature space -- a closed-form solution for each optimization substep and an overall linear iteration cost scaling. The GOAL algorithm has been compared to other state-of-the-art machine learning (ML) tools on both synthetic data and challenging real-world applications from climate science and bioinformatics (i.e., prediction of the El Nino Southern Oscillation and inference of epigenetically-induced gene-activity networks from limited experimental data). The experimental results show that the proposed algorithm outperforms the reported best competitors for these problems both in learning performance and computational cost.Comment: 47 pages, 4 figure

    Low-cost probabilistic 3D denoising with applications for ultra-low-radiation computed tomography

    Get PDF
    We propose a pipeline for synthetic generation of personalized Computer Tomography (CT) images, with a radiation exposure evaluation and a lifetime attributable risk (LAR) assessment. We perform a patient-specific performance evaluation for a broad range of denoising algorithms (including the most popular deep learning denoising approaches, wavelets-based methods, methods based on Mumford-Shah denoising, etc.), focusing both on accessing the capability to reduce the patient-specific CT-induced LAR and on computational cost scalability. We introduce a parallel Probabilistic Mumford-Shah denoising model (PMS) and show that it markedly-outperforms the compared common denoising methods in denoising quality and cost scaling. In particular, we show that it allows an approximately 22-fold robust patient-specific LAR reduction for infants and a 10-fold LAR reduction for adults. Using a normal laptop, the proposed algorithm for PMS allows cheap and robust (with a multiscale structural similarity index >90%) denoising of very large 2D videos and 3D images (with over 107 voxels) that are subject to ultra-strong noise (Gaussian and non-Gaussian) for signal-to-noise ratios far below 1.0. The code is provided for open access.Web of Science86art. no. 15
    corecore