36 research outputs found
On existence, uniqueness and scalability of adversarial robustness measures for AI classifiers
Simply-verifiable mathematical conditions for existence, uniqueness and
explicit analytical computation of minimal adversarial paths (MAP) and minimal
adversarial distances (MAD) for (locally) uniquely-invertible classifiers, for
generalized linear models (GLM), and for entropic AI (EAI) are formulated and
proven. Practical computation of MAP and MAD, their comparison and
interpretations for various classes of AI tools (for neuronal networks, boosted
random forests, GLM and EAI) are demonstrated on the common synthetic
benchmarks: on a double Swiss roll spiral and its extensions, as well as on the
two biomedical data problems (for the health insurance claim predictions, and
for the heart attack lethality classification). On biomedical applications it
is demonstrated how MAP provides unique minimal patient-specific
risk-mitigating interventions in the predefined subsets of accessible control
variables.Comment: 16 pages, 3 figure
Non-stationary data-driven computational portfolio theory and algorithms
The aim of the dissertation is the development of a data-driven portfolio optimization framework beyond standard assumptions. Investment decisions are either based on the opinion of a human expert, who evaluates information about companies, or on statistical models. The most famous methods based on statistics are the Markowitz portfolio model and utility maximization. All statistical methods assume certain knowledge over the underlying distribution of the returns, either by imposing Gaussianity, by expecting complete knowledge of the distribution or by inferring sufficiently good estimators of parameters. Yet in practice, all methods suffer from incomplete knowledge, small sample sizes and the problem that parameters might be varying over time. A new, model-free approach to the portfolio optimization problem allowing for time-varying dynamics in the price processes is presented. The methods proposed in this work are designed to solve the problem with less a-priori assumptions than standard methods, like assumptions on the distribution of the price processes or assumptions on time-invariant statistical properties. The new approach introduces two new parameters and a method to chose these based on principles of information theory. An analysis of different approaches to incorporate additional information is performed before a straightforward approach to the out-of-sample application is introduced. The structure of the numerical problem is obtained directly from the problem of portfolio optimization, resulting in a system of objective function and constraints known from non-stationary time series analysis. The incorporation of transaction costs allows to naturally obtain regularization that is normally included for numerical reasons. The applicability and the numerical feasibility of the method are demonstrated in a low-dimensional example in-sample and in a high-dimensional example in- and out-of-sample in an environment with mixed transaction costs. The performance of both examples is measured and compared to standard methods, as the Markowitz approach and to methods based on techniques to analyse non- stationary data, like Hidden Markov Models
Data-based analysis of extreme events: inference, numerics and applications
The concept of extreme events describes the above average behavior of a process, for instance, heat waves in climate or weather research, earthquakes in geology and financial crashes in economics. It is significant to study the behavior of extremes, in order to reduce their negative impacts. Key objectives include the identification of the appropriate mathematical/statistical model, description of the underlying dependence structure in the multivariate or the spatial case, and the investigation of the most relevant external factors. Extreme value analysis (EVA), based on Extreme Value Theory, provides the necessary statistical tools. Assuming that all relevant covariates are known and observed, EVA often deploys statistical regression analysis to study the changes in the model parameters. Modeling of the dependence structure implies a priori assumptions such as Gaussian, locally stationary or isotropic behavior. Based on EVA and advanced time-series analysis methodology, this thesis introduces a semiparametric, nonstationary and non- homogenous framework for statistical regression analysis of spatio-temporal extremes. The involved regression analysis accounts explicitly for systematically missing covariates; their influence was reduced to an additive nonstationary offset. The nonstationarity was resolved by the Finite Element Time Series Analysis Methodology (FEM). FEM approximates the underlying nonstationarity by a set of locally stationary models and a nonstationary hidden switching process with bounded variation (BV). The resulting FEM-BV- EVA approach goes beyond a priori assumptions of standard methods based, for instance, on Bayesian statistics, Hidden Markov Models or Local Kernel Smoothing. The multivariate/spatial extension of FEM-BV-EVA describes the underlying spatial variability by the model parameters, referring to hierarchical modeling. The spatio-temporal behavior of the model parameters was approximated by locally stationary models and a spatial nonstationary switching process. Further, it was shown that the resulting spatial FEM-BV-EVA formulation is consistent with the max-stability postulate and describes the underlying dependence structure in a nonparametric way. The proposed FEM-BV-EVA methodology was integrated into the existent FEM MATLAB toolbox. The FEM-BV-EVA framework is computationally efficient as it deploys gradient free MCMC based optimization methods and numerical solvers for constrained, large, structured quadratic and linear problems. In order to demonstrate its performance, FEM-BV-EVA was applied to various test-cases and real-data and compared to standard methods. It was shown that parametric approaches lead to biased results if significant covariates are unresolved. Comparison to nonparametric methods based on smoothing regression revealed their weakness, the locality property and the inability to resolve discontinuous functions. Spatial FEM-BV-EVA was applied to study the dynamics of extreme precipitation over Switzerland. The analysis identified among others three major spatially dependent regions
Linearly-scalable learning of smooth low-dimensional patterns with permutation-aided entropic dimension reduction
In many data science applications, the objective is to extract
appropriately-ordered smooth low-dimensional data patterns from
high-dimensional data sets. This is challenging since common sorting algorithms
are primarily aiming at finding monotonic orderings in low-dimensional data,
whereas typical dimension reduction and feature extraction algorithms are not
primarily designed for extracting smooth low-dimensional data patterns. We show
that when selecting the Euclidean smoothness as a pattern quality criterium,
both of these problems (finding the optimal 'crisp' data permutation and
extracting the sparse set of permuted low-dimensional smooth patterns) can be
efficiently solved numerically as one unsupervised entropy-regularized
iterative optimization problem. We formulate and prove the conditions for
monotonicity and convergence of this linearly-scalable (in dimension) numerical
procedure, with the iteration cost scaling of , where is
the size of the data statistics and is a feature space dimension. The
efficacy of the proposed method is demonstrated through the examination of
synthetic examples as well as a real-world application involving the
identification of smooth bankruptcy risk minimizing transition patterns from
high-dimensional economical data. The results showcase that the statistical
properties of the overall time complexity of the method exhibit linear scaling
in the dimensionality within the specified confidence intervals
A scalable approach to the computation of invariant measures for high-dimensional Markovian systems
Abstract The Markovian invariant measure is a central concept in many disciplines. Conventional numerical techniques for data-driven computation of invariant measures rely on estimation and further numerical processing of a transition matrix. Here we show how the quality of data-driven estimation of a transition matrix crucially depends on the validity of the statistical independence assumption for transition probabilities. Moreover, the cost of the invariant measure computation in general scales cubically with the dimension - and is usually unfeasible for realistic high-dimensional systems. We introduce a method relaxing the independence assumption of transition probabilities that scales quadratically in situations with latent variables. Applications of the method are illustrated on the Lorenz-63 system and for the molecular dynamics (MD) simulation data of the α-synuclein protein. We demonstrate how the conventional methodologies do not provide good estimates of the invariant measure based upon the available α-synuclein MD data. Applying the introduced approach to these MD data we detect two robust meta-stable states of α-synuclein and a linear transition between them, involving transient formation of secondary structure, qualitatively consistent with previous purely experimental reports
Gauge-optimal approximate learning for small data classification problems
Small data learning problems are characterized by a significant discrepancy
between the limited amount of response variable observations and the large
feature space dimension. In this setting, the common learning tools struggle to
identify the features important for the classification task from those that
bear no relevant information, and cannot derive an appropriate learning rule
which allows to discriminate between different classes. As a potential solution
to this problem, here we exploit the idea of reducing and rotating the feature
space in a lower-dimensional gauge and propose the Gauge-Optimal Approximate
Learning (GOAL) algorithm, which provides an analytically tractable joint
solution to the dimension reduction, feature segmentation and classification
problems for small data learning problems. We prove that the optimal solution
of the GOAL algorithm consists in piecewise-linear functions in the Euclidean
space, and that it can be approximated through a monotonically convergent
algorithm which presents -- under the assumption of a discrete segmentation of
the feature space -- a closed-form solution for each optimization substep and
an overall linear iteration cost scaling. The GOAL algorithm has been compared
to other state-of-the-art machine learning (ML) tools on both synthetic data
and challenging real-world applications from climate science and bioinformatics
(i.e., prediction of the El Nino Southern Oscillation and inference of
epigenetically-induced gene-activity networks from limited experimental data).
The experimental results show that the proposed algorithm outperforms the
reported best competitors for these problems both in learning performance and
computational cost.Comment: 47 pages, 4 figure
Low-cost probabilistic 3D denoising with applications for ultra-low-radiation computed tomography
We propose a pipeline for synthetic generation of personalized Computer Tomography (CT) images, with a radiation exposure evaluation and a lifetime attributable risk (LAR) assessment. We perform a patient-specific performance evaluation for a broad range of denoising algorithms (including the most popular deep learning denoising approaches, wavelets-based methods, methods based on Mumford-Shah denoising, etc.), focusing both on accessing the capability to reduce the patient-specific CT-induced LAR and on computational cost scalability. We introduce a parallel Probabilistic Mumford-Shah denoising model (PMS) and show that it markedly-outperforms the compared common denoising methods in denoising quality and cost scaling. In particular, we show that it allows an approximately 22-fold robust patient-specific LAR reduction for infants and a 10-fold LAR reduction for adults. Using a normal laptop, the proposed algorithm for PMS allows cheap and robust (with a multiscale structural similarity index >90%) denoising of very large 2D videos and 3D images (with over 107 voxels) that are subject to ultra-strong noise (Gaussian and non-Gaussian) for signal-to-noise ratios far below 1.0. The code is provided for open access.Web of Science86art. no. 15