    Robustness of Generalized Learning Vector Quantization Models against Adversarial Attacks

    Adversarial attacks and the development of (deep) neural networks robust against them are currently two widely researched topics. The robustness of Learning Vector Quantization (LVQ) models against adversarial attacks has however not yet been studied to the same extent. We therefore present an extensive evaluation of three LVQ models: Generalized LVQ, Generalized Matrix LVQ and Generalized Tangent LVQ. The evaluation suggests that both Generalized LVQ and Generalized Tangent LVQ have a high base robustness, on par with the current state-of-the-art in robust neural network methods. In contrast to this, Generalized Matrix LVQ shows a high susceptibility to adversarial attacks, scoring consistently behind all other models. Additionally, our numerical evaluation indicates that increasing the number of prototypes per class improves the robustness of the models.Comment: to be published in 13th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualizatio

    Optimal approximation of piecewise smooth functions using deep ReLU neural networks

    We study the necessary and sufficient complexity of ReLU neural networks---in terms of depth and number of weights---which is required for approximating classifier functions in L2L^2. As a model class, we consider the set Eβ(Rd)\mathcal{E}^\beta (\mathbb R^d) of possibly discontinuous piecewise CβC^\beta functions f:[−1/2,1/2]d→Rf : [-1/2, 1/2]^d \to \mathbb R, where the different smooth regions of ff are separated by CβC^\beta hypersurfaces. For dimension d≥2d \geq 2, regularity β>0\beta > 0, and accuracy ε>0\varepsilon > 0, we construct artificial neural networks with ReLU activation function that approximate functions from Eβ(Rd)\mathcal{E}^\beta(\mathbb R^d) up to L2L^2 error of ε\varepsilon. The constructed networks have a fixed number of layers, depending only on dd and β\beta, and they have O(ε−2(d−1)/β)O(\varepsilon^{-2(d-1)/\beta}) many nonzero weights, which we prove to be optimal. In addition to the optimality in terms of the number of weights, we show that in order to achieve the optimal approximation rate, one needs ReLU networks of a certain depth. Precisely, for piecewise Cβ(Rd)C^\beta(\mathbb R^d) functions, this minimal depth is given---up to a multiplicative constant---by β/d\beta/d. Up to a log factor, our constructed networks match this bound. This partly explains the benefits of depth for ReLU networks by showing that deep networks are necessary to achieve efficient approximation of (piecewise) smooth functions. Finally, we analyze approximation in high-dimensional spaces where the function ff to be approximated can be factorized into a smooth dimension reducing feature map τ\tau and classifier function gg---defined on a low-dimensional feature space---as f=g∘τf = g \circ \tau. We show that in this case the approximation rate depends only on the dimension of the feature space and not the input dimension.Comment: Generalized some estimates to LpL^p norms for $0<p<\infty

    Essays on Machine Learning in Risk Management, Option Pricing, and Insurance Economics

    Dealing with uncertainty is at the heart of financial risk management and asset pricing. This cumulative dissertation consists of four independent research papers that study various aspects of uncertainty, from estimation and model risk over the volatility risk premium to the measurement of unobservable variables. In the first paper, a non-parametric estimator of conditional quantiles is proposed that builds on methods from the machine learning literature. The so-called leveraging estimator is discussed in detail and analyzed in an extensive simulation study. Subsequently, the estimator is used to quantify the estimation risk of Value-at-Risk and Expected Shortfall models. The results suggest that there are significant differences in the estimation risk of various GARCH-type models while in general estimation risk for the Expected Shortfall is higher than for the Value-at-Risk. In the second paper, the leveraging estimator is applied to realized and implied volatility estimates of US stock options to empirically test if the volatility risk premium is priced in the cross-section of option returns. A trading strategy that is long (short) in a portfolio with low (high) implied volatility conditional on the realized volatility yields average monthly returns that are economically and statistically significant. The third paper investigates the model risk of multivariate Value-at-Risk and Expected Shortfall models in a comprehensive empirical study on copula GARCH models. The paper finds that model risk is economically significant, especially high during periods of financial turmoil, and mainly due to the choice of the copula. In the fourth paper, the relation between digitalization and the market value of US insurers is analyzed. Therefore, a text-based measure of digitalization building on the Latent Dirichlet Allocation is proposed. It is shown that a rise in digitalization efforts is associated with an increase in market valuations.:1 Introduction 1.1 Motivation 1.2 Conditional quantile estimation via leveraging optimal quantization 1.3 Cross-section of option returns and the volatility risk premium 1.4 Marginals versus copulas: Which account for more model risk in multivariate risk forecasting? 1.5 Estimating the relation between digitalization and the market value of insurers 2 Conditional Quantile Estimation via Leveraging Optimal Quantization 2.1 Introduction 2.2 Optimal quantization 2.3 Conditional quantiles through leveraging optimal quantization 2.4 The hyperparameters N, λ, and γ 2.5 Simulation study 2.6 Empirical application 2.7 Conclusion 3 Cross-Section of Option Returns and the Volatility Risk Premium 3.1 Introduction 3.2 Capturing the volatility risk premium 3.3 Empirical study 3.4 Robustness checks 3.5 Conclusion 4 Marginals Versus Copulas: Which Account for More Model Risk in Multivariate Risk Forecasting? 4.1 Introduction 4.2 Market risk models and model risk 4.3 Data 4.4 Analysis of model risk 4.5 Model risk for models in the model confidence set 4.6 Model risk and backtesting 4.7 Conclusion 5 Estimating the Relation Between Digitalization and the Market Value of Insurers 5.1 Introduction 5.2 Measuring digitalization using LDA 5.3 Financial data & empirical strategy 5.4 Estimation results 5.5 Conclusio

    Model-driven and Data-driven Approaches for some Object Recognition Problems

    Recognizing objects from images and videos has been a long standing problem in computer vision. The recent surge in the prevalence of visual cameras has given rise to two main challenges where, (i) it is important to understand different sources of object variations in more unconstrained scenarios, and (ii) rather than describing an object in isolation, efficient learning methods for modeling object-scene `contextual' relations are required to resolve visual ambiguities. This dissertation addresses some aspects of these challenges, and consists of two parts. First part of the work focuses on obtaining object descriptors that are largely preserved across certain sources of variations, by utilizing models for image formation and local image features. Given a single instance of an object, we investigate the following three problems. (i) Representing a 2D projection of a 3D non-planar shape invariant to articulations, when there are no self-occlusions. We propose an articulation invariant distance that is preserved across piece-wise affine transformations of a non-rigid object `parts', under a weak perspective imaging model, and then obtain a shape context-like descriptor to perform recognition; (ii) Understanding the space of `arbitrary' blurred images of an object, by representing an unknown blur kernel of a known maximum size using a complete set of orthonormal basis functions spanning that space, and showing that subspaces resulting from convolving a clean object and its blurred versions with these basis functions are equal under some assumptions. We then view the invariant subspaces as points on a Grassmann manifold, and use statistical tools that account for the underlying non-Euclidean nature of the space of these invariants to perform recognition across blur; (iii) Analyzing the robustness of local feature descriptors to different illumination conditions. We perform an empirical study of these descriptors for the problem of face recognition under lighting change, and show that the direction of image gradient largely preserves object properties across varying lighting conditions. The second part of the dissertation utilizes information conveyed by large quantity of data to learn contextual information shared by an object (or an entity) with its surroundings. (i) We first consider a supervised two-class problem of detecting lane markings from road video sequences, where we learn relevant feature-level contextual information through a machine learning algorithm based on boosting. We then focus on unsupervised object classification scenarios where, (ii) we perform clustering using maximum margin principles, by deriving some basic properties on the affinity of `a pair of points' belonging to the same cluster using the information conveyed by `all' points in the system, and (iii) then consider correspondence-free adaptation of statistical classifiers across domain shifting transformations, by generating meaningful `intermediate domains' that incrementally convey potential information about the domain change

    Distributed signal processing using nested lattice codes

    Multi-Terminal Source Coding (MTSC) addresses the problem of compressing correlated sources without communication links among them. In this thesis, the constructive approach of this problem is considered in an algebraic framework and a system design is provided that can be applicable in a variety of settings. Wyner-Ziv problem is first investigated: coding of an independent and identically distributed (i.i.d.) Gaussian source with side information available only at the decoder in the form of a noisy version of the source to be encoded. Theoretical models are first established and derived for calculating distortion-rate functions. Then a few novel practical code implementations are proposed by using the strategy of multi-dimensional nested lattice/trellis coding. By investigating various lattices in the dimensions considered, analysis is given on how lattice properties affect performance. Also proposed are methods on choosing good sublattices in multiple dimensions. By introducing scaling factors, the relationship between distortion and scaling factor is examined for various rates. The best high-dimensional lattice using our scale-rotate method can achieve a performance less than 1 dB at low rates from the Wyner-Ziv limit; and random nested ensembles can achieve a 1.87 dB gap with the limit. Moreover, the code design is extended to incorporate with distributed compressive sensing (DCS). Theoretical framework is proposed and practical design using nested lattice/trellis is presented for various scenarios. By using nested trellis, the simulation shows a 3.42 dB gap from our derived bound for the DCS plus Wyner-Ziv framework

    Understanding perceived quality through visual representations

    Get PDF
    The formatting of images can be considered as an optimization problem, whose cost function is a quality assessment algorithm. There is a trade-off between bit budget per pixel and quality. To maximize the quality and minimize the bit budget, we need to measure the perceived quality. In this thesis, we focus on understanding perceived quality through visual representations that are based on visual system characteristics and color perception mechanisms. Specifically, we use the contrast sensitivity mechanisms in retinal ganglion cells and the suppression mechanisms in cortical neurons. We utilize color difference equations and color name distances to mimic pixel-wise color perception and a bio-inspired model to formulate center surround effects. Based on these formulations, we introduce two novel image quality estimators PerSIM and CSV, and a new image quality-assistance method BLeSS. We combine our findings from visual system and color perception with data-driven methods to generate visual representations and measure their quality. The majority of existing data-driven methods require subjective scores or degraded images. In contrast, we follow an unsupervised approach that only utilizes generic images. We introduce a novel unsupervised image quality estimator UNIQUE, and extend it with multiple models and layers to obtain MS-UNIQUE and DMS-UNIQUE. In addition to introducing quality estimators, we analyze the role of spatial pooling and boosting in image quality assessment.Ph.D
