149 research outputs found

    A One-Sample Test for Normality with Kernel Methods

    Get PDF
    We propose a new one-sample test for normality in a Reproducing Kernel Hilbert Space (RKHS). Namely, we test the null-hypothesis of belonging to a given family of Gaussian distributions. Hence our procedure may be applied either to test data for normality or to test parameters (mean and covariance) if data are assumed Gaussian. Our test is based on the same principle as the MMD (Maximum Mean Discrepancy) which is usually used for two-sample tests such as homogeneity or independence testing. Our method makes use of a special kind of parametric bootstrap (typical of goodness-of-fit tests) which is computationally more efficient than standard parametric bootstrap. Moreover, an upper bound for the Type-II error highlights the dependence on influential quantities. Experiments illustrate the practical improvement allowed by our test in high-dimensional settings where common normality tests are known to fail. We also consider an application to covariance rank selection through a sequential procedure

    Assessing Univariate and Multivariate Normality, A Guide For Non-Statisticians

    Get PDF
    Most parametric methods rely on the assumption of normality. Results obtained from these methods are more powerful compared to their non-parametric counterparts. However for valid inference, the assumptions underlying the use of these methods should be satisfied. Many published statistical articles that make use of the assumption of normality fail to guarantee it. Hence, quite a number of published statistical results are presented with errors. As a way to reduce this, various approaches used in assessing the assumption of normality are presented and illustrated in this paper.   In assessing both univariate and multivariate normality, several methods have been proposed. In the univariate setting, the Q-Q plot, histogram, box plot, stem-and-leaf plot or dot plot are some graphical methods that can be used. Also, the properties of the normal distribution provide an alternative approach to assess normality. The Kolmogorov-Smirnov (K-S) test, Lilliefors corrected K-S test, Shapiro-Wilk test, Anderson-Darling test, Cramer-von Mises test, D'Agostino skewness test, Anscombe-Glynn kurtosis test, D'Agostino-Pearson omnibus test, and the Jarque-Bera test are also used to test for normality. However, Kolmogorov-Smirnov (K-S) test, Shapiro-Wilk test, Anderson-Darling test, and Cramer-von Mises test are widely used in practice and implemented in many statistical applications. For multivariate normal data, marginal distribution and linear combinations should also be normal. This provides a starting point for assessing normality in the multivariate setting. A scatter plot for each pair of variables together with a Gamma plot (Chi-squared Q-Q plot) is used in assessing bivariate normality. For more than two variables, a Gamma plot can still be used to check the assumption of multivariate normality. Among the many test proposed for testing multivariate normality, Royston's and Mardia's tests are used more often and are implemented in many statistical packages. When the normality assumption is not justifiable, techniques for non-normal data can be used. Likewise, transformation to near normality is another alternative. Keywords: Univariate normal, Multivariate normal, Q-Q plot, Gamma plot, Kolmogorov-Smirnov test, Shapiro-Wilk test, Mardia's test, Royston's test

    Simulation assisted performance optimization of large-scale multiparameter technical systems

    Get PDF
    During the past two decades the role of dynamic process simulation within the research and development work of process and control solutions has grown tremendously. As the simulation assisted working practices have become more and more popular, also the accuracy requirements concerning the simulation results have tightened. The accuracy improvement of complex, plant-wide models via parameter tuning necessitates implementing practical, scalable methods and tools operating on the correct level of abstraction. In modern integrated process plants, it is not only the performance of individual controllers but also their interactions that determine the overall performance of the large-scale control systems. However, in practice it has become customary to split large-scale problems into smaller pieces and to use traditional analytical control engineering approaches, which inevitably end in suboptimal solutions. The performance optimization problems related to large control systems and to plant-wide process models are essentially connected in the context of new simulation assisted process and control design practices. The accuracy of the model that is obtained with data-based parameter tuning determines the quality of the simulation assisted controller tuning results. In this doctoral thesis both problems are formulated in the same framework depicted in the title of the thesis. To solve the optimization problem, a novel method called Iterative Regression Tuning (IRT) applying numerical optimization and multivariate regression is presented. IRT method has been designed especially for large-scale systems and it allows the incorporation of domain area expertise into the optimization goals. The thesis introduces different variations on the IRT method, technical details related to their application and various use cases of the algorithm. The simulation assisted use case is presented through a number of application examples of control performance and model accuracy optimization

    Towards an interpretation of the Portuguese cultural tourist: motivations and constraints

    Get PDF
    The importance of cultural tourism for the development of tourist destinations has created a need for knowledge on both supply and demand sides, related to the motivations and constraints which can affect the decisions of the cultural tourist. The aim of this study is to develop and validate for the Portuguese tourists two scales to measure motivations to travel and its constraints, considering theories of needs and constraints, and to examine differences according to tourists’ socio-demographic characteristics. The scales were determined using a non-linear principal component analysis followed by bootstrap confirmatory factor analysis, and differences were examined using parametric tests. A six-dimensional model for motivations and a five-dimensional model for constraints were validated, both with good overall fit. Motives linked to culture, intellectual curiosity and cultural knowledge stand out with high levels of relevance. Lack of resources and other commitments are the most important constraints. Significant differences were found in almost all characteristics. The results reveal influential dimensions on travel decisions which are of utmost importance for the design of the cultural offer of destinations.info:eu-repo/semantics/publishedVersio

    VI Workshop on Computational Data Analysis and Numerical Methods: Book of Abstracts

    Get PDF
    The VI Workshop on Computational Data Analysis and Numerical Methods (WCDANM) is going to be held on June 27-29, 2019, in the Department of Mathematics of the University of Beira Interior (UBI), Covilhã, Portugal and it is a unique opportunity to disseminate scientific research related to the areas of Mathematics in general, with particular relevance to the areas of Computational Data Analysis and Numerical Methods in theoretical and/or practical field, using new techniques, giving especial emphasis to applications in Medicine, Biology, Biotechnology, Engineering, Industry, Environmental Sciences, Finance, Insurance, Management and Administration. The meeting will provide a forum for discussion and debate of ideas with interest to the scientific community in general. With this meeting new scientific collaborations among colleagues, namely new collaborations in Masters and PhD projects are expected. The event is open to the entire scientific community (with or without communication/poster)

    Signal Processing in Arrayed MIMO Systems

    No full text
    Multiple-Input Multiple-Output (MIMO) systems, using antenna arrays at both receiver and transmitter, have shown great potential to provide high bandwidth utilization efficiency. Unlike other reported research on MIMO systems which often assumes independent antennas, in this thesis an arrayed MIMO system framework is proposed, which provides a richer description of the channel charac- teristics and additional degrees of freedom in designing communication systems. Firstly, the spatial correlated MIMO system is studied as an array-to-array system with each array (Tx or Rx) having predefined constrained aperture. The MIMO system is completely characterized by its transmit and receive array man- ifolds and a new spatial correlation model other than Kronecker-based model is proposed. As this model is based on array manifolds, it enables the study of the effect of array geometry on the capacity of correlated MIMO channels. Secondly, to generalize the proposed arrayed MIMO model to a frequency selective fading scenario, the framework of uplink MIMO DS-CDMA (Direct- Sequence Code Division Multiple Access) systems is developed. DOD estimation is developed based on transmit beamrotation. A subspace-based joint DOA/TOA estimation scheme as well as various spatial temporal reception algorithms is also proposed. Finally, the downlink MIMO-CDMA systems in multiple-access multipath fading channels are investigated. Linear precoder and decoder optimization problems are studied under different criterions. Optimization approaches with different power allocation schemes are investigated. Sub-optimization approaches with close-form solution and thus less computation complexity are also proposed

    Fault Detection, Diagnosis and Fault Tolerance Approaches in Dynamic Systems based on Black-Box Models

    Get PDF
    In this dissertation new contributions to the research area of fault detection and diagnosis in dynamic systems are presented. The main research effort has been done on the development of new on-line model-based fault detection and diagnosis (FDD) approaches based on blackbox models (linear ARX models, and neural nonlinear ARX models). From a theoretical point of view a white-box model is more desirable to perform the FDD tasks, but in most cases it is very hard, or even impossible, to obtain. When the systems are complex, or difficult to model, modelling based on black-box models is usually a good and often the only alternative. The performance of the system identification methods plays a crucial role in the FDD methods proposed. Great research efforts have been made on the development of linear and nonlinear FDD approaches to detect and diagnose multiplicative (parametric) faults, since most of the past research work has been done focused on additive faults on sensors and actuators. The main pre-requisites for the FDD methods developed are: a) the on-line application in a real-time environment for systems under closed-loop control; b) the algorithms must be implemented in discrete time, and the plants are systems in continuous time; c) a two or three dimensional space for visualization and interpretation of the fault symptoms. An engineering and pragmatic view of FDD approaches has been followed, and some new theoretical contributions are presented in this dissertation. The fault tolerance problem and the fault tolerant control (FTC) have been investigated, and some ideas of the new FDD approaches have been incorporated in the FTC context. One of the main ideas underlying the research done in this work is to detect and diagnose faults occurring in continuous time systems via the analysis of the effect on the parameters of the discrete time black-box ARX models or associated features. In the FDD methods proposed, models for nominal operation and models for each faulty situation are constructed in off-line operation, and used a posteriori in on-line operation. The state of the art and some background concepts used for the research come from many scientific areas. The main concepts related to data mining, multivariate statistics (principal component analysis, PCA), linear and nonlinear dynamic systems, black-box models, system identification, fault detection and diagnosis (FDD), pattern recognition and discriminant analysis, and fault tolerant control (FTC), are briefly described. A sliding window version of the principal components regression algorithm, termed SW-PCR, is proposed for parameter estimation. The sliding window parameter estimation algorithms are most appropriate for fault detection and diagnosis than the recursive algorithms. For linear SISO systems, a new fault detection and diagnosis approach based on dynamic features (static gain and bandwidth) of ARX models is proposed, using a pattern classification approach based on neural nonlinear discriminant analysis (NNLDA). A new approach for fault detection (FDE) is proposed based on the application of the PCA method to the parameter space of ARX models; this allows a dimensional reduction, and the definition of thresholds based on multivariate statistics. This FDE method has been combined with a fault diagnosis (FDG) method based on an influence matrix (IMX). This combined FDD method (PCA & IMX) is suitable to deal with SISO or MIMO linear systems. Most of the research on the fault detection and diagnosis area has been done for linear systems. Few investigations exist in the FDD approaches for nonlinear systems. In this work, two new nonlinear approaches to FDD are proposed that are appropriate to SISO or MISO systems. A new architecture for a neural recurrent output predictor (NROP) is proposed, incorporating an embedded neural parallel model, an external feedback and an adjustable gain (design parameter). A new fault detection and diagnosis (FDD) approach for nonlinear systems is proposed based on a bank of neural recurrent output predictors (NROPs). Each neural NROP predictor is tuned to a specific fault. Also, a new FDD method based on the application of neural nonlinear PCA to ARX model parameters is proposed, combined with a pattern classification approach based on neural nonlinear discriminant analysis. In order to evaluate the performance of the proposed FDD methodologies, many experiments have been done using simulation models and a real setup. All the algorithms have been developed in discrete time, except the process models. The process models considered for the validation and tests of the FDD approaches are: a) a first order linear SISO system; b) a second order SISO model of a DC motor; c) a MIMO system model, the three-tank benchmark. A real nonlinear DC motor setup has also been used. A fault tolerant control (FTC) approach has been proposed to solve the typical reconfiguration problem formulated for the three-tank benchmark. This FTC approach incorporates the FDD method based on a bank of NROP predictors, and on an adaptive optimal linear quadratic Gaussian controller

    Topological event history analysis for random fields with an application to global wind intensities

    Get PDF
    PhD ThesisRealisations of simulated climate variables from the CESM Large Ensemble (Kay et al., 2015) are frequently assumed to be independent and identically distributed random fields (Castruccio and Stein, 2013; Castruccio and Genton, 2014, 2016). Using concepts from the study of survival analysis and topological data analysis, we propose a methodology for the comparison of these realisations with specific application to global wind intensities. Topological data analysis is becoming more widely used as the data available in many applications grows considerably, both in volume and complexity. Where computer science and machine learning often lean heavily on clustering techniques (Gan et al., 2007; Schaef fer, 2007), TDA, and more specifically persistent homology, allows a similar analysis with greater robustness to perturbations in data, for example. We extend ideas from topo logical data analysis using an event history approach. Survival analysis has wide-ranging applications, particularly in manufacturing and medical sciences where time-to-event data is common. We are interested in how event history methods can be used as tools for the comparison of topological features in random fields. Drawing on work from these two areas of research, we consider specific topological features, connected components, on a random field and show that the number of these features differs between fields with different distributions or correlation structures. We use non parametric survival models to model the rate of emergence of such features, achieving this through a reformulation of homological births and deaths as survival events. We evaluate methods for modelling covariance of our wind intensities data on the surface of a sphere, comparing several common stationary models utilising a selection of distance measures. Our data is unusual in that we have multiple realisations of the same dataset, allowing us to examine the empirical correlation between each pair of points. We look at nonstationary approaches to modelling, including the incorporation of large-scale geo graphic descriptors, such as land, coast and ocean and consider the challenge of obtaining accurate covariance matrices on a single replicate. We demonstrate how our proposed methods are informative for the assessment of Gaus sianity in spatial data sets, comparing standard Gaussian data simulation packages. Fi nally, we apply our topological event history methods to multiple realisations from our large climate data set, identifying anomalous realisations. Keywords: survival analysis, topological data analysis, statistical topology, climate, spher ical correlation, persistent homolog

    The connected brain: Causality, models and intrinsic dynamics

    Get PDF
    Recently, there have been several concerted international efforts - the BRAIN initiative, European Human Brain Project and the Human Connectome Project, to name a few - that hope to revolutionize our understanding of the connected brain. Over the past two decades, functional neuroimaging has emerged as the predominant technique in systems neuroscience. This is foreshadowed by an ever increasing number of publications on functional connectivity, causal modeling, connectomics, and multivariate analyses of distributed patterns of brain responses. In this article, we summarize pedagogically the (deep) history of brain mapping. We will highlight the theoretical advances made in the (dynamic) causal modelling of brain function - that may have escaped the wider audience of this article - and provide a brief overview of recent developments and interesting clinical applications. We hope that this article will engage the signal processing community by showcasing the inherently multidisciplinary nature of this important topic and the intriguing questions that are being addressed
    corecore