11 research outputs found

    Directional outlyingness applied to distances between genomic words

    Get PDF
    The detection of outlier curves/images is crucial in many areas, such as environmental, meteorological, medical, or economic contexts. In the functional framework, outlying observations are not only those that contain atypically high or low values, but also curves that present a different shape or pattern from the rest of the curves in the sample. In this short paper, we mention some recent methods for outlier detection in functional data and apply a recently proposed measure, the directional outlyingness, and the functional outlier map to detect words with outlying distance distribution in the human genome.publishe

    Robust functional regression based on principal components

    Full text link
    Functional data analysis is a fast evolving branch of modern statistics and the functional linear model has become popular in recent years. However, most estimation methods for this model rely on generalized least squares procedures and therefore are sensitive to atypical observations. To remedy this, we propose a two-step estimation procedure that combines robust functional principal components and robust linear regression. Moreover, we propose a transformation that reduces the curvature of the estimators and can be advantageous in many settings. For these estimators we prove Fisher-consistency at elliptical distributions and consistency under mild regularity conditions. The influence function of the estimators is investigated as well. Simulation experiments show that the proposed estimators have reasonable efficiency, protect against outlying observations, produce smooth estimates and perform well in comparison to existing approaches.Comment: 33 pages, including the appendix and reference

    Automated data inspection in jet engines

    Get PDF
    Rolls Royce accumulate a large amount of sensor data throughout the testing and deployment of their engines. The availability of this rich source of data offers exciting opportunities to automate the monitoring and testing of the engines. In this thesis we have developed statistical models to make meaningful insights from engine test data. We have built a classification model to identify different types of engine running in Pass-Off tests. The labels can be used for post-analysis and highlight problematic engine tests. The model has been applied to two different types of engines, in which it gives close to perfect classification accuracy. We have also created an unsupervised approach when there are no defined classes of engine running. These models have been incorporated into Rolls Royce systems. Early warnings for potential issues can enable relatively cheap maintenance to be performed and reduce the risk of irreparable engine damage. We have therefore developed an outlier detection model to identify abnormal temperature behaviour. The capabilities of the model are shown theoretically and tested on experimental and real data. Lastly, in a test decisions are made by engineers to ensure the engine complies with certain standards. To support the engineers we have developed a predictive model to identify segments of the engine test that should be retested. The model is tested against the current decision making of the engineers, and gives good predictive performance. The model highlights the possibility of automating the decision making process within a test

    Novel Methods for the Detection of Emergent Phenomena in Streaming Data

    Get PDF
    In the fast paced and data rich world of today there is an increased demand for methods that analyse a stream of data in real time. In particular, there is a desire for methods that can identify phenomena in the data stream as they are emerging. These emergent phenomena can be viewed as observations being received that are surprising when compared to the history of the data. Motivated by challenges in the telecommunications sector, we develop methods that operate when the stream does not follow classical assumptions. This includes when the data are not independent or identically distributed, or when the phenomena occur gradually over time. This thesis makes three contributions to the field of anomaly detection for streaming data. The first, Non-Parametric Unbounded Change (NUNC), provides a non-parametric method for identifying changes in the distribution of a data stream. The second, Functional Anomaly Sequential Test (FAST), provides a method for identifying deviations from an expected shape in a stream of partially observed functional data. The third, mvFAST, extends FAST to the multivariate functional data setting

    CLADAG 2021 BOOK OF ABSTRACTS AND SHORT PAPERS

    Get PDF
    The book collects the short papers presented at the 13th Scientific Meeting of the Classification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS). The meeting has been organized by the Department of Statistics, Computer Science and Applications of the University of Florence, under the auspices of the Italian Statistical Society and the International Federation of Classification Societies (IFCS). CLADAG is a member of the IFCS, a federation of national, regional, and linguistically-based classification societies. It is a non-profit, non-political scientific organization, whose aims are to further classification research
    corecore