Search CORE

16 research outputs found

Benchmarking Nearest Neighbor Search: Influence of Local Intrinsic Dimensionality and Result Diversity in Real-World Datasets

Author: Aumüller Martin
Ceccarello Matteo
Publication venue
Publication date: 01/01/2019
Field of study

The IT University of Copenhagen's Repository

Archivio istituzionale della ricerca - Università di Padova

ATHMoS: Automated Telemetry Health Monitoring System at GSOC using Outlier Detection and Supervised Machine Learning

Author: Faltenbacher Luisa
O'Meara Corey
Schlag Leonard
Wickler Martin
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 01/05/2016
Field of study

Knowing which telemetry parameters are behaving accordingly and those which are behaving out of the ordinary is vital information for continued mission success. For a large amount of different parameters, it is not possible to monitor all of them manually. One of the simplest methods of monitoring the behavior of telemetry is the Out Of Limit (OOL) check, which monitors whether a value exceeds its upper or lower limit. A fundamental problem occurs when a telemetry parameter is showing signs of abnormal behavior; yet, the values are not extreme enough for the OOL-check to detect the problem. By the time the OOL threshold is reached, it could be too late for the operators to react. To solve this problem, the Automated Telemetry Health Monitoring System (ATHMoS) is in development at the German Space Operation Center (GSOC). At the heart of the framework is a novel algorithm for statistical outlier detection which makes use of the so-called Intrinsic Dimensionality (ID) of a data set. Using an ID measure as the core data mining technique allows us to not only run ATHMoS on a parameter by parameter basis, but also monitor and flag anomalies for multi-parameter interactions. By aggregating past telemetry data and employing these techniques, ATHMoS employs a supervised machine learning approach to construct three databases: Historic Nominal data, Recent Nominal data and past Anomaly data. Once new telemetry is received, the algorithm makes a distinction between nominal behaviour and new potentially dangerous behaviour; the latter of which is then flagged to mission engineers. ATHMoS continually learns to distinguish between new nominal behavior and true anomaly events throughout the mission lifetime. To this end, we present an overview of the algorithms ATHMoS uses as well an example where we successfully detected both previously unknown, and known anomalies for an ongoing mission at GSOC

Institute of Transport Research:Publications

Crossref

Dimension Estimation Using Random Connection Models

Author: Mandjes Michel
Serra Paulo
Publication venue
Publication date: 01/11/2017
Field of study

Information about intrinsic dimension is crucial to perform dimensionality reduction, compress information, design efficient algorithms, and do statistical adaptation. In this paper we propose an estimator for the intrinsic dimension of a data set. The estimator is based on binary neighbourhood information about the observations in the form of two adjacency matrices, and does not require any explicit distance information. The underlying graph is modelled according to a subset of a specific random connection model, sometimes referred to as the Poisson blob model. Computationally the estimator scales like n log n, and we specify its asymptotic distribution and rate of convergence. A simulation study on both real and simulated data shows that our approach compares favourably with some competing methods from the literature, including approaches that rely on distance information

arXiv.org e-Print Archive

Pure OAI Repository

UvA-DARE

International Migration, Integration and Social Cohesion online publications

The generalized ratios intrinsic dimension estimator

Author: Denti Francesco (ORCID:0000-0003-2978-4702)
Doimo Diego
Laio Alessandro
Mira Antonietta
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Modern datasets are characterized by numerous features related by complex dependency structures. To deal with these data, dimensionality reduction techniques are essential. Many of these techniques rely on the concept of intrinsic dimension (id), a measure of the complexity of the dataset. However, the estimation of this quantity is not trivial: often, the id depends rather dramatically on the scale of the distances among data points. At short distances, the id can be grossly overestimated due to the presence of noise, becoming smaller and approximately scale-independent only at large distances. An immediate approach to examining the scale dependence consists in decimating the dataset, which unavoidably induces non-negligible statistical errors at large scale. This article introduces a novel statistical method, Gride, that allows estimating the id as an explicit function of the scale without performing any decimation. Our approach is based on rigorous distributional results that enable the quantification of uncertainty of the estimates. Moreover, our method is simple and computationally efficient since it relies only on the distances among data points. Through simulation studies, we show that Gride is asymptotically unbiased, provides comparable estimates to other state-of-the-art methods, and is more robust to short-scale noise than other likelihood-based approaches

PubliCatt

Archivio istituzionale della ricerca - Università dell'Insubria

PubMed Central

Sissa Digital Library

A Modern Approach to Visualise Structured and Unstructured Space Missions' Data

Author: Braun Armin
Dauth Matthias
Del Moro Agnese
Filip Vlad
Göttfert Tobias
Lesch Tobias
Schefels Clemens
Publication venue
Publication date: 01/01/2023
Field of study

In this paper the Visualisation and Data Analysis (ViDA) project, currently being developed at the German Space Operations Center (GSOC), is presented. ViDA is a modern, interactive, web-based frontend tool designed to efficiently explore various types of data generated by space missions. It is more than just a telemetry display tool and, as such, includes features from business intelligence, data science and AI tools, while being focused on the multi-spacecraft operations use case. The paper describes how the big data challenges (volume, variety, variability, complexity, value) in the context of spacecraft operations have been addressed and how the adopted solutions have been integrated into ViDA. It also highlights the importance of contextual knowledge as crucial point for the design and implementation of ViDA. The techniques used for creating appropriate visual representations of the data and their relations are described. Such visualisations are specifically designed to deliver interpretable results to the users, thus helping them to quickly extract knowledge from them during their analytical process. Finally, the integration of ViDA into the ground system and its connections to the other tools in the telemetry/telecommand chain are discussed

Institute of Transport Research:Publications

Image Classification with Deep Learning in the Presence of Noisy Labels: A Survey

Author: Algan Görkem
Ulusoy Ilkay
Publication venue: 'Elsevier BV'
Publication date: 11/01/2021
Field of study

Image classification systems recently made a giant leap with the advancement of deep neural networks. However, these systems require an excessive amount of labeled data to be adequately trained. Gathering a correctly annotated dataset is not always feasible due to several factors, such as the expensiveness of the labeling process or difficulty of correctly classifying data, even for the experts. Because of these practical challenges, label noise is a common problem in real-world datasets, and numerous methods to train deep neural networks with label noise are proposed in the literature. Although deep neural networks are known to be relatively robust to label noise, their tendency to overfit data makes them vulnerable to memorizing even random noise. Therefore, it is crucial to consider the existence of label noise and develop counter algorithms to fade away its adverse effects to train deep neural networks efficiently. Even though an extensive survey of machine learning techniques under label noise exists, the literature lacks a comprehensive survey of methodologies centered explicitly around deep learning in the presence of noisy labels. This paper aims to present these algorithms while categorizing them into one of the two subgroups: noise model based and noise model free methods. Algorithms in the first group aim to estimate the noise structure and use this information to avoid the adverse effects of noisy labels. Differently, methods in the second group try to come up with inherently noise robust algorithms by using approaches like robust losses, regularizers or other learning paradigms

arXiv.org e-Print Archive

OpenMETU (Middle East Technical University)