Search CORE

50,632 research outputs found

A Unique Pipeline Model to Improve Anomaly Detection in High Dimensional Data

Author: Upasana Gupta et al.
Publication venue: Auricle Global Society of Education and Research
Publication date: 07/11/2023
Field of study

This paper presents a comprehensive method for dimension reduction and detecting anomalies in high-dimensional data (on healthcare datasets) using R. Realizing that traditional linear methods such as Principal Component Analysis (PCA) often ignore the complexity of the non-linear manifold of the data, our approach exploits iterative learning, on the belief that high-dimensional data is largely based on a low-dimensional manifold. The methodology starts by preparing the data using R libraries like Keras, dplyr, and ggplot2, addressing challenges like missing values ??and visualizing meaningful information. Using the Mahalanobis distance, the paper identifies and removes country-specific outliers. The pipelined model integrates Principal Component Analysis (PCA) for data transformation and combines an Autoencoder with t-SNE for dimensionality reduction. This refined dataset is then used to train a Multi-Layer Perceptron (MLP) artificial neural network, which facilitates anomaly detection based on reconstruction errors, illustrated by the point cloud. Additionally, the paper explores metric multidimensional scaling using artificial neural networks, tests large datasets such as healthcare and wine, and compares the results of the work using conventional techniques. This study highlights the effectiveness of integrating various pre-processing, visualization, and artificial neural network strategies through R for effective anomaly detection

International Journal on Recent and Innovation Trends in Computing and Communication

Principal manifolds and graphs in practice: from molecular biology to dynamical systems

Author: ALEXANDER N. GORBAN
ANDREI ZINOVYEV
Gorban A. N.
Gorban A. N.
Karhunen K.
Loève M. M.
Melville A. Yu.
Ritter H.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 25/07/2010
Field of study

We present several applications of non-linear data modeling, using principal manifolds and principal graphs constructed using the metaphor of elasticity (elastic principal graph approach). These approaches are generalizations of the Kohonen's self-organizing maps, a class of artificial neural networks. On several examples we show advantages of using non-linear objects for data approximation in comparison to the linear ones. We propose four numerical criteria for comparing linear and non-linear mappings of datasets into the spaces of lower dimension. The examples are taken from comparative political science, from analysis of high-throughput data in molecular biology, from analysis of dynamical systems.Comment: 12 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref