Search CORE

1,845 research outputs found

Outlier Mining Methods Based on Graph Structure Analysis

Author: Almeira Nahuel
Amil Marletti Pablo
Masoller Alonso Cristina
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

Outlier detection in high-dimensional datasets is a fundamental and challenging problem across disciplines that has also practical implications, as removing outliers from the training set improves the performance of machine learning algorithms. While many outlier mining algorithms have been proposed in the literature, they tend to be valid or efficient for specific types of datasets (time series, images, videos, etc.). Here we propose two methods that can be applied to generic datasets, as long as there is a meaningful measure of distance between pairs of elements of the dataset. Both methods start by defining a graph, where the nodes are the elements of the dataset, and the links have associated weights that are the distances between the nodes. Then, the first method assigns an outlier score based on the percolation (i.e., the fragmentation) of the graph. The second method uses the popular IsoMap non-linear dimensionality reduction algorithm, and assigns an outlier score by comparing the geodesic distances with the distances in the reduced space. We test these algorithms on real and synthetic datasets and show that they either outperform, or perform on par with other popular outlier detection methods. A main advantage of the percolation method is that is parameter free and therefore, it does not require any training; on the other hand, the IsoMap method has two integer number parameters, and when they are appropriately selected, the method performs similar to or better than all the other methods tested.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

CONICET Digital

STWalk: Learning Trajectory Representations in Temporal Graphs

Author: Balasubramanian Vineeth N
Gupta Manish
Mittal Himangi
Pandhre Supriya
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/11/2017
Field of study

Analyzing the temporal behavior of nodes in time-varying graphs is useful for many applications such as targeted advertising, community evolution and outlier detection. In this paper, we present a novel approach, STWalk, for learning trajectory representations of nodes in temporal graphs. The proposed framework makes use of structural properties of graphs at current and previous time-steps to learn effective node trajectory representations. STWalk performs random walks on a graph at a given time step (called space-walk) as well as on graphs from past time-steps (called time-walk) to capture the spatio-temporal behavior of nodes. We propose two variants of STWalk to learn trajectory representations. In one algorithm, we perform space-walk and time-walk as part of a single step. In the other variant, we perform space-walk and time-walk separately and combine the learned representations to get the final trajectory embedding. Extensive experiments on three real-world temporal graph datasets validate the effectiveness of the learned representations when compared to three baseline methods. We also show the goodness of the learned trajectory embeddings for change point detection, as well as demonstrate that arithmetic operations on these trajectory representations yield interesting and interpretable results.Comment: 10 pages, 5 figures, 2 table

arXiv.org e-Print Archive

Research Archive of Indian Institute of Technology Hyderabad

Spatial Data Quality in the IoT Era:Management and Exploitation

Author: Cheema Muhammad Aamir
Jensen Christian S.
Li Huan
Lu Hua
Tang Bo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Within the rapidly expanding Internet of Things (IoT), growing amounts of spatially referenced data are being generated. Due to the dynamic, decentralized, and heterogeneous nature of the IoT, spatial IoT data (SID) quality has attracted considerable attention in academia and industry. How to invent and use technologies for managing spatial data quality and exploiting low-quality spatial data are key challenges in the IoT. In this tutorial, we highlight the SID consumption requirements in applications and offer an overview of spatial data quality in the IoT setting. In addition, we review pertinent technologies for quality management and low-quality data exploitation, and we identify trends and future directions for quality-aware SID management and utilization. The tutorial aims to not only help researchers and practitioners to better comprehend SID quality challenges and solutions, but also offer insights that may enable innovative research and applications

Roskilde Universitet

ZENODO

VBN

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Towards Real-Time Detection and Tracking of Spatio-Temporal Features: Blob-Filaments in Fusion Plasma

Author: Chang Cs
Choi Jong Y.
Churchill Michael
Klasky Scott
Sim Alex
Stathopoulos Andreas
Wu Kesheng
Wu Lingfei
Publication venue
Publication date: 01/07/2016
Field of study

A novel algorithm and implementation of real-time identification and tracking of blob-filaments in fusion reactor data is presented. Similar spatio-temporal features are important in many other applications, for example, ignition kernels in combustion and tumor cells in a medical image. This work presents an approach for extracting these features by dividing the overall task into three steps: local identification of feature cells, grouping feature cells into extended feature, and tracking movement of feature through overlapping in space. Through our extensive work in parallelization, we demonstrate that this approach can effectively make use of a large number of compute nodes to detect and track blob-filaments in real time in fusion plasma. On a set of 30GB fusion simulation data, we observed linear speedup on 1024 processes and completed blob detection in less than three milliseconds using Edison, a Cray XC30 system at NERSC.Comment: 14 pages, 40 figure

arXiv.org e-Print Archive

eScholarship - University of California

USING SPATIAL METHODS TO BETTER UNDERSTAND FOOD INSECURITY AND SNAP UNDER-PARTICIPATION IN TEXAS

Author: RAMPHUL RYAN
Publication venue: DigitalCommons@TMC
Publication date: 01/07/2020
Field of study

The overall objective of this research is to use spatial methods to better understand food insecurity and SNAP under-participation in Texas. Paper 1 assesses whether a sample of community dwelling Medicare and Medicaid beneficiaries, who screen positive for food insecurity at healthcare locations in Harris County, exhibit a spatial pattern in terms of where they live. In other words, it tests whether or not there are statistically significant neighborhood hot spots or cold spots of food insecurity against a null hypothesis of complete spatial randomness. This approach is novel because it uses address-level data on patients who report being food insecure to test for statistically significant neighborhood hot spots or cold spots, instead of relying on extant factors like neighborhood poverty rates, or the presence of grocery stores. Using address-level food insecurity screening data is often difficult because few organizations screen for food insecurity, and even fewer are willing to share their data due to privacy concerns. Paper 2 utilizes geographical information systems (GIS) to map census tract-level clusters and outliers of households that are eligible but not enrolled (EBNE) in the SNAP program. The implications of this analysis are vast. Knowing the locations of neighborhood-level clusters and outliers of SNAP EBNE households can inform interventions to address the “SNAP GAP” more effectively. Additionally, this method of identifying neighborhood-level clusters and outliers of SNAP EBNE households can be applied to other safety net programs including Medicaid, the Children’s Health Insurance Program (CHIP), Healthy Texas Women, and the Women, Infant, and Children (WIC) Program

DigitalCommons@The Texas Medical Center