Search CORE

2,266 research outputs found

A survey of outlier detection methodologies

Author: Austin J.
Hodge V.J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

CiteSeerX

Crossref

White Rose Research Online

Furniture models learned from the WWW: using web catalogs to locate and categorize unknown furniture pieces in 3D laser scans

Author: Beetz Michael
Martinez Mozos Oscar
Marton Zoltan-Csaba
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

In this article, we investigate how autonomous robots can exploit the high quality information already available from the WWW concerning 3-D models of office furniture. Apart from the hobbyist effort in Google 3-D Warehouse, many companies providing office furnishings already have the models for considerable portions of the objects found in our workplaces and homes. In particular, we present an approach that allows a robot to learn generic models of typical office furniture using examples found in the Web. These generic models are then used by the robot to locate and categorize unknown furniture in real indoor environments

University of Lincoln Institutional Repository

CiteSeerX

Scaling associative classification for very large datasets

Author: Baralis Elena
Garza Paolo
Venturini Luca
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Supervised learning algorithms are nowadays successfully scaling up to datasets that are very large in volume, leveraging the potential of in-memory cluster-computing Big Data frameworks. Still, massive datasets with a number of large-domain categorical features are a difficult challenge for any classifier. Most off-the-shelf solutions cannot cope with this problem. In this work we introduce DAC, a Distributed Associative Classifier. DAC exploits ensemble learning to distribute the training of an associative classifier among parallel workers and improve the final quality of the model. Furthermore, it adopts several novel techniques to reach high scalability without sacrificing quality, among which a preventive pruning of classification rules in the extraction phase based on Gini impurity. We ran experiments on Apache Spark, on a real large-scale dataset with more than 4 billion records and 800 million distinct categories. The results showed that DAC improves on a state-of-the-art solution in both prediction quality and execution time. Since the generated model is human-readable, it can not only classify new records, but also allow understanding both the logic behind the prediction and the properties of the model, becoming a useful aid for decision makers

arXiv.org e-Print Archive

Directory of Open Access Journals

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

PORTO Publications Open Repository TOrino

Recurrent Pixel Embedding for Instance Grouping

Author: Fowlkes Charless
Kong Shu
Publication venue
Publication date: 21/12/2017
Field of study

We introduce a differentiable, end-to-end trainable framework for solving pixel-level grouping problems such as instance segmentation consisting of two novel components. First, we regress pixels into a hyper-spherical embedding space so that pixels from the same group have high cosine similarity while those from different groups have similarity below a specified margin. We analyze the choice of embedding dimension and margin, relating them to theoretical results on the problem of distributing points uniformly on the sphere. Second, to group instances, we utilize a variant of mean-shift clustering, implemented as a recurrent neural network parameterized by kernel bandwidth. This recurrent grouping module is differentiable, enjoys convergent dynamics and probabilistic interpretability. Backpropagating the group-weighted loss through this module allows learning to focus on only correcting embedding errors that won't be resolved during subsequent clustering. Our framework, while conceptually simple and theoretically abundant, is also practically effective and computationally efficient. We demonstrate substantial improvements over state-of-the-art instance segmentation for object proposal generation, as well as demonstrating the benefits of grouping loss for classification tasks such as boundary detection and semantic segmentation

arXiv.org e-Print Archive

Crossref

Predicting Mental Health Crisis in Veterans: Early Warning Signs, Precursors and Protective Factors

Author: Annapureddy Priyanka
Publication venue: e-Publications@Marquette
Publication date: 01/10/2022
Field of study

Mental Health (MH) conditions have recently increased to a large extent due to socio-demographic changes. Posttraumatic Stress Disorder (PTSD) is one of the most common mental health disorders prevalent in US. PTSD is even more troubling at double the rate in combat veterans leaving their service compared to general population. Severity of PTSD is associated with risk taking behaviors such as substance abuse, non-suicidal self-injury, and sexual risk behaviors. Psychological disorders are often preceded by early warning signs and recognizing the early warning signs of PTSD will help in preventing the returning or worsening of PTSD symptoms. Ecological momentary assessment (EMA) studies are more sophisticated in tracking fluctuations of symptoms real-time, and they are effective in monitoring for crisis events in veterans. Mobile applications are commonly used means to gather such EMA information from participants. Our research focuses on developing interpretable machine learning (ML) models using socio-demographic data and EMA data from natural settings to predict high PTSD risk in veterans and those who engage in risky behaviors. Findings from these models can be integrated with existing m-health frameworks to generate text alerts to the mentors when the crisis patterns are observed in their mentees. Such an integrated crisis prediction and alerting system would add benefit to peer mentors to plan intervention

epublications@Marquette

k-Nearest Neighbour Classifiers: 2nd Edition (with Python examples)

Author: Cunningham Padraig
Delany Sarah Jane
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/04/2020
Field of study

Perhaps the most straightforward classifier in the arsenal or machine learning techniques is the Nearest Neighbour Classifier -- classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance because issues of poor run-time performance is not such a problem these days with the computational power that is available. This paper presents an overview of techniques for Nearest Neighbour classification focusing on; mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours and mechanisms for reducing the dimension of the data. This paper is the second edition of a paper previously published as a technical report. Sections on similarity measures for time-series, retrieval speed-up and intrinsic dimensionality have been added. An Appendix is included providing access to Python code for the key methods.Comment: 22 pages, 15 figures: An updated edition of an older tutorial on kN

arXiv.org e-Print Archive

Arrow@TUDublin

An EMG Gesture Recognition System with Flexible High-Density Sensors and Brain-Inspired High-Dimensional Classifier

Author: Arias Ana C.
Benatti Simone
Benini Luca
Burghardt Fred
Khan Yasser
Menon Alisha
Moin Ali
Rabaey Jan M.
Rahimi Abbas
Tamakloe Senam
Ting Jonathan
Yamamoto Natasha
Zhou Andy
Publication venue
Publication date: 01/01/2018
Field of study

EMG-based gesture recognition shows promise for human-machine interaction. Systems are often afflicted by signal and electrode variability which degrades performance over time. We present an end-to-end system combating this variability using a large-area, high-density sensor array and a robust classification algorithm. EMG electrodes are fabricated on a flexible substrate and interfaced to a custom wireless device for 64-channel signal acquisition and streaming. We use brain-inspired high-dimensional (HD) computing for processing EMG features in one-shot learning. The HD algorithm is tolerant to noise and electrode misplacement and can quickly learn from few gestures without gradient descent or back-propagation. We achieve an average classification accuracy of 96.64% for five gestures, with only 7% degradation when training and testing across different days. Our system maintains this accuracy when trained with only three trials of gestures; it also demonstrates comparable accuracy with the state-of-the-art when trained with one trial

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna