Search CORE

147,217 research outputs found

How is a data-driven approach better than random choice in label space division for multi-label classification?

Author: Kajdanowicz Tomasz
Kersting Kristian
Szymański Piotr
Publication venue: 'MDPI AG'
Publication date: 07/06/2016
Field of study

We propose using five data-driven community detection approaches from social networks to partition the label space for the task of multi-label classification as an alternative to random partitioning into equal subsets as performed by RAkELd: modularity-maximizing fastgreedy and leading eigenvector, infomap, walktrap and label propagation algorithms. We construct a label co-occurence graph (both weighted an unweighted versions) based on training data and perform community detection to partition the label set. We include Binary Relevance and Label Powerset classification methods for comparison. We use gini-index based Decision Trees as the base classifier. We compare educated approaches to label space divisions against random baselines on 12 benchmark data sets over five evaluation measures. We show that in almost all cases seven educated guess approaches are more likely to outperform RAkELd than otherwise in all measures, but Hamming Loss. We show that fastgreedy and walktrap community detection methods on weighted label co-occurence graphs are 85-92% more likely to yield better F1 scores than random partitioning. Infomap on the unweighted label co-occurence graphs is on average 90% of the times better than random paritioning in terms of Subset Accuracy and 89% when it comes to Jaccard similarity. Weighted fastgreedy is better on average than RAkELd when it comes to Hamming Loss

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Frame Coherence and Sparse Signal Processing

Author: Bajwa Waheed U.
Calderbank Robert
Mixon Dustin G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

The sparse signal processing literature often uses random sensing matrices to obtain performance guarantees. Unfortunately, in the real world, sensing matrices do not always come from random processes. It is therefore desirable to evaluate whether an arbitrary matrix, or frame, is suitable for sensing sparse signals. To this end, the present paper investigates two parameters that measure the coherence of a frame: worst-case and average coherence. We first provide several examples of frames that have small spectral norm, worst-case coherence, and average coherence. Next, we present a new lower bound on worst-case coherence and compare it to the Welch bound. Later, we propose an algorithm that decreases the average coherence of a frame without changing its spectral norm or worst-case coherence. Finally, we use worst-case and average coherence, as opposed to the Restricted Isometry Property, to garner near-optimal probabilistic guarantees on both sparse signal detection and reconstruction in the presence of noise. This contrasts with recent results that only guarantee noiseless signal recovery from arbitrary frames, and which further assume independence across the nonzero entries of the signal---in a sense, requiring small average coherence replaces the need for such an assumption

arXiv.org e-Print Archive

CiteSeerX

Crossref

Multi-criteria Anomaly Detection using Pareto Depth Analysis

Author: Calder Jeff
Hero III Alfred O.
Hsiao Ko-Jen
Xu Kevin S.
Publication venue
Publication date: 01/01/2012
Field of study

We consider the problem of identifying patterns in a data set that exhibit anomalous behavior, often referred to as anomaly detection. In most anomaly detection algorithms, the dissimilarity between data samples is calculated by a single criterion, such as Euclidean distance. However, in many cases there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In such a case, multiple criteria can be defined, and one can test for anomalies by scalarizing the multiple criteria using a linear combination of them. If the importance of the different criteria are not known in advance, the algorithm may need to be executed multiple times with different choices of weights in the linear combination. In this paper, we introduce a novel non-parametric multi-criteria anomaly detection method using Pareto depth analysis (PDA). PDA uses the concept of Pareto optimality to detect anomalies under multiple criteria without having to run an algorithm multiple times with different choices of weights. The proposed PDA approach scales linearly in the number of criteria and is provably better than linear combinations of the criteria.Comment: Removed an unnecessary line from Algorithm

arXiv.org e-Print Archive

CiteSeerX

Tuning Windowed Chi-Squared Detectors for Sensor Attacks

Author: aström
chen
cárdenas
gustafsson
lancaster
li
murguia
Publication venue
Publication date: 01/01/2017
Field of study

A model-based windowed chi-squared procedure is proposed for identifying falsified sensor measurements. We employ the widely-used static chi-squared and the dynamic cumulative sum (CUSUM) fault/attack detection procedures as benchmarks to compare the performance of the windowed chi-squared detector. In particular, we characterize the state degradation that a class of attacks can induce to the system while enforcing that the detectors do not raise alarms (zero-alarm attacks). We quantify the advantage of using dynamic detectors (windowed chi-squared and CUSUM detectors), which leverages the history of the state, over a static detector (chi-squared) which uses a single measurement at a time. Simulations using a chemical reactor are presented to illustrate the performance of our tools

arXiv.org e-Print Archive

Repository TU/e

Crossref