Search CORE

32,742 research outputs found

Bayesian outlier detection in Capital Asset Pricing Model

Author: De Giuli Maria Elena
Maggi Mario Alessandro
Tarantola Claudia
Publication venue
Publication date: 23/04/2009
Field of study

We propose a novel Bayesian optimisation procedure for outlier detection in the Capital Asset Pricing Model. We use a parametric product partition model to robustly estimate the systematic risk of an asset. We assume that the returns follow independent normal distributions and we impose a partition structure on the parameters of interest. The partition structure imposed on the parameters induces a corresponding clustering of the returns. We identify via an optimisation procedure the partition that best separates standard observations from the atypical ones. The methodology is illustrated with reference to a real data set, for which we also provide a microeconomic interpretation of the detected outliers

arXiv.org e-Print Archive

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

A survey of outlier detection methodologies

Author: Austin J.
Hodge V.J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

CiteSeerX

Crossref

White Rose Research Online

Preprocessing Among the Infalling Galaxy Population of EDisCS Clusters

Author: Aragon-Salamanca Alfonso
Bian Fu-Yan
Clowe Douglas
Cool Richard
De Lucia Gabriella
Desai Vandana
Desjardins Tyler
Finn Rose
Halliday Claire
Jablonka Pascale
Just Dennis W.
Kirby Matthew
Liebst Kelley
Mann Justin
Moustakas John
Poggianti Bianca
Rudnick Gregory
Zaritsky Dennis
Publication venue: 'American Astronomical Society'
Publication date: 01/01/2019
Field of study

We present results from a low-resolution spectroscopic survey for 21 galaxy clusters at

0.4 < z < 0.8

selected from the ESO Distant Cluster Survey. We measured spectra using the low-dispersion prism in IMACS on the Magellan Baade telescope and calculate redshifts with an accuracy of

\sigma_z = 0.007

. We find 1763 galaxies that are brighter than

R = 22.9

in the large-scale cluster environs. We identify the galaxies expected to be accreted by the clusters as they evolve to

z = 0

using spherical infall models and find that

\sim30\%

\sim70\%

of the

z = 0

cluster population lies outside the virial radius at

z \sim 0.6

. For analogous clusters at

z = 0

, we calculate that the ratio of galaxies that have fallen into the clusters since

z \sim 0.6

to those that were already in the core at that redshift is typically between

\sim0.3

and

1.5

. This wide range of ratios is due to intrinsic scatter and is not a function of velocity dispersion, so a variety of infall histories is to be expected for clusters with current velocity dispersions of

300 \lesssim\sigma\lesssim 1200

km s

^{-1}

. Within the infall regions of

z \sim 0.6

clusters, we find a larger red fraction of galaxies than in the field and greater clustering among red galaxies than blue. We interpret these findings as evidence of "preprocessing", where galaxies in denser local environments have their star formation rates affected prior to their aggregation into massive clusters, although the possibility of backsplash galaxies complicates the interpretation.Comment: Accepted for publication in Ap

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Repository@Nottingham

KU ScholarWorks

HAL-INSU

OA@INAF - Istituto Nazionale di Astrofisica

The University of Arizona

Caltech Authors

HAL-OBSPM

Distributed Low-rank Subspace Segmentation

Author: Chang Shih-Fu
Jordan Michael I.
Mackey Lester
Mu Yadong
Talwalkar Ameet
Publication venue
Publication date: 15/10/2013
Field of study

Vision problems ranging from image clustering to motion segmentation to semi-supervised learning can naturally be framed as subspace segmentation problems, in which one aims to recover multiple low-dimensional subspaces from noisy and corrupted input data. Low-Rank Representation (LRR), a convex formulation of the subspace segmentation problem, is provably and empirically accurate on small problems but does not scale to the massive sizes of modern vision datasets. Moreover, past work aimed at scaling up low-rank matrix factorization is not applicable to LRR given its non-decomposable constraints. In this work, we propose a novel divide-and-conquer algorithm for large-scale subspace segmentation that can cope with LRR's non-decomposable constraints and maintains LRR's strong recovery guarantees. This has immediate implications for the scalability of subspace segmentation, which we demonstrate on a benchmark face recognition dataset and in simulations. We then introduce novel applications of LRR-based subspace segmentation to large-scale semi-supervised learning for multimedia event detection, concept detection, and image tagging. In each case, we obtain state-of-the-art results and order-of-magnitude speed ups

arXiv.org e-Print Archive

Crossref

A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets

Author: Havinga P.J.M.
Meratnia N.
Zhang Yang
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2007
Field of study

The term "outlier" can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework

University of Twente Research Information