7,549 research outputs found
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
A multi-scale filament extraction method: getfilaments
Far-infrared imaging surveys of Galactic star-forming regions with Herschel
have shown that a substantial part of the cold interstellar medium appears as a
fascinating web of omnipresent filamentary structures. This highly anisotropic
ingredient of the interstellar material further complicates the difficult
problem of the systematic detection and measurement of dense cores in the
strongly variable but (relatively) isotropic backgrounds. Observational
evidence that stars form in dense filaments creates severe problems for
automated source extraction methods that must reliably distinguish sources not
only from fluctuating backgrounds and noise, but also from the filamentary
structures. A previous paper presented the multi-scale, multi-wavelength source
extraction method getsources based on a fine spatial scale decomposition and
filtering of irrelevant scales from images. In this paper, a multi-scale,
multi-wavelength filament extraction method getfilaments is presented that
solves this problem, substantially improving the robustness of source
extraction with getsources in filamentary backgrounds. The main difference is
that the filaments extracted by getfilaments are now subtracted by getsources
from detection images during source extraction, greatly reducing the chances of
contaminating catalogs with spurious sources. The intimate physical
relationship between forming stars and filaments seen in Herschel observations
demands that accurate filament extraction methods must remove the contribution
of sources and that accurate source extraction methods must be able to remove
underlying filamentary structures. Source extraction with getsources now
provides researchers also with clean images of filaments, free of sources,
noise, and isotropic backgrounds.Comment: 15 pages, 19 figures, to be published in Astronomy & Astrophysics;
language polished for better readabilit
Advances in Kth nearest-neighbour clutter removal
We consider the problem of feature detection in the presence of clutter in
spatial point processes. Classification methods have been developed in previous
studies. Among these, Byers and Raftery (1998) models the observed Kth nearest
neighbour distances as a mixture distribution and classifies the clutter and
feature points consequently. In this paper, we enhance such approach in two
manners. First, we propose an automatic procedure for selecting the number of
nearest neighbours to consider in the classification method by means of
segmented regression models. Secondly, with the aim of applying the procedure
multiple times to get a ``better" end result, we propose a stopping criterion
that minimizes the overall entropy measure of cluster separation between
clutter and feature points. The proposed procedures are suitable for a feature
with clutter as two superimposed Poisson processes on any space, including
linear networks. We present simulations and two case studies of environmental
data to illustrate the method
Cumulative hard X-ray spectrum of local AGN: a link to the cosmic X-ray background
We determine the cumulative spectral energy distribution (SED) of local AGN
in the 3-300 keV band and compare it with the spectrum of the cosmic X-ray
background (CXB) in order to test the widely accepted paradigm that the CXB is
a superposition of AGN and to place constraints on AGN evolution. We performed
a stacking analysis of the hard X-ray spectra of AGN detected in two recent
all-sky surveys, performed by the IBIS/ISGRI instrument aboard INTEGRAL and by
the PCA instrument aboard RXTE, taking into account the space densities of AGN
with different luminosities and absorption column densities. We derived the
collective SED of local AGN in the 3-300 keV energy band. Those AGN with
luminosities below 10^43.5 erg/s (17-60 keV) provide the main contribution to
the local volume hard X-ray emissivity, at least 5 times more than more
luminous objects. The cumulative spectrum exhibits (although with marginal
significance) a cutoff at energies above 100-200 keV and is consistent with the
CXB spectrum if AGN evolve over cosmic time in such a way that the SED of their
collective high-energy emission has a constant shape and the relative fraction
of obscured AGN remains nearly constant, while the AGN luminosity density
undergoes strong evolution between z~1 and z=0, a scenario broadly consistent
with results from recent deep X-ray surveys. The first direct comparison
between the collective hard X-ray SED of local AGN and the CXB spectrum
demonstrates that the popular concept of the CXB being a superposition of AGN
is generally correct. By repeating this test using improved AGN statistics from
current and future hard X-ray surveys, it should be possible to tighten the
constraints on the cosmic history of black hole growth.Comment: 12 pages, 9 figures. Revised version accepted for publication in A&
Ultra-deep catalog of X-ray groups in the Extended Chandra Deep Field South
Ultra-deep observations of ECDF-S with Chandra and XMM-Newton enable a search
for extended X-ray emission down to an unprecedented flux of
ergs s cm. We present the search for the extended emission on
spatial scales of 32 in both Chandra and XMM data, covering
0.3 square degrees and model the extended emission on scales of arcminutes. We
present a catalog of 46 spectroscopically identified groups, reaching a
redshift of 1.6. We show that the statistical properties of ECDF-S, such as
logN-logS and X-ray luminosity function are broadly consistent with LCDM, with
the exception that dn/dz/d test reveals that a redshift range of
in ECDF-S is sparsely populated. The lack of nearby structure,
however, makes studies of high-redshift groups particularly easier both in
X-rays and lensing, due to a lower level of clustered foreground. We present
one and two point statistics of the galaxy groups as well as weak-lensing
analysis to show that the detected low-luminosity systems are indeed low-mass
systems. We verify the applicability of the scaling relations between the X-ray
luminosity and the total mass of the group, derived for the COSMOS survey to
lower masses and higher redshifts probed by ECDF-S by means of stacked weak
lensing and clustering analysis, constraining any possible departures to be
within 30% in mass. Abridged.Comment: 20 pages, 21 figures, 3 tables, to match the journal versio
Comparing Star Formation on Large Scales in the c2d Legacy Clouds: Bolocam 1.1 mm Dust Continuum Surveys of Serpens, Perseus, and Ophiuchus
We have undertaken an unprecedentedly large 1.1 millimeter continuum survey
of three nearby star forming clouds using Bolocam at the Caltech Submillimeter
Observatory. We mapped the largest areas in each cloud at millimeter or
submillimeter wavelengths to date: 7.5 sq. deg in Perseus (Paper I), 10.8 sq.
deg in Ophiuchus (Paper II), and 1.5 sq. deg in Serpens with a resolution of
31", detecting 122, 44, and 35 cores, respectively. Here we report on results
of the Serpens survey and compare the three clouds. Average measured angular
core sizes and their dependence on resolution suggest that many of the observed
sources are consistent with power-law density profiles. Tests of the effects of
cloud distance reveal that linear resolution strongly affects measured source
sizes and densities, but not the shape of the mass distribution. Core mass
distribution slopes in Perseus and Ophiuchus (alpha=2.1+/-0.1 and
alpha=2.1+/-0.3) are consistent with recent measurements of the stellar IMF,
whereas the Serpens distribution is flatter (alpha=1.6+/-0.2). We also compare
the relative mass distribution shapes to predictions from turbulent
fragmentation simulations. Dense cores constitute less than 10% of the total
cloud mass in all three clouds, consistent with other measurements of low
star-formation efficiencies. Furthermore, most cores are found at high column
densities; more than 75% of 1.1 mm cores are associated with Av>8 mag in
Perseus, 15 mag in Serpens, and 20-23 mag in Ophiuchus.Comment: 32 pages, including 18 figures, accepted for publication in Ap
Spitzer Imaging of the Nearby Rich Young Cluster, Cep OB3b
We map the full extent of a rich massive young cluster in the Cep OB3b
association with the IRAC and MIPS instruments aboard the {\it Spitzer} Space
Telescope and the ACIS instrument aboard the X-Ray Observatory.
At 700 pc, it is revealed to be the second nearest large ( member),
young ( Myr) cluster known. In contrast to the nearest large cluster, the
Orion Nebula Cluster, Cep OB3b is only lightly obscured and is mostly located
in a large cavity carved out of the surrounding molecular cloud. Our infrared
and X-ray datasets, as well as visible photometry from the literature, are used
to take a census of the young stars in Cep OB3b. We find that the young stars
within the cluster are concentrated in two sub-clusters; an eastern
sub-cluster, near the Cep B molecular clump, and a western sub-cluster, near
the Cep F molecular clump. Using our census of young stars, we examine the
fraction of young stars with infrared excesses indicative of circumstellar
disks. We create a map of the disk fraction throughout the cluster and find
that it is spatially variable. Due to these spatial variations, the two
sub-clusters exhibit substantially different average disk fractions from each
other: and . We discuss whether the discrepant disk
fractions are due to the photodestruction of disks by the high mass members of
the cluster or whether they result from differences in the ages of the
sub-clusters. We conclude that the discrepant disk fractions are most likely
due to differences in the ages.Comment: 48 Pages, 12 figures, 6 table
- …