104,498 research outputs found
A Novel Subspace Outlier Detection Approach in High Dimensional Data Sets
Many real applications are required to detect outliers in high dimensional data sets. The major difficulty of mining outliers lies on the fact that outliers are often embedded in subspaces. No efficient methods are available in general for subspace-based outlier detection. Most existing subspacebased outlier detection methods identify outliers by searching for abnormal sparse density units in subspaces. In this paper, we present a novel approach for finding outliers in the ‘interesting’ subspaces. The interesting subspaces are strongly correlated with `good\u27 clusters. This approach aims to group the meaningful subspaces and then identify outliers in the projected subspaces. In doing so, an extension to the subspacebased clustering algorithm is proposed so as to find the ‘good’ subspaces, and then outliers are identified in the projected subspaces using some classical outlier detection techniques such as distance-based and density-based algorithms. Comprehensive case studies are conducted using various types of subspace clustering and outlier detection algorithms. The experimental results demonstrate that the proposed method can detect outliers effectively and efficiently in high dimensional data sets
A new method to unveil embedded stellar clusters
In this paper we present a novel method to identify and characterize stellar
clusters deeply embedded in a dark molecular cloud. The method is based on
measuring stellar surface density in wide-field infrared images using star
counting techniques. It takes advantage of the differing -band luminosity
functions (HLFs) of field stars and young stellar populations and is able to
statistically associate each star in an image as a member of either the
background stellar population or a young stellar population projected on or
near the cloud. Moreover, the technique corrects for the effects of
differential extinction toward each individual star. We have tested this method
against simulations as well as observations. In particular, we have applied the
method to 2MASS point sources observed in the Orion A and B complexes, and the
results obtained compare very well with those obtained from deep Spitzer and
Chandra observations where presence of infrared excess or X-ray emission
directly determines membership status for every star. Additionally, our method
also identifies unobscured clusters and a low resolution version of the Orion
stellar surface density map shows clearly the relatively unobscured and diffuse
OB 1a and 1b sub-groups and provides useful insights on their spatial
distribution.Comment: A&A, in press; 13 pages, multi-layer figures can be displayed with
Adobe Acrobat Reade
The Quantity of Intracluster Light: Comparing Theoretical and Observational Measurement Techniques Using Simulated Clusters
Using a suite of N-body simulations of galaxy clusters specifically tailored
to study the intracluster light (ICL) component, we measure the quantity of ICL
using a number of different methods previously employed in the literature for
both observational and simulation data sets. By measuring the ICL of the
clusters using multiple techniques, we identify systematic differences in how
each detection method identifies the ICL. We find that techniques which define
the ICL solely based on the current position of the cluster luminosity, such as
a surface brightness or local density threshold, tend to find less ICL than
methods utilizing time or velocity information, including stellar particles'
density history or binding energy. The range of ICL fractions (the fraction of
the clusters' total luminosity found in the ICL component) we measure at z=0
across all our clusters using any definition span the range from 9-36%, and
even within a single cluster different methods can change the measured ICL
fraction by up to a factor of two. Separating the cluster's central galaxy from
the surrounding ICL component is a challenge for all ICL techniques, and
because the ICL is centrally concentrated within the cluster, the differences
in the measured ICL quantity between techniques are largely a consequence of
this central galaxy/ICL separation. We thoroughly explore the free parameters
involved with each measurement method, and find that adjusting these parameters
can change the measured ICL fraction by up to a factor of two. While for all
definitions the quantity of ICL tends to increase with time, the ICL fraction
does not grow at a uniform rate, nor even monotonically under some definitions.
Thus, the ICL can be used as a rough indicator of dynamical age, where more
dynamically advanced clusters will on average have higher ICL fractions.Comment: 18 pages, 11 figues. Accepted for publication in Ap
Stochastic density functional theory
Linear-scaling implementations of density functional theory (DFT) reach their
intended efficiency regime only when applied to systems having a physical size
larger than the range of their Kohn-Sham density matrix (DM). This causes a
problem since many types of large systems of interest have a rather broad DM
range and are therefore not amenable to analysis using DFT methods. For this
reason, the recently proposed stochastic DFT (sDFT), avoiding exhaustive DM
evaluations, is emerging as an attractive alternative linear-scaling approach.
This review develops a general formulation of sDFT in terms of a
(non)orthogonal basis representation and offers an analysis of the statistical
errors (SEs) involved in the calculation. Using a new Gaussian-type basis-set
implementation of sDFT, applied to water clusters and silicon nanocrystals, it
demonstrates and explains how the standard deviation and the bias depend on the
sampling rate and the system size in various types of calculations. We also
develop basis-set embedded-fragments theory, demonstrating its utility for
reducing the SEs for energy, density of states and nuclear force calculations.
Finally, we discuss the algorithmic complexity of sDFT, showing it has CPU
wall-time linear-scaling. The method parallelizes well over distributed
processors with good scalability and therefore may find use in the upcoming
exascale computing architectures
Energy benchmarks for water clusters and ice structures from an embedded many-body expansion
We show how an embedded many-body expansion (EMBE) can be used to calculate
accurate \emph{ab initio} energies of water clusters and ice structures using
wavefunction-based methods. We use the EMBE described recently by Bygrave
\emph{et al.} (J. Chem. Phys. \textbf{137}, 164102 (2012)), in which the terms
in the expansion are obtained from calculations on monomers, dimers, etc. acted
on by an approximate representation of the embedding field due to all other
molecules in the system, this field being a sum of Coulomb and
exchange-repulsion fields. Our strategy is to separate the total energy of the
system into Hartree-Fock and correlation parts, using the EMBE only for the
correlation energy, with the Hartree-Fock energy calculated using standard
molecular quantum chemistry for clusters and plane-wave methods for crystals.
Our tests on a range of different water clusters up to the 16-mer show that for
the second-order M\o{}ller-Plesset (MP2) method the EMBE truncated at 2-body
level reproduces to better than 0.1 m/monomer the correlation energy
from standard methods. The use of EMBE for computing coupled-cluster energies
of clusters is also discussed. For the ice structures Ih, II and VIII, we find
that MP2 energies near the complete basis-set limit reproduce very well the
experimental values of the absolute and relative binding energies, but that the
use of coupled-cluster methods for many-body correlation (non-additive
dispersion) is essential for a full description. Possible future applications
of the EMBE approach are suggested
Cluster Disruption: From infant mortality to long term survival
How stellar clusters disrupt, and over what timescales, is intimately linked
with how they form. Here, we review the theory and observations of cluster
disruption, both the suggested initial rapid dissolution phase (infant
mortality) and the longer timescale processes that affect clusters after they
emerge from their progenitor GMCs. Over the past decade, the standard paradigm
that has developed is that all/most stars are formed in clusters and that the
vast majority of these groups are disrupted over short timescales (< 10 Myr).
This is thought to be due to the removal of the left over gas from the
star-formation process, known as infant mortality. However, recent results have
suggested that the fraction of stars that form in clusters has been
overestimated, with the majority being formed in unbound groups (i.e.
associations) which expand and disrupt without the need of invoking gas
removal. Dynamical measurements of young massive clusters in the Galaxy suggest
that clusters reach a stable equilibrium at very young (<3 Myr) ages,
suggesting that gas expulsion has little effect on the cluster. After the early
dynamical phase, clusters appear to be long lived and stable objects. We use
the recent WFC3 image of the cluster population in M83 to test empirical
disruption laws and find that the lifetime of clusters strongly depends on
their ambient environment. While the role of cluster mass is less well
constrained (due to the added parameter of the form of the cluster mass
function), we find evidence suggesting that higher mass clusters survive
longer, and that the cluster mass function (at least in M83, outside the
nuclear region) is truncated above ~10^5Msun.Comment: 18 pages, 6 Figures, invited review for the proceedings of "Stellar
Clusters and Associations - A RIA workshop on GAIA", 23-27 May 2011, Granada,
Spai
- …