104,498 research outputs found

    A Novel Subspace Outlier Detection Approach in High Dimensional Data Sets

    Get PDF
    Many real applications are required to detect outliers in high dimensional data sets. The major difficulty of mining outliers lies on the fact that outliers are often embedded in subspaces. No efficient methods are available in general for subspace-based outlier detection. Most existing subspacebased outlier detection methods identify outliers by searching for abnormal sparse density units in subspaces. In this paper, we present a novel approach for finding outliers in the ‘interesting’ subspaces. The interesting subspaces are strongly correlated with `good\u27 clusters. This approach aims to group the meaningful subspaces and then identify outliers in the projected subspaces. In doing so, an extension to the subspacebased clustering algorithm is proposed so as to find the ‘good’ subspaces, and then outliers are identified in the projected subspaces using some classical outlier detection techniques such as distance-based and density-based algorithms. Comprehensive case studies are conducted using various types of subspace clustering and outlier detection algorithms. The experimental results demonstrate that the proposed method can detect outliers effectively and efficiently in high dimensional data sets

    A new method to unveil embedded stellar clusters

    Get PDF
    In this paper we present a novel method to identify and characterize stellar clusters deeply embedded in a dark molecular cloud. The method is based on measuring stellar surface density in wide-field infrared images using star counting techniques. It takes advantage of the differing HH-band luminosity functions (HLFs) of field stars and young stellar populations and is able to statistically associate each star in an image as a member of either the background stellar population or a young stellar population projected on or near the cloud. Moreover, the technique corrects for the effects of differential extinction toward each individual star. We have tested this method against simulations as well as observations. In particular, we have applied the method to 2MASS point sources observed in the Orion A and B complexes, and the results obtained compare very well with those obtained from deep Spitzer and Chandra observations where presence of infrared excess or X-ray emission directly determines membership status for every star. Additionally, our method also identifies unobscured clusters and a low resolution version of the Orion stellar surface density map shows clearly the relatively unobscured and diffuse OB 1a and 1b sub-groups and provides useful insights on their spatial distribution.Comment: A&A, in press; 13 pages, multi-layer figures can be displayed with Adobe Acrobat Reade

    The Quantity of Intracluster Light: Comparing Theoretical and Observational Measurement Techniques Using Simulated Clusters

    Full text link
    Using a suite of N-body simulations of galaxy clusters specifically tailored to study the intracluster light (ICL) component, we measure the quantity of ICL using a number of different methods previously employed in the literature for both observational and simulation data sets. By measuring the ICL of the clusters using multiple techniques, we identify systematic differences in how each detection method identifies the ICL. We find that techniques which define the ICL solely based on the current position of the cluster luminosity, such as a surface brightness or local density threshold, tend to find less ICL than methods utilizing time or velocity information, including stellar particles' density history or binding energy. The range of ICL fractions (the fraction of the clusters' total luminosity found in the ICL component) we measure at z=0 across all our clusters using any definition span the range from 9-36%, and even within a single cluster different methods can change the measured ICL fraction by up to a factor of two. Separating the cluster's central galaxy from the surrounding ICL component is a challenge for all ICL techniques, and because the ICL is centrally concentrated within the cluster, the differences in the measured ICL quantity between techniques are largely a consequence of this central galaxy/ICL separation. We thoroughly explore the free parameters involved with each measurement method, and find that adjusting these parameters can change the measured ICL fraction by up to a factor of two. While for all definitions the quantity of ICL tends to increase with time, the ICL fraction does not grow at a uniform rate, nor even monotonically under some definitions. Thus, the ICL can be used as a rough indicator of dynamical age, where more dynamically advanced clusters will on average have higher ICL fractions.Comment: 18 pages, 11 figues. Accepted for publication in Ap

    Stochastic density functional theory

    Get PDF
    Linear-scaling implementations of density functional theory (DFT) reach their intended efficiency regime only when applied to systems having a physical size larger than the range of their Kohn-Sham density matrix (DM). This causes a problem since many types of large systems of interest have a rather broad DM range and are therefore not amenable to analysis using DFT methods. For this reason, the recently proposed stochastic DFT (sDFT), avoiding exhaustive DM evaluations, is emerging as an attractive alternative linear-scaling approach. This review develops a general formulation of sDFT in terms of a (non)orthogonal basis representation and offers an analysis of the statistical errors (SEs) involved in the calculation. Using a new Gaussian-type basis-set implementation of sDFT, applied to water clusters and silicon nanocrystals, it demonstrates and explains how the standard deviation and the bias depend on the sampling rate and the system size in various types of calculations. We also develop basis-set embedded-fragments theory, demonstrating its utility for reducing the SEs for energy, density of states and nuclear force calculations. Finally, we discuss the algorithmic complexity of sDFT, showing it has CPU wall-time linear-scaling. The method parallelizes well over distributed processors with good scalability and therefore may find use in the upcoming exascale computing architectures

    Energy benchmarks for water clusters and ice structures from an embedded many-body expansion

    Get PDF
    We show how an embedded many-body expansion (EMBE) can be used to calculate accurate \emph{ab initio} energies of water clusters and ice structures using wavefunction-based methods. We use the EMBE described recently by Bygrave \emph{et al.} (J. Chem. Phys. \textbf{137}, 164102 (2012)), in which the terms in the expansion are obtained from calculations on monomers, dimers, etc. acted on by an approximate representation of the embedding field due to all other molecules in the system, this field being a sum of Coulomb and exchange-repulsion fields. Our strategy is to separate the total energy of the system into Hartree-Fock and correlation parts, using the EMBE only for the correlation energy, with the Hartree-Fock energy calculated using standard molecular quantum chemistry for clusters and plane-wave methods for crystals. Our tests on a range of different water clusters up to the 16-mer show that for the second-order M\o{}ller-Plesset (MP2) method the EMBE truncated at 2-body level reproduces to better than 0.1 mEhE_{\rm h}/monomer the correlation energy from standard methods. The use of EMBE for computing coupled-cluster energies of clusters is also discussed. For the ice structures Ih, II and VIII, we find that MP2 energies near the complete basis-set limit reproduce very well the experimental values of the absolute and relative binding energies, but that the use of coupled-cluster methods for many-body correlation (non-additive dispersion) is essential for a full description. Possible future applications of the EMBE approach are suggested

    Cluster Disruption: From infant mortality to long term survival

    Full text link
    How stellar clusters disrupt, and over what timescales, is intimately linked with how they form. Here, we review the theory and observations of cluster disruption, both the suggested initial rapid dissolution phase (infant mortality) and the longer timescale processes that affect clusters after they emerge from their progenitor GMCs. Over the past decade, the standard paradigm that has developed is that all/most stars are formed in clusters and that the vast majority of these groups are disrupted over short timescales (< 10 Myr). This is thought to be due to the removal of the left over gas from the star-formation process, known as infant mortality. However, recent results have suggested that the fraction of stars that form in clusters has been overestimated, with the majority being formed in unbound groups (i.e. associations) which expand and disrupt without the need of invoking gas removal. Dynamical measurements of young massive clusters in the Galaxy suggest that clusters reach a stable equilibrium at very young (<3 Myr) ages, suggesting that gas expulsion has little effect on the cluster. After the early dynamical phase, clusters appear to be long lived and stable objects. We use the recent WFC3 image of the cluster population in M83 to test empirical disruption laws and find that the lifetime of clusters strongly depends on their ambient environment. While the role of cluster mass is less well constrained (due to the added parameter of the form of the cluster mass function), we find evidence suggesting that higher mass clusters survive longer, and that the cluster mass function (at least in M83, outside the nuclear region) is truncated above ~10^5Msun.Comment: 18 pages, 6 Figures, invited review for the proceedings of "Stellar Clusters and Associations - A RIA workshop on GAIA", 23-27 May 2011, Granada, Spai
    • …
    corecore