258,168 research outputs found

    Redshift Weights for Baryon Acoustic Oscillations : Application to Mock Galaxy Catalogs

    Get PDF
    Large redshift surveys capable of measuring the Baryon Acoustic Oscillation (BAO) signal have proven to be an effective way of measuring the distance-redshift relation in cosmology. Building off the work in Zhu et al. (2015), we develop a technique to directly constrain the distance-redshift relation from BAO measurements without splitting the sample into redshift bins. We parametrize the distance-redshift relation, relative to a fiducial model, as a quadratic expansion. We measure its coefficients and reconstruct the distance-redshift relation from the expansion. We apply the redshift weighting technique in Zhu et al. (2015) to the clustering of galaxies from 1000 QuickPM (QPM) mock simulations after reconstruction and achieve a 0.75% measurement of the angular diameter distance DAD_A at z=0.64z=0.64 and the same precision for Hubble parameter H at z=0.29z=0.29. These QPM mock catalogs are designed to mimic the clustering and noise level of the Baryon Oscillation Spectroscopic Survey (BOSS) Data Release 12 (DR12). We compress the correlation functions in the redshift direction onto a set of weighted correlation functions. These estimators give unbiased DAD_A and HH measurements at all redshifts within the range of the combined sample. We demonstrate the effectiveness of redshift weighting in improving the distance and Hubble parameter estimates. Instead of measuring at a single 'effective' redshift as in traditional analyses, we report our DAD_A and HH measurements at all redshifts. The measured fractional error of DAD_A ranges from 1.53% at z=0.2z=0.2 to 0.75% at z=0.64z=0.64. The fractional error of HH ranges from 0.75% at z=0.29z=0.29 to 2.45% at z=0.7z = 0.7. Our measurements are consistent with a Fisher forecast to within 10% to 20% depending on the pivot redshift. We further show the results are robust against the choice of fiducial cosmologies, galaxy bias models, and RSD streaming parameters.Comment: 13 pages, 8 figures, submitted to MNRA

    A Modified Overlapping Partitioning Clustering Algorithm for Categorical Data Clustering

    Get PDF
    Clustering is one of the important approaches for Clustering enables the grouping of unlabeled data by partitioning data into clusters with similar patterns. Over the past decades, many clustering algorithms have been developed for various clustering problems. An overlapping partitioning clustering (OPC) algorithm can only handle numerical data. Hence, novel clustering algorithms have been studied extensively to overcome this issue. By increasing the number of objects belonging to one cluster and distance between cluster centers, the study aimed to cluster the textual data type without losing the main functions. The proposed study herein included over twenty newsgroup dataset, which consisted of approximately 20000 textual documents. By introducing some modifications to the traditional algorithm, an acceptable level of homogeneity and completeness of clusters were generated. Modifications were performed on the pre-processing phase and data representation, along with the number methods which influence the primary function of the algorithm. Subsequently, the results were evaluated and compared with the k-means algorithm of the training and test datasets. The results indicated that the modified algorithm could successfully handle the categorical data and produce satisfactory clusters

    GScluster: Network-weighted gene-set clustering analysis

    Get PDF
    Background: Gene-set analysis (GSA) has been commonly used to identify significantly altered pathways or functions from omics data. However, GSA often yields a long list of gene-sets, necessitating efficient post-processing for improved interpretation. Existing methods cluster the gene-sets based on the extent of their overlap to summarize the GSA results without considering interactions between gene-sets. Results: Here, we presented a novel network-weighted gene-set clustering that incorporates both the gene-set overlap and protein-protein interaction (PPI) networks. Three examples were demonstrated for microarray gene expression, GWAS summary, and RNA-sequencing data to which different GSA methods were applied. These examples as well as a global analysis show that the proposed method increases PPI densities and functional relevance of the resulting clusters. Additionally, distinct properties of gene-set distance measures were compared. The methods are implemented as an R/Shiny package GScluster that provides gene-set clustering and diverse functions for visualization of gene-sets and PPI networks. Conclusions: Network-weighted gene-set clustering provides functionally more relevant gene-set clusters and related network analysis

    Cosmic Acceleration from Causal Backreaction with Recursive Nonlinearities

    Full text link
    We revisit the causal backreaction paradigm, in which the need for Dark Energy is eliminated via the generation of an apparent cosmic acceleration from the causal flow of inhomogeneity information coming in towards each observer from distant structure-forming regions. This second-generation formalism incorporates "recursive nonlinearities": the process by which already-established metric perturbations will then act to slow down all future flows of inhomogeneity information. Here, the long-range effects of causal backreaction are now damped, weakening its impact for models that were previously best-fit cosmologies. Nevertheless, we find that causal backreaction can be recovered as a replacement for Dark Energy via the adoption of larger values for the dimensionless `strength' of the clustering evolution functions being modeled -- a change justified by the hierarchical nature of clustering and virialization in the universe, occurring on multiple cosmic length scales simultaneously. With this, and with one new model parameter representing the slowdown of clustering due to astrophysical feedback processes, an alternative cosmic concordance can once again be achieved for a matter-only universe in which the apparent acceleration is generated entirely by causal backreaction effects. One drawback is a new degeneracy which broadens our predicted range for the observed jerk parameter j0Obsj_{0}^{\mathrm{Obs}}, thus removing what had appeared to be a clear signature for distinguishing causal backreaction from Cosmological Constant Λ\LambdaCDM. As for the long-term fate of the universe, incorporating recursive nonlinearities appears to make the possibility of an `eternal' acceleration due to causal backreaction far less likely; though this does not take into account gravitational nonlinearities or the large-scale breakdown of cosmological isotropy, effects not easily modeled within this formalism.Comment: 53 pages, 7 figures, 3 tables. This paper is an advancement of previous research on Causal Backreaction; the earlier work is available at arXiv:1109.4686 and arXiv:1109.515

    Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes).</p> <p>Results</p> <p>We developed Nearest Neighbor Networks (NNN), a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods.</p> <p>Conclusion</p> <p>The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the analysis of large datasets, and its ability to span a wide range of biological functions with high precision.</p

    A directed isoperimetric inequality with application to Bregman near neighbor lower bounds

    Full text link
    Bregman divergences DϕD_\phi are a class of divergences parametrized by a convex function ϕ\phi and include well known distance functions like 22\ell_2^2 and the Kullback-Leibler divergence. There has been extensive research on algorithms for problems like clustering and near neighbor search with respect to Bregman divergences, in all cases, the algorithms depend not just on the data size nn and dimensionality dd, but also on a structure constant μ1\mu \ge 1 that depends solely on ϕ\phi and can grow without bound independently. In this paper, we provide the first evidence that this dependence on μ\mu might be intrinsic. We focus on the problem of approximate near neighbor search for Bregman divergences. We show that under the cell probe model, any non-adaptive data structure (like locality-sensitive hashing) for cc-approximate near-neighbor search that admits rr probes must use space Ω(n1+μcr)\Omega(n^{1 + \frac{\mu}{c r}}). In contrast, for LSH under 1\ell_1 the best bound is Ω(n1+1cr)\Omega(n^{1+\frac{1}{cr}}). Our new tool is a directed variant of the standard boolean noise operator. We show that a generalization of the Bonami-Beckner hypercontractivity inequality exists "in expectation" or upon restriction to certain subsets of the Hamming cube, and that this is sufficient to prove the desired isoperimetric inequality that we use in our data structure lower bound. We also present a structural result reducing the Hamming cube to a Bregman cube. This structure allows us to obtain lower bounds for problems under Bregman divergences from their 1\ell_1 analog. In particular, we get a (weaker) lower bound for approximate near neighbor search of the form Ω(n1+1cr)\Omega(n^{1 + \frac{1}{cr}}) for an rr-query non-adaptive data structure, and new cell probe lower bounds for a number of other near neighbor questions in Bregman space.Comment: 27 page
    corecore