15,902 research outputs found

    The Beylkin-Cramer Summation Rule and A New Fast Algorithm of Cosmic Statistics for Large Data Sets

    Full text link
    Based on the Beylkin-Cramer summation rule, we introduce a new fast algorithm that enable us to explore the high order statistics efficiently in large data sets. Central to this technique is to make decomposition both of fields and operators within the framework of multi-resolution analysis (MRA), and realize theirs discrete representations. Accordingly, a homogenous point process could be equivalently described by a operation of a Toeplitz matrix on a vector, which is accomplished by making use of fast Fourier transformation. The algorithm could be applied widely in the cosmic statistics to tackle large data sets. Especially, we demonstrate this novel technique using the spherical, cubic and cylinder counts in cells respectively. The numerical test shows that the algorithm produces an excellent agreement with the expected results. Moreover, the algorithm introduces naturally a sharp-filter, which is capable of suppressing shot noise in weak signals. In the numerical procedures, the algorithm is somewhat similar to particle-mesh (PM) methods in N-body simulations. As scaled with O(Nlog⁥N)O(N\log N), it is significantly faster than the current particle-based methods, and its computational cost does not relies on shape or size of sampling cells. In addition, based on this technique, we propose further a simple fast scheme to compute the second statistics for cosmic density fields and justify it using simulation samples. Hopefully, the technique developed here allows us to make a comprehensive study of non-Guassianity of the cosmic fields in high precision cosmology. A specific implementation of the algorithm is publicly available upon request to the author.Comment: 27 pages, 9 figures included. revised version, changes include (a) adding a new fast algorithm for 2nd statistics (b) more numerical tests including counts in asymmetric cells, the two-point correlation functions and 2nd variances (c) more discussions on technic

    A Phase Field Model for Continuous Clustering on Vector Fields

    Get PDF
    A new method for the simplification of flow fields is presented. It is based on continuous clustering. A well-known physical clustering model, the Cahn Hilliard model, which describes phase separation, is modified to reflect the properties of the data to be visualized. Clusters are defined implicitly as connected components of the positivity set of a density function. An evolution equation for this function is obtained as a suitable gradient flow of an underlying anisotropic energy functional. Here, time serves as the scale parameter. The evolution is characterized by a successive coarsening of patterns-the actual clustering-during which the underlying simulation data specifies preferable pattern boundaries. We introduce specific physical quantities in the simulation to control the shape, orientation and distribution of the clusters as a function of the underlying flow field. In addition, the model is expanded, involving elastic effects. In the early stages of the evolution shear layer type representation of the flow field can thereby be generated, whereas, for later stages, the distribution of clusters can be influenced. Furthermore, we incorporate upwind ideas to give the clusters an oriented drop-shaped appearance. Here, we discuss the applicability of this new type of approach mainly for flow fields, where the cluster energy penalizes cross streamline boundaries. However, the method also carries provisions for other fields as well. The clusters can be displayed directly as a flow texture. Alternatively, the clusters can be visualized by iconic representations, which are positioned by using a skeletonization algorithm.

    A Density-Based Approach to the Retrieval of Top-K Spatial Textual Clusters

    Full text link
    Keyword-based web queries with local intent retrieve web content that is relevant to supplied keywords and that represent points of interest that are near the query location. Two broad categories of such queries exist. The first encompasses queries that retrieve single spatial web objects that each satisfy the query arguments. Most proposals belong to this category. The second category, to which this paper's proposal belongs, encompasses queries that support exploratory user behavior and retrieve sets of objects that represent regions of space that may be of interest to the user. Specifically, the paper proposes a new type of query, namely the top-k spatial textual clusters (k-STC) query that returns the top-k clusters that (i) are located the closest to a given query location, (ii) contain the most relevant objects with regard to given query keywords, and (iii) have an object density that exceeds a given threshold. To compute this query, we propose a basic algorithm that relies on on-line density-based clustering and exploits an early stop condition. To improve the response time, we design an advanced approach that includes three techniques: (i) an object skipping rule, (ii) spatially gridded posting lists, and (iii) a fast range query algorithm. An empirical study on real data demonstrates that the paper's proposals offer scalability and are capable of excellent performance
    • 

    corecore