32,561 research outputs found

    Streaming Similarity Self-Join

    Full text link
    We introduce and study the problem of computing the similarity self-join in a streaming context (SSSJ), where the input is an unbounded stream of items arriving continuously. The goal is to find all pairs of items in the stream whose similarity is greater than a given threshold. The simplest formulation of the problem requires unbounded memory, and thus, it is intractable. To make the problem feasible, we introduce the notion of time-dependent similarity: the similarity of two items decreases with the difference in their arrival time. By leveraging the properties of this time-dependent similarity function, we design two algorithmic frameworks to solve the sssj problem. The first one, MiniBatch (MB), uses existing index-based filtering techniques for the static version of the problem, and combines them in a pipeline. The second framework, Streaming (STR), adds time filtering to the existing indexes, and integrates new time-based bounds deeply in the working of the algorithms. We also introduce a new indexing technique (L2), which is based on an existing state-of-the-art indexing technique (L2AP), but is optimized for the streaming case. Extensive experiments show that the STR algorithm, when instantiated with the L2 index, is the most scalable option across a wide array of datasets and parameters

    Structure of supercritically dried calcium silicate hydrates (C-S-H) and structural changes induced by weathering

    Get PDF
    The nanostructure of supercritically dried calcium silicate hydrates was researched. This particular drying procedure was used to avoid nanostructure modifications due to conventional drying processes. Thus, in this study, the as-precipitated cementitious C-S-H structure was obtained for the first time. A specific surface area 20 % larger than conventionally dried C-S-H was measured. Given the importance of this nanostructured phase for the properties of hydrated cements, especially when in contact with CO2-rich environments, the supercritically dried C-S-H was weathered for 2 weeks. The structural effects of this weathering process on the C-S-H were researched and calcium carbonate microcrystal precipitation or the presence of silica by-product are reported. Calcite and aragonite polymorphs were observed, as well as nanoporous silica forming globular arrangements. In addition, 2 weeks of weathering was not enough to carbonate the entire C-S-H sample.Junta de Andalucía TEP11

    Scalable Online Betweenness Centrality in Evolving Graphs

    Full text link
    Betweenness centrality is a classic measure that quantifies the importance of a graph element (vertex or edge) according to the fraction of shortest paths passing through it. This measure is notoriously expensive to compute, and the best known algorithm runs in O(nm) time. The problems of efficiency and scalability are exacerbated in a dynamic setting, where the input is an evolving graph seen edge by edge, and the goal is to keep the betweenness centrality up to date. In this paper we propose the first truly scalable algorithm for online computation of betweenness centrality of both vertices and edges in an evolving graph where new edges are added and existing edges are removed. Our algorithm is carefully engineered with out-of-core techniques and tailored for modern parallel stream processing engines that run on clusters of shared-nothing commodity hardware. Hence, it is amenable to real-world deployment. We experiment on graphs that are two orders of magnitude larger than previous studies. Our method is able to keep the betweenness centrality measures up to date online, i.e., the time to update the measures is smaller than the inter-arrival time between two consecutive updates.Comment: 15 pages, 9 Figures, accepted for publication in IEEE Transactions on Knowledge and Data Engineerin
    corecore