32,970 research outputs found
Streaming Similarity Self-Join
We introduce and study the problem of computing the similarity self-join in a
streaming context (SSSJ), where the input is an unbounded stream of items
arriving continuously. The goal is to find all pairs of items in the stream
whose similarity is greater than a given threshold. The simplest formulation of
the problem requires unbounded memory, and thus, it is intractable. To make the
problem feasible, we introduce the notion of time-dependent similarity: the
similarity of two items decreases with the difference in their arrival time. By
leveraging the properties of this time-dependent similarity function, we design
two algorithmic frameworks to solve the sssj problem. The first one, MiniBatch
(MB), uses existing index-based filtering techniques for the static version of
the problem, and combines them in a pipeline. The second framework, Streaming
(STR), adds time filtering to the existing indexes, and integrates new
time-based bounds deeply in the working of the algorithms. We also introduce a
new indexing technique (L2), which is based on an existing state-of-the-art
indexing technique (L2AP), but is optimized for the streaming case. Extensive
experiments show that the STR algorithm, when instantiated with the L2 index,
is the most scalable option across a wide array of datasets and parameters
Structure of supercritically dried calcium silicate hydrates (C-S-H) and structural changes induced by weathering
The nanostructure of supercritically dried calcium silicate hydrates was researched. This particular drying procedure was used to avoid nanostructure modifications due to conventional drying processes. Thus, in this study, the as-precipitated cementitious C-S-H structure was obtained for the first time. A specific surface area 20 % larger than conventionally dried C-S-H was measured. Given the importance of this nanostructured phase for the properties of hydrated cements, especially when in contact with CO2-rich environments, the supercritically dried C-S-H was weathered for 2 weeks. The structural effects of this weathering process on the C-S-H were researched and calcium carbonate microcrystal precipitation or the presence of silica by-product are reported. Calcite and aragonite polymorphs were observed, as well as nanoporous silica forming globular arrangements. In addition, 2 weeks of weathering was not enough to carbonate the entire C-S-H sample.Junta de Andalucía TEP11
Scalable Online Betweenness Centrality in Evolving Graphs
Betweenness centrality is a classic measure that quantifies the importance of
a graph element (vertex or edge) according to the fraction of shortest paths
passing through it. This measure is notoriously expensive to compute, and the
best known algorithm runs in O(nm) time. The problems of efficiency and
scalability are exacerbated in a dynamic setting, where the input is an
evolving graph seen edge by edge, and the goal is to keep the betweenness
centrality up to date. In this paper we propose the first truly scalable
algorithm for online computation of betweenness centrality of both vertices and
edges in an evolving graph where new edges are added and existing edges are
removed. Our algorithm is carefully engineered with out-of-core techniques and
tailored for modern parallel stream processing engines that run on clusters of
shared-nothing commodity hardware. Hence, it is amenable to real-world
deployment. We experiment on graphs that are two orders of magnitude larger
than previous studies. Our method is able to keep the betweenness centrality
measures up to date online, i.e., the time to update the measures is smaller
than the inter-arrival time between two consecutive updates.Comment: 15 pages, 9 Figures, accepted for publication in IEEE Transactions on
Knowledge and Data Engineerin
- …