Search CORE

22 research outputs found

Continuous Queries and Real-time Analysis of Social Semantic Data with C-SPARQL

Author: Barbieri DAVIDE FRANCESCO
Braga DANIELE MARIA
Ceri Stefano
DELLA VALLE Emanuele
M. Grossniklaus
Publication venue
Publication date: 01/01/2009
Field of study

Abstract. Social semantic data are becoming a reality, but apparently their streaming nature has been ignored so far. Streams, being unbounded sequences of time-varying data elements, should not be treated as persistent data to be stored “forever ” and queried on demand, but rather as transient data to be consumed on the fly by queries which are registered once and for all and keep analyzing such streams, producing answers triggered by the streaming data and not by explicit invocation. In this paper, we propose an approach to continuous queries and realtime analysis of social semantic data with C-SPARQL, an extension of SPARQL for querying RDF streams

KOPS - The Institutional Repository of the University of Konstanz

CiteSeerX

Archivio istituzionale della ricerca - Politecnico di Milano

Stream Mining for Network Management

Author: Ano Shigehiro
Katsuno Satoshi
Tsuru Masato
Yamazaki Katsuyuki
Yoshida Kenichi
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/06/2006
Field of study

Network management is an important issue in maintaining the Internet as an important social infrastructure. Finding excessive consumption of network bandwidth caused by P2P mass flows is especially important. Finding Internet viruses is also an important security issue. Although stream mining techniques seem to be promising techniques to find P2P and Internet viruses, vast network flows prevent the simple application of such techniques. A mining technique which works well with extremely limited memory is required. Also it should have a real-time analysis capability. In this paper, we propose a cache based mining method to realize such a technique. By analyzing the characteristics of the proposed method with real Internet backbone flow data, we show the advantages of the proposed method, i.e. less memory consumption while realizing realtime analysis capability. We also show the fact that we can use the proposed method to find mass flow information from Internet backbone flow data

Kyutacar : Kyushu Institute of Technology Academic Repository

Streaming Temporal Graphs: Subgraph Matching

Author: Goodman Eric L.
Grunwald Dirk
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/03/2020
Field of study

We investigate solutions to subgraph matching within a temporal stream of data. We present a high-level language for describing temporal subgraphs of interest, the Streaming Analytics Language (SAL). SAL programs are translated into C++ code that is run in parallel on a cluster. We call this implementation of SAL the Streaming Analytics Machine (SAM). SAL programs are succinct, requiring about 20 times fewer lines of code than using the SAM library directly, or writing an implementation using Apache Flink. To benchmark SAM we calculate finding temporal triangles within streaming netflow data. Also, we compare SAM to an implementation written for Flink. We find that SAM is able to scale to 128 nodes or 2560 cores, while Apache Flink has max throughput with 32 nodes and degrades thereafter. Apache Flink has an advantage when triangles are rare, with max aggregate throughput for Flink at 32 nodes greater than the max achievable rate of SAM. In our experiments, when triangle occurrence was faster than five per second per node, SAM performed better. Both frameworks may miss results due to latencies in network communication. SAM consistently reported an average of 93.7% of expected results while Flink decreases from 83.7% to 52.1% as we increase to the maximum size of the cluster. Overall, SAM can obtain rates of 91.8 billion netflows per day.Comment: Big Data 201

arXiv.org e-Print Archive

Crossref

An evaluation of streaming algorithms for distinct counting over a sliding window

Author: Singh Sneha
Tirthapura Srikanta
Tirthapura Srikanta
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2015
Field of study

Counting the number of distinct elements in a data stream (distinct counting) is a fundamental aggregation task in database query processing, query optimization, and network monitoring. On a stream of elements, it is commonly needed to compute an aggregate over only the most recent elements, leading to the problem of distinct counting over a “sliding window” of the stream. We present a detailed experimental study of the performance of different algorithms for distinct counting over a sliding window. We observe that the performance of an algorithm depends on the basic method used, as well as aspects such as the hash function, the mix of query and updates, and the method used to boost accuracy. We compare the performance of prominent algorithms and evaluate the influence of these factors, leading to practical recommendations for implementation. To the best of our knowledge, this is the first detailed experimental study of distinct counting over a sliding window

Digital Repository @ Iowa State University (ISU)

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

Achieving Intelligent Traffic-aware Consolidation of Virtual Machines in a Data Center Using Learning Automata

Author: Begnum Kyrre
Jobava Akaki
Oommen John
Yazidi Anis
Publication venue
Publication date: 01/01/2016
Field of study

Agder University Research Archive