901 research outputs found
A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs
Cyber security is one of the most significant technical challenges in current
times. Detecting adversarial activities, prevention of theft of intellectual
properties and customer data is a high priority for corporations and government
agencies around the world. Cyber defenders need to analyze massive-scale,
high-resolution network flows to identify, categorize, and mitigate attacks
involving networks spanning institutional and national boundaries. Many of the
cyber attacks can be described as subgraph patterns, with prominent examples
being insider infiltrations (path queries), denial of service (parallel paths)
and malicious spreads (tree queries). This motivates us to explore subgraph
matching on streaming graphs in a continuous setting. The novelty of our work
lies in using the subgraph distributional statistics collected from the
streaming graph to determine the query processing strategy. We introduce a
"Lazy Search" algorithm where the search strategy is decided on a
vertex-to-vertex basis depending on the likelihood of a match in the vertex
neighborhood. We also propose a metric named "Relative Selectivity" that is
used to select between different query processing strategies. Our experiments
performed on real online news, network traffic stream and a synthetic social
network benchmark demonstrate 10-100x speedups over selectivity agnostic
approaches.Comment: in 18th International Conference on Extending Database Technology
(EDBT) (2015
When Things Matter: A Data-Centric View of the Internet of Things
With the recent advances in radio-frequency identification (RFID), low-cost
wireless sensor devices, and Web technologies, the Internet of Things (IoT)
approach has gained momentum in connecting everyday objects to the Internet and
facilitating machine-to-human and machine-to-machine communication with the
physical world. While IoT offers the capability to connect and integrate both
digital and physical entities, enabling a whole new class of applications and
services, several significant challenges need to be addressed before these
applications and services can be fully realized. A fundamental challenge
centers around managing IoT data, typically produced in dynamic and volatile
environments, which is not only extremely large in scale and volume, but also
noisy, and continuous. This article surveys the main techniques and
state-of-the-art research efforts in IoT from data-centric perspectives,
including data stream processing, data storage models, complex event
processing, and searching in IoT. Open research issues for IoT data management
are also discussed
Corporate Smart Content Evaluation
Nowadays, a wide range of information sources are available due to the
evolution of web and collection of data. Plenty of these information are
consumable and usable by humans but not understandable and processable by
machines. Some data may be directly accessible in web pages or via data feeds,
but most of the meaningful existing data is hidden within deep web databases
and enterprise information systems. Besides the inability to access a wide
range of data, manual processing by humans is effortful, error-prone and not
contemporary any more. Semantic web technologies deliver capabilities for
machine-readable, exchangeable content and metadata for automatic processing
of content. The enrichment of heterogeneous data with background knowledge
described in ontologies induces re-usability and supports automatic processing
of data. The establishment of “Corporate Smart Content” (CSC) - semantically
enriched data with high information content with sufficient benefits in
economic areas - is the main focus of this study. We describe three actual
research areas in the field of CSC concerning scenarios and datasets
applicable for corporate applications, algorithms and research. Aspect-
oriented Ontology Development advances modular ontology development and
partial reuse of existing ontological knowledge. Complex Entity Recognition
enhances traditional entity recognition techniques to recognize clusters of
related textual information about entities. Semantic Pattern Mining combines
semantic web technologies with pattern learning to mine for complex models by
attaching background knowledge. This study introduces the afore-mentioned
topics by analyzing applicable scenarios with economic and industrial focus,
as well as research emphasis. Furthermore, a collection of existing datasets
for the given areas of interest is presented and evaluated. The target
audience includes researchers and developers of CSC technologies - people
interested in semantic web features, ontology development, automation,
extracting and mining valuable information in corporate environments. The aim
of this study is to provide a comprehensive and broad overview over the three
topics, give assistance for decision making in interesting scenarios and
choosing practical datasets for evaluating custom problem statements. Detailed
descriptions about attributes and metadata of the datasets should serve as
starting point for individual ideas and approaches
Data semantic enrichment for complex event processing over IoT Data Streams
This thesis generalizes techniques for processing IoT data streams, semantically enrich data with contextual information, as well as complex event processing in IoT applications. A case study for ECG anomaly detection and signal classification was conducted to validate the knowledge foundation
Pattern Discovery from Event Data
Events are ubiquitous in real-life. With the rapid rise of the popularity of social media channels, massive amounts of event data, such as information about festivals, concerts, or meetings, are increasingly created and shared by users on the Internet. Deriving insights or knowledge from such social media data provides a semantically rich basis for many applications, for instance, social media marketing, service recommendation, sales promotion, or enrichment of existing data sources. In spite of substantial research on discovering valuable knowledge from various types of social media data such as microblog data, check-in data, or GPS trajectories, interestingly there has been only little work on mining event data for useful patterns.
In this thesis, we focus on the discovery of interesting, useful patterns from datasets of events, where information about these events is shared by and spread across social media platforms. To deal with the existence of heterogeneous event data sources, we propose a comprehensive framework to model events for pattern mining purposes, where each event is described by three components: context, time, and location. This framework allows one to easily define how events are related in terms of conceptual, temporal, and spatial (geographic) relationships. Moreover, we also take into account hierarchies for contexts, time, and locations of events, which naturally exist as useful background knowledge to derive patterns at different levels of abstraction and granularity. Based on this framework, we focus on the following problems: (i) mining interval-based event sequence patterns, (ii) mining periodic event patterns, and (iii) extracting semantic annotations for locations of events. Generally, the first two problems consider correlations of events whereas the last one takes correlations of event components into account.
In particular, the first problem is a generalization of mining sequential patterns from traditional data, where patterns representing complex temporal relationships among events can be discovered at different levels of abstraction and granularity. The second problem is to find periodic event patterns, where a notion of relaxed periodicity is formulated for events as well as for groups of events that co-occur. The third~problem is to extract semantic annotations for locations on the basis of exploiting correlations of contexts, time, and locations of events. For the three problems above, we respectively propose novel and efficient approaches. Our experiments clearly indicate that extracted patterns and knowledge can be well utilized in various useful tasks, such as event prediction, semantic search for locations, or topic-based clustering of locations
Requirements and Use Cases ; Report I on the sub-project Smart Content Enrichment
In this technical report, we present the results of the first milestone phase
of the Corporate Smart Content sub-project "Smart Content Enrichment". We
present analyses of the state of the art in the fields concerning the three
working packages defined in the sub-project, which are aspect-oriented
ontology development, complex entity recognition, and semantic event pattern
mining. We compare the research approaches related to our three research
subjects and outline briefly our future work plan
Knowledge Graphs Evolution and Preservation -- A Technical Report from ISWS 2019
One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge
Graphs: New Directions for Knowledge Representation on the Semantic Web" and
described in its report is that of a: "Public FAIR Knowledge Graph of
Everything: We increasingly see the creation of knowledge graphs that capture
information about the entirety of a class of entities. [...] This grand
challenge extends this further by asking if we can create a knowledge graph of
"everything" ranging from common sense concepts to location based entities.
This knowledge graph should be "open to the public" in a FAIR manner
democratizing this mass amount of knowledge." Although linked open data (LOD)
is one knowledge graph, it is the closest realisation (and probably the only
one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides
a unique testbed for experimenting and evaluating research hypotheses on open
and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing
evolution and long term preservation. We want to investigate this problem, that
is to understand what preserving and supporting the evolution of KGs means and
how these problems can be addressed. Clearly, the problem can be approached
from different perspectives and may require the development of different
approaches, including new theories, ontologies, metrics, strategies,
procedures, etc. This document reports a collaborative effort performed by 9
teams of students, each guided by a senior researcher as their mentor,
attending the International Semantic Web Research School (ISWS 2019). Each team
provides a different perspective to the problem of knowledge graph evolution
substantiated by a set of research questions as the main subject of their
investigation. In addition, they provide their working definition for KG
preservation and evolution
- …