901 research outputs found

    A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs

    Full text link
    Cyber security is one of the most significant technical challenges in current times. Detecting adversarial activities, prevention of theft of intellectual properties and customer data is a high priority for corporations and government agencies around the world. Cyber defenders need to analyze massive-scale, high-resolution network flows to identify, categorize, and mitigate attacks involving networks spanning institutional and national boundaries. Many of the cyber attacks can be described as subgraph patterns, with prominent examples being insider infiltrations (path queries), denial of service (parallel paths) and malicious spreads (tree queries). This motivates us to explore subgraph matching on streaming graphs in a continuous setting. The novelty of our work lies in using the subgraph distributional statistics collected from the streaming graph to determine the query processing strategy. We introduce a "Lazy Search" algorithm where the search strategy is decided on a vertex-to-vertex basis depending on the likelihood of a match in the vertex neighborhood. We also propose a metric named "Relative Selectivity" that is used to select between different query processing strategies. Our experiments performed on real online news, network traffic stream and a synthetic social network benchmark demonstrate 10-100x speedups over selectivity agnostic approaches.Comment: in 18th International Conference on Extending Database Technology (EDBT) (2015

    When Things Matter: A Data-Centric View of the Internet of Things

    Full text link
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

    Corporate Smart Content Evaluation

    Get PDF
    Nowadays, a wide range of information sources are available due to the evolution of web and collection of data. Plenty of these information are consumable and usable by humans but not understandable and processable by machines. Some data may be directly accessible in web pages or via data feeds, but most of the meaningful existing data is hidden within deep web databases and enterprise information systems. Besides the inability to access a wide range of data, manual processing by humans is effortful, error-prone and not contemporary any more. Semantic web technologies deliver capabilities for machine-readable, exchangeable content and metadata for automatic processing of content. The enrichment of heterogeneous data with background knowledge described in ontologies induces re-usability and supports automatic processing of data. The establishment of “Corporate Smart Content” (CSC) - semantically enriched data with high information content with sufficient benefits in economic areas - is the main focus of this study. We describe three actual research areas in the field of CSC concerning scenarios and datasets applicable for corporate applications, algorithms and research. Aspect- oriented Ontology Development advances modular ontology development and partial reuse of existing ontological knowledge. Complex Entity Recognition enhances traditional entity recognition techniques to recognize clusters of related textual information about entities. Semantic Pattern Mining combines semantic web technologies with pattern learning to mine for complex models by attaching background knowledge. This study introduces the afore-mentioned topics by analyzing applicable scenarios with economic and industrial focus, as well as research emphasis. Furthermore, a collection of existing datasets for the given areas of interest is presented and evaluated. The target audience includes researchers and developers of CSC technologies - people interested in semantic web features, ontology development, automation, extracting and mining valuable information in corporate environments. The aim of this study is to provide a comprehensive and broad overview over the three topics, give assistance for decision making in interesting scenarios and choosing practical datasets for evaluating custom problem statements. Detailed descriptions about attributes and metadata of the datasets should serve as starting point for individual ideas and approaches

    Data semantic enrichment for complex event processing over IoT Data Streams

    Get PDF
    This thesis generalizes techniques for processing IoT data streams, semantically enrich data with contextual information, as well as complex event processing in IoT applications. A case study for ECG anomaly detection and signal classification was conducted to validate the knowledge foundation

    Pattern Discovery from Event Data

    Get PDF
    Events are ubiquitous in real-life. With the rapid rise of the popularity of social media channels, massive amounts of event data, such as information about festivals, concerts, or meetings, are increasingly created and shared by users on the Internet. Deriving insights or knowledge from such social media data provides a semantically rich basis for many applications, for instance, social media marketing, service recommendation, sales promotion, or enrichment of existing data sources. In spite of substantial research on discovering valuable knowledge from various types of social media data such as microblog data, check-in data, or GPS trajectories, interestingly there has been only little work on mining event data for useful patterns. In this thesis, we focus on the discovery of interesting, useful patterns from datasets of events, where information about these events is shared by and spread across social media platforms. To deal with the existence of heterogeneous event data sources, we propose a comprehensive framework to model events for pattern mining purposes, where each event is described by three components: context, time, and location. This framework allows one to easily define how events are related in terms of conceptual, temporal, and spatial (geographic) relationships. Moreover, we also take into account hierarchies for contexts, time, and locations of events, which naturally exist as useful background knowledge to derive patterns at different levels of abstraction and granularity. Based on this framework, we focus on the following problems: (i) mining interval-based event sequence patterns, (ii) mining periodic event patterns, and (iii) extracting semantic annotations for locations of events. Generally, the first two problems consider correlations of events whereas the last one takes correlations of event components into account. In particular, the first problem is a generalization of mining sequential patterns from traditional data, where patterns representing complex temporal relationships among events can be discovered at different levels of abstraction and granularity. The second problem is to find periodic event patterns, where a notion of relaxed periodicity is formulated for events as well as for groups of events that co-occur. The third~problem is to extract semantic annotations for locations on the basis of exploiting correlations of contexts, time, and locations of events. For the three problems above, we respectively propose novel and efficient approaches. Our experiments clearly indicate that extracted patterns and knowledge can be well utilized in various useful tasks, such as event prediction, semantic search for locations, or topic-based clustering of locations

    Requirements and Use Cases ; Report I on the sub-project Smart Content Enrichment

    Get PDF
    In this technical report, we present the results of the first milestone phase of the Corporate Smart Content sub-project "Smart Content Enrichment". We present analyses of the state of the art in the fields concerning the three working packages defined in the sub-project, which are aspect-oriented ontology development, complex entity recognition, and semantic event pattern mining. We compare the research approaches related to our three research subjects and outline briefly our future work plan

    Knowledge Graphs Evolution and Preservation -- A Technical Report from ISWS 2019

    Get PDF
    One of the grand challenges discussed during the Dagstuhl Seminar "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web" and described in its report is that of a: "Public FAIR Knowledge Graph of Everything: We increasingly see the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this further by asking if we can create a knowledge graph of "everything" ranging from common sense concepts to location based entities. This knowledge graph should be "open to the public" in a FAIR manner democratizing this mass amount of knowledge." Although linked open data (LOD) is one knowledge graph, it is the closest realisation (and probably the only one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides a unique testbed for experimenting and evaluating research hypotheses on open and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing evolution and long term preservation. We want to investigate this problem, that is to understand what preserving and supporting the evolution of KGs means and how these problems can be addressed. Clearly, the problem can be approached from different perspectives and may require the development of different approaches, including new theories, ontologies, metrics, strategies, procedures, etc. This document reports a collaborative effort performed by 9 teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019). Each team provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation. In addition, they provide their working definition for KG preservation and evolution
    corecore