662 research outputs found

    When Things Matter: A Data-Centric View of the Internet of Things

    Full text link
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

    A Survey on Behavioral Pattern Mining from Sensor Data in Internet of Things

    Get PDF
    The deployment of large-scale wireless sensor networks (WSNs) for the Internet of Things (IoT) applications is increasing day-by-day, especially with the emergence of smart city services. The sensor data streams generated from these applications are largely dynamic, heterogeneous, and often geographically distributed over large areas. For high-value use in business, industry and services, these data streams must be mined to extract insightful knowledge, such as about monitoring (e.g., discovering certain behaviors over a deployed area) or network diagnostics (e.g., predicting faulty sensor nodes). However, due to the inherent constraints of sensor networks and application requirements, traditional data mining techniques cannot be directly used to mine IoT data streams efficiently and accurately in real-time. In the last decade, a number of works have been reported in the literature proposing behavioral pattern mining algorithms for sensor networks. This paper presents the technical challenges that need to be considered for mining sensor data. It then provides a thorough review of the mining techniques proposed in the recent literature to mine behavioral patterns from sensor data in IoT, and their characteristics and differences are highlighted and compared. We also propose a behavioral pattern mining framework for IoT and discuss possible future research directions in this area. © 2013 IEEE

    Intrusion detection system alert correlation with operating system level logs

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2009Includes bibliographical references (leaves: 63-66)Text in English; Abstract: Turkish and Englishvii, 67 leavesInternet is a global public network. More and more people are getting connected to the Internet every day to take advantage of the Internetwork connectivity. It also brings in a lot of risk on the Internet because there are both harmless and harmful users on the Internet. While an organization makes its information system available to harmless Internet users, at the same time the information is available to the malicious users as well. Most organizations deploy firewalls to protect their private network from the public network. But, no network can be hundred percent secured. This is because; the connectivity requires some kind of access to be granted on the internal systems to Internet users. The firewall provides security by allowing only specific services through it. The firewall implements defined rules to each packet reaching to its network interface. The IDS complements the firewall security by detected if someone tries to break in through the firewall or manages to break in the firewall security and tried to have access on any system in the trusted site and alerted the system administrator in case there is a breach in security. However, at present, IDSs suffer from several limitations. To address these limitations and learn network security threats, it is necessary to perform alert correlation. Alert correlation focuses on discovering various relationships between individual alerts. Intrusion alert correlation techniques correlate alerts into meaningful groups or attack scenarios for ease to understand by human analysts. In order to be sure about the alert correlation working properly, this thesis proposed to use attack scenarios by correlating alerts on the basis of prerequisites and consequences of intrusions. The architecture of the experimental environment based on the prerequisites and consequences of different types of attacks, the proposed approach correlates alerts by matching the consequence of some previous alerts and the prerequisite of some later ones with OS-level logs. As a result, the accuracy of the proposed method and its advantage demonstrated to focus on building IDS alert correlation with OS-level logs in information security systems

    Knowledge discovery in data streams

    Full text link
    Knowing what to do with the massive amount of data collected has always been an ongoing issue for many organizations. While data mining has been touted to be the solution, it has failed to deliver the impact despite its successes in many areas. One reason is that data mining algorithms were not designed for the real world, i.e., they usually assume a static view of the data and a stable execution environment where resources are abundant. The reality however is that data are constantly changing and the execution environment is dynamic. Hence, it becomes difficult for data mining to truly deliver timely and relevant results. Recently, the processing of stream data has received many attention. What is interesting is that the methodology to design stream-based algorithms may well be the solution to the above problem. In this entry, we discuss this issue and present an overview of recent works

    Algorithmic Techniques for Processing Data Streams

    Get PDF
    We give a survey at some algorithmic techniques for processing data streams. After covering the basic methods of sampling and sketching, we present more evolved procedures that resort on those basic ones. In particular, we examine algorithmic schemes for similarity mining, the concept of group testing, and techniques for clustering and summarizing data streams

    Large-Scale Indexing, Discovery, and Ranking for the Internet of Things (IoT)

    Get PDF
    Network-enabled sensing and actuation devices are key enablers to connect real-world objects to the cyber world. The Internet of Things (IoT) consists of the network-enabled devices and communication technologies that allow connectivity and integration of physical objects (Things) into the digital world (Internet). Enormous amounts of dynamic IoT data are collected from Internet-connected devices. IoT data are usually multi-variant streams that are heterogeneous, sporadic, multi-modal, and spatio-temporal. IoT data can be disseminated with different granularities and have diverse structures, types, and qualities. Dealing with the data deluge from heterogeneous IoT resources and services imposes new challenges on indexing, discovery, and ranking mechanisms that will allow building applications that require on-line access and retrieval of ad-hoc IoT data. However, the existing IoT data indexing and discovery approaches are complex or centralised, which hinders their scalability. The primary objective of this article is to provide a holistic overview of the state-of-the-art on indexing, discovery, and ranking of IoT data. The article aims to pave the way for researchers to design, develop, implement, and evaluate techniques and approaches for on-line large-scale distributed IoT applications and services

    Monitoring Network Data Streams

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A Survey on Index Support for Item Set Mining

    Get PDF
    It is very difficult to handle the huge amount of information stored in modern databases. To manage with these databases association rule mining is currently used, which is a costly process that involves a significant amount of time and memory. Therefore, it is necessary to develop an approach to overcome these difficulties. A suitable data structures and algorithms must be developed to effectively perform the item set mining. An index includes all necessary characteristics potentially needed during the mining task; the extraction can be executed with the help of the index, without accessing the database. A database index is a data structure that enhances the speed of information retrieval operations on a database table at very low cost and increased storage space. The use index permits user interaction, in which the user can specify different attributes for item set extraction. Therefore, the extraction can be completed with the use index and without accessing the original database. Index also supports for reusing concept to mine item sets with the use of any support threshold. This paper also focuses on the survey of index support for item set mining which are proposed by various authors

    Design and implementation of an efficient data stream processing system

    Get PDF
    In standard database scenarios, an end-user assumes that all data (e.g., sensor readings) is stored in a database. Therefore, one can simply submit any arbitrary complex processing in the form of SQL queries or stored procedures to a database server. Data stream oriented applications are typically dealing with huge volumes of data. Storing data and performing off-line processing on this huge dataset can be costly, time consuming and impractical. This work describes our research results while designing and implementing an efficient data management system for online and off-line processing of data streams in the field of environmental monitoring. Our target data sources are wireless sensor networks. Although our focus is on a specific application domain, the results of this thesis are designed in a generic way, so that they can be applied to wide variety of data stream oriented applications. This thesis starts by first presenting the state-of-the-art in data stream processing research specifically window processing concepts, continuous queries, stream filtering query languages and in-network data processing (particular focus on TinyOS-based approaches). We present key existing data stream processing engines, their internal architecture and how they are compared to our platform, namely Global Sensor Network (GSN) middleware. GSN middleware enables fast and flexible deployment and interconnection of sensor networks. It provides simple and uniform access to a comprehensive set of heterogeneous technologies. Additionally, GSN offers zero-programming deployment and data-oriented integration of sensor networks and supports dynamic re-configuration and adaptation at runtime. We present the virtual sensor concept, which offers a high-level view of arbitrary stream data sources, its powerful declarative specification and query tools. Furthermore, we describe design, conceptual, architectural and optimization decisions of GSN platform in detail. In order to achieve high efficiency while processing large volumes of streaming data using window-based continuous queries, we present a set of optimization algorithms and techniques to intelligently group and process different types of continuous queries. While adapting GSN to large scale sensor network deployments, we have encountered several performance bottlenecks. One of the challenges we faced was related to scalable delivery of streaming data for high data rate streams. We found out that we could dramatically improve the performance of a query processor by performing simple grouping of user queries hence sharing both the processing and memory costs among similar queries. Moreover, we encountered a similar performance issue while scheduling continuous queries. Problem of efficiently scheduling the execution of continuous queries with window and sliding parameters is not addressed in depth in literature. This problem becomes severe when one considers large volumes of high data rate streams. In these cases, an efficient query scheduler not only increases the performance at least by an order of magnitude but also, decreases the response time and memory requirements. Finally, we present how our GSN platform can get integrated with an external data sharing and visualization framework namely Microsoft's SenseWeb platform. Microsoft's SenseWeb platform, provides a sensor network data gathering and visualization infrastructure which is globally accessible to the end users. This integration (which is initiated by the Swiss Experiment project and demanded by GSN users) not only shows the scalability of GSN platform when combined with optimized algorithms, but also demonstrates its flexibility
    corecore