19 research outputs found

    Overhead Management Strategies for Internet of Things Devices

    Get PDF
    Overhead (time and energy) management is paramount for IoT edge devices considering their typically resource-constrained nature. In this thesis we present two contributions for lowering resource consumption of IoT devices. The first contribution is minimizing the overhead of the Transport Layer Security (TLS) authentication protocol in the context of IoT networks by selecting a lightweight cipher suite configuration. TLS is the de facto authentication protocol for secure communication in Internet of Things (IoT) applications. However, the processing and energy demands of this protocol are the two essential parameters that must be taken into account with respect to the resource-constraint nature of IoT devices. For the first contribution, we study these parameters using a testbed in which an IoT board (Cypress CYW43907) communicates with a server over an 802.11 wireless link. Although TLS supports a wide-array of cipher suites, in this paper we focus on DHE RSA, ECDHE RSA, and ECDHE ECDSA, which are among the most popular ciphers used due to their robustness. Our studies show that ciphers using Elliptic Curve Diffie Hellman (ECDHE) key exchange are considerably more efficient than ciphers using Diffie Hellman (DHE). Furthermore, ECDSA signature verification consumes more time and energy than RSA signature verification for ECDHE key exchange. This study helps IoT designers choose an appropriate TLS cipher suite based on application demands, computational capabilities, and energy resources available. The second contribution of this thesis is deploying supervised machine learning anomaly detection algorithms on an IoT edge device to reduce data transmission overhead and cloud storage requirements. With continuous monitoring and sensing, millions of Internet of Things sensors all over the world generate tremendous amounts of data every minute. As a result, recent studies start to raise the question as whether to send all the sensing data directly to the cloud (i.e., direct transmission), or to preprocess such data at the network edge and only send necessary data to the cloud (i.e., preprocessing at the edge). Anomaly detection is particularly useful as an edge mining technique to reduce the transmission overhead in such a context when the frequently monitored activities contain only a sparse set of anomalies. This paper analyzes the potential overhead-savings of machine learning based anomaly detection models on the edge in three different IoT scenarios. Our experimental results prove that by choosing the appropriate anomaly detection models, we are able to effectively reduce the total amount of transmission energy as well as minimize required cloud storage. We prove that Random Forest, Multilayer Perceptron, and Discriminant Analysis models can viably save time and energy on the edge device during data transmission. K-Nearest Neighbors, although reliable in terms of prediction accuracy, demands exorbitant overhead and results in net time and energy loss on the edge device. In addition to presenting our model results for the different IoT scenarios, we provide guidelines for potential model selections through analysis of involved tradeoffs such as training overhead, prediction overhead, and classification accuracy

    Real-Time Data Analytics in Sensor Networks

    Full text link
    Abstract. The proliferation of Wireless Sensor Networks (WSNS) in the past decade has provided the bridge between the physical and digital worlds, enabling the monitoring and study of physical phenomena at a granularity and level of detail that was never before possible. In this study, we review the efforts of the research community with respect to two important problems in the context of WSNS: real-time collection of the sensed data, and real-time processing of these data series

    Outlier detection in wireless sensor network based on time series approach

    Get PDF
    Sensory data inWireless Sensor Network (WSN) is not always reliable because of open environmental factors such as noise, weak received signal strength or intrusion attacks. The process of detecting highly noisy data and noisy sensor node is called outlier detection. Outlier detection is one of the fundamental tasks of time series analysis that relates to predictive modeling, cluster analysis and association analysis. It has been widely researched in various disciplines besides WSN. The challenge of noise detection in WSN is when it has to be done inside a sensor with limited computational and communication capabilities. Furthermore, there are only a few outlier detection techniques in WSNs and there are no algorithms to detect outliers on real data with high level of accuracy locally and select the most effective neighbors for collaborative detection globally. Hence, this research designed a local and global time series outlier detection in WSN. The Local Outlier Detection Algorithm (LODA) as a decentralized noise detection algorithm runs on each sensor node by identifying intrinsic features, determining the memory size of data histogram to accomplish effective available memory, and making classification for predicting outlier data was developed. Next, the Global Outlier Detection Algorithm (GODA)was developed using adaptive Gray Coding and Entropy techniques for best neighbor selection for spatial correlation amongst sensor nodes. Beside GODA also adopts Adaptive Random Forest algorithm for best results. Finally, this research developed a Compromised SensorNode Detection Algorithm (CSDA) as a centralized algorithm processed at the base station for detecting compromised sensor nodes regardless of specific cause of the anomalies. To measure the effectiveness and accuracy of these algorithms, a comprehensive scenario was simulated. Noisy data were injected into the data randomly and the sensor nodes. The results showed that LODA achieved 89% accuracy in the prediction of the outliers, GODA detected anomalies up to 99% accurately and CSDA identified accurately up to 80% of the sensor nodes that have been compromised. In conclusion, the proposed algorithms have proven the anomaly detection locally and globally, and compromised sensor node detection in WSN

    Machine learning for Internet of Things data analysis: A survey

    Get PDF
    Rapid developments in hardware, software, and communication technologies have allowed the emergence of Internet-connected sensory devices that provide observation and data measurement from the physical world. By 2020, it is estimated that the total number of Internet-connected devices being used will be between 25 and 50 billion. As the numbers grow and technologies become more mature, the volume of data published will increase. Internet-connected devices technology, referred to as Internet of Things (IoT), continues to extend the current Internet by providing connectivity and interaction between the physical and cyber worlds. In addition to increased volume, the IoT generates Big Data characterized by velocity in terms of time and location dependency, with a variety of multiple modalities and varying data quality. Intelligent processing and analysis of this Big Data is the key to developing smart IoT applications. This article assesses the different machine learning methods that deal with the challenges in IoT data by considering smart cities as the main use case. The key contribution of this study is presentation of a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information. The potential and challenges of machine learning for IoT data analytics will also be discussed. A use case of applying Support Vector Machine (SVM) on Aarhus Smart City traffic data is presented for a more detailed exploration.Comment: Digital Communications and Networks (2017

    An anomaly mitigation framework for IoT using fog computing

    Get PDF
    The advancement in IoT has prompted its application in areas such as smart homes, smart cities, etc., and this has aided its exponential growth. However, alongside this development, IoT networks are experiencing a rise in security challenges such as botnet attacks, which often appear as network anomalies. Similarly, providing security solutions has been challenging due to the low resources that characterize the devices in IoT networks. To overcome these challenges, the fog computing paradigm has provided an enabling environment that offers additional resources for deploying security solutions such as anomaly mitigation schemes. In this paper, we propose a hybrid anomaly mitigation framework for IoT using fog computing to ensure faster and accurate anomaly detection. The framework employs signature- and anomaly-based detection methodologies for its two modules, respectively. The signature-based module utilizes a database of attack sources (blacklisted IP addresses) to ensure faster detection when attacks are executed from the blacklisted IP address, while the anomaly-based module uses an extreme gradient boosting algorithm for accurate classification of network traffic flow into normal or abnormal. We evaluated the performance of both modules using an IoT-based dataset in terms response time for the signature-based module and accuracy in binary and multiclass classification for the anomaly-based module. The results show that the signature-based module achieves a fast attack detection of at least six times faster than the anomaly-based module in each number of instances evaluated. The anomaly-based module using the XGBoost classifier detects attacks with an accuracy of 99% and at least 97% for average recall, average precision, and average F1 score for binary and multiclass classification. Additionally, it recorded 0.05 in terms of false-positive rates

    Security architecture for Fog-To-Cloud continuum system

    Get PDF
    Nowadays, by increasing the number of connected devices to Internet rapidly, cloud computing cannot handle the real-time processing. Therefore, fog computing was emerged for providing data processing, filtering, aggregating, storing, network, and computing closer to the users. Fog computing provides real-time processing with lower latency than cloud. However, fog computing did not come to compete with cloud, it comes to complete the cloud. Therefore, a hierarchical Fog-to-Cloud (F2C) continuum system was introduced. The F2C system brings the collaboration between distributed fogs and centralized cloud. In F2C systems, one of the main challenges is security. Traditional cloud as security provider is not suitable for the F2C system due to be a single-point-of-failure; and even the increasing number of devices at the edge of the network brings scalability issues. Furthermore, traditional cloud security cannot be applied to the fog devices due to their lower computational power than cloud. On the other hand, considering fog nodes as security providers for the edge of the network brings Quality of Service (QoS) issues due to huge fog device’s computational power consumption by security algorithms. There are some security solutions for fog computing but they are not considering the hierarchical fog to cloud characteristics that can cause a no-secure collaboration between fog and cloud. In this thesis, the security considerations, attacks, challenges, requirements, and existing solutions are deeply analyzed and reviewed. And finally, a decoupled security architecture is proposed to provide the demanded security in hierarchical and distributed fashion with less impact on the QoS.Hoy en día, al aumentar rápidamente el número de dispositivos conectados a Internet, el cloud computing no puede gestionar el procesamiento en tiempo real. Por lo tanto, la informática de niebla surgió para proporcionar procesamiento de datos, filtrado, agregación, almacenamiento, red y computación más cercana a los usuarios. La computación nebulizada proporciona procesamiento en tiempo real con menor latencia que la nube. Sin embargo, la informática de niebla no llegó a competir con la nube, sino que viene a completar la nube. Por lo tanto, se introdujo un sistema continuo jerárquico de niebla a nube (F2C). El sistema F2C aporta la colaboración entre las nieblas distribuidas y la nube centralizada. En los sistemas F2C, uno de los principales retos es la seguridad. La nube tradicional como proveedor de seguridad no es adecuada para el sistema F2C debido a que se trata de un único punto de fallo; e incluso el creciente número de dispositivos en el borde de la red trae consigo problemas de escalabilidad. Además, la seguridad tradicional de la nube no se puede aplicar a los dispositivos de niebla debido a su menor poder computacional que la nube. Por otro lado, considerar los nodos de niebla como proveedores de seguridad para el borde de la red trae problemas de Calidad de Servicio (QoS) debido al enorme consumo de energía computacional del dispositivo de niebla por parte de los algoritmos de seguridad. Existen algunas soluciones de seguridad para la informática de niebla, pero no están considerando las características de niebla a nube jerárquica que pueden causar una colaboración insegura entre niebla y nube. En esta tesis, las consideraciones de seguridad, los ataques, los desafíos, los requisitos y las soluciones existentes se analizan y revisan en profundidad. Y finalmente, se propone una arquitectura de seguridad desacoplada para proporcionar la seguridad exigida de forma jerárquica y distribuida con menor impacto en la QoS.Postprint (published version

    Mining complex data in highly streaming environments

    Get PDF
    Data is growing at a rapid rate because of advanced hardware and software technologies and platforms such as e-health systems, sensor networks, and social media. One of the challenging problems is storing, processing and transferring this big data in an efficient and effective way. One solution to tackle these challenges is to construct synopsis by means of data summarization techniques. Motivated by the fact that without summarization, processing, analyzing and communicating this vast amount of data is inefficient, this thesis introduces new summarization frameworks with the main goals of reducing communication costs and accelerating data mining processes in different application scenarios. Specifically, we study the following big data summarizaion techniques:(i) dimensionality reduction;(ii)clustering,and(iii)histogram, considering their importance and wide use in various areas and domains. In our work, we propose three different frameworks using these summarization techniques to cover three different aspects of big data:"Volume","Velocity"and"Variety" in centralized and decentralized platforms. We use dimensionality reduction techniques for summarizing large 2D-arrays, clustering and histograms for processing multiple data streams. With respect to the importance and rapid growth of emerging e-health applications such as tele-radiology and tele-medicine that require fast, low cost, and often lossless access to massive amounts of medical images and data over band limited channels,our first framework attempts to summarize streams of large volume medical images (e.g. X-rays) for the purpose of compression. Significant amounts of correlation and redundancy exist across different medical images. These can be extracted and used as a data summary to achieve better compression, and consequently less storage and less communication overheads on the network. We propose a novel memory-assisted compression framework as a learning-based universal coding, which can be used to complement any existing algorithm to further eliminate redundancies/similarities across images. This approach is motivated by the fact that, often in medical applications, massive amounts of correlated images from the same family are available as training data for learning the dependencies and deriving appropriate reference or synopses models. The models can then be used for compression of any new image from the same family. In particular, dimensionality reduction techniques such as Principal Component Analysis (PCA) and Non-negative Matrix Factorization (NMF) are applied on a set of images from training data to form the required reference models. The proposed memory-assisted compression allows each image to be processed independently of other images, and hence allows individual image access and transmission. In the second part of our work,we investigate the problem of summarizing distributed multidimensional data streams using clustering. We devise a distributed clustering framework, DistClusTree, that extends the centralized ClusTree approach. The main difficulty in distributed clustering is balancing communication costs and clustering quality. We tackle this in DistClusTree through combining spatial index summaries and online tracking for efficient local and global incremental clustering. We demonstrate through extensive experiments the efficacy of the framework in terms of communication costs and approximate clustering quality. In the last part, we use a multidimensional index structure to merge distributed summaries in the form of a centralized histogram as another widely used summarization technique with the application in approximate range query answering. In this thesis, we propose the index-based Distributed Mergeable Summaries (iDMS) framework based on kd-trees that addresses these challenges with data generative models of Gaussian mixture models (GMMs) and a Generative Adversarial Network (GAN). iDMS maintains a global approximate kd-tree at a central site via GMMs or GANs upon new arrivals of streaming data at local sites. Experimental results validate the effectiveness and efficiency of iDMS against baseline distributed settings in terms of approximation error and communication costs

    An Evolutionary Algorithm to Generate Ellipsoid Detectors for Negative Selection

    Get PDF
    Negative selection is a process from the biological immune system that can be applied to two-class (self and nonself) classification problems. Negative selection uses only one class (self) for training, which results in detectors for the other class (nonself). This paradigm is especially useful for problems in which only one class is available for training, such as network intrusion detection. Previous work has investigated hyper-rectangles and hyper-spheres as geometric detectors. This work proposes ellipsoids as geometric detectors. First, the author establishes a mathematical model for ellipsoids. He develops an algorithm to generate ellipsoids by training on only one class of data. Ellipsoid mutation operators, an objective function, and a convergence technique are described for the evolutionary algorithm that generates ellipsoid detectors. Testing on several data sets validates this approach by showing that the algorithm generates good ellipsoid detectors. Against artificial data sets, the detectors generated by the algorithm match more than 90% of nonself data with no false alarms. Against a subset of data from the 1999 DARPA MIT intrusion detection data, the ellipsoids generated by the algorithm detected approximately 98% of nonself (intrusions) with an approximate 0% false alarm rate
    corecore