9 research outputs found

    SLDPC: Towards Second Order Learning for Detecting Persistent Clusters in Data Streams

    Get PDF
    The main attention of research on data stream clustering algorithms so far has been focused on the adaptation of the algorithms for static datasets to the data streams and improvements of the existing adapted algorithms. Such algorithms fulfil the purpose of the first-order learning from data to clusters. This paper prompts a new question on second-order learning of cluster models from data streams and presents a learning algorithm that detects persistent clusters from consecutive clustering snapshots in data streams. In this work, we first collect a sequence of cluster snapshots as the output clusters at selected query points and then identify the persistent clusters within a given timeframe. The algorithm is evaluated on collections of synthetic datasets. The experimental results have demonstrated the effectiveness of the algorithm in detecting such persistent clusters

    A metaheuristic optimization approach for energy efficiency in the IoT networks

    Get PDF
    © 2020 John Wiley & Sons, Ltd. Recently Internet of Things (IoT) is being used in several fields like smart city, agriculture, weather forecasting, smart grids, waste management, etc. Even though IoT has huge potential in several applications, there are some areas for improvement. In the current work, we have concentrated on minimizing the energy consumption of sensors in the IoT network that will lead to an increase in the network lifetime. In this work, to optimize the energy consumption, most appropriate Cluster Head (CH) is chosen in the IoT network. The proposed work makes use of a hybrid metaheuristic algorithm, namely, Whale Optimization Algorithm (WOA) with Simulated Annealing (SA). To select the optimal CH in the clusters of IoT network, several performance metrics such as the number of alive nodes, load, temperature, residual energy, cost function have been used. The proposed approach is then compared with several state-of-the-art optimization algorithms like Artificial Bee Colony algorithm, Genetic Algorithm, Adaptive Gravitational Search algorithm, WOA. The results prove the superiority of the proposed hybrid approach over existing approaches

    Distributed Fog computing for Internet of Things (IoT) based Ambient Data Processing and Analysis

    Get PDF
    Urban centers across the globe are under immense environmental distress due to an increase in air pollution, industrialization, and elevated living standards. The unmanageable and mushroom growth of industries and an exponential soar in population has made the ascent of air pollution intractable. To this end, the solutions that are based on the latest technologies, such as the Internet of things (IoT) and Artificial Intelligence (AI) are becoming increasingly popular and they have capabilities to monitor the extent and scale of air contaminants and would be subsequently useful for containing them. With centralized cloud-based IoT platforms, the ubiquitous and continuous monitoring of air quality and data processing can be facilitated for the identification of air pollution hot spots. However, owing to the inherent characteristics of cloud, such as large end-to-end delay and bandwidth constraint, handling the high velocity and large volume of data that are generated by distributed IoT sensors would not be feasible in the longer run. To address these issues, fog computing is a powerful paradigm, where the data are processed and filtered near the end of the IoT nodes and it is useful for improving the quality of service (QoS) of IoT network. To further improve the QoS, a conceptual model of distributed fog computing and a machine learning based data processing and analysis model is proposed for the optimal utilization of cloud resources. The proposed model provides a classification accuracy of 99% while using a Support Vector Machines (SVM) classifier. This model is also simulated in iFogSim toolkit. It affords many advantages, such as reduced load on the central server by locally processing the data and reporting the quality of air. Additionally, it would offer the scalability of the system by integrating more air quality monitoring nodes in the IoT network

    Data Stream Clustering: A Review

    Full text link
    Number of connected devices is steadily increasing and these devices continuously generate data streams. Real-time processing of data streams is arousing interest despite many challenges. Clustering is one of the most suitable methods for real-time data stream processing, because it can be applied with less prior information about the data and it does not need labeled instances. However, data stream clustering differs from traditional clustering in many aspects and it has several challenging issues. Here, we provide information regarding the concepts and common characteristics of data streams, such as concept drift, data structures for data streams, time window models and outlier detection. We comprehensively review recent data stream clustering algorithms and analyze them in terms of the base clustering technique, computational complexity and clustering accuracy. A comparison of these algorithms is given along with still open problems. We indicate popular data stream repositories and datasets, stream processing tools and platforms. Open problems about data stream clustering are also discussed.Comment: Has been accepted for publication in Artificial Intelligence Revie

    Enhanced non-parametric sequence learning scheme for internet of things sensory data in cloud infrastructure

    Get PDF
    The Internet of Things (IoT) Cloud is an emerging technology that enables machine-to-machine, human-to-machine and human-to-human interaction through the Internet. IoT sensor devices tend to generate sensory data known for their dynamic and heterogeneous nature. Hence, it makes it elusive to be managed by the sensor devices due to their limited computation power and storage space. However, the Cloud Infrastructure as a Service (IaaS) leverages the limitations of the IoT devices by making its computation power and storage resources available to execute IoT sensory data. In IoT-Cloud IaaS, resource allocation is the process of distributing optimal resources to execute data request tasks that comprise data filtering operations. Recently, machine learning, non-heuristics, multi-objective and hybrid algorithms have been applied for efficient resource allocation to execute IoT sensory data filtering request tasks in IoT-enabled Cloud IaaS. However, the filtering task is still prone to some challenges. These challenges include global search entrapment of event and error outlier detection as the dimension of the dataset increases in size, the inability of missing data recovery for effective redundant data elimination and local search entrapment that leads to unbalanced workloads on available resources required for task execution. In this thesis, the enhancement of Non-Parametric Sequence Learning (NPSL), Perceptually Important Point (PIP) and Efficient Energy Resource Ranking- Virtual Machine Selection (ERVS) algorithms were proposed. The Non-Parametric Sequence-based Agglomerative Gaussian Mixture Model (NPSAGMM) technique was initially utilized to improve the detection of event and error outliers in the global space as the dimension of the dataset increases in size. Then, Perceptually Important Points K-means-enabled Cosine and Manhattan (PIP-KCM) technique was employed to recover missing data to improve the elimination of duplicate sensed data records. Finally, an Efficient Resource Balance Ranking- based Glow-warm Swarm Optimization (ERBV-GSO) technique was used to resolve the local search entrapment for near-optimal solutions and to reduce workload imbalance on available resources for task execution in the IoT-Cloud IaaS platform. Experiments were carried out using the NetworkX simulator and the results of N-PSAGMM, PIP-KCM and ERBV-GSO techniques with N-PSL, PIP, ERVS and Resource Fragmentation Aware (RF-Aware) algorithms were compared. The experimental results showed that the proposed NPSAGMM, PIP-KCM, and ERBV-GSO techniques produced a tremendous performance improvement rate based on 3.602%/6.74% Precision, 9.724%/8.77% Recall, 5.350%/4.42% Area under Curve for the detection of event and error outliers. Furthermore, the results indicated an improvement rate of 94.273% F1-score, 0.143 Reduction Ratio, and with minimum 0.149% Root Mean Squared Error for redundant data elimination as well as the minimum number of 608 Virtual Machine migrations, 47.62% Resource Utilization and 41.13% load balancing degree for the allocation of desired resources deployed to execute sensory data filtering tasks respectively. Therefore, the proposed techniques have proven to be effective for improving the load balancing of allocating the desired resources to execute efficient outlier (Event and Error) detection and eliminate redundant data records in the IoT-based Cloud IaaS Infrastructure

    Incremental algorithm for Decision Rule generation in data stream contexts

    Get PDF
    Actualmente, la ciencia de datos está ganando mucha atención en diferentes sectores. Concretamente en la industria, muchas aplicaciones pueden ser consideradas. Utilizar técnicas de ciencia de datos en el proceso de toma de decisiones es una de esas aplicaciones que pueden aportar valor a la industria. El incremento de la disponibilidad de los datos y de la aparición de flujos continuos en forma de data streams hace emerger nuevos retos a la hora de trabajar con datos cambiantes. Este trabajo presenta una propuesta innovadora, Incremental Decision Rules Algorithm (IDRA), un algoritmo que, de manera incremental, genera y modifica reglas de decisión para entornos de data stream para incorporar cambios que puedan aparecer a lo largo del tiempo. Este método busca proponer una nueva estructura de reglas que busca mejorar el proceso de toma de decisiones, planteando una base de conocimiento descriptiva y transparente que pueda ser integrada en una herramienta decisional. Esta tesis describe la lógica existente bajo la propuesta de IDRA, en todas sus versiones, y propone una variedad de experimentos para compararlas con un método clásico (CREA) y un método adaptativo (VFDR). Conjuntos de datos reales, juntamente con algunos escenarios simulados con diferentes tipos y ratios de error, se utilizan para comparar estos algoritmos. El estudio prueba que IDRA, específicamente la versión reactiva de IDRA (RIDRA), mejora la precisión de VFDR y CREA en todos los escenarios, tanto reales como simulados, a cambio de un incremento en el tiempo.Nowadays, data science is earning a lot of attention in many different sectors. Specifically in the industry, many applications might be considered. Using data science techniques in the decision-making process is a valuable approach among the mentioned applications. Along with this, the growth of data availability and the appearance of continuous data flows in the form of data stream arise other challenges when dealing with changing data. This work presents a novel proposal of an algorithm, Incremental Decision Rules Algorithm (IDRA), that incrementally generates and modify decision rules for data stream contexts to incorporate the changes that could appear over time. This method aims to propose new rule structures that improve the decision-making process by providing a descriptive and transparent base of knowledge that could be integrated in a decision tool. This work describes the logic underneath IDRA, in all its versions, and proposes a variety of experiments to compare them with a classical method (CREA) and an adaptive method (VFDR). Some real datasets, together with some simulated scenarios with different error types and rates are used to compare these algorithms. The study proved that IDRA, specifically the reactive version of IDRA (RIDRA), improves the accuracies of VFDR and CREA in all the studied scenarios, both real and simulated, in exchange of more time

    Feature Papers "Age-Friendly Cities & Communities: State of the Art and Future Perspectives"

    Get PDF
    The "Age-Friendly Cities & Communities: States of the Art and Future Perspectives" publication presents contemporary, innovative, and insightful narratives, debates, and frameworks based on an international collection of papers from scholars spanning the fields of gerontology, social sciences, architecture, computer science, and gerontechnology. This extensive collection of papers aims to move the narrative and debates forward in this interdisciplinary field of age-friendly cities and communities

    Adaptive Clustering for Dynamic IoT Data Streams

    No full text
    The emergence of the Internet of Things (IoT) has led to the production of huge volumes of real-world streaming data. We need effective techniques to process IoT data streams and to gain insights and actionable information from realworld observations and measurements. Most existing approaches are application or domain dependent. We propose a method which determines how many different clusters can be found in a stream based on the data distribution. After selecting the number of clusters, we use an online clustering mechanism to cluster the incoming data from the streams. Our approach remains adaptive to drifts by adjusting itself as the data changes. We benchmark our approach against state-of-the-art stream clustering algorithms on data streams with data drift. We show how our method can be applied in a use case scenario involving near real-time traffic data. Our results allow to cluster, label and interpret IoT data streams dynamically according to the data distribution. This enables to adaptively process large volumes of dynamic data online based on the current situation. We show how our method adapts itself to the changes. We demonstrate how the number of clusters in a real-world data stream can be determined by analysing the data distributions

    Adaptive Clustering for Dynamic IoT Data Streams

    No full text
    The emergence of the Internet of Things (IoT) has led to the production of huge volumes of real-world streaming data. We need effective techniques to process IoT data streams and to gain insights and actionable information from realworld observations and measurements. Most existing approaches are application or domain dependent. We propose a method which determines how many different clusters can be found in a stream based on the data distribution. After selecting the number of clusters, we use an online clustering mechanism to cluster the incoming data from the streams. Our approach remains adaptive to drifts by adjusting itself as the data changes. We benchmark our approach against state-of-the-art stream clustering algorithms on data streams with data drift. We show how our method can be applied in a use case scenario involving near real-time traffic data. Our results allow to cluster, label and interpret IoT data streams dynamically according to the data distribution. This enables to adaptively process large volumes of dynamic data online based on the current situation. We show how our method adapts itself to the changes. We demonstrate how the number of clusters in a real-world data stream can be determined by analysing the data distributions
    corecore