262 research outputs found

    IoT Data Analytics in Dynamic Environments: From An Automated Machine Learning Perspective

    Full text link
    With the wide spread of sensors and smart devices in recent years, the data generation speed of the Internet of Things (IoT) systems has increased dramatically. In IoT systems, massive volumes of data must be processed, transformed, and analyzed on a frequent basis to enable various IoT services and functionalities. Machine Learning (ML) approaches have shown their capacity for IoT data analytics. However, applying ML models to IoT data analytics tasks still faces many difficulties and challenges, specifically, effective model selection, design/tuning, and updating, which have brought massive demand for experienced data scientists. Additionally, the dynamic nature of IoT data may introduce concept drift issues, causing model performance degradation. To reduce human efforts, Automated Machine Learning (AutoML) has become a popular field that aims to automatically select, construct, tune, and update machine learning models to achieve the best performance on specified tasks. In this paper, we conduct a review of existing methods in the model selection, tuning, and updating procedures in the area of AutoML in order to identify and summarize the optimal solutions for every step of applying ML algorithms to IoT data analytics. To justify our findings and help industrial users and researchers better implement AutoML approaches, a case study of applying AutoML to IoT anomaly detection problems is conducted in this work. Lastly, we discuss and classify the challenges and research directions for this domain.Comment: Published in Engineering Applications of Artificial Intelligence (Elsevier, IF:7.8); Code/An AutoML tutorial is available at Github link: https://github.com/Western-OC2-Lab/AutoML-Implementation-for-Static-and-Dynamic-Data-Analytic

    A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection

    Full text link
    Time series are the primary data type used to record dynamic system measurements and generated in great volume by both physical sensors and online processes (virtual sensors). Time series analytics is therefore crucial to unlocking the wealth of information implicit in available data. With the recent advancements in graph neural networks (GNNs), there has been a surge in GNN-based approaches for time series analysis. Approaches can explicitly model inter-temporal and inter-variable relationships, which traditional and other deep neural network-based methods struggle to do. In this survey, we provide a comprehensive review of graph neural networks for time series analysis (GNN4TS), encompassing four fundamental dimensions: Forecasting, classification, anomaly detection, and imputation. Our aim is to guide designers and practitioners to understand, build applications, and advance research of GNN4TS. At first, we provide a comprehensive task-oriented taxonomy of GNN4TS. Then, we present and discuss representative research works and, finally, discuss mainstream applications of GNN4TS. A comprehensive discussion of potential future research directions completes the survey. This survey, for the first time, brings together a vast array of knowledge on GNN-based time series research, highlighting both the foundations, practical applications, and opportunities of graph neural networks for time series analysis.Comment: 27 pages, 6 figures, 5 table

    A Systematic Review of Data Quality in CPS and IoT for Industry 4.0

    Get PDF
    The Internet of Things (IoT) and Cyber-Physical Systems (CPS) are the backbones of Industry 4.0, where data quality is crucial for decision support. Data quality in these systems can deteriorate due to sensor failures or uncertain operating environments. Our objective is to summarize and assess the research efforts that address data quality in data-centric CPS/IoT industrial applications. We systematically review the state-of-the-art data quality techniques for CPS and IoT in Industry 4.0 through a systematic literature review (SLR) study. We pose three research questions, define selection and exclusion criteria for primary studies, and extract and synthesize data from these studies to answer our research questions. Our most significant results are (i) the list of data quality issues, their sources, and application domains, (ii) the best practices and metrics for managing data quality, (iii) the software engineering solutions employed to manage data quality, and (iv) the state of the data quality techniques (data repair, cleaning, and monitoring) in the application domains. The results of our SLR can help researchers obtain an overview of existing data quality issues, techniques, metrics, and best practices. We suggest research directions that require attention from the research community for follow-up work.acceptedVersio

    Personalized data analytics for internet-of-things-based health monitoring

    Get PDF
    The Internet-of-Things (IoT) has great potential to fundamentally alter the delivery of modern healthcare, enabling healthcare solutions outside the limits of conventional clinical settings. It can offer ubiquitous monitoring to at-risk population groups and allow diagnostic care, preventive care, and early intervention in everyday life. These services can have profound impacts on many aspects of health and well-being. However, this field is still at an infancy stage, and the use of IoT-based systems in real-world healthcare applications introduces new challenges. Healthcare applications necessitate satisfactory quality attributes such as reliability and accuracy due to their mission-critical nature, while at the same time, IoT-based systems mostly operate over constrained shared sensing, communication, and computing resources. There is a need to investigate this synergy between the IoT technologies and healthcare applications from a user-centered perspective. Such a study should examine the role and requirements of IoT-based systems in real-world health monitoring applications. Moreover, conventional computing architecture and data analytic approaches introduced for IoT systems are insufficient when used to target health and well-being purposes, as they are unable to overcome the limitations of IoT systems while fulfilling the needs of healthcare applications. This thesis aims to address these issues by proposing an intelligent use of data and computing resources in IoT-based systems, which can lead to a high-level performance and satisfy the stringent requirements. For this purpose, this thesis first delves into the state-of-the-art IoT-enabled healthcare systems proposed for in-home and in-hospital monitoring. The findings are analyzed and categorized into different domains from a user-centered perspective. The selection of home-based applications is focused on the monitoring of the elderly who require more remote care and support compared to other groups of people. In contrast, the hospital-based applications include the role of existing IoT in patient monitoring and hospital management systems. Then, the objectives and requirements of each domain are investigated and discussed. This thesis proposes personalized data analytic approaches to fulfill the requirements and meet the objectives of IoT-based healthcare systems. In this regard, a new computing architecture is introduced, using computing resources in different layers of IoT to provide a high level of availability and accuracy for healthcare services. This architecture allows the hierarchical partitioning of machine learning algorithms in these systems and enables an adaptive system behavior with respect to the user's condition. In addition, personalized data fusion and modeling techniques are presented, exploiting multivariate and longitudinal data in IoT systems to improve the quality attributes of healthcare applications. First, a real-time missing data resilient decision-making technique is proposed for health monitoring systems. The technique tailors various data resources in IoT systems to accurately estimate health decisions despite missing data in the monitoring. Second, a personalized model is presented, enabling variations and event detection in long-term monitoring systems. The model evaluates the sleep quality of users according to their own historical data. Finally, the performance of the computing architecture and the techniques are evaluated in this thesis using two case studies. The first case study consists of real-time arrhythmia detection in electrocardiography signals collected from patients suffering from cardiovascular diseases. The second case study is continuous maternal health monitoring during pregnancy and postpartum. It includes a real human subject trial carried out with twenty pregnant women for seven months

    A Review on the Role of Nano-Communication in Future Healthcare Systems: A Big Data Analytics Perspective

    Get PDF
    This paper presents a first-time review of the open literature focused on the significance of big data generated within nano-sensors and nano-communication networks intended for future healthcare and biomedical applications. It is aimed towards the development of modern smart healthcare systems enabled with P4, i.e. predictive, preventive, personalized and participatory capabilities to perform diagnostics, monitoring, and treatment. The analytical capabilities that can be produced from the substantial amount of data gathered in such networks will aid in exploiting the practical intelligence and learning capabilities that could be further integrated with conventional medical and health data leading to more efficient decision making. We have also proposed a big data analytics framework for gathering intelligence, form the healthcare big data, required by futuristic smart healthcare to address relevant problems and exploit possible opportunities in future applications. Finally, the open challenges, future directions for researchers in the evolving healthcare domain, are presented

    Data analysis and machine learning approaches for time series pre- and post- processing pipelines

    Get PDF
    157 p.En el ámbito industrial, las series temporales suelen generarse de forma continua mediante sensores quecaptan y supervisan constantemente el funcionamiento de las máquinas en tiempo real. Por ello, esimportante que los algoritmos de limpieza admitan un funcionamiento casi en tiempo real. Además, amedida que los datos evolución, la estrategia de limpieza debe cambiar de forma adaptativa eincremental, para evitar tener que empezar el proceso de limpieza desde cero cada vez.El objetivo de esta tesis es comprobar la posibilidad de aplicar flujos de aprendizaje automática a lasetapas de preprocesamiento de datos. Para ello, este trabajo propone métodos capaces de seleccionarestrategias óptimas de preprocesamiento que se entrenan utilizando los datos históricos disponibles,minimizando las funciones de perdida empíricas.En concreto, esta tesis estudia los procesos de compresión de series temporales, unión de variables,imputación de observaciones y generación de modelos subrogados. En cada uno de ellos se persigue laselección y combinación óptima de múltiples estrategias. Este enfoque se define en función de lascaracterísticas de los datos y de las propiedades y limitaciones del sistema definidas por el usuario

    Interconnected Services for Time-Series Data Management in Smart Manufacturing Scenarios

    Get PDF
    xvii, 218 p.The rise of Smart Manufacturing, together with the strategic initiatives carried out worldwide, have promoted its adoption among manufacturers who are increasingly interested in boosting data-driven applications for different purposes, such as product quality control, predictive maintenance of equipment, etc. However, the adoption of these approaches faces diverse technological challenges with regard to the data-related technologies supporting the manufacturing data life-cycle. The main contributions of this dissertation focus on two specific challenges related to the early stages of the manufacturing data life-cycle: an optimized storage of the massive amounts of data captured during the production processes and an efficient pre-processing of them. The first contribution consists in the design and development of a system that facilitates the pre-processing task of the captured time-series data through an automatized approach that helps in the selection of the most adequate pre-processing techniques to apply to each data type. The second contribution is the design and development of a three-level hierarchical architecture for time-series data storage on cloud environments that helps to manage and reduce the required data storage resources (and consequently its associated costs). Moreover, with regard to the later stages, a thirdcontribution is proposed, that leverages advanced data analytics to build an alarm prediction system that allows to conduct a predictive maintenance of equipment by anticipating the activation of different types of alarms that can be produced on a real Smart Manufacturing scenario

    Big data reduction framework for value creation in sustainable enterprises

    No full text
    Value creation is a major sustainability factor for enterprises, in addition to profit maximization and revenue generation. Modern enterprises collect big data from various inbound and outbound data sources. The inbound data sources handle data generated from the results of business operations, such as manufacturing, supply chain management, marketing, and human resource management, among others. Outbound data sources handle customer-generated data which are acquired directly or indirectly from customers, market analysis, surveys, product reviews, and transactional histories. However, cloud service utilization costs increase because of big data analytics and value creation activities for enterprises and customers. This article presents a novel concept of big data reduction at the customer end in which early data reduction operations are performed to achieve multiple objectives, such as a) lowering the service utilization cost, b) enhancing the trust between customers and enterprises, c) preserving privacy of customers, d) enabling secure data sharing, and e) delegating data sharing control to customers. We also propose a framework for early data reduction at customer end and present a business model for end-to-end data reduction in enterprise applications. The article further presents a business model canvas and maps the future application areas with its nine components. Finally, the article discusses the technology adoption challenges for value creation through big data reduction in enterprise applications

    Optimized and Automated Machine Learning Techniques Towards IoT Data Analytics and Cybersecurity

    Get PDF
    The Internet-of-Things (IoT) systems have emerged as a prevalent technology in our daily lives. With the wide spread of sensors and smart devices in recent years, the data generation volume and speed of IoT systems have increased dramatically. In most IoT systems, massive volumes of data must be processed, transformed, and analyzed on a frequent basis to enable various IoT services and functionalities. Machine Learning (ML) approaches have shown their capacity for IoT data analytics. However, applying ML models to IoT data analytics tasks still faces many difficulties and challenges. The first challenge is to process large amounts of dynamic IoT data to make accurate and informed decisions. The second challenge is to automate and optimize the data analytics process. The third challenge is to protect IoT devices and systems against various cyber threats and attacks. To address the IoT data analytics challenges, this thesis proposes various ML-based frameworks and data analytics approaches in several applications. Specifically, the first part of the thesis provides a comprehensive review of applying Automated Machine Learning (AutoML) techniques to IoT data analytics tasks. It discusses all procedures of the general ML pipeline. The second part of the thesis proposes several supervised ML-based novel Intrusion Detection Systems (IDSs) to improve the security of the Internet of Vehicles (IoV) systems and connected vehicles. Optimization techniques are used to obtain optimized ML models with high attack detection accuracy. The third part of the thesis developed unsupervised ML algorithms to identify network anomalies and malicious network entities (e.g., attacker IPs, compromised machines, and polluted files/content) to protect Content Delivery Networks (CDNs) from service targeting attacks, including distributed denial of service and cache pollution attacks. The proposed framework is evaluated on real-world CDN access log data to illustrate its effectiveness. The fourth part of the thesis proposes adaptive online learning algorithms for addressing concept drift issues (i.e., data distribution changes) and effectively handling dynamic IoT data streams in order to provide reliable IoT services. The development of drift adaptive learning methods can effectively adapt to data distribution changes and avoid data analytics model performance degradation
    corecore