1,277 research outputs found

    Agrupamiento, predicción y clasificación ordinal para series temporales utilizando técnicas de machine learning: aplicaciones

    Get PDF
    In the last years, there has been an increase in the number of fields improving their standard processes by using machine learning (ML) techniques. The main reason for this is that the vast amount of data generated by these processes is difficult to be processed by humans. Therefore, the development of automatic methods to process and extract relevant information from these data processes is of great necessity, giving that these approaches could lead to an increase in the economic benefit of enterprises or to a reduction in the workload of some current employments. Concretely, in this Thesis, ML approaches are applied to problems concerning time series data. Time series is a special kind of data in which data points are collected chronologically. Time series are present in a wide variety of fields, such as atmospheric events or engineering applications. Besides, according to the main objective to be satisfied, there are different tasks in the literature applied to time series. Some of them are those on which this Thesis is mainly focused: clustering, classification, prediction and, in general, analysis. Generally, the amount of data to be processed is huge, arising the need of methods able to reduce the dimensionality of time series without decreasing the amount of information. In this sense, the application of time series segmentation procedures dividing the time series into different subsequences is a good option, given that each segment defines a specific behaviour. Once the different segments are obtained, the use of statistical features to characterise them is an excellent way to maximise the information of the time series and simultaneously reducing considerably their dimensionality. In the case of time series clustering, the objective is to find groups of similar time series with the idea of discovering interesting patterns in time series datasets. In this Thesis, we have developed a novel time series clustering technique. The aim of this proposal is twofold: to reduce as much as possible the dimensionality and to develop a time series clustering approach able to outperform current state-of-the-art techniques. In this sense, for the first objective, the time series are segmented in order to divide the them identifying different behaviours. Then, these segments are projected into a vector of statistical features aiming to reduce the dimensionality of the time series. Once this preprocessing step is done, the clustering of the time series is carried out, with a significantly lower computational load. This novel approach has been tested on all the time series datasets available in the University of East Anglia and University of California Riverside (UEA/UCR) time series classification (TSC) repository. Regarding time series classification, two main paths could be differentiated: firstly, nominal TSC, which is a well-known field involving a wide variety of proposals and transformations applied to time series. Concretely, one of the most popular transformation is the shapelet transform (ST), which has been widely used in this field. The original method extracts shapelets from the original time series and uses them for classification purposes. Nevertheless, the full enumeration of all possible shapelets is very time consuming. Therefore, in this Thesis, we have developed a hybrid method that starts with the best shapelets extracted by using the original approach with a time constraint and then tunes these shapelets by using a convolutional neural network (CNN) model. Secondly, time series ordinal classification (TSOC) is an unexplored field beginning with this Thesis. In this way, we have adapted the original ST to the ordinal classification (OC) paradigm by proposing several shapelet quality measures taking advantage of the ordinal information of the time series. This methodology leads to better results than the state-of-the-art TSC techniques for those ordinal time series datasets. All these proposals have been tested on all the time series datasets available in the UEA/UCR TSC repository. With respect to time series prediction, it is based on estimating the next value or values of the time series by considering the previous ones. In this Thesis, several different approaches have been considered depending on the problem to be solved. Firstly, the prediction of low-visibility events produced by fog conditions is carried out by means of hybrid autoregressive models (ARs) combining fixed-size and dynamic windows, adapting itself to the dynamics of the time series. Secondly, the prediction of convective cloud formation (which is a highly imbalance problem given that the number of convective cloud events is much lower than that of non-convective situations) is performed in two completely different ways: 1) tackling the problem as a multi-objective classification task by the use of multi-objective evolutionary artificial neural networks (MOEANNs), in which the two conflictive objectives are accuracy of the minority class and the global accuracy, and 2) tackling the problem from the OC point of view, in which, in order to reduce the imbalance degree, an oversampling approach is proposed along with the use of OC techniques. Thirdly, the prediction of solar radiation is carried out by means of evolutionary artificial neural networks (EANNs) with different combinations of basis functions in the hidden and output layers. Finally, the last challenging problem is the prediction of energy flux from waves and tides. For this, a multitask EANN has been proposed aiming to predict the energy flux at several prediction time horizons (from 6h to 48h). All these proposals and techniques have been corroborated and discussed according to physical and atmospheric models. The work developed in this Thesis is supported by 11 JCR-indexed papers in international journals (7 Q1, 3 Q2, 1 Q3), 11 papers in international conferences, and 4 papers in national conferences

    Sensor-Based Locomotion Data Mining for Supporting the Diagnosis of Neurodegenerative Disorders: A Survey

    Get PDF
    Locomotion characteristics and movement patterns are reliable indicators of neurodegenerative diseases (NDDs). This survey provides a systematic literature review of locomotion data mining systems for supporting NDD diagnosis. We discuss techniques for discovering low-level locomotion indicators, sensor data acquisition and processing methods, and NDD detection algorithms. The survey presents a comprehensive discussion on the main challenges for this active area, including the addressed diseases, locomotion data types, duration of monitoring, employed algorithms, and experimental validation strategies. We also identify prominent open challenges and research directions regarding ethics and privacy issues, technological and usability aspects, and availability of public benchmarks

    Automatic Environmental Sound Recognition: Performance versus Computational Cost

    Get PDF
    In the context of the Internet of Things (IoT), sound sensing applications are required to run on embedded platforms where notions of product pricing and form factor impose hard constraints on the available computing power. Whereas Automatic Environmental Sound Recognition (AESR) algorithms are most often developed with limited consideration for computational cost, this article seeks which AESR algorithm can make the most of a limited amount of computing power by comparing the sound classification performance em as a function of its computational cost. Results suggest that Deep Neural Networks yield the best ratio of sound classification accuracy across a range of computational costs, while Gaussian Mixture Models offer a reasonable accuracy at a consistently small cost, and Support Vector Machines stand between both in terms of compromise between accuracy and computational cost

    Automated Intelligent Cueing Device to Improve Ambient Gait Behaviors for Patients with Parkinson\u27s Disease

    Get PDF
    Freezing of gait (FoG) is a common motor dysfunction in individuals with Parkinson’s disease (PD). FoG impairs walking and is associated with increased fall risk. Although pharmacological treatments have shown promise during ON-medication periods, FoG remains difficult to treat during medication OFF state and in advanced stages of the disease. External cueing therapy in the forms of visual, auditory, and vibrotactile, has been effective in treating gait deviations. Intelligent (or on-demand) cueing devices are novel systems that analyze gait patterns in real-time and activate cues only at moments when specific gait alterations are detected. In this study we developed methods to analyze gait signals collected through wearable sensors and accurately identify FoG episodes. We also investigated the potential of predicting the symptoms before their actual occurrence. We collected data from seven participants with PD using two Inertial Measurement Units (IMUs) on ankles. In our first study, we extracted engineered features from the signals and used machine learning (ML) methods to identify FoG episodes. We tested the performance of models using patient-dependent and patient-independent paradigms. The former models achieved 92.5% and 89.0% for average sensitivity and specificity, respectively. However, the conventional binary classification methods fail to accurately classify data if only data from normal gait periods are available. In order to identify FoG episodes in participants who did not freeze during data collection sessions, we developed a Deep Gait Anomaly Detector (DGAD) to identify anomalies (i.e., FoG) in the signals. DGAD was formed of convolutional layers and trained to automatically learn features from signals. The convolutional layers are followed by fully connected layers to reduce the dimensions of the features. A k-nearest neighbors (kNN) classifier is then used to classify the data as normal or FoG. The models identified 87.4% of FoG onsets, with 21.9% being predicted on average for each participant. This study demonstrates our algorithm\u27s potential for delivery of preventive cues. The DGAD algorithm was then implemented in an Android application to monitor gait patterns of PD patients in ambient environments. The phone triggered vibrotactile and auditory cues on a connected smartwatch if an FoG episode was identified. A 6-week in-home study showed the potentials for effective treatment of FoG severity in ambient environments using intelligent cueing devices

    Machine Learning for Internet of Things Data Analysis: A Survey

    Get PDF
    Rapid developments in hardware, software, and communication technologies have facilitated the emergence of Internet-connected sensory devices that provide observations and data measurements from the physical world. By 2020, it is estimated that the total number of Internet-connected devices being used will be between 25 and 50 billion. As these numbers grow and technologies become more mature, the volume of data being published will increase. The technology of Internet-connected devices, referred to as Internet of Things (IoT), continues to extend the current Internet by providing connectivity and interactions between the physical and cyber worlds. In addition to an increased volume, the IoT generates big data characterized by its velocity in terms of time and location dependency, with a variety of multiple modalities and varying data quality. Intelligent processing and analysis of this big data are the key to developing smart IoT applications. This article assesses the various machine learning methods that deal with the challenges presented by IoT data by considering smart cities as the main use case. The key contribution of this study is the presentation of a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information. The potential and challenges of machine learning for IoT data analytics will also be discussed. A use case of applying a Support Vector Machine (SVM) to Aarhus smart city traffic data is presented for a more detailed exploration

    Data Mining for Fog Prediction and Low Clouds Detection

    Get PDF
    his paper describes our contribution to the research of parametrized models and methods for detection and prediction of significant meteorological phenomena, especially fog and low cloud cover. The project covered methods for integration of distributed meteorological data necessary for running the prediction models, training models and then mining the data in order to be able to efficiently and quickly predict even sparsely occurring phenomena. The detection and prediction methods are based on knowledge discovery -- data mining of meteorological data using neural networks and decision trees. The mined data were mainly METAR aerodrome messages, meteorological data from specialized stations and cloud data from special airport sensors -- laser ceilometers

    Machine learning for Internet of Things data analysis: A survey

    Get PDF
    Rapid developments in hardware, software, and communication technologies have allowed the emergence of Internet-connected sensory devices that provide observation and data measurement from the physical world. By 2020, it is estimated that the total number of Internet-connected devices being used will be between 25 and 50 billion. As the numbers grow and technologies become more mature, the volume of data published will increase. Internet-connected devices technology, referred to as Internet of Things (IoT), continues to extend the current Internet by providing connectivity and interaction between the physical and cyber worlds. In addition to increased volume, the IoT generates Big Data characterized by velocity in terms of time and location dependency, with a variety of multiple modalities and varying data quality. Intelligent processing and analysis of this Big Data is the key to developing smart IoT applications. This article assesses the different machine learning methods that deal with the challenges in IoT data by considering smart cities as the main use case. The key contribution of this study is presentation of a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information. The potential and challenges of machine learning for IoT data analytics will also be discussed. A use case of applying Support Vector Machine (SVM) on Aarhus Smart City traffic data is presented for a more detailed exploration.Comment: Digital Communications and Networks (2017

    Mobile Health in Remote Patient Monitoring for Chronic Diseases: Principles, Trends, and Challenges

    Get PDF
    Chronic diseases are becoming more widespread. Treatment and monitoring of these diseases require going to hospitals frequently, which increases the burdens of hospitals and patients. Presently, advancements in wearable sensors and communication protocol contribute to enriching the healthcare system in a way that will reshape healthcare services shortly. Remote patient monitoring (RPM) is the foremost of these advancements. RPM systems are based on the collection of patient vital signs extracted using invasive and noninvasive techniques, then sending them in real-time to physicians. These data may help physicians in taking the right decision at the right time. The main objective of this paper is to outline research directions on remote patient monitoring, explain the role of AI in building RPM systems, make an overview of the state of the art of RPM, its advantages, its challenges, and its probable future directions. For studying the literature, five databases have been chosen (i.e., science direct, IEEE-Explore, Springer, PubMed, and science.gov). We followed the (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) PRISMA, which is a standard methodology for systematic reviews and meta-analyses. A total of 56 articles are reviewed based on the combination of a set of selected search terms including RPM, data mining, clinical decision support system, electronic health record, cloud computing, internet of things, and wireless body area network. The result of this study approved the effectiveness of RPM in improving healthcare delivery, increase diagnosis speed, and reduce costs. To this end, we also present the chronic disease monitoring system as a case study to provide enhanced solutions for RPMsThis research work was partially supported by the Sejong University Research Faculty Program (20212023)S
    corecore