3,908 research outputs found

    Machine Learning for Synthetic Data Generation: A Review

    Full text link
    Data plays a crucial role in machine learning. However, in real-world applications, there are several problems with data, e.g., data are of low quality; a limited number of data points lead to under-fitting of the machine learning model; it is hard to access the data due to privacy, safety and regulatory concerns. Synthetic data generation offers a promising new avenue, as it can be shared and used in ways that real-world data cannot. This paper systematically reviews the existing works that leverage machine learning models for synthetic data generation. Specifically, we discuss the synthetic data generation works from several perspectives: (i) applications, including computer vision, speech, natural language, healthcare, and business; (ii) machine learning methods, particularly neural network architectures and deep generative models; (iii) privacy and fairness issue. In addition, we identify the challenges and opportunities in this emerging field and suggest future research directions

    End-to-end anomaly detection in stream data

    Get PDF
    Nowadays, huge volumes of data are generated with increasing velocity through various systems, applications, and activities. This increases the demand for stream and time series analysis to react to changing conditions in real-time for enhanced efficiency and quality of service delivery as well as upgraded safety and security in private and public sectors. Despite its very rich history, time series anomaly detection is still one of the vital topics in machine learning research and is receiving increasing attention. Identifying hidden patterns and selecting an appropriate model that fits the observed data well and also carries over to unobserved data is not a trivial task. Due to the increasing diversity of data sources and associated stochastic processes, this pivotal data analysis topic is loaded with various challenges like complex latent patterns, concept drift, and overfitting that may mislead the model and cause a high false alarm rate. Handling these challenges leads the advanced anomaly detection methods to develop sophisticated decision logic, which turns them into mysterious and inexplicable black-boxes. Contrary to this trend, end-users expect transparency and verifiability to trust a model and the outcomes it produces. Also, pointing the users to the most anomalous/malicious areas of time series and causal features could save them time, energy, and money. For the mentioned reasons, this thesis is addressing the crucial challenges in an end-to-end pipeline of stream-based anomaly detection through the three essential phases of behavior prediction, inference, and interpretation. The first step is focused on devising a time series model that leads to high average accuracy as well as small error deviation. On this basis, we propose higher-quality anomaly detection and scoring techniques that utilize the related contexts to reclassify the observations and post-pruning the unjustified events. Last but not least, we make the predictive process transparent and verifiable by providing meaningful reasoning behind its generated results based on the understandable concepts by a human. The provided insight can pinpoint the anomalous regions of time series and explain why the current status of a system has been flagged as anomalous. Stream-based anomaly detection research is a principal area of innovation to support our economy, security, and even the safety and health of societies worldwide. We believe our proposed analysis techniques can contribute to building a situational awareness platform and open new perspectives in a variety of domains like cybersecurity, and health

    Analyzing Daily Behavioral Data for Personalized Health Management

    Get PDF
    Emerging wearable and environmental sensor technologies provide health professionals with unprecedented capacity to continuously collect human behavior data for health monitoring and management. This enables new solutions to mitigate globally emerging health problems such as obesity. With such outburst of dynamic sensor data, it is critical that appropriate mathematical models and computational analytic methods are developed to translate the collected data into an accurate characterization of the underlying health dynamics, enabling more reliable personalized monitoring, prediction, and intervention of health status changes. However, several challenges arise in translating them effectively into personalized activity plans. Besides common analytic challenges that come from the missing values and outliers often seen in sensor behavior data, modeling the complex health dynamics with potential influence from human daily behaviors also pose significant challenges. We address these challenges as follows: We firstly explore existing missing value imputation and outlier detection preprocessing methods. We compare these methods with a recently developed dynamic system learning method – SSMO – that learns a personalized behavior model from real-world sensor data while simultaneously estimating missing values and detecting outliers. We then focus on modeling heterogeneous dynamics to better capture health status changes under different conditions, which may lead to more effective state-dependent intervention strategies. We implement switching-state dynamic models with different complexity levels on real-world daily behavior data. Finally, we conducted evaluation experiments of these models to demonstrate the importance of modeling the dynamic heterogeneity, as well as simultaneously conducting missing value imputation and outlier detection in achieving better prediction of health status changes

    Energy Disaggregation Using Elastic Matching Algorithms

    Get PDF
    © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/)In this article an energy disaggregation architecture using elastic matching algorithms is presented. The architecture uses a database of reference energy consumption signatures and compares them with incoming energy consumption frames using template matching. In contrast to machine learning-based approaches which require significant amount of data to train a model, elastic matching-based approaches do not have a model training process but perform recognition using template matching. Five different elastic matching algorithms were evaluated across different datasets and the experimental results showed that the minimum variance matching algorithm outperforms all other evaluated matching algorithms. The best performing minimum variance matching algorithm improved the energy disaggregation accuracy by 2.7% when compared to the baseline dynamic time warping algorithm.Peer reviewedFinal Published versio

    Review of constraints on vision-based gesture recognition for human–computer interaction

    Get PDF
    The ability of computers to recognise hand gestures visually is essential for progress in human-computer interaction. Gesture recognition has applications ranging from sign language to medical assistance to virtual reality. However, gesture recognition is extremely challenging not only because of its diverse contexts, multiple interpretations, and spatio-temporal variations but also because of the complex non-rigid properties of the hand. This study surveys major constraints on vision-based gesture recognition occurring in detection and pre-processing, representation and feature extraction, and recognition. Current challenges are explored in detail

    Visitor-art interaction by motion path detection

    Get PDF
    This paper describes a method for video-based motion path detection which is applied in the creation of an interactive artwork. The proposed algorithm, based on the Hough transform, detects parametric motion trajectories in real-time (10 fps). In order to detect people's motion under non-static background object occlusion we have also developed a video segmentation technique. The proposed interaction system adopts top-down camera view to extract spatiotemporal motion trajectories and discern predefined patterns of movement thus enabling the creation of new artistic choreographies. We present test results that illustrate the effectiveness of our method and discuss the practical applicability of our approach in other domains

    Advances in Robotics, Automation and Control

    Get PDF
    The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man
    • …
    corecore