3 research outputs found

    Human action recognition from multi-sensor stream data by genetic programming

    No full text
    This paper presents an approach to recognition of human actions such as sitting, standing, walking or running by analysing the data produced by the sensors of a smart phone. The data comes as streams of parallel time series from 21 sensors. We have used genetic programming to evolve detectors for a number of actions and compared the detection accuracy of the evolved detectors with detectors built from the classical machine learning methods including Decision Trees, Na¨ıve Bayes, Nearest Neighbour and Support Vector Machines. The evolved detectors were considerably more accurate. We conclude that the proposed GP method can capture complex interaction of variables in parallel time series without using predefined features

    Data mining of vehicle telemetry data

    Get PDF
    Driving a safety critical task that requires a high level of attention and workload from the driver. Despite this, people often perform secondary tasks such as eating or using a mobile phone, which increase workload levels and divert cognitive and physical attention from the primary task of driving. As well as these distractions, the driver may also be overloaded for other reasons, such as dealing with an incident on the road or holding conversations in the car. One solution to this distraction problem is to limit the functionality of in-car devices while the driver is overloaded. This can take the form of withholding an incoming phone call or delaying the display of a non-urgent piece of information about the vehicle. In order to design and build these adaptions in the car, we must first have an understanding of the driver's current level of workload. Traditionally, driver workload has been monitored using physiological sensors or camera systems in the vehicle. However, physiological systems are often intrusive and camera systems can be expensive and are unreliable in poor light conditions. It is important, therefore, to use methods that are non-intrusive, inexpensive and robust, such as sensors already installed on the car and accessible via the Controller Area Network (CAN)-bus. This thesis presents a data mining methodology for this problem, as well as for others in domains with similar types of data, such as human activity monitoring. It focuses on the variable selection stage of the data mining process, where inputs are chosen for models to learn from and make inferences. Selecting inputs from vehicle telemetry data is challenging because there are many irrelevant variables with a high level of redundancy. Furthermore, data in this domain often contains biases because only relatively small amounts can be collected and processed, leading to some variables appearing more relevant to the classification task than they are really. Over the course of this thesis, a detailed variable selection framework that addresses these issues for telemetry data is developed. A novel blocked permutation method is developed and applied to mitigate biases when selecting variables from potentially biased temporal data. This approach is infeasible computationally when variable redundancies are also considered, and so a novel permutation redundancy measure with similar properties is proposed. Finally, a known redundancy structure between features in telemetry data is used to enhance the feature selection process in two ways. First the benefits of performing raw signal selection, feature extraction, and feature selection in different orders are investigated. Second, a two-stage variable selection framework is proposed and the two permutation based methods are combined. Throughout the thesis, it is shown through classification evaluations and inspection of the features that these permutation based selection methods are appropriate for use in selecting features from CAN-bus data

    Event and state detection in time series by genetic programming

    Get PDF
    Event and state detection in time series has significant value in scientific areas and real-world applications. The aim of detecting time series event and state patterns is to identify particular variations of user-interest in one or more channels of time series streams. For example, dangerous driving behaviours such as sudden braking and harsh acceleration can be detected from continuous recordings from inertial sensors. However, the existing methods are highly dependent on domain knowledge such as the size of the time series pattern and a set of effective features. Furthermore, they are not directly suitable for multi-channel time series data. In this study, we establish a genetic programming based method which can perform classification on multi-channel time series data. It does not need the domain knowledge required by the existing methods. The investigation consists of four parts: the methodology, an evaluation on event detection tasks, an evaluation on state detection tasks and an analysis on the suitability for real-world applications. In the methodology, a GP based method is proposed for processing and analysing multi-channel time series streams. The function set includes basic mathematical operations. In addition, specific functions and terminals are introduced to reserve historical information, capture temporal dependency across time points and handle dependency between channels. These functions and terminals help the GP based method to automatically find the pattern size and extract features. This study also investigates two different fitness functions - accuracy and area under the curve. The proposed method is investigated on a range of event detection tasks. The investigation starts from synthetic tasks such as detecting complete sine waves. The performance of the GP based method is compared to traditional classification methods. On the raw data the GP based method achieves 100 percent accuracy, which outperforms all the non-GP methods.The performance of the non-GP methods is comparable to the GP based method only with suitable features. In addition, the GP based method is investigated on two complex real-world event detection tasks - dangerous driving behaviour detection and video shot detection. In the task of detecting three dangerous driving behaviours from 21-channel time series data, the GP based method performs consistently better than the non-GP classifiers even when features are provided. In the video shot detection task, the GP based method achieves comparable performance on 11200-channel time series to the non-GP classifiers on 28 features. The GP based method is more accurate than a commercial product. The GP based method has also been investigated on state detection tasks. This involves synthetic tasks such as detecting concurrent high values in four of five channels and a real-world activity recognition problem. The results also show that the GP based method consistently outperforms the non-GP methods even with the presence of manually constructed features. As part of the investigation, a mobile phone based activity recognition data set was collected as there was no existing publicly available data set. The suitability of the GP based method for solving real-world problems is further analysed. Our analysis shows that the GP based method can be successfully extended for multi-class classification. The analysis of the evolved programs demonstrates that they do capture time series patterns. On synthetic data sets, the injected regularities are revealed in understandable individuals. The best programs for three real-world problems are more difficult to explain but still provide some insight. The selection of relevant channels and data points by the programs are consistent with domain knowledge. In addition, the analysis shows that the proposed method still performs well for time series pattern of different sizes. The effective window sizes of the evolved GP programs are close to the pattern size. Finally, our study on execution performance of the evolved programs shows that these programs are fast in execution and are suitable for real-time applications. In summary, the GP based method is suitable for the kinds of real-world applications studied in this thesis. This thesis concludes that, with a suitable representation, genetic programming can be an effective method for event and state detection in multi-channel time series for a range of synthetic and real-world tasks. This method does not require much domain knowledge such as the pattern size and suitable features. It offers an effective classification method in similar tasks that are studied in this thesis
    corecore