7 research outputs found

    MLGaze: Machine Learning-Based Analysis of Gaze Error Patterns in Consumer Eye Tracking Systems

    Full text link
    Analyzing the gaze accuracy characteristics of an eye tracker is a critical task as its gaze data is frequently affected by non-ideal operating conditions in various consumer eye tracking applications. In this study, gaze error patterns produced by a commercial eye tracking device were studied with the help of machine learning algorithms, such as classifiers and regression models. Gaze data were collected from a group of participants under multiple conditions that commonly affect eye trackers operating on desktop and handheld platforms. These conditions (referred here as error sources) include user distance, head pose, and eye-tracker pose variations, and the collected gaze data were used to train the classifier and regression models. It was seen that while the impact of the different error sources on gaze data characteristics were nearly impossible to distinguish by visual inspection or from data statistics, machine learning models were successful in identifying the impact of the different error sources and predicting the variability in gaze error levels due to these conditions. The objective of this study was to investigate the efficacy of machine learning methods towards the detection and prediction of gaze error patterns, which would enable an in-depth understanding of the data quality and reliability of eye trackers under unconstrained operating conditions. Coding resources for all the machine learning methods adopted in this study were included in an open repository named MLGaze to allow researchers to replicate the principles presented here using data from their own eye trackers.Comment: https://github.com/anuradhakar49/MLGaz

    Machine Learning Applications for Dynamic Security Assessment in presence of Renewable Generation and Load Induced Variability

    Get PDF
    abstract: Large-scale blackouts that have occurred across North America in the past few decades have paved the path for substantial amount of research in the field of security assessment of the grid. With the aid of advanced technology such as phasor measurement units (PMUs), considerable work has been done involving voltage stability analysis and power system dynamic behavior analysis to ensure security and reliability of the grid. Online dynamic security assessment (DSA) analysis has been developed and applied in several power system control centers. Existing applications of DSA are limited by the assumption of simplistic load profiles, which often considers a normative day to represent an entire year. To overcome these aforementioned challenges, this research developed a novel DSA scheme to provide security prediction in real-time for load profiles corresponding to different seasons. The major contributions of this research are to (1) develop a DSA scheme incorporated with PMU data, (2) consider a comprehensive seasonal load profile, (3) account for varying penetrations of renewable generation, and (4) compare the accuracy of different machine learning (ML) algorithms for DSA. The ML algorithms that will be the focus of this study include decision trees (DTs), support vector machines (SVMs), random forests (RFs), and multilayer neural networks (MLNNs). This thesis describes the development of a novel DSA scheme using synchrophasor measurements that accounts for the load variability occurring across different seasons in a year. Different amounts of solar generation have also been incorporated in this study to account for increasing percentage of renewables in the modern grid. To account for the security of the operating conditions different ML algorithms have been trained and tested. A database of cases for different operating conditions has been developed offline that contains secure as well as insecure cases, and the ML models have been trained to classify the security or insecurity of a particular operating condition in real-time. Multiple scenarios are generated every 15 minutes for different seasons and stored in the database. The performance of this approach is tested on the IEEE-118 bus system.Dissertation/ThesisMasters Thesis Electrical Engineering 201

    Efficient Group K Nearest-Neighbor Spatial Query Processing in Apache Spark

    Get PDF
    Aiming at the problem of spatial query processing in distributed computing systems, the design and implementation of new distributed spatial query algorithms is a current challenge. Apache Spark is a memory-based framework suitable for real-time and batch processing. Spark-based systems allow users to work on distributed in-memory data, without worrying about the data distribution mechanism and fault-tolerance. Given two datasets of points (called Query and Training), the group K nearest-neighbor (GKNN) query retrieves (K) points of the Training with the smallest sum of distances to every point of the Query. This spatial query has been actively studied in centralized environments and several performance improving techniques and pruning heuristics have been also proposed, while, a distributed algorithm in Apache Hadoop was recently proposed by our team. Since, in general, Apache Hadoop exhibits lower performance than Spark, in this paper, we present the first distributed GKNN query algorithm in Apache Spark and compare it against the one in Apache Hadoop. This algorithm incorporates programming features and facilities that are specific to Apache Spark. Moreover, techniques that improve performance and are applicable in Apache Spark are also incorporated. The results of an extensive set of experiments with real-world spatial datasets are presented, demonstrating that our Apache Spark GKNN solution, with its improvements, is efficient and a clear winner in comparison to processing this query in Apache Hadoop

    Object detection, distributed cloud computing and parallelization techniques for autonomous driving systems.

    Get PDF
    Autonomous vehicles are increasingly becoming a necessary trend towards building the smart cities of the future. Numerous proposals have been presented in recent years to tackle particular aspects of the working pipeline towards creating a functional end-to-end system, such as object detection, tracking, path planning, sentiment or intent detection, amongst others. Nevertheless, few efforts have been made to systematically compile all of these systems into a single proposal that also considers the real challenges these systems will have on the road, such as real-time computation, hardware capabilities, etc. This paper reviews the latest techniques towards creating our own end-to-end autonomous vehicle system, considering the state-of-the-art methods on object detection, and the possible incorporation of distributed systems and parallelization to deploy these methods. Our findings show that while techniques such as convolutional neural networks, recurrent neural networks, and long short-term memory can effectively handle the initial detection and path planning tasks, more efforts are required to implement cloud computing to reduce the computational time that these methods demand. Additionally, we have mapped different strategies to handle the parallelization task, both within and between the networks

    Effective Features and Machine Learning Methods for Document Classification

    Get PDF
    Document classification has been involved in a variety of applications, such as phishing and fraud detection, news categorisation, and information retrieval. This thesis aims to provide novel solutions to several important problems presented by document classification. First, an improved Principal Components Analysis (PCA), based on similarity and correlation criteria instead of covariance, is proposed, which aims to capture low-dimensional feature subset that facilitates improved performance in text classification. The experimental results have demonstrated the advantages and usefulness of the proposed method for text classification in high-dimensional feature space in terms of the number of features required to achieve the best classification accuracy. Second, two hybrid feature-subset selection methods are proposed based on the combination (via either union or intersection) of the results of both supervised (in one method) and unsupervised (in the other method) filter approaches prior to the use of a wrapper, leading to low-dimensional feature subset that can achieve both high classification accuracy and good interpretability, and spend less processing time than most current methods. The experimental results have demonstrated the effectiveness of the proposed methods for feature subset selection in high-dimensional feature space in terms of the number of selected features and the processing time spent to achieve the best classification accuracy. Third, a class-specific (supervised) pre-trained approach based on a sparse autoencoder is proposed for acquiring low-dimensional interesting structure of relevant features, which can be used for high-performance document classification. The experimental results have demonstrated the merit of this proposed method for document classification in high-dimensional feature space, in terms of the limited number of features required to achieve good classification accuracy. Finally, deep classifier structures associated with a stacked autoencoder (SAE) for higher-level feature extraction are investigated, aiming to overcome the difficulties experienced in training deep neural networks with limited training data in high-dimensional feature space, such as overfitting and vanishing/exploding gradients. This investigation has resulted in a three-stage learning algorithm for training deep neural networks. In comparison with support vector machines (SVMs) combined with SAE and Deep Multilayer Perceptron (DMLP) with random weight initialisation, the experimental results have shown the advantages and effectiveness of the proposed three-stage learning algorithm
    corecore