10 research outputs found
Decoding the cognitive states of attention and distraction in a real-life setting using EEG.
Lapses in attention can have serious consequences in situations such as driving a car, hence there is considerable interest in tracking it using neural measures. However, as most of these studies have been done in highly controlled and artificial laboratory settings, we want to explore whether it is also possible to determine attention and distraction using electroencephalogram (EEG) data collected in a natural setting using machine/deep learning. 24 participants volunteered for the study. Data were collected from pairs of participants simultaneously while they engaged in Tibetan Monastic debate, a practice that is interesting because it is a real-life situation that generates substantial variability in attention states. We found that attention was on average associated with increased left frontal alpha, increased left parietal theta, and decreased central delta compared to distraction. In an attempt to predict attention and distraction, we found that a Long Short Term Memory model classified attention and distraction with maximum accuracy of 95.86% and 95.4% corresponding to delta and theta waves respectively. This study demonstrates that EEG data collected in a real-life setting can be used to predict attention states in participants with good accuracy, opening doors for developing Brain-Computer Interfaces that track attention in real-time using data extracted in daily life settings, rendering them much more usable
Self-Updating Models with Error Remediation
Many environments currently employ machine learning models for data
processing and analytics that were built using a limited number of training
data points. Once deployed, the models are exposed to significant amounts of
previously-unseen data, not all of which is representative of the original,
limited training data. However, updating these deployed models can be difficult
due to logistical, bandwidth, time, hardware, and/or data sensitivity
constraints. We propose a framework, Self-Updating Models with Error
Remediation (SUMER), in which a deployed model updates itself as new data
becomes available. SUMER uses techniques from semi-supervised learning and
noise remediation to iteratively retrain a deployed model using
intelligently-chosen predictions from the model as the labels for new training
iterations. A key component of SUMER is the notion of error remediation as
self-labeled data can be susceptible to the propagation of errors. We
investigate the use of SUMER across various data sets and iterations. We find
that self-updating models (SUMs) generally perform better than models that do
not attempt to self-update when presented with additional previously-unseen
data. This performance gap is accentuated in cases where there is only limited
amounts of initial training data. We also find that the performance of SUMER is
generally better than the performance of SUMs, demonstrating a benefit in
applying error remediation. Consequently, SUMER can autonomously enhance the
operational capabilities of existing data processing systems by intelligently
updating models in dynamic environments.Comment: 17 pages, 13 figures, published in the proceedings of the Artificial
Intelligence and Machine Learning for Multi-Domain Operations Applications II
conference in the SPIE Defense + Commercial Sensing, 2020 symposiu
Noise Models in Classification: Unified Nomenclature, Extended Taxonomy and Pragmatic Categorization
This paper presents the first review of noise models in classification covering both label and
attribute noise. Their study reveals the lack of a unified nomenclature in this field. In order to address
this problem, a tripartite nomenclature based on the structural analysis of existing noise models is
proposed. Additionally, a revision of their current taxonomies is carried out, which are combined
and updated to better reflect the nature of any model. Finally, a categorization of noise models is
proposed from a practical point of view depending on the characteristics of noise and the study
purpose. These contributions provide a variety of models to introduce noise, their characteristics
according to the proposed taxonomy and a unified way of naming them, which will facilitate their
identification and study, as well as the reproducibility of future research
Towards Addressing Key Visual Processing Challenges in Social Media Computing
abstract: Visual processing in social media platforms is a key step in gathering and understanding information in the era of Internet and big data. Online data is rich in content, but its processing faces many challenges including: varying scales for objects of interest, unreliable and/or missing labels, the inadequacy of single modal data and difficulty in analyzing high dimensional data. Towards facilitating the processing and understanding of online data, this dissertation primarily focuses on three challenges that I feel are of great practical importance: handling scale differences in computer vision tasks, such as facial component detection and face retrieval, developing efficient classifiers using partially labeled data and noisy data, and employing multi-modal models and feature selection to improve multi-view data analysis. For the first challenge, I propose a scale-insensitive algorithm to expedite and accurately detect facial landmarks. For the second challenge, I propose two algorithms that can be used to learn from partially labeled data and noisy data respectively. For the third challenge, I propose a new framework that incorporates feature selection modules into LDA models.Dissertation/ThesisDoctoral Dissertation Computer Science 201
Recommended from our members
Contributions to evaluation of machine learning models. Applicability domain of classification models
Artificial intelligence (AI) and machine learning (ML) present some application opportunities and
challenges that can be framed as learning problems. The performance of machine learning models
depends on algorithms and the data. Moreover, learning algorithms create a model of reality through
learning and testing with data processes, and their performance shows an agreement degree of their
assumed model with reality. ML algorithms have been successfully used in numerous classification
problems. With the developing popularity of using ML models for many purposes in different domains,
the validation of such predictive models is currently required more formally. Traditionally, there are
many studies related to model evaluation, robustness, reliability, and the quality of the data and the
data-driven models. However, those studies do not consider the concept of the applicability domain
(AD) yet. The issue is that the AD is not often well defined, or it is not defined at all in many fields. This
work investigates the robustness of ML classification models from the applicability domain
perspective. A standard definition of applicability domain regards the spaces in which the model
provides results with specific reliability.
The main aim of this study is to investigate the connection between the applicability domain approach
and the classification model performance. We are examining the usefulness of assessing the AD for
the classification model, i.e. reliability, reuse, robustness of classifiers. The work is implemented using
three approaches, and these approaches are conducted in three various attempts: firstly, assessing
the applicability domain for the classification model; secondly, investigating the robustness of the
classification model based on the applicability domain approach; thirdly, selecting an optimal model
using Pareto optimality. The experiments in this work are illustrated by considering different machine
learning algorithms for binary and multi-class classifications for healthcare datasets from public
benchmark data repositories. In the first approach, the decision trees algorithm (DT) is used for the
classification of data in the classification stage. The feature selection method is applied to choose
features for classification. The obtained classifiers are used in the third approach for selection of
models using Pareto optimality. The second approach is implemented using three steps; namely,
building classification model; generating synthetic data; and evaluating the obtained results.
The results obtained from the study provide an understanding of how the proposed approach can help
to define the modelâs robustness and the applicability domain, for providing reliable outputs. These
approaches open opportunities for classification data and model management. The proposed
algorithms are implemented through a set of experiments on classification accuracy of instances,
which fall in the domain of the model. For the first approach, by considering all the features, the
highest accuracy obtained is 0.98, with thresholds average of 0.34 for Breast cancer dataset. After
applying recursive feature elimination (RFE) method, the accuracy is 0.96% with 0.27 thresholds
average. For the robustness of the classification model based on the applicability domain approach,
the minimum accuracy is 0.62% for Indian Liver Patient data at r=0.10, and the maximum accuracy is
0.99% for Thyroid dataset at r=0.10. For the selection of an optimal model using Pareto optimality,
the optimally selected classifier gives the accuracy of 0.94% with 0.35 thresholds average.
This research investigates critical aspects of the applicability domain as related to the robustness of
classification ML algorithms. However, the performance of machine learning techniques depends on
the degree of reliable predictions of the model. In the literature, the robustness of the ML model can
be defined as the ability of the model to provide the testing error close to the training error. Moreover,
the properties can describe the stability of the model performance when being tested on the new
datasets. Concluding, this thesis introduced the concept of applicability domain for classifiers and
tested the use of this concept with some case studies on health-related public benchmark datasets.Ministry of Higher Education in Liby