3,875 research outputs found
Fairness-aware Machine Learning in Educational Data Mining
Fairness is an essential requirement of every educational system, which is reflected in a variety of educational activities. With the extensive use of Artificial Intelligence (AI) and Machine Learning (ML) techniques in education, researchers and educators can analyze educational (big) data and propose new (technical) methods in order to support teachers, students, or administrators of (online) learning systems in the organization of teaching and learning. Educational data mining (EDM) is the result of the application and development of data mining (DM), and ML techniques to deal with educational problems, such as student performance prediction and student grouping. However, ML-based decisions in education can be based on protected attributes, such as race or gender, leading to discrimination of individual students or subgroups of students. Therefore, ensuring fairness in ML models also contributes to equity in educational systems. On the other hand, bias can also appear in the data obtained from learning environments. Hence, bias-aware exploratory educational data analysis is important to support unbiased decision-making in EDM.
In this thesis, we address the aforementioned issues and propose methods that mitigate discriminatory outcomes of ML algorithms in EDM tasks. Specifically, we make the following contributions:
We perform bias-aware exploratory analysis of educational datasets using Bayesian networks to identify the relationships among attributes in order to understand bias in the datasets. We focus the exploratory data analysis on features having a direct or indirect relationship with the protected attributes w.r.t. prediction outcomes.
We perform a comprehensive evaluation of the sufficiency of various group fairness measures in predictive models for student performance prediction problems. A variety of experiments on various educational datasets with different fairness measures are performed to provide users with a broad view of unfairness from diverse aspects.
We deal with the student grouping problem in collaborative learning. We introduce the fair-capacitated clustering problem that takes into account cluster fairness and cluster cardinalities. We propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain fair-capacitated clustering.
We introduce the multi-fair capacitated (MFC) students-topics grouping problem that satisfies students' preferences while ensuring balanced group cardinalities and maximizing the diversity of members regarding the protected attribute. We propose three approaches: a greedy heuristic approach, a knapsack-based approach using vanilla maximal 0-1 knapsack formulation, and an MFC knapsack approach based on group fairness knapsack formulation.
In short, the findings described in this thesis demonstrate the importance of fairness-aware ML in educational settings. We show that bias-aware data analysis, fairness measures, and fairness-aware ML models are essential aspects to ensure fairness in EDM and the educational environment.Ministry of Science and Culture of Lower Saxony/LernMINT/51410078/E
Natural and Technological Hazards in Urban Areas
Natural hazard events and technological accidents are separate causes of environmental impacts. Natural hazards are physical phenomena active in geological times, whereas technological hazards result from actions or facilities created by humans. In our time, combined natural and man-made hazards have been induced. Overpopulation and urban development in areas prone to natural hazards increase the impact of natural disasters worldwide. Additionally, urban areas are frequently characterized by intense industrial activity and rapid, poorly planned growth that threatens the environment and degrades the quality of life. Therefore, proper urban planning is crucial to minimize fatalities and reduce the environmental and economic impacts that accompany both natural and technological hazardous events
3D Innovations in Personalized Surgery
Current practice involves the use of 3D surgical planning and patient-specific solutions in multiple surgical areas of expertise. Patient-specific solutions have been endorsed for several years in numerous publications due to their associated benefits around accuracy, safety, and predictability of surgical outcome. The basis of 3D surgical planning is the use of high-quality medical images (e.g., CT, MRI, or PET-scans). The translation from 3D digital planning toward surgical applications was developed hand in hand with a rise in 3D printing applications of multiple biocompatible materials. These technical aspects of medical care require engineers’ or technical physicians’ expertise for optimal safe and effective implementation in daily clinical routines.The aim and scope of this Special Issue is high-tech solutions in personalized surgery, based on 3D technology and, more specifically, bone-related surgery. Full-papers or highly innovative technical notes or (systematic) reviews that relate to innovative personalized surgery are invited. This can include optimization of imaging for 3D VSP, optimization of 3D VSP workflow and its translation toward the surgical procedure, or optimization of personalized implants or devices in relation to bone surgery
Classifier Calibration: A survey on how to assess and improve predicted class probabilities
This paper provides both an introduction to and a detailed overview of the
principles and practice of classifier calibration. A well-calibrated classifier
correctly quantifies the level of uncertainty or confidence associated with its
instance-wise predictions. This is essential for critical applications, optimal
decision making, cost-sensitive classification, and for some types of context
change. Calibration research has a rich history which predates the birth of
machine learning as an academic field by decades. However, a recent increase in
the interest on calibration has led to new methods and the extension from
binary to the multiclass setting. The space of options and issues to consider
is large, and navigating it requires the right set of concepts and tools. We
provide both introductory material and up-to-date technical details of the main
concepts and methods, including proper scoring rules and other evaluation
metrics, visualisation approaches, a comprehensive account of post-hoc
calibration methods for binary and multiclass classification, and several
advanced topics
Sensitivity of NEXT-100 detector to neutrinoless double beta decay
Nesta tese estúdiase a sensibilidade do detector NEXT-100 á desintegración dobre
beta sen neutrinos. Existe un gran interese na busca desta desintegración xa que
podería respostar preguntas fundamentais en física de neutrinos. O detector constitúe
a terceira fase do experimento NEXT, colaboración na que se desenrolou esta tese.
A continuación inclúese un resumo de cada un dos capítulos nos que se divide a
tese. Comézase introducindo o marco teórico e experimental nas seccións Física de
neutrinos, A busca da desintegración dobre beta sen neutrinos e O experimento
NEXT. Posteriormente descríbense a parte principal do análise da tese en Simulación
do detector, Procesamento de datos e Sensibilidade do detector NEXT-100
Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring
Artificially intelligent perception is increasingly present in the lives of
every one of us. Vehicles are no exception, (...) In the near future, pattern
recognition will have an even stronger role in vehicles, as self-driving cars
will require automated ways to understand what is happening around (and within)
them and act accordingly. (...) This doctoral work focused on advancing
in-vehicle sensing through the research of novel computer vision and pattern
recognition methodologies for both biometrics and wellbeing monitoring. The
main focus has been on electrocardiogram (ECG) biometrics, a trait well-known
for its potential for seamless driver monitoring. Major efforts were devoted to
achieving improved performance in identification and identity verification in
off-the-person scenarios, well-known for increased noise and variability. Here,
end-to-end deep learning ECG biometric solutions were proposed and important
topics were addressed such as cross-database and long-term performance,
waveform relevance through explainability, and interlead conversion. Face
biometrics, a natural complement to the ECG in seamless unconstrained
scenarios, was also studied in this work. The open challenges of masked face
recognition and interpretability in biometrics were tackled in an effort to
evolve towards algorithms that are more transparent, trustworthy, and robust to
significant occlusions. Within the topic of wellbeing monitoring, improved
solutions to multimodal emotion recognition in groups of people and
activity/violence recognition in in-vehicle scenarios were proposed. At last,
we also proposed a novel way to learn template security within end-to-end
models, dismissing additional separate encryption processes, and a
self-supervised learning approach tailored to sequential data, in order to
ensure data security and optimal performance. (...)Comment: Doctoral thesis presented and approved on the 21st of December 2022
to the University of Port
Robust Out-of-Distribution Detection in Deep Classifiers
Over the past decade, deep learning has gone from a fringe discipline of computer science
to a major driver of innovation across a large number of industries. The deployment of such
rapidly developing technology in safety-critical applications necessitates the careful study and
mitigation of potential failure modes. Indeed, many deep learning models are overconfident in
their predictions, are unable to flag out-of-distribution examples that are clearly unrelated to
the task they were trained on and are vulnerable to adversarial vulnerabilities, where a small
change in the input leads to a large change in the model’s prediction. In this dissertation, we
study the relation between these issues in deep learning based vision classifiers.
First, we benchmark various methods that have been proposed to enable deep learning meth-
ods to detect out-of-distribution examples and we show that a classifier’s predictive confidence
is well-suited for this task, if the classifier has had access to a large and diverse out-distribution
at train time. We theoretically investigate how different out-of-distribution detection methods
are related and show that several seemingly different approaches are actually modeling the
same core quantities.
In the second part we study the adversarial robustness of a classifier’s confidence on out-
of-distribution data. Concretely, we show that several previous techniques for adversarial
robustness can be combined to create a model that inherits each method’s strength while sig-
nificantly reducing their respective drawbacks. In addition, we demonstrate that the enforce-
ment of adversarially robust low confidence on out-of-distribution data enhances the inherent
interpretability of the model by imbuing the classifier with certain generative properties that
can be used to query the model for counterfactual explanations for its decisions.
In the third part of this dissertation we will study the problem of issuing mathematically
provable certificates for the adversarial robustness of a model’s confidence on out-of-distribution
data. We develop two different approaches to this problem and show that they have comple-
mentary strength and weaknesses. The first method is easy to train, puts no restrictions on
the architecture that our classifier can use and provably ensures that the classifier will have
low confidence on data very far away. However, it only provides guarantees for very specific
types of adversarial perturbations and only for data that is very easy to distinguish from the
in-distribution. The second approach works for more commonly studied sets of adversarial
perturbations and on much more challenging out-distribution data, but puts heavy restrictions
on the architecture that can be used and thus the achievable accuracy. It also does not guar-
antee low confidence on asymptotically far away data. In the final chapter of this dissertation
we show how ideas from both of these techniques can be combined in a way that preserves all
of their strengths while inheriting none of their weaknesses. Thus, this thesis outlines how to
develop high-performing classifiers that provably know when they do not know
- …