75 research outputs found
Acceptance and profitability modelling for consumer loans
This thesis explores and models the relationships between offers of credit products,
credit scores, consumers' acceptance decisions and expected profits generated using
data that records actual choices made by customers and their monthly account status
after being accepted. Based on Keeney and Oliver's theoretical work, this thesis esti¬
mates the expected profits for the lender at the time of application, draws the iso-profit
curves and iso-preference curves, derives optimal policy decisions subject to various
constraints and compares the economic benefits after the segmentation analysis.This thesis also addresses other research issues that have emerged during the explo¬
ration into profitability and acceptance. We use a Bivariate Sample Selection model to
test the existence of sample selection bias and found that acceptance inference may not
be necessary for our data. We compared the predictive performance of Support Vector
Machines (SVMs) vs. Logistic Regression (LR) on default data as well as on accep¬
tance data, without finding that SVMs outperform LR. We applied different Survival
Analysis models on two events of interest, default and paying back early. Our results
favoured semi-parametric PH-Cox models separately estimated for each hazard
Deep audio-visual speech recognition
Decades of research in acoustic speech recognition have led to systems that we use in our everyday life. However, even the most advanced speech recognition systems fail in the presence of noise. The degraded performance can be compensated by introducing visual speech information. However, Visual Speech Recognition (VSR) in naturalistic conditions is very challenging, in part due to the lack of architectures and annotations.
This thesis contributes towards the problem of Audio-Visual Speech Recognition (AVSR) from different aspects. Firstly, we develop AVSR models for isolated words. In contrast to previous state-of-the-art methods that consists of a two-step approach, feature extraction and recognition, we present an End-to-End (E2E) approach inside a deep neural network, and this has led to a significant improvement in audio-only, visual-only and audio-visual experiments. We further replace Bi-directional Gated Recurrent Unit (BGRU) with Temporal Convolutional Networks (TCN) to greatly simplify the training procedure.
Secondly, we extend our AVSR model for continuous speech by presenting a hybrid Connectionist Temporal Classification (CTC)/Attention model, that can be trained in an end-to-end manner. We then propose the addition of prediction-based auxiliary tasks to a VSR model and highlight the importance of hyper-parameter optimisation and appropriate data augmentations.
Next, we present a self-supervised framework, Learning visual speech Representations from Audio via self-supervision (LiRA). Specifically, we train a ResNet+Conformer model to predict acoustic features from unlabelled visual speech, and find that this pre-trained model can be leveraged towards word-level and sentence-level lip-reading.
We also investigate the Lombard effect influence in an end-to-end AVSR system, which is the first work using end-to-end deep architectures and presents results on unseen speakers. We show that even if a relatively small amount of Lombard speech is added to the training set then the performance in a real scenario, where noisy Lombard speech is present, can be significantly improved.
Lastly, we propose a detection method against adversarial examples in an AVSR system, where the strong correlation between audio and visual streams is leveraged. The synchronisation confidence score is leveraged as a proxy for audio-visual correlation and based on it, we can detect adversarial attacks. We apply recent adversarial attacks on two AVSR models and the experimental results demonstrate that the proposed approach is an effective way for detecting such attacks.Open Acces
PerfCE: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis
Debugging performance anomalies in real-world databases is challenging.
Causal inference techniques enable qualitative and quantitative root cause
analysis of performance downgrade. Nevertheless, causality analysis is
practically challenging, particularly due to limited observability. Recently,
chaos engineering has been applied to test complex real-world software systems.
Chaos frameworks like Chaos Mesh mutate a set of chaos variables to inject
catastrophic events (e.g., network slowdowns) to "stress" software systems. The
systems under chaos stress are then tested using methods like differential
testing to check if they retain their normal functionality (e.g., SQL query
output is always correct under stress). Despite its ubiquity in the industry,
chaos engineering is now employed mostly to aid software testing rather for
performance debugging.
This paper identifies novel usage of chaos engineering on helping developers
diagnose performance anomalies in databases. Our presented framework, PERFCE,
comprises an offline phase and an online phase. The offline phase learns the
statistical models of the target database system, whilst the online phase
diagnoses the root cause of monitored performance anomalies on the fly. During
the offline phase, PERFCE leverages both passive observations and proactive
chaos experiments to constitute accurate causal graphs and structural equation
models (SEMs). When observing performance anomalies during the online phase,
causal graphs enable qualitative root cause identification (e.g., high CPU
usage) and SEMs enable quantitative counterfactual analysis (e.g., determining
"when CPU usage is reduced to 45\%, performance returns to normal"). PERFCE
notably outperforms prior works on common synthetic datasets, and our
evaluation on real-world databases, MySQL and TiDB, shows that PERFCE is highly
accurate and moderately expensive
Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition
Several audio-visual speech recognition models have been recently proposed
which aim to improve the robustness over audio-only models in the presence of
noise. However, almost all of them ignore the impact of the Lombard effect,
i.e., the change in speaking style in noisy environments which aims to make
speech more intelligible and affects both the acoustic characteristics of
speech and the lip movements. In this paper, we investigate the impact of the
Lombard effect in audio-visual speech recognition. To the best of our
knowledge, this is the first work which does so using end-to-end deep
architectures and presents results on unseen speakers. Our results show that
properly modelling Lombard speech is always beneficial. Even if a relatively
small amount of Lombard speech is added to the training set then the
performance in a real scenario, where noisy Lombard speech is present, can be
significantly improved. We also show that the standard approach followed in the
literature, where a model is trained and tested on noisy plain speech, provides
a correct estimate of the video-only performance and slightly underestimates
the audio-visual performance. In case of audio-only approaches, performance is
overestimated for SNRs higher than -3dB and underestimated for lower SNRs.Comment: Accepted for publication at Interspeech 201
Towards Practical Federated Causal Structure Learning
Understanding causal relations is vital in scientific discovery. The process
of causal structure learning involves identifying causal graphs from
observational data to understand such relations. Usually, a central server
performs this task, but sharing data with the server poses privacy risks.
Federated learning can solve this problem, but existing solutions for federated
causal structure learning make unrealistic assumptions about data and lack
convergence guarantees. FedC2SL is a federated constraint-based causal
structure learning scheme that learns causal graphs using a federated
conditional independence test, which examines conditional independence between
two variables under a condition set without collecting raw data from clients.
FedC2SL requires weaker and more realistic assumptions about data and offers
stronger resistance to data variability among clients. FedPC and FedFCI are the
two variants of FedC2SL for causal structure learning in causal sufficiency and
causal insufficiency, respectively. The study evaluates FedC2SL using both
synthetic datasets and real-world data against existing solutions and finds it
demonstrates encouraging performance and strong resilience to data
heterogeneity among clients
- …