8,386 research outputs found
Recommended from our members
Rare-Event Estimation and Calibration for Large-Scale Stochastic Simulation Models
Stochastic simulation has been widely applied in many domains. More recently, however, the rapid surge of sophisticated problems such as safety evaluation of intelligent systems has posed various challenges to conventional statistical methods. Motivated by these challenges, in this thesis, we develop novel methodologies with theoretical guarantees and numerical applications to tackle them from different perspectives.
In particular, our works can be categorized into two areas: (1) rare-event estimation (Chapters 2 to 5) where we develop approaches to estimating the probabilities of rare events via simulation; (2) model calibration (Chapters 6 and 7) where we aim at calibrating the simulation model so that it is close to reality.
In Chapter 2, we study rare-event simulation for a class of problems where the target hitting sets of interest are defined via modern machine learning tools such as neural networks and random forests. We investigate an importance sampling scheme that integrates the dominating point machinery in large deviations and sequential mixed integer programming to locate the underlying dominating points. We provide efficiency guarantees and numerical demonstration of our approach.
In Chapter 3, we propose a new efficiency criterion for importance sampling, which we call probabilistic efficiency. Conventionally, an estimator is regarded as efficient if its relative error is sufficiently controlled. It is widely known that when a rare-event set contains multiple "important regions" encoded by the dominating points, importance sampling needs to account for all of them via mixing to achieve efficiency. We argue that the traditional analysis recipe could suffer from intrinsic looseness by using relative error as an efficiency criterion. Thus, we propose the new efficiency notion to tighten this gap. In particular, we show that under the standard Gartner-Ellis large deviations regime, an importance sampling that uses only the most significant dominating points is sufficient to attain this efficiency notion.
In Chapter 4, we consider the estimation of rare-event probabilities using sample proportions output by crude Monte Carlo. Due to the recent surge of sophisticated rare-event problems, efficiency-guaranteed variance reduction may face implementation challenges, which motivate one to look at naive estimators. In this chapter we construct confidence intervals for the target probability using this naive estimator from various techniques, and then analyze their validity as well as tightness respectively quantified by the coverage probability and relative half-width.
In Chapter 5, we propose the use of extreme value analysis, in particular the peak-over-threshold method which is popularly employed for extremal estimation of real datasets, in the simulation setting. More specifically, we view crude Monte Carlo samples as data to fit on a generalized Pareto distribution. We test this idea on several numerical examples. The results show that in the absence of efficient variance reduction schemes, it appears to offer potential benefits to enhance crude Monte Carlo estimates.
In Chapter 6, we investigate a framework to develop calibration schemes in parametric settings, which satisfies rigorous frequentist statistical guarantees via a basic notion that we call eligibility set designed to bypass non-identifiability via a set-based estimation. We investigate a feature extraction-then-aggregation approach to construct these sets that target at multivariate outputs. We demonstrate our methodology on several numerical examples, including an application to calibration of a limit order book market simulator.
In Chapter 7, we study a methodology to tackle the NASA Langley Uncertainty Quantification Challenge, a model calibration problem under both aleatory and epistemic uncertainties. Our methodology is based on an integration of distributionally robust optimization and importance sampling. The main computation machinery in this integrated methodology amounts to solving sampled linear programs. We present theoretical statistical guarantees of our approach via connections to nonparametric hypothesis testing, and numerical performances including parameter calibration and downstream decision and risk evaluation tasks
Calibrated Explanations for Regression
Artificial Intelligence (AI) is often an integral part of modern decision
support systems (DSSs). The best-performing predictive models used in AI-based
DSSs lack transparency. Explainable Artificial Intelligence (XAI) aims to
create AI systems that can explain their rationale to human users. Local
explanations in XAI can provide information about the causes of individual
predictions in terms of feature importance. However, a critical drawback of
existing local explanation methods is their inability to quantify the
uncertainty associated with a feature's importance. This paper introduces an
extension of a feature importance explanation method, Calibrated Explanations
(CE), previously only supporting classification, with support for standard
regression and probabilistic regression, i.e., the probability that the target
is above an arbitrary threshold. The extension for regression keeps all the
benefits of CE, such as calibration of the prediction from the underlying model
with confidence intervals, uncertainty quantification of feature importance,
and allows both factual and counterfactual explanations. CE for standard
regression provides fast, reliable, stable, and robust explanations. CE for
probabilistic regression provides an entirely new way of creating probabilistic
explanations from any ordinary regression model and with a dynamic selection of
thresholds. The performance of CE for probabilistic regression regarding
stability and speed is comparable to LIME. The method is model agnostic with
easily understood conditional rules. An implementation in Python is freely
available on GitHub and for installation using pip making the results in this
paper easily replicable.Comment: 30 pages, 11 figures (replaced due to omitted author, which is the
only change made
Operational Modal Analysis of Near-Infrared Spectroscopy Measure of 2-Month Exercise Intervention Effects in Sedentary Older Adults with Diabetes and Cognitive Impairment
The Global Burden of Disease Study (GBD 2019 Diseases and Injuries Collaborators) found that diabetes significantly increases the overall burden of disease, leading to a 24.4% increase in disability-adjusted life years. Persistently high glucose levels in diabetes can cause structural and functional changes in proteins throughout the body, and the accumulation of protein aggregates in the brain that can be associated with the progression of Alzheimer’s Disease (AD). To address this burden in type 2 diabetes mellitus (T2DM), a combined aerobic and resistance exercise program was developed based on the recommendations of the American College of Sports Medicine. The prospectively registered clinical trials (NCT04626453, NCT04812288) involved two groups: an Intervention group of older sedentary adults with T2DM and a Control group of healthy older adults who could be either active or sedentary. The completion rate for the 2-month exercise program was high, with participants completing on an average of 89.14% of the exercise sessions. This indicated that the program was practical, feasible, and well tolerated, even during the COVID-19 pandemic. It was also safe, requiring minimal equipment and no supervision. Our paper presents portable near-infrared spectroscopy (NIRS) based measures that showed muscle oxygen saturation (SmO2), i.e., the balance between oxygen delivery and oxygen consumption in muscle, drop during bilateral heel rise task (BHR) and the 6 min walk task (6MWT) significantly (p < 0.05) changed at the post-intervention follow-up from the pre-intervention baseline in the T2DM Intervention group participants. Moreover, post-intervention changes from pre-intervention baseline for the prefrontal activation (both oxyhemoglobin and deoxyhemoglobin) showed statistically significant (p < 0.05, q < 0.05) effect at the right superior frontal gyrus, dorsolateral, during the Mini-Cog task. Here, operational modal analysis provided further insights into the 2-month exercise intervention effects on the very-low-frequency oscillations (<0.05 Hz) during the Mini-Cog task that improved post-intervention in the sedentary T2DM Intervention group from their pre-intervention baseline when compared to active healthy Control group. Then, the 6MWT distance significantly (p < 0.01) improved in the T2DM Intervention group at post-intervention follow-up from pre-intervention baseline that showed improved aerobic capacity and endurance. Our portable NIRS based measures have practical implications at the point of care for the therapists as they can monitor muscle and brain oxygenation changes during physical and cognitive tests to prescribe personalized physical exercise doses without triggering individual stress response, thereby, enhancing vascular health in T2DM
2023-2024 Boise State University Undergraduate Catalog
This catalog is primarily for and directed at students. However, it serves many audiences, such as high school counselors, academic advisors, and the public. In this catalog you will find an overview of Boise State University and information on admission, registration, grades, tuition and fees, financial aid, housing, student services, and other important policies and procedures. However, most of this catalog is devoted to describing the various programs and courses offered at Boise State
Perceptual Requirements for World-Locked Rendering in AR and VR
Stereoscopic, head-tracked display systems can show users realistic,
world-locked virtual objects and environments. However, discrepancies between
the rendering pipeline and physical viewing conditions can lead to perceived
instability in the rendered content resulting in reduced realism, immersion,
and, potentially, visually-induced motion sickness. The requirements to achieve
perceptually stable world-locked rendering are unknown due to the challenge of
constructing a wide field of view, distortion-free display with highly accurate
head- and eye-tracking. In this work we introduce new hardware and software
built upon recently introduced hardware and present a system capable of
rendering virtual objects over real-world references without perceivable drift
under such constraints. The platform is used to study acceptable errors in
render camera position for world-locked rendering in augmented and virtual
reality scenarios, where we find an order of magnitude difference in perceptual
sensitivity between them. We conclude by comparing study results with an
analytic model which examines changes to apparent depth and visual heading in
response to camera displacement errors. We identify visual heading as an
important consideration for world-locked rendering alongside depth errors from
incorrect disparity
On the Principles of Evaluation for Natural Language Generation
Natural language processing is concerned with the ability of computers to understand natural language texts, which is, arguably, one of the major bottlenecks in the course of chasing the holy grail of general Artificial Intelligence. Given the unprecedented success of deep learning technology, the natural language processing community has been almost entirely in favor of practical applications with state-of-the-art systems emerging and competing for human-parity performance at an ever-increasing pace. For that reason, fair and adequate evaluation and comparison, responsible for ensuring trustworthy, reproducible and unbiased results, have fascinated the scientific community for long, not only in natural language but also in other fields. A popular example is the ISO-9126 evaluation standard for software products, which outlines a wide range of evaluation concerns, such as cost, reliability, scalability, security, and so forth. The European project EAGLES-1996, being the acclaimed extension to ISO-9126, depicted the fundamental principles specifically for evaluating natural language technologies, which underpins succeeding methodologies in the evaluation of natural language.
Natural language processing encompasses an enormous range of applications, each with its own evaluation concerns, criteria and measures. This thesis cannot hope to be comprehensive but particularly addresses the evaluation in natural language generation (NLG), which touches on, arguably, one of the most human-like natural language applications. In this context, research on quantifying day-to-day progress with evaluation metrics lays the foundation of the fast-growing NLG community. However, previous works have failed to address high-quality metrics in multiple scenarios such as evaluating long texts and when human references are not available, and, more prominently, these studies are limited in scope, given the lack of a holistic view sketched for principled NLG evaluation.
In this thesis, we aim for a holistic view of NLG evaluation from three complementary perspectives, driven by the evaluation principles in EAGLES-1996: (i) high-quality evaluation metrics, (ii) rigorous comparison of NLG systems for properly tracking the progress, and (iii) understanding evaluation metrics. To this end, we identify the current state of challenges derived from the inherent characteristics of these perspectives, and then present novel metrics, rigorous comparison approaches, and explainability techniques for metrics to address the identified issues.
We hope that our work on evaluation metrics, system comparison and explainability for metrics inspires more research towards principled NLG evaluation, and contributes to the fair and adequate evaluation and comparison in natural language processing
Machine Learning approach for TWA detection relying on ensemble data design
Background and objective: T-wave alternans (TWA) is a fluctuation of the ST–T complex of
the surface electrocardiogram (ECG) on an every–other–beat basis. It has been shown to be
clinically helpful for sudden cardiac death stratification, though the lack of a gold standard to
benchmark detection methods limits its application and impairs the development of alternative
techniques. In this work, a novel approach based on machine learning for TWA detection is
proposed. Additionally, a complete experimental setup is presented for TWA detection methods
benchmarking.
Methods: The proposed experimental setup is based on the use of open-source databases to
enable experiment replication and the use of real ECG signals with added TWA episodes. Also,
intra-patient overfitting and class imbalance have been carefully avoided. The Spectral Method
(SM), the Modified Moving Average Method (MMA), and the Time Domain Method (TM) are used
to obtain input features to the Machine Learning (ML) algorithms, namely, K Nearest Neighbor,
Decision Trees, Random Forest, Support Vector Machine and Multi-Layer Perceptron.
Results: There were not found large differences in the performance of the different ML algorithms.
Decision Trees showed the best overall performance (accuracy 0.88 ± 0.04, precision 0.89 ± 0.05,
Recall 0.90± 0.05, F1 score 0.89± 0.03). Compared to the SM (accuracy 0.79, precision 0.93, Recall
0.64, F1 score 0.76) there was an improvement in every metric except for the precision.
Conclusions: In this work, a realistic database to test the presence of TWA using ML algorithms
was assembled. The ML algorithms overall outperformed the SM used as a gold standard. Learning
from data to identify alternans elicits a substantial detection growth at the expense of a small
increment of the false alarm.Universidad de Alcal
- …