9 research outputs found
Optimization methods for parameter identifications in settings with only partial knowledge
Thesis (Ph.D.)--University of Washington, 2023This work summarizes two projects focused on incorporating prior knowledge into machine learning models. In the first project, a universal feature selection method for linear mixed-effect models is developed. Namely, Sparse Relaxed Regularized Regression (SR3) is extended to the case of Linear Mixed-Effect (LME) likelihoods, and we prove that one can minimize such likelihoods with proximal gradient descent. Theoretical underpinnings of the proposed extension are also presented, including consistency results, variational properties, implementability of optimization methods, and convergence results. In particular, convergence analyses are provided for a basic implementation of SR3 for LME and an accelerated hybrid algorithm. Numerical results show the utility and speed of these algorithms on real and simulated datasets. Finally, both algorithms are implemented in an open-source python package pysr3. Conveniently, this package offers complete compatibility with scikit-learn, so all pysr3 models can be used in a pipeline with classic modeling blocks such as data pre-processors, randomized grid search, cross-validation, and quality metrics. The second line of work develops a framework for training Reduced-Order Models (ROMs) with Physics-Informed Neural Ordinary Differential Equations (PINODE). In particular, a classic technique of collocation points is adapted to transfer knowledge from a known equation to a model that approximates solutions of that equation. The addition of a physics-informed loss allows for exceptional data supply strategies that improve the performance of ROMs in data-scarce settings, where training high-quality data-driven models is impossible. The resulting ROMs extrapolate forward in time more accurately, perform better for unseen initial conditions, and exhibit less sensitivity to noise. Finally, I show how such ROMs can be used as strong regularizers in single-pixel imaging (SPI), enabling the reduction of samples-per-frame rate by an order of magnitude relative to other state-of-the-art algorithms
Physics-informed neural ODE (PINODE): embedding physics into models using collocation points
Abstract Building reduced-order models (ROMs) is essential for efficient forecasting and control of complex dynamical systems. Recently, autoencoder-based methods for building such models have gained significant traction, but their demand for data limits their use when the data is scarce and expensive. We propose aiding a model’s training with the knowledge of physics using a collocation-based physics-informed loss term. Our innovation builds on ideas from classical collocation methods of numerical analysis to embed knowledge from a known equation into the latent-space dynamics of a ROM. We show that the addition of our physics-informed loss allows for exceptional data supply strategies that improves the performance of ROMs in data-scarce settings, where training high-quality data-driven models is impossible. Namely, for a problem of modeling a high-dimensional nonlinear PDE, our experiments show × 5 performance gains, measured by prediction error, in a low-data regime, × 10 performance gains in tasks of high-noise learning, × 100 gains in the efficiency of utilizing the latent-space dimension, and × 200 gains in tasks of far-out out-of-distribution forecasting relative to purely data-driven models. These improvements pave the way for broader adoption of network-based physics-informed ROMs in compressive sensing and control applications
A Relaxation Approach to Feature Selection for Linear Mixed Effects Models
Linear Mixed-Effects (LME) models are a fundamental tool for modeling
correlated data, including cohort studies, longitudinal data analysis, and
meta-analysis. Design and analysis of variable selection methods for LMEs is
more difficult than for linear regression because LME models are nonlinear. In
this work we propose a relaxation strategy and optimization methods that enable
a wide range of variable selection methods for LMEs using both convex and
nonconvex regularizers, including , Adaptive-, CAD, and
. The computational framework only requires the proximal operator for
each regularizer to be available, and the implementation is available in an
open source python package pysr3, consistent with the sklearn standard. The
numerical results on simulated data sets indicate that the proposed strategy
improves on the state of the art for both accuracy and compute time. The
variable selection techniques are also validated on a real example using a data
set on bullying victimization.Comment: 29 pages, 6 figure
Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic
Before 2020, mental disorders were leading causes of the global health-related burden, with depressive and anxiety disorders being leading contributors to this burden. The emergence of the COVID-19 pandemic has created an environment where many determinants of poor mental health are exacerbated. The need for up-to-date information on the mental health impacts of COVID-19 in a way that informs health system responses is imperative. In this study, we aimed to quantify the impact of the COVID-19 pandemic on the prevalence and burden of major depressive disorder and anxiety disorders globally in 2020.
Through a systematic review of data reporting the prevalence of major depressive disorder and anxiety disorders during the COVID-19 pandemic and published between Jan 1, 2020, and Jan 29, 2021 and using the assembled data in a meta-regression to estimate change in the prevalence of major depressive disorder and anxiety disorders between pre-pandemic and mid-pandemic (using periods as defined by each study) via COVID-19 impact indicators (human mobility, daily SARS-CoV-2 infection rate, and daily excess mortality rate) by age, sex, and location. Final prevalence estimates and disability weights were used to estimate years lived with disability and disability-adjusted life-years (DALYs) for major depressive disorder and anxiety disorders
Estimating global, regional, and national daily and cumulative infections with SARS-CoV-2 through Nov 14, 2021: a statistical analysis
Timely, accurate, and comprehensive estimates of SARS-CoV-2 daily infection rates, cumulative infections, the proportion of the population that has been infected at least once, and the effective reproductive number (Reffective) are essential for understanding the determinants of past infection, current transmission patterns, and a population’s
susceptibility to future infection with the same variant. Although several studies have estimated cumulative SARS-CoV-2
infections in select locations at specific points in time, all of these analyses have relied on biased data inputs that were not adequately corrected for. In this study, we aimed to provide a novel approach to estimating past SARS-CoV-2 daily
infections, cumulative infections, and the proportion of the population infected, for 190 countries and territories from the start of the pandemic to Nov 14, 2021. This approach combines data from reported cases, reported deaths, excess
deaths attributable to COVID-19, hospitalisations, and seroprevalence surveys to produce more robust estimates that minimise constituent biases
Pandemic preparedness and COVID-19: an exploratory analysis of infection and fatality rates, and contextual factors associated with preparedness in 177 countries, from Jan 1, 2020, to Sept 30, 2021
National rates of COVID-19 infection and fatality have varied dramatically since the onset of the pandemic. Understanding the conditions associated with this cross-country variation is essential to guiding investment in more effective preparedness and response for future pandemics.
Daily SARS-CoV-2 infections and COVID-19 deaths for 177 countries and territories and 181 subnational locations were extracted from the Institute for Health Metrics and Evaluation's modelling database. Cumulative infection rate and infection-fatality ratio (IFR) were estimated and standardised for environmental, demographic, biological, and economic factors. For infections, we included factors associated with environmental seasonality (measured as the relative risk of pneumonia), population density, gross domestic product (GDP) per capita, proportion of the population living below 100 m, and a proxy for previous exposure to other betacoronaviruses. For IFR, factors were age distribution of the population, mean body-mass index (BMI), exposure to air pollution, smoking rates, the proxy for previous exposure to other betacoronaviruses, population density, age-standardised prevalence of chronic obstructive pulmonary disease and cancer, and GDP per capita. These were standardised using indirect age standardisation and multivariate linear models. Standardised national cumulative infection rates and IFRs were tested for associations with 12 pandemic preparedness indices, seven health-care capacity indicators, and ten other demographic, social, and political conditions using linear regression. To investigate pathways by which important factors might affect infections with SARS-CoV-2, we also assessed the relationship between interpersonal and governmental trust and corruption and changes in mobility patterns and COVID-19 vaccination rates.
The factors that explained the most variation in cumulative rates of SARS-CoV-2 infection between Jan 1, 2020, and Sept 30, 2021, included the proportion of the population living below 100 m (5·4% [4·0–7·9] of variation), GDP per capita (4·2% [1·8–6·6] of variation), and the proportion of infections attributable to seasonality (2·1% [95% uncertainty interval 1·7–2·7] of variation). Most cross-country variation in cumulative infection rates could not be explained. The factors that explained the most variation in COVID-19 IFR over the same period were the age profile of the country (46·7% [18·4–67·6] of variation), GDP per capita (3·1% [0·3–8·6] of variation), and national mean BMI (1·1% [0·2–2·6] of variation). 44·4% (29·2–61·7) of cross-national variation in IFR could not be explained. Pandemic-preparedness indices, which aim to measure health security capacity, were not meaningfully associated with standardised infection rates or IFRs. Measures of trust in the government and interpersonal trust, as well as less government corruption, had larger, statistically significant associations with lower standardised infection rates. High levels of government and interpersonal trust, as well as less government corruption, were also associated with higher COVID-19 vaccine coverage among middle-income and high-income countries where vaccine availability was more widespread, and lower corruption was associated with greater reductions in mobility. If these modelled associations were to be causal, an increase in trust of governments such that all countries had societies that attained at least the amount of trust in government or interpersonal trust measured in Denmark, which is in the 75th percentile across these spectrums, might have reduced global infections by 12·9% (5·7–17·8) for government trust and 40·3% (24·3–51·4) for interpersonal trust. Similarly, if all countries had a national BMI equal to or less than that of the 25th percentile, our analysis suggests global standardised IFR would be reduced by 11·1%.
Efforts to improve pandemic preparedness and response for the next pandemic might benefit from greater investment in risk communication and community engagement strategies to boost the confidence that individuals have in public health guidance. Our results suggest that increasing health promotion for key modifiable risks is associated with a reduction of fatalities in such a scenario
Estimating excess mortality due to the COVID-19 pandemic: a systematic analysis of COVID-19-related mortality, 2020???21
The full impact of the pandemic has been much greater than what is indicated by reported deaths due
to COVID-19 alone. Strengthening death registration systems around the world, long understood to be crucial to
global public health strategy, is necessary for improved monitoring of this pandemic and future pandemics. In
addition, further research is warranted to help distinguish the proportion of excess mortality that was directly caused
by SARS-CoV-2 infection and the changes in causes of death as an indirect consequence of the pandemic
Estimated Global Proportions of Individuals With Persistent Fatigue, Cognitive, and Respiratory Symptom Clusters Following Symptomatic COVID-19 in 2020 and 2021
IMPORTANCE Some individuals experience persistent symptoms after initial symptomatic SARS-CoV-2 infection (often referred to as Long COVID).OBJECTIVE To estimate the proportion of males and females with COVID-19, younger or older than 20 years of age, who had Long COVID symptoms in 2020 and 2021 and their Long COVID symptom duration.DESIGN, SETTING, AND PARTICIPANTS Bayesian meta-regression and pooling of 54 studies and 2 medical record databases with data for 1.2 million individuals (from 22 countries) who had symptomatic SARS-CoV-2 infection. Of the 54 studies, 44 were published and 10 were collaborating cohorts (conducted in Austria, the Faroe Islands, Germany, Iran, Italy, the Netherlands, Russia, Sweden, Switzerland, and the US). The participant data were derived from the 44 published studies (10 501 hospitalized individuals and 42 891 nonhospitalized individuals), the 10 collaborating cohort studies (10 526 and 1906), and the 2 US electronic medical record databases (250 928 and 846 046). Data collection spanned March 2020 to January 2022.EXPOSURES Symptomatic SARS-CoV-2 infection.MAIN OUTCOMES AND MEASURES Proportion of individuals with at least 1 of the 3 self-reported Long COVID symptom clusters (persistent fatigue with bodily pain or mood swings; cognitive problems; or ongoing respiratory problems) 3 months after SARS-CoV-2 infection in 2020 and 2021, estimated separately for hospitalized and nonhospitalized individuals aged 20 years or older by sex and for both sexes of nonhospitalized individuals younger than 20 years of age.RESULTS A total of 1.2 million individuals who had symptomatic SARS-CoV-2 infection were included (mean age, 4-66 years; males, 26%-88%). In the modeled estimates, 6.2% (95% uncertainty interval [UI], 2.4%-13.3%) of individuals who had symptomatic SARS-CoV-2 infection experienced at least 1 of the 3 Long COVID symptom clusters in 2020 and 2021, including 3.2% (95% UI, 0.6%-10.0%) for persistent fatigue with bodily pain or mood swings, 3.7% (95% UI, 0.9%-9.6%) for ongoing respiratory problems, and 2.2% (95% UI, 0.3%-7.6%) for cognitive problems after adjusting for health status before COVID-19, comprising an estimated 51.0% (95% UI, 16.9%-92.4%), 60.4% (95% UI, 18.9%-89.1%), and 35.4% (95% UI, 9.4%-75.1%), respectively, of Long COVID cases. The Long COVID symptom clusters were more common in women aged 20 years or older (10.6% [95% UI, 4.3%-22.2%]) 3 months after symptomatic SARS-CoV-2 infection than in men aged 20 years or older (5.4% [95% UI, 2.2%-11.7%]). Both sexes younger than 20 years of age were estimated to be affected in 2.8% (95% UI, 0.9%-7.0%) of symptomatic SARS-CoV-2 infections. The estimated mean Long COVID symptom cluster duration was 9.0 months (95% UI, 7.0-12.0 months) among hospitalized individuals and 4.0 months (95% UI, 3.6-4.6 months) among nonhospitalized individuals. Among individuals with Long COVID symptoms 3 months after symptomatic SARS-CoV-2 infection, an estimated 15.1% (95% UI, 10.3%-21.1%) continued to experience symptoms at 12 months.CONCLUSIONS AND RELEVANCE This study presents modeled estimates of the proportion of individuals with at least 1 of 3 self-reported Long COVID symptom clusters (persistent fatigue with bodily pain or mood swings; cognitive problems; or ongoing respiratory problems) 3 months after symptomatic SARS-CoV-2 infection