42 research outputs found
Asymptotic Characterisation of Robust Empirical Risk Minimisation Performance in the Presence of Outliers
We study robust linear regression in high-dimension, when both the dimension
and the number of data points diverge with a fixed ratio ,
and study a data model that includes outliers. We provide exact asymptotics for
the performances of the empirical risk minimisation (ERM) using
-regularised , , and Huber loss, which are the standard
approach to such problems. We focus on two metrics for the performance: the
generalisation error to similar datasets with outliers, and the estimation
error of the original, unpolluted function. Our results are compared with the
information theoretic Bayes-optimal estimation bound. For the generalization
error, we find that optimally-regularised ERM is asymptotically consistent in
the large sample complexity limit if one perform a simple calibration, and
compute the rates of convergence. For the estimation error however, we show
that due to a norm calibration mismatch, the consistency of the estimator
requires an oracle estimate of the optimal norm, or the presence of a
cross-validation set not corrupted by the outliers. We examine in detail how
performance depends on the loss function and on the degree of outlier
corruption in the training set and identify a region of parameters where the
optimal performance of the Huber loss is identical to that of the
loss, offering insights into the use cases of different loss functions
Thermodynamics and phonon dispersion of pyrope and grossular silicate garnets from ab initio simulations
Imaging-based representation and stratification of intra-tumor heterogeneity via tree-edit distance
Personalized medicine is the future of medical practice. In oncology, tumor heterogeneity assessment represents a pivotal step for effective treatment planning and prognosis prediction. Despite new procedures for DNA sequencing and analysis, non-invasive methods for tumor characterization are needed to impact on daily routine. On purpose, imaging texture analysis is rapidly scaling, holding the promise to surrogate histopathological assessment of tumor lesions. In this work, we propose a tree-based representation strategy for describing intra-tumor heterogeneity of patients affected by metastatic cancer. We leverage radiomics information extracted from PET/CT imaging and we provide an exhaustive and easily readable summary of the disease spreading. We exploit this novel patient representation to perform cancer subtyping according to hierarchical clustering technique. To this purpose, a new heterogeneity-based distance between trees is defined and applied to a case study of prostate cancer. Clusters interpretation is explored in terms of concordance with severity status, tumor burden and biological characteristics. Results are promising, as the proposed method outperforms current literature approaches. Ultimately, the proposed method draws a general analysis framework that would allow to extract knowledge from daily acquired imaging data of patients and provide insights for effective treatment planning
Plasma Dynamics
Contains table of contents for Section 2 and reports on three research projects.U.S. Navy - Office of Naval Research Grant N00014-90-J-4130Princeton University/Tokamak Physics Experiment Grant S-03688-GU.S. Department of Energy Grant DE-FG02-91-ER-54109National Science Foundation Grant ATM 94-2428
Impact of COVID-19 on cardiovascular testing in the United States versus the rest of the world
Objectives: This study sought to quantify and compare the decline in volumes of cardiovascular procedures between the United States and non-US institutions during the early phase of the coronavirus disease-2019 (COVID-19) pandemic.
Background: The COVID-19 pandemic has disrupted the care of many non-COVID-19 illnesses. Reductions in diagnostic cardiovascular testing around the world have led to concerns over the implications of reduced testing for cardiovascular disease (CVD) morbidity and mortality.
Methods: Data were submitted to the INCAPS-COVID (International Atomic Energy Agency Non-Invasive Cardiology Protocols Study of COVID-19), a multinational registry comprising 909 institutions in 108 countries (including 155 facilities in 40 U.S. states), assessing the impact of the COVID-19 pandemic on volumes of diagnostic cardiovascular procedures. Data were obtained for April 2020 and compared with volumes of baseline procedures from March 2019. We compared laboratory characteristics, practices, and procedure volumes between U.S. and non-U.S. facilities and between U.S. geographic regions and identified factors associated with volume reduction in the United States.
Results: Reductions in the volumes of procedures in the United States were similar to those in non-U.S. facilities (68% vs. 63%, respectively; p = 0.237), although U.S. facilities reported greater reductions in invasive coronary angiography (69% vs. 53%, respectively; p < 0.001). Significantly more U.S. facilities reported increased use of telehealth and patient screening measures than non-U.S. facilities, such as temperature checks, symptom screenings, and COVID-19 testing. Reductions in volumes of procedures differed between U.S. regions, with larger declines observed in the Northeast (76%) and Midwest (74%) than in the South (62%) and West (44%). Prevalence of COVID-19, staff redeployments, outpatient centers, and urban centers were associated with greater reductions in volume in U.S. facilities in a multivariable analysis.
Conclusions: We observed marked reductions in U.S. cardiovascular testing in the early phase of the pandemic and significant variability between U.S. regions. The association between reductions of volumes and COVID-19 prevalence in the United States highlighted the need for proactive efforts to maintain access to cardiovascular testing in areas most affected by outbreaks of COVID-19 infection
The Dyck bound in the concave 1-dimensional random assignment model
International audienceWe consider models of assignment for random N blue points and N red points on an interval of length 2N , in which the cost for connecting a blue point in x to a red point in y is the concave function |x − y| p , for 0 1, where the optimal matching is trivially determined, here the optimization is non-trivial. The purpose of this paper is to introduce a special configuration, that we call the Dyck matching, and to study its statistical properties. We compute exactly the average cost, in the asymptotic limit of large N , together with the first subleading correction. The scaling is remarkable: it is of order N for p 1 2 , and it is universal for a wide class of models. We conjecture that the average cost of the Dyck matching has the same scaling in N as the cost of the optimal matching, and we produce numerical data in support of this conjecture. We hope to produce a proof of this claim in future work