11 research outputs found

    Comparison of First-Line Dual Combination Treatments in Hypertension: Real-World Evidence from Multinational Heterogeneous Cohorts

    Get PDF
    BACKGROUND AND OBJECTIVES: 2018 ESC/ESH Hypertension guideline recommends 2-drug combination as initial anti-hypertensive therapy. However, real-world evidence for effectiveness of recommended regimens remains limited. We aimed to compare the effectiveness of first-line anti-hypertensive treatment combining 2 out of the following classes: angiotensin-converting enzyme (ACE) inhibitors/angiotensin-receptor blocker (A), calcium channel blocker (C), and thiazide-type diuretics (D). METHODS: Treatment-naรฏve hypertensive adults without cardiovascular disease (CVD) who initiated dual anti-hypertensive medications were identified in 5 databases from US and Korea. The patients were matched for each comparison set by large-scale propensity score matching. Primary endpoint was all-cause mortality. Myocardial infarction, heart failure, stroke, and major adverse cardiac and cerebrovascular events as a composite outcome comprised the secondary measure. RESULTS: A total of 987,983 patients met the eligibility criteria. After matching, 222,686, 32,344, and 38,513 patients were allocated to A+C vs. A+D, C+D vs. A+C, and C+D vs. A+D comparison, respectively. There was no significant difference in the mortality during total of 1,806,077 person-years: A+C vs. A+D (hazard ratio [HR], 1.08; 95% confidence interval [CI], 0.97-1.20; p=0.127), C+D vs. A+C (HR, 0.93; 95% CI, 0.87-1.01; p=0.067), and C+D vs. A+D (HR, 1.18; 95% CI, 0.95-1.47; p=0.104). A+C was associated with a slightly higher risk of heart failure (HR, 1.09; 95% CI, 1.01-1.18; p=0.040) and stroke (HR, 1.08; 95% CI, 1.01-1.17; p=0.040) than A+D. CONCLUSIONS: There was no significant difference in mortality among A+C, A+D, and C+D combination treatment in patients without previous CVD. This finding was consistent across multi-national heterogeneous cohorts in real-world practice.ope

    Scalable Sparse Cox's Regression for Large-Scale Survival Data via Broken Adaptive Ridge

    Full text link
    This paper develops a new scalable sparse Cox regression tool for sparse high-dimensional massive sample size (sHDMSS) survival data. The method is a local L0L_0-penalized Cox regression via repeatedly performing reweighted L2L_2-penalized Cox regression. We show that the resulting estimator enjoys the best of L0L_0- and L2L_2-penalized Cox regressions while overcoming their limitations. Specifically, the estimator is selection consistent, oracle for parameter estimation, and possesses a grouping property for highly correlated covariates. Simulation results suggest that when the sample size is large, the proposed method with pre-specified tuning parameters has a comparable or better performance than some popular penalized regression methods. More importantly, because the method naturally enables adaptation of efficient algorithms for massive L2L_2-penalized optimization and does not require costly data driven tuning parameter selection, it has a significant computational advantage for sHDMSS data, offering an average of 5-fold speedup over its closest competitor in empirical studies

    Risk of hydroxychloroquine alone and in combination with azithromycin in the treatment of rheumatoid arthritis: a multinational, retrospective study

    Get PDF
    Background: Hydroxychloroquine, a drug commonly used in the treatment of rheumatoid arthritis, has received much negative publicity for adverse events associated with its authorisation for emergency use to treat patients with COVID-19 pneumonia. We studied the safety of hydroxychloroquine, alone and in combination with azithromycin, to determine the risk associated with its use in routine care in patients with rheumatoid arthritis. Methods: In this multinational, retrospective study, new user cohort studies in patients with rheumatoid arthritis aged 18 years or older and initiating hydroxychloroquine were compared with those initiating sulfasalazine and followed up over 30 days, with 16 severe adverse events studied. Self-controlled case series were done to further establish safety in wider populations, and included all users of hydroxychloroquine regardless of rheumatoid arthritis status or indication. Separately, severe adverse events associated with hydroxychloroquine plus azithromycin (compared with hydroxychloroquine plus amoxicillin) were studied. Data comprised 14 sources of claims data or electronic medical records from Germany, Japan, the Netherlands, Spain, the UK, and the USA. Propensity score stratification and calibration using negative control outcomes were used to address confounding. Cox models were fitted to estimate calibrated hazard ratios (HRs) according to drug use. Estimates were pooled where the I2 value was less than 0ยท4. Findings: The study included 956 374 users of hydroxychloroquine, 310 350 users of sulfasalazine, 323 122 users of hydroxychloroquine plus azithromycin, and 351 956 users of hydroxychloroquine plus amoxicillin. No excess risk of severe adverse events was identified when 30-day hydroxychloroquine and sulfasalazine use were compared. Self-controlled case series confirmed these findings. However, long-term use of hydroxychloroquine appeared to be associated with increased cardiovascular mortality (calibrated HR 1ยท65 [95% CI 1ยท12โ€“2ยท44]). Addition of azithromycin appeared to be associated with an increased risk of 30-day cardiovascular mortality (calibrated HR 2ยท19 [95% CI 1ยท22โ€“3ยท95]), chest pain or angina (1ยท15 [1ยท05โ€“1ยท26]), and hear

    A model not a prophet:Operationalising patient-level prediction using observational data networks

    Get PDF
    Improving prediction model developement and evaluation processes using observational health data

    A model not a prophet:Operationalising patient-level prediction using observational data networks

    Get PDF
    Improving prediction model developement and evaluation processes using observational health data

    High-Performance Statistical Computing in the Computing Environments of the 2020s

    Full text link
    Technological advances in the past decade, hardware and software alike, have made access to high-performance computing (HPC) easier than ever. We review these advances from a statistical computing perspective. Cloud computing makes access to supercomputers affordable. Deep learning software libraries make programming statistical algorithms easy and enable users to write code once and run it anywhere -- from a laptop to a workstation with multiple graphics processing units (GPUs) or a supercomputer in a cloud. Highlighting how these developments benefit statisticians, we review recent optimization algorithms that are useful for high-dimensional models and can harness the power of HPC. Code snippets are provided to demonstrate the ease of programming. We also provide an easy-to-use distributed matrix data structure suitable for HPC. Employing this data structure, we illustrate various statistical applications including large-scale positron emission tomography and โ„“1\ell_1-regularized Cox regression. Our examples easily scale up to an 8-GPU workstation and a 720-CPU-core cluster in a cloud. As a case in point, we analyze the onset of type-2 diabetes from the UK Biobank with 200,000 subjects and about 500,000 single nucleotide polymorphisms using the HPC โ„“1\ell_1-regularized Cox regression. Fitting this half-million-variate model takes less than 45 minutes and reconfirms known associations. To our knowledge, this is the first demonstration of the feasibility of penalized regression of survival outcomes at this scale.Comment: Accepted for publication in Statistical Scienc

    ๋ณ‘๋ ฌํ™” ์šฉ์ดํ•œ ํ†ต๊ณ„๊ณ„์‚ฐ ๋ฐฉ๋ฒ•๋ก ๊ณผ ํ˜„๋Œ€ ๊ณ ์„ฑ๋Šฅ ์ปดํ“จํŒ… ํ™˜๊ฒฝ์—์˜ ์ ์šฉ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ†ต๊ณ„ํ•™๊ณผ, 2020. 8. ์›์ค‘ํ˜ธ.Technological advances in the past decade, hardware and software alike, have made access to high-performance computing (HPC) easier than ever. In this dissertation, easily-parallelizable, inversion-free, and variable-separated algorithms and their implementation in statistical computing are discussed. The first part considers statistical estimation problems under structured sparsity posed as minimization of a sum of two or three convex functions, one of which is a composition of non-smooth and linear functions. Examples include graph-guided sparse fused lasso and overlapping group lasso. Two classes of inversion-free primal-dual algorithms are considered and unified from a perspective of monotone operator theory. From this unification, a continuum of preconditioned forward-backward operator splitting algorithms amenable to parallel and distributed computing is proposed. The unification is further exploited to introduce a continuum of accelerated algorithms on which the theoretically optimal asymptotic rate of convergence is obtained. For the second part, easy-to-use distributed matrix data structures in PyTorch and Julia are presented. They enable users to write code once and run it anywhere from a laptop to a workstation with multiple graphics processing units (GPUs) or a supercomputer in a cloud. With these data structures, various parallelizable statistical applications, including nonnegative matrix factorization, positron emission tomography, multidimensional scaling, and โ„“1-regularized Cox regression, are demonstrated. The examples scale up to an 8-GPU workstation and a 720-CPU-core cluster in a cloud. As a case in point, the onset of type-2 diabetes from the UK Biobank with 400,000 subjects and about 500,000 single nucleotide polymorphisms is analyzed using the HPC โ„“1-regularized Cox regression. Fitting a half-million variate model took about 50 minutes, reconfirming known associations. To my knowledge, the feasibility of a joint genome-wide association analysis of survival outcomes at this scale is first demonstrated.์ง€๋‚œ 10๋…„๊ฐ„์˜ ํ•˜๋“œ์›จ์–ด์™€ ์†Œํ”„ํŠธ์›จ์–ด์˜ ๊ธฐ์ˆ ์ ์ธ ๋ฐœ์ „์€ ๊ณ ์„ฑ๋Šฅ ์ปดํ“จํŒ…์˜ ์ ‘๊ทผ์žฅ๋ฒฝ์„ ๊ทธ ์–ด๋Š ๋•Œ๋ณด๋‹ค ๋‚ฎ์ถ”์—ˆ๋‹ค. ์ด ํ•™์œ„๋…ผ๋ฌธ์—์„œ๋Š” ๋ณ‘๋ ฌํ™” ์šฉ์ดํ•˜๊ณ  ์—ญํ–‰๋ ฌ ์—ฐ์‚ฐ์ด ์—†๋Š” ๋ณ€์ˆ˜ ๋ถ„๋ฆฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ๊ทธ ํ†ต๊ณ„๊ณ„์‚ฐ์—์„œ์˜ ๊ตฌํ˜„์„ ๋…ผ์˜ํ•œ๋‹ค. ์ฒซ ๋ถ€๋ถ„์€ ๋ณผ๋ก ํ•จ์ˆ˜ ๋‘ ๊ฐœ ๋˜๋Š” ์„ธ ๊ฐœ์˜ ํ•ฉ์œผ๋กœ ๋‚˜ํƒ€๋‚˜๋Š” ๊ตฌ์กฐํ™”๋œ ํฌ์†Œ ํ†ต๊ณ„ ์ถ”์ • ๋ฌธ์ œ์— ๋Œ€ํ•ด ๋‹ค๋ฃฌ๋‹ค. ์ด ๋•Œ ํ•จ์ˆ˜๋“ค ์ค‘ ํ•˜๋‚˜๋Š” ๋น„ํ‰ํ™œ ํ•จ์ˆ˜์™€ ์„ ํ˜• ํ•จ์ˆ˜์˜ ํ•ฉ์„ฑ์œผ๋กœ ๋‚˜ํƒ€๋‚œ๋‹ค. ๊ทธ ์˜ˆ์‹œ๋กœ๋Š” ๊ทธ๋ž˜ํ”„ ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด ์œ ๋„๋˜๋Š” ํฌ์†Œ ์œตํ•ฉ Lasso ๋ฌธ์ œ์™€ ํ•œ ๋ณ€์ˆ˜๊ฐ€ ์—ฌ๋Ÿฌ ๊ทธ๋ฃน์— ์†ํ•  ์ˆ˜ ์žˆ๋Š” ๊ทธ๋ฃน Lasso ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ์ด๋ฅผ ํ’€๊ธฐ ์œ„ํ•ด ์—ญํ–‰๋ ฌ ์—ฐ์‚ฐ์ด ์—†๋Š” ๋‘ ์ข…๋ฅ˜์˜ ์›์‹œ-์Œ๋Œ€ (primal-dual) ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋‹จ์กฐ ์—ฐ์‚ฐ์ž ์ด๋ก  ๊ด€์ ์—์„œ ํ†ตํ•ฉํ•˜๋ฉฐ ์ด๋ฅผ ํ†ตํ•ด ๋ณ‘๋ ฌํ™” ์šฉ์ดํ•œ precondition๋œ ์ „๋ฐฉ-ํ›„๋ฐฉ ์—ฐ์‚ฐ์ž ๋ถ„ํ•  ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์ง‘ํ•ฉ์„ ์ œ์•ˆํ•œ๋‹ค. ์ด ํ†ตํ•ฉ์€ ์ ๊ทผ์ ์œผ๋กœ ์ตœ์  ์ˆ˜๋ ด๋ฅ ์„ ๊ฐ–๋Š” ๊ฐ€์† ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์ง‘ํ•ฉ์„ ๊ตฌ์„ฑํ•˜๋Š” ๋ฐ ํ™œ์šฉ๋œ๋‹ค. ๋‘ ๋ฒˆ์งธ ๋ถ€๋ถ„์—์„œ๋Š” PyTorch์™€ Julia๋ฅผ ํ†ตํ•ด ์‚ฌ์šฉํ•˜๊ธฐ ์‰ฌ์šด ๋ถ„์‚ฐ ํ–‰๋ ฌ ์ž๋ฃŒ ๊ตฌ์กฐ๋ฅผ ์ œ์‹œํ•œ๋‹ค. ์ด ๊ตฌ์กฐ๋Š” ์‚ฌ์šฉ์ž๋“ค์ด ์ฝ”๋“œ๋ฅผ ํ•œ ๋ฒˆ ์ž‘์„ฑํ•˜๋ฉด ์ด๊ฒƒ์„ ๋…ธํŠธ๋ถ ํ•œ ๋Œ€์—์„œ๋ถ€ํ„ฐ ์—ฌ๋Ÿฌ ๋Œ€์˜ ๊ทธ๋ž˜ํ”ฝ ์ฒ˜๋ฆฌ ์žฅ์น˜ (GPU)๋ฅผ ๊ฐ€์ง„ ์›Œํฌ์Šคํ…Œ์ด์…˜, ๋˜๋Š” ํด๋ผ์šฐ๋“œ ์ƒ์— ์žˆ๋Š” ์Šˆํผ์ปดํ“จํ„ฐ๊นŒ์ง€ ๋‹ค์–‘ํ•œ ์Šค์ผ€์ผ์—์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด ์ค€๋‹ค. ์•„์šธ๋Ÿฌ, ์ด ์ž๋ฃŒ ๊ตฌ์กฐ๋ฅผ ๋น„์Œ ํ–‰๋ ฌ ๋ถ„ํ•ด, ์–‘์ „์ž ๋‹จ์ธต ์ดฌ์˜, ๋‹ค์ฐจ์› ์ฒ™ ๋„๋ฒ•, โ„“1-๋ฒŒ์ ํ™” Cox ํšŒ๊ท€ ๋ถ„์„ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ณ‘๋ ฌํ™” ๊ฐ€๋Šฅํ•œ ํ†ต๊ณ„์  ๋ฌธ์ œ์— ์ ์šฉํ•œ๋‹ค. ์ด ์˜ˆ์‹œ๋“ค์€ 8๋Œ€์˜ GPU๊ฐ€ ์žˆ๋Š” ์›Œํฌ์Šคํ…Œ์ด์…˜๊ณผ 720๊ฐœ์˜ ์ฝ”์–ด๊ฐ€ ์žˆ๋Š” ํด๋ผ์šฐ๋“œ ์ƒ์˜ ๊ฐ€์ƒ ํด๋Ÿฌ์Šคํ„ฐ์—์„œ ํ™•์žฅ ๊ฐ€๋Šฅํ–ˆ๋‹ค. ํ•œ ์‚ฌ๋ก€๋กœ 400,000๋ช…์˜ ๋Œ€์ƒ๊ณผ 500,000๊ฐœ์˜ ๋‹จ์ผ ์—ผ๊ธฐ ๋‹คํ˜•์„ฑ ์ •๋ณด๊ฐ€ ์žˆ๋Š” UK Biobank ์ž๋ฃŒ์—์„œ์˜ ์ œ2ํ˜• ๋‹น๋‡จ๋ณ‘ (T2D) ๋ฐœ๋ณ‘ ๋‚˜์ด๋ฅผ โ„“1-๋ฒŒ์ ํ™” Cox ํšŒ๊ท€ ๋ชจํ˜•์„ ํ†ตํ•ด ๋ถ„์„ํ–ˆ๋‹ค. 500,000๊ฐœ์˜ ๋ณ€์ˆ˜๊ฐ€ ์žˆ๋Š” ๋ชจํ˜•์„ ์ ํ•ฉ์‹œํ‚ค๋Š” ๋ฐ 50๋ถ„ ๊ฐ€๋Ÿ‰์˜ ์‹œ๊ฐ„์ด ๊ฑธ๋ ธ์œผ๋ฉฐ ์•Œ๋ ค์ง„ T2D ๊ด€๋ จ ๋‹คํ˜•์„ฑ๋“ค์„ ์žฌํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ทœ๋ชจ์˜ ์ „์œ ์ „์ฒด ๊ฒฐํ•ฉ ์ƒ์กด ๋ถ„์„์€ ์ตœ์ดˆ๋กœ ์‹œ๋„๋œ ๊ฒƒ์ด๋‹ค.Chapter1Prologue 1 1.1 Introduction 1 1.2 Accessible High-Performance Computing Systems 4 1.2.1 Preliminaries 4 1.2.2 Multiple CPU nodes: clusters, supercomputers, and clouds 7 1.2.3 Multi-GPU node 9 1.3 Highly Parallelizable Algorithms 12 1.3.1 MM algorithms 12 1.3.2 Proximal gradient descent 14 1.3.3 Proximal distance algorithm 16 1.3.4 Primal-dual methods 17 Chapter 2 Easily Parallelizable and Distributable Class of Algorithms for Structured Sparsity, with Optimal Acceleration 20 2.1 Introduction 20 2.2 Unification of Algorithms LV and CV (g โ‰ก 0) 30 2.2.1 Relation between Algorithms LV and CV 30 2.2.2 Unified algorithm class 34 2.2.3 Convergence analysis 35 2.3 Optimal acceleration 39 2.3.1 Algorithms 40 2.3.2 Convergence analysis 41 2.4 Stochastic optimal acceleration 45 2.4.1 Algorithm 45 2.4.2 Convergence analysis 47 2.5 Numerical experiments 50 2.5.1 Model problems 50 2.5.2 Convergence behavior 52 2.5.3 Scalability 62 2.6 Discussion 63 Chapter 3 Towards Unified Programming for High-Performance Statistical Computing Environments 66 3.1 Introduction 66 3.2 Related Software 69 3.2.1 Message-passing interface and distributed array interfaces 69 3.2.2 Unified array interfaces for CPU and GPU 69 3.3 Easy-to-use Software Libraries for HPC 70 3.3.1 Deep learning libraries and HPC 70 3.3.2 Case study: PyTorch versus TensorFlow 73 3.3.3 A brief introduction to PyTorch 76 3.3.4 A brief introduction to Julia 80 3.3.5 Methods and multiple dispatch 80 3.3.6 Multidimensional arrays 82 3.3.7 Matrix multiplication 83 3.3.8 Dot syntax for vectorization 86 3.4 Distributed matrix data structure 87 3.4.1 Distributed matrices in PyTorch: distmat 87 3.4.2 Distributed arrays in Julia: MPIArray 90 3.5 Examples 98 3.5.1 Nonnegative matrix factorization 100 3.5.2 Positron emission tomography 109 3.5.3 Multidimensional scaling 113 3.5.4 L1-regularized Cox regression 117 3.5.5 Genome-wide survival analysis of the UK Biobank dataset 121 3.6 Discussion 126 Chapter 4 Conclusion 131 Appendix A Monotone Operator Theory 134 Appendix B Proofs for Chapter II 139 B.1 Preconditioned forward-backward splitting 139 B.2 Optimal acceleration 147 B.3 Optimal stochastic acceleration 158 Appendix C AWS EC2 and ParallelCluster 168 C.1 Overview 168 C.2 Glossary 169 C.3 Prerequisites 172 C.4 Installation 173 C.5 Configuration 173 C.6 Creating, accessing, and destroying the cluster 178 C.7 Installation of libraries 178 C.8 Running a job 179 C.9 Miscellaneous 180 Appendix D Code for memory-efficient L1-regularized Cox proportional hazards model 182 Appendix E Details of SNPs selected in L1-regularized Cox regression 184 Bibliography 188 ๊ตญ๋ฌธ์ดˆ๋ก 212Docto
    corecore