Search CORE

3,373 research outputs found

Controlling the Precision-Recall Tradeoff in Differential Dependency Network Analysis

Author: Clark Vincent P.
Niculescu-Mizil Alexandru
Ostroff Rachel
Oyen Diane
Stewart Alex
Publication venue
Publication date: 09/07/2013
Field of study

Graphical models have gained a lot of attention recently as a tool for learning and representing dependencies among variables in multivariate data. Often, domain scientists are looking specifically for differences among the dependency networks of different conditions or populations (e.g. differences between regulatory networks of different species, or differences between dependency networks of diseased versus healthy populations). The standard method for finding these differences is to learn the dependency networks for each condition independently and compare them. We show that this approach is prone to high false discovery rates (low precision) that can render the analysis useless. We then show that by imposing a bias towards learning similar dependency networks for each condition the false discovery rates can be reduced to acceptable levels, at the cost of finding a reduced number of differences. Algorithms developed in the transfer learning literature can be used to vary the strength of the imposed similarity bias and provide a natural mechanism to smoothly adjust this differential precision-recall tradeoff to cater to the requirements of the analysis conducted. We present real case studies (oncological and neurological) where domain experts use the proposed technique to extract useful differential networks that shed light on the biological processes involved in cancer and brain function

arXiv.org e-Print Archive

CiteSeerX

Mortality modelling and forecasting: a review of methods

Author: Booth Heather
Tickle Leonie
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 09/12/2015
Field of study

The Australian National University

Recommended from our members

Computational framework for longevity risk management

Author: A Renshaw
AE Renshaw
AE Renshaw
AM Alonso
B Efron
B Efron
B Efron
E Choi
E Paparoditis
FC Palm
FT Denton
Gabriella Piscopo
H Booth
J Kreiss
JP Kreiss
K Dowd
M Denuit
Maria Russolillo
MC Koissi
N Brouhns
P Bühlmann
P Bühlmann
P Bühlmann
P Hatzopoulos
RD Lee
RJ Hyndman
SN Lahiri
Steven Haberman
T Amemiya
V D’Amato
V D’Amato
V D’Amato
Valeria D’Amato
Y Chang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Longevity risk threatens the financial stability of private and government sponsored defined benefit pension systems as well as social security schemes, in an environment already characterized by persistent low interest rates and heightened financial uncertainty. The mortality experience of countries in the industrialized world would suggest a substantial age-time interaction, with the two dominant trends affecting different age groups at different times. From a statistical point of view, this indicates a dependence structure. It is observed that mortality improvements are similar for individuals of contiguous ages (Wills and Sherris, Integrating financial and demographic longevity risk models: an Australian model for financial applications, Discussion Paper PI-0817, 2008). Moreover, considering the dataset by single ages, the correlations between the residuals for adjacent age groups tend to be high (as noted in Denton et al., J Population Econ 18:203-227, 2005). This suggests that there is value in exploring the dependence structure, also across time, in other words the inter-period correlation. In this research, we focus on the projections of mortality rates, contravening the most commonly encountered dependence property which is the "lack of dependence" (Denuit et al., Actuarial theory for dependent risks: measures. Orders and models, Wiley, New York, 2005). By taking into account the presence of dependence across age and time which leads to systematic over-estimation or under-estimation of uncertainty in the estimates (Liu and Braun, J Probability Stat, 813583:15, 2010), the paper analyzes a tailor-made bootstrap methodology for capturing the spatial dependence in deriving confidence intervals for mortality projection rates. We propose a method which leads to a prudent measure of longevity risk, avoiding the structural incompleteness of the ordinary simulation bootstrap methodology which involves the assumption of independence

City Research Online

Crossref

Archivio della ricerca - Università degli studi di Napoli Federico II

Archivio della Ricerca - Università di Salerno

Archivio istituzionale della ricerca - Università di Genova

Machine learning approaches to optimise the management of patients with sepsis

Author: Komorowski Matthieu
Publication venue: Surgery and Cancer, Imperial College London
Publication date: 01/03/2019
Field of study

The goal of this PhD was to generate novel tools to improve the management of patients with sepsis, by applying machine learning techniques on routinely collected electronic health records. Machine learning is an application of artificial intelligence (AI), where a machine analyses data and becomes able to execute complex tasks without being explicitly programmed. Sepsis is the third leading cause of death worldwide and the main cause of mortality in hospitals, but the best treatment strategy remains uncertain. In particular, evidence suggests that current practices in the administration of intravenous fluids and vasopressors are suboptimal and likely induce harm in a proportion of patients. This represents a key clinical challenge and a top research priority. The main contribution of the research has been the development of a reinforcement learning framework and algorithms, in order to tackle this sequential decision-making problem. The model was built and then validated on three large non-overlapping intensive care databases, containing data collected from adult patients in the U.S.A and the U.K. Our agent extracted implicit knowledge from an amount of patient data that exceeds many-fold the life-time experience of human clinicians and learned optimal treatment by having analysed myriads of (mostly sub-optimal) treatment decisions. We used state-of-the-art evaluation techniques (called high confidence off-policy evaluation) and demonstrated that the value of the treatment strategy of the AI agent was on average reliably higher than the human clinicians. In two large validation cohorts independent from the training data, mortality was the lowest in patients where clinicians’ actual doses matched the AI policy. We also gained insight into the model representations and confirmed that the AI agent relied on clinically and biologically meaningful parameters when making its suggestions. We conducted extensive testing and exploration of the behaviour of the AI agent down to the level of individual patient trajectories, identified potential sources of inappropriate behaviour and offered suggestions for future model refinements. If validated, our model could provide individualized and clinically interpretable treatment decisions for sepsis that may improve patient outcomes.Open Acces

Spiral - Imperial College Digital Repository

Pathophysiological characterization of traumatic brain injury using novel analytical methods

Author: Åkerlund Cecilia
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 26/05/2023
Field of study

Severity of traumatic brain injury is usually classified by Glasgow coma scale (GCS) as “mild”, "moderate" or "severe’, which does not capture the heterogeneity of the disease. According to current guidelines, intracranial pressure (ICP) should not exceed 22 mmHg, with no further recommendations concerning individualization or tolerable duration of intracranial hypertension. The aims of this thesis were to identify subgroups of patients beyond characterization using GCS, and to investigate the impact of duration and magnitude of intracranial hypertension on outcome, using data from the observational prospective study Collaborative European neurotrauma effectiveness research in TBI (CENTER-TBI). To investigate the temporal aspect of tolerable ICP elevations, we examined the correlation between dose of ICP and outcome represented by 6-month Glasgow outcome scale extended (GOSE). ICP dose was represented both by the number of events above thresholds for ICP magnitude and duration and by area under the ICP curve (i.e., “pressure time dose” (PTD)). A variation in tolerable ICP thresholds of 18 mmHg +/- 4 mmHg (2 standard deviations (SD)) for events with duration longer than five minutes was identified using a bootstrapping technique. PTD was correlated to both mortality and unfavorable outcome. A cerebrovascular autoregulation (CA) dependent ICP tolerability was identified. If CA was impaired, no tolerable ICP magnitude and duration thresholds were identified, while if CA was intact, both 19 mmHg for 5 minutes or longer and 15 mmHg for 50 minutes or longer were correlated to worse outcome. While no significant difference in PTD was seen between favorable and unfavorable outcome if CA was intact, there was a significant difference if CA was impaired. In a multivariable analysis, PTD did not remain a significant predictor of outcome when adjusting for other known predictors in TBI. In a causal inference analysis, both cerebrovascular autoregulation status and ICP-lowering therapies represented by the therapy intensity level (TIL) have a directional relationship with outcome. However, no direct causal relationship of ICP towards outcome was found. By applying an unsupervised clustering method, we identified six distinct admission clusters defined by GCS, lactate, oxygen saturation (SpO2), creatinine, glucose, base excess, pH, PaCO2, and body temperature. These clusters can be summarized in clinical presentation and metabolic profile. When clustering longitudinal features during the first week in the intensive care unit (ICU), no optimal number of clusters could be seen. However, glucose variation, a panel of brain biomarkers, and creatinine consistently described trajectories. Although no information on outcome was included in the models, both admission clusters and trajectories showed clear outcome differences, with mortality from 7 to 40% in the admission clusters and 4 to 85% in the trajectories. Adding cluster or trajectory labels to the established outcome prediction IMPACT model significantly improved outcome predictions. The results in this thesis support the importance of cerebrovascular autoregulation status as it was found that CA status was more informative towards outcome than ICP magnitude and duration. There was a variation in tolerable ICP intensity and duration dependent on whether CA was intact. Distinct clusters defined by GCS and metabolic profiles related to outcome suggest the importance of an extracranial evaluation in addition to GCS in TBI patients. Longitudinal trajectories of TBI patients in the ICU are highly characterized by glucose variation, brain biomarkers and creatinine

Publications from Karolinska Institutet

Data-driven modelling of biological multi-scale processes

Author: Hasenauer Jan
Hross Sabrina
Jagiella Nick
Theis Fabian J.
Publication venue
Publication date: 01/01/2015
Field of study

Biological processes involve a variety of spatial and temporal scales. A holistic understanding of many biological processes therefore requires multi-scale models which capture the relevant properties on all these scales. In this manuscript we review mathematical modelling approaches used to describe the individual spatial scales and how they are integrated into holistic models. We discuss the relation between spatial and temporal scales and the implication of that on multi-scale modelling. Based upon this overview over state-of-the-art modelling approaches, we formulate key challenges in mathematical and computational modelling of biological multi-scale and multi-physics processes. In particular, we considered the availability of analysis tools for multi-scale models and model-based multi-scale data integration. We provide a compact review of methods for model-based data integration and model-based hypothesis testing. Furthermore, novel approaches and recent trends are discussed, including computation time reduction using reduced order and surrogate models, which contribute to the solution of inference problems. We conclude the manuscript by providing a few ideas for the development of tailored multi-scale inference methods.Comment: This manuscript will appear in the Journal of Coupled Systems and Multiscale Dynamics (American Scientific Publishers

arXiv.org e-Print Archive

PuSH

Cardiovascular risk prediction in the acute care setting: a mixed methods evaluation using machine learning, real world evidence and qualitative methods

Author: Reynard Charles
Publication venue
Publication date: 31/12/2023
Field of study

The University of Manchester - Institutional Repository

A blood gene expression marker of early Alzheimer's disease.

Author: Coppola G
dNeuroMed Consortium
Dobson R
Furney SJ
Geschwind D
Hodges A
Johnston C
Kłoszewska I
Lourdusamy A
Lovestone Simon
Lunnon Katie
Lupton MK
Mecocci P
Proitsi P
Sattlecker M
Simmons A
Soininen H
Tsolaki M
Vellas B
Publication venue: 'IOS Press'
Publication date: 11/02/2016
Field of study

PublishedJournal ArticleResearch Support, N.I.H., ExtramuralResearch Support, Non-U.S. Gov'tA marker of Alzheimer's disease (AD) that can accurately diagnose disease at the earliest stage would significantly support efforts to develop treatments for early intervention. We have sought to determine the sensitivity and specificity of peripheral blood gene expression as a diagnostic marker of AD using data generated on HT-12v3 BeadChips. We first developed an AD diagnostic classifier in a training cohort of 78 AD and 78 control blood samples and then tested its performance in a validation group of 26 AD and 26 control and 118 mild cognitive impairment (MCI) subjects who were likely to have an AD-endpoint. A 48 gene classifier achieved an accuracy of 75% in the AD and control validation group. Comparisons were made with a classifier developed using structural MRI measures, where both measures were available in the same individuals. In AD and control subjects, the gene expression classifier achieved an accuracy of 70% compared to 85% using MRI. Bootstrapping validation produced expression and MRI classifiers with mean accuracies of 76% and 82%, respectively, demonstrating better concordance between these two classifiers than achieved in a single validation population. We conclude there is potential for blood expression to be a marker for AD. The classifier also predicts a large number of people with MCI, who are likely to develop AD, are more AD-like than normal with 76% of subjects classified as AD rather than control. Many of these people do not have overt brain atrophy, which is known to emerge around the time of AD diagnosis, suggesting the expression classifier may detect AD earlier in the prodromal phase. However, we accept these results could also represent a marker of diseases sharing common etiology.InnoMed, European Union of the Sixth Framework programAlzheimer’s Research UKJohn and Lucille van Geest FoundationNIHRBiomedical Research Centre for Mental Health, South London and Maudsley NHS Foundation TrustInstitute of Psychiatry Kings College LondonNIA/NIH RC

Open Research Exeter

DM-PhyClus: A Bayesian phylogenetic algorithm for infectious disease transmission cluster inference

Author: Brenner Bluma
Labbe Aurélie
Roger Michel
Stephens David A.
Villandré Luc
Publication venue
Publication date: 08/08/2017
Field of study

Background. Conventional phylogenetic clustering approaches rely on arbitrary cutpoints applied a posteriori to phylogenetic estimates. Although in practice, Bayesian and bootstrap-based clustering tend to lead to similar estimates, they often produce conflicting measures of confidence in clusters. The current study proposes a new Bayesian phylogenetic clustering algorithm, which we refer to as DM-PhyClus, that identifies sets of sequences resulting from quick transmission chains, thus yielding easily-interpretable clusters, without using any ad hoc distance or confidence requirement. Results. Simulations reveal that DM-PhyClus can outperform conventional clustering methods, as well as the Gap procedure, a pure distance-based algorithm, in terms of mean cluster recovery. We apply DM-PhyClus to a sample of real HIV-1 sequences, producing a set of clusters whose inference is in line with the conclusions of a previous thorough analysis. Conclusions. DM-PhyClus, by eliminating the need for cutpoints and producing sensible inference for cluster configurations, can facilitate transmission cluster detection. Future efforts to reduce incidence of infectious diseases, like HIV-1, will need reliable estimates of transmission clusters. It follows that algorithms like DM-PhyClus could serve to better inform public health strategies

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

A semi-supervised approach for rapidly creating clinical biomarker phenotypes in the UK Biobank using different primary care EHR and clinical terminology systems

Author: Denaxas S
Fatemifar G
Fitzpatrick N
Hemingway H
Kuan V
Mateen BA
Quint JK
Shah AD
Torralbo A
Publication venue: 'Oxford University Press (OUP)'
Publication date: 05/12/2020
Field of study

Objectives: The UK Biobank (UKB) is making primary care electronic health records (EHRs) for 500 000 participants available for COVID-19-related research. Data are extracted from four sources, recorded using five clinical terminologies and stored in different schemas. The aims of our research were to: (a) develop a semi-supervised approach for bootstrapping EHR phenotyping algorithms in UKB EHR, and (b) to evaluate our approach by implementing and evaluating phenotypes for 31 common biomarkers. Materials and Methods: We describe an algorithmic approach to phenotyping biomarkers in primary care EHR involving (a) bootstrapping definitions using existing phenotypes, (b) excluding generic, rare, or semantically distant terms, (c) forward-mapping terminology terms, (d) expert review, and (e) data extraction. We evaluated the phenotypes by assessing the ability to reproduce known epidemiological associations with all-cause mortality using Cox proportional hazards models. Results: We created and evaluated phenotyping algorithms for 31 biomarkers many of which are directly related to COVID-19 complications, for example diabetes, cardiovascular disease, respiratory disease. Our algorithm identified 1651 Read v2 and Clinical Terms Version 3 terms and automatically excluded 1228 terms. Clinical review excluded 103 terms and included 44 terms, resulting in 364 terms for data extraction (sensitivity 0.89, specificity 0.92). We extracted 38 190 682 events and identified 220 978 participants with at least one biomarker measured. Discussion and conclusion: Bootstrapping phenotyping algorithms from similar EHR can potentially address pre-existing methodological concerns that undermine the outputs of biomarker discovery pipelines and provide research-quality phenotyping algorithms

UCL Discovery