784 research outputs found
Analyzing Patient Trajectories With Artificial Intelligence
In digital medicine, patient data typically record health events over time (eg, through electronic health records, wearables, or other sensing technologies) and thus form unique patient trajectories. Patient trajectories are highly predictive of the future course of diseases and therefore facilitate effective care. However, digital medicine often uses only limited patient data, consisting of health events from only a single or small number of time points while ignoring additional information encoded in patient trajectories. To analyze such rich longitudinal data, new artificial intelligence (AI) solutions are needed. In this paper, we provide an overview of the recent efforts to develop trajectory-aware AI solutions and provide suggestions for future directions. Specifically, we examine the implications for developing disease models from patient trajectories along the typical workflow in AI: problem definition, data processing, modeling, evaluation, and interpretation. We conclude with a discussion of how such AI solutions will allow the field to build robust models for personalized risk scoring, subtyping, and disease pathway discovery
Federated Learning on Heterogenous Data using Chest CT
Large data have accelerated advances in AI. While it is well known that
population differences from genetics, sex, race, diet, and various
environmental factors contribute significantly to disease, AI studies in
medicine have largely focused on locoregional patient cohorts with less diverse
data sources. Such limitation stems from barriers to large-scale data share in
medicine and ethical concerns over data privacy. Federated learning (FL) is one
potential pathway for AI development that enables learning across hospitals
without data share. In this study, we show the results of various FL strategies
on one of the largest and most diverse COVID-19 chest CT datasets: 21
participating hospitals across five continents that comprise >10,000 patients
with >1 million images. We present three techniques: Fed Averaging (FedAvg),
Incremental Institutional Learning (IIL), and Cyclical Incremental
Institutional Learning (CIIL). We also propose an FL strategy that leverages
synthetically generated data to overcome class imbalances and data size
disparities across centers. We show that FL can achieve comparable performance
to Centralized Data Sharing (CDS) while maintaining high performance across
sites with small, underrepresented data. We investigate the strengths and
weaknesses for all technical approaches on this heterogeneous dataset including
the robustness to non-Independent and identically distributed (non-IID)
diversity of data. We also describe the sources of data heterogeneity such as
age, sex, and site locations in the context of FL and show how even among the
correctly labeled populations, disparities can arise due to these biases
An Experimental Study of Class Imbalance in Federated Learning
Federated learning is a distributed machine learning paradigm that trains a
global model for prediction based on a number of local models at clients while
local data privacy is preserved. Class imbalance is believed to be one of the
factors that degrades the global model performance. However, there has been
very little research on if and how class imbalance can affect the global
performance. class imbalance in federated learning is much more complex than
that in traditional non-distributed machine learning, due to different class
imbalance situations at local clients. Class imbalance needs to be re-defined
in distributed learning environments. In this paper, first, we propose two new
metrics to define class imbalance -- the global class imbalance degree (MID)
and the local difference of class imbalance among clients (WCS). Then, we
conduct extensive experiments to analyze the impact of class imbalance on the
global performance in various scenarios based on our definition. Our results
show that a higher MID and a larger WCS degrade more the performance of the
global model. Besides, WCS is shown to slow down the convergence of the global
model by misdirecting the optimization
Systems Biology in ELIXIR: modelling in the spotlight
info:eu-repo/semantics/publishedVersio
The Use and Misuse of Biomedical Data: Is Bigger Really Better?”
Very large biomedical research databases, containing electronic health records (HER) and genomic data from millions of patients, have been heralded recently for their potential to accelerate scientific discovery and produce dramatic improvements in medical treatments. Research enabled by these databases may also lead to profound changes in law, regulation, social policy, and even litigation strategies. Yet, is “big data” necessarily better data?
This paper makes an original contribution to the legal literature by focusing on what can go wrong in the process of biomedical database research and what precautions are necessary to avoid critical mistakes. We address three main reasons for a cautious approach to such research and to relying on its outcomes for purposes of public policy or litigation. First, the data contained in databases is surprisingly likely to be incorrect or incomplete. Second, systematic biases, arising from both the nature of the data and the preconceptions of investigators, are serious threats to the validity of biomedical database research, especially in answering causal questions. Third, data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policy makers.
In short, this paper sheds much-needed light on the problems of credulous and uninformed uses of biomedical databases. An understanding of the pitfalls of big data analysis is of critical importance to anyone who will rely on or dispute its outcomes, including lawyers, policy makers, and the public at large. The article also recommends technical, methodological, and educational interventions to combat the dangers of database errors and abuses
- …