45 research outputs found
Probabilistic integration of large Brazilian socioeconomic and clinical databases
The integration of disparate large and heterogeneous socioeconomic and clinical databases is considered essential to capture and model longitudinal and social aspects of diseases. However, such integration is challenging: databases are stored in disparate locations, make use of different identifiers, have variable data quality, record information in bespoke purpose-specific formats and have different levels of metadata. Novel computational methods are required to integrate them and enable their statistical analyses for epidemiological research purposes. In this paper, we describe a probabilistic approach for constructing a very large population-based cohort comprised of 114 million individuals using linkages between clinical databases from the National Health System and administrative databases from governmental social programmes. We present our data integration model for creating data marts (epidemiological data) and discuss our evaluation results in controlled and uncontrolled scenarios, which demonstrate that our model and tools achieve high accuracy (minimum of 91%) in different probabilistic data integration scenarios
Recommended from our members
The effects of social determinants of health on acquired immune deficiency syndrome in a low-income population of Brazil: a retrospective cohort study of 28.3 million individuals.
BACKGROUND: Social determinants of health (SDH) include factors such as income, education, and race, that could significantly affect the human immunodeficiency virus and acquired immunodeficiency syndrome (HIV/AIDS). Studies on the effects of SDH on HIV/AIDS are limited, and do not yet provide a systematic understanding of how the various SDH act on important indicators of HIV/AIDS progression. We aimed to evaluate the effects of SDH on AIDS morbidity and mortality. METHODS: A retrospective cohort of 28.3 million individuals was evaluated over a 9-year period (from 2007 to 2015). Multivariable Poisson regression, with a hierarchical approach, was used to estimate the effects of SDH-at the individual and familial level-on AIDS incidence, mortality, and case-fatality rates. FINDINGS: A total of 28,318,532 individuals, representing the low-income Brazilian population, were assessed, who had a mean age of 36.18 (SD: 16.96) years, 52.69% (14,920,049) were female, 57.52% (15,360,569) were pardos, 34.13% (9,113,222) were white/Asian, 7.77% (2,075,977) were black, and 0.58% (154,146) were indigenous. Specific socioeconomic, household, and geographic factors were significantly associated with AIDS-related outcomes. Less wealth was strongly associated with a higher AIDS incidence (rate ratios-RR: 1.55; 95% confidence interval-CI: 1.43-1.68) and mortality (RR: 1.99; 95% CI: 1.70-2.34). Lower educational attainment was also greatly associated with higher AIDS incidence (RR: 1.46; 95% CI: 1.26-1.68), mortality (RR: 2.76; 95% CI: 1.99-3.82) and case-fatality rates (RR: 2.30; 95% CI: 1.31-4.01). Being black was associated with a higher AIDS incidence (RR: 1.53; 95% CI: 1.45-1.61), mortality (RR: 1.69; 95% CI: 1.57-1.83) and case-fatality rates (RR: 1.16; 95% CI: 1.03-1.32). Overall, also considering the other SDH, individuals experiencing greater levels of socioeconomic deprivation were, by far, more likely to acquire AIDS, and to die from it. INTERPRETATION: In the population studied, SDH related to poverty and social vulnerability are strongly associated with a higher burden of HIV/AIDS, most notably less wealth, illiteracy, and being black. In the absence of relevant social protection policies, the current worldwide increase in poverty and inequalities-due to the consequences of the COVID-19 pandemic, and the effects of war in the Ukraine-could reverse progress made in the fight against HIV/AIDS in low- and middle-income countries (LMIC). FUNDING: National Institute of Allergy and Infectious Diseases (NAIDS), National Institutes of Health (NIH), US Grant Number: 1R01AI152938
fabH deletion increases DHA productionin Escherichia coli expressing Pfa genes
Background: Some marine bacteria, such as Moritella marina, produce the nutraceutical docosahexaenoic acid (DHA) thanks to a specific enzymatic complex called Pfa synthase. Escherichia coli heterologously expressing the pfa gene cluster from M. marina also produces DHA. The aim of this study was to find genetic or metabolic conditions to increase DHA production in E. coli. Results: First, we analysed the effect of the antibiotic cerulenin, showing that DHA production increased twofold. Then, we tested a series of single gene knockout mutations affecting fatty acid biosynthesis, in order to optimize the synthesis of DHA. The most effective mutant, fabH, showed a threefold increase compared to wild type strain. The combination of cerulenin inhibition and fabH deletion rendered a 6.5-fold improvement compared to control strain. Both strategies seem to have the same mechanism of action, in which fatty acid synthesis via the canonical pathway (fab pathway) is affected in its first catalytic step, which allows the substrates to be used by the heterologous pathway to synthesize DHA. Conclusions: DHA-producing E. coli strain that carries a fabH gene deletion boosts DHA production by tuning down the competing canonical biosynthesis pathway. Our approach can be used for optimization of DHA production in different organisms.Funding: The work in the FdlC and GM laboratories was financed by the Spanish Ministry of Economy, Industry and Competitiveness Grant BFU2014-55534-C2
Ethnoracial disparities in childhood growth trajectories in Brazil: a longitudinal nationwide study of four million children.
BACKGROUND: The literature contains scarce data on inequalities in growth trajectories among children born to mothers of diverse ethnoracial background in the first 5 years of life. OBJECTIVE: We aimed to investigate child growth according to maternal ethnoracial group using a nationwide Brazilian database. METHODS: A population-based retrospective cohort study employed linked data from the CIDACS Birth Cohort and the Brazilian Food and Nutrition Surveillance System (SISVAN). Children born at term, aged 5 years or younger who presented two or more measurements of length/height (cm) and weight (kg) were followed up between 2008 and 2017. Prevalence of stunting, underweight, wasting, and thinness were estimated. Nonlinear mixed effect models were used to estimate childhood growth trajectories, among different maternal ethnoracial groups (White, Asian descent, Black, Pardo, and Indigenous), using the raw measures of weight (kg) and height (cm) and the length/height-for-age (L/HAZ) and weight-for-age z-scores (WAZ). The analyses were also adjusted for mother's age, educational level, and marital status. RESULTS: A total of 4,090,271 children were included in the study. Children of Indigenous mothers exhibited higher rates of stunting (26.74%) and underweight (5.90%). Wasting and thinness were more prevalent among children of Pardo, Asian, Black, and Indigenous mothers than those of White mothers. Regarding children's weight (kg) and length/height (cm), those of Indigenous, Pardo, Black, and Asian descent mothers were on average shorter and weighted less than White ones. Regarding WAZ and L/HAZ growth trajectories, a sharp decline in average z-scores was evidenced in the first weeks of life, followed by a period of recovery. Over time, z-scores for most of the subgroups analyzed trended below zero. Children of mother in greater social vulnerability showed less favorable growth. CONCLUSION: We observed racial disparities in nutritional status and childhood growth trajectories, with children of Indigenous mothers presenting less favorable outcomes compared to their White counterparts. The strengthening of policies aimed at protecting Indigenous children should be urgently undertaken to address systematic ethnoracial health inequalities
What scans we will read: imaging instrumentation trends in clinical oncology
Oncological diseases account for a significant portion of the burden on public healthcare systems with associated
costs driven primarily by complex and long-lasting therapies. Through the visualization of patient-specific
morphology and functional-molecular pathways, cancerous tissue can be detected and characterized non-
invasively, so as to provide referring oncologists with essential information to support therapy management
decisions. Following the onset of stand-alone anatomical and functional imaging, we witness a push towards
integrating molecular image information through various methods, including anato-metabolic imaging (e.g., PET/
CT), advanced MRI, optical or ultrasound imaging.
This perspective paper highlights a number of key technological and methodological advances in imaging
instrumentation related to anatomical, functional, molecular medicine and hybrid imaging, that is understood as
the hardware-based combination of complementary anatomical and molecular imaging. These include novel
detector technologies for ionizing radiation used in CT and nuclear medicine imaging, and novel system
developments in MRI and optical as well as opto-acoustic imaging. We will also highlight new data processing
methods for improved non-invasive tissue characterization. Following a general introduction to the role of imaging
in oncology patient management we introduce imaging methods with well-defined clinical applications and
potential for clinical translation. For each modality, we report first on the status quo and point to perceived
technological and methodological advances in a subsequent status go section. Considering the breadth and
dynamics of these developments, this perspective ends with a critical reflection on where the authors, with the
majority of them being imaging experts with a background in physics and engineering, believe imaging methods
will be in a few years from now.
Overall, methodological and technological medical imaging advances are geared towards increased image contrast,
the derivation of reproducible quantitative parameters, an increase in volume sensitivity and a reduction in overall
examination time. To ensure full translation to the clinic, this progress in technologies and instrumentation is
complemented by progress in relevant acquisition and image-processing protocols and improved data analysis. To
this end, we should accept diagnostic images as “data”, and – through the wider adoption of advanced analysis,
including machine learning approaches and a “big data” concept – move to the next stage of non-invasive tumor
phenotyping. The scans we will be reading in 10 years from now will likely be composed of highly diverse multi-
dimensional data from multiple sources, which mandate the use of advanced and interactive visualization and
analysis platforms powered by Artificial Intelligence (AI) for real-time data handling by cross-specialty clinical experts
with a domain knowledge that will need to go beyond that of plain imaging
The trans-ancestral genomic architecture of glycemic traits
Glycemic traits are used to diagnose and monitor type 2 diabetes and cardiometabolic health. To date, most genetic studies of glycemic traits have focused on individuals of European ancestry. Here we aggregated genome-wide association studies comprising up to 281,416 individuals without diabetes (30% non-European ancestry) for whom fasting glucose, 2-h glucose after an oral glucose challenge, glycated hemoglobin and fasting insulin data were available. Trans-ancestry and single-ancestry meta-analyses identified 242 loci (99 novel; P < 5 × 10−8), 80% of which had no significant evidence of between-ancestry heterogeneity. Analyses restricted to individuals of European ancestry with equivalent sample size would have led to 24 fewer new loci. Compared with single-ancestry analyses, equivalent-sized trans-ancestry fine-mapping reduced the number of estimated variants in 99% credible sets by a median of 37.5%. Genomic-feature, gene-expression and gene-set analyses revealed distinct biological signatures for each trait, highlighting different underlying biological pathways. Our results increase our understanding of diabetes pathophysiology by using trans-ancestry studies for improved power and resolution
Genetic drivers of heterogeneity in type 2 diabetes pathophysiology.
Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes1,2 and molecular mechanisms that are often specific to cell type3,4. Here, to characterize the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study data from 2,535,601 individuals (39.7% not of European ancestry), including 428,452 cases of T2D. We identify 1,289 independent association signals at genome-wide significance (P < 5 × 10-8) that map to 611 loci, of which 145 loci are, to our knowledge, previously unreported. We define eight non-overlapping clusters of T2D signals that are characterized by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type-specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial cells and enteroendocrine cells. We build cluster-specific partitioned polygenic scores5 in a further 279,552 individuals of diverse ancestry, including 30,288 cases of T2D, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned polygenic scores are associated with coronary artery disease, peripheral artery disease and end-stage diabetic nephropathy across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings show the value of integrating multi-ancestry genome-wide association study data with single-cell epigenomics to disentangle the aetiological heterogeneity that drives the development and progression of T2D. This might offer a route to optimize global access to genetically informed diabetes care
Interethnic analyses of blood pressure loci in populations of East Asian and European descent
Blood pressure (BP) is a major risk factor for cardiovascular disease and more than 200 genetic loci associated with BP are known. Here, we perform a multi-stage genome-wide association study for BP (max N = 289,038) principally in East Asians and meta-analysis in East Asians and Europeans. We report 19 new genetic loci and ancestry-specific BP variants, conforming to a common ancestry-specific variant association model. At 10 unique loci, distinct non-rare ancestry-specific variants colocalize within the same linkage disequilibrium block despite the significantly discordant effects for the proxy shared variants between the ethnic groups. The genome-wide transethnic correlation of causal-variant effect-sizes is 0.898 and 0.851 for systolic and diastolic BP, respectively. Some of the ancestry-specific association signals are also influenced by a selective sweep. Our results provide new evidence for the role of common ancestry-specific variants and natural selection in ethnic differences in complex traits such as BP.</p
Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis.
BACKGROUND: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. RESULTS: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3-5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. CONCLUSIONS: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk