54 research outputs found
Novel pattern recognition approaches for transcriptomics data analysis
We proposed a family of methods for transcriptomics and genomics data analysis based on multi-level thresholding approach, such as OMTG for sub-grid and spot detection in DNA microarrays, and OMT for detecting significant regions based on next generation sequencing data. Extensive experiments on real-life datasets and a comparison to other methods show that the proposed methods perform these tasks fully automatically and with a very high degree of accuracy. Moreover, unlike previous methods, the proposed approaches can be used in various types of transcriptome analysis problems such as microarray image gridding with different resolutions and spot sizes as well as finding the interacting regions of DNA with a protein of interest using ChIP-Seq data without any need for parameter adjustment. We also developed constrained multi-level thresholding (CMT), an algorithm used to detect enriched regions on ChIP-Seq data with the ability of targeting regions within a specific range. We show that CMT has higher accuracy in detecting enriched regions (peaks) by objectively assessing its performance relative to other previously proposed peak finders. This is shown by testing three algorithms on the well-known FoxA1 Data set, four transcription factors (with a total of six antibodies) for Drosophila melanogaster and the H3K4ac antibody dataset. Finally, we propose a tree-based approach that conducts gene selection and builds a classifier simultaneously, in order to select the minimal number of genes that would reliably predict a given breast cancer subtype. Our results support that this modified approach to gene selection yields a small subset of genes that can predict subtypes with greater than 95%overall accuracy. In addition to providing a valuable list of targets for diagnostic purposes, the gene ontologies of the selected genes suggest that these methods have isolated a number of potential genes involved in breast cancer biology, etiology and potentially novel therapeutics
Predicting Outcomes of Hormone and Chemotherapy in the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) Study by Biochemically-inspired Machine Learning
Genomic aberrations and gene expression-defined subtypes in the large METABRIC patient cohort have been used to stratify and predict survival. The present study used normalized gene expression signatures of paclitaxel drug response to predict outcome for different survival times in METABRIC patients receiving hormone (HT) and, in some cases, chemotherapy (CT) agents. This machine learning method, which distinguishes sensitivity vs. resistance in breast cancer cell lines and validates predictions in patients; was also used to derive gene signatures of other HT (tamoxifen) and CT agents (methotrexate, epirubicin, doxorubicin, and 5-fluorouracil) used in METABRIC. Paclitaxel gene signatures exhibited the best performance, however the other agents also predicted survival with acceptable accuracies. A support vector machine (SVM) model of paclitaxel response containing genes ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2, SLCO1B3, TUBB1, TUBB4A, and TUBB4B was 78.6% accurate in predicting survival of 84 patients treated with both HT and CT (median survival ≥ 4.4 yr). Accuracy was lower (73.4%) in 304 untreated patients. The performance of other machine learning approaches was also evaluated at different survival thresholds. Minimum redundancy maximum relevance feature selection of a paclitaxel-based SVM classifier based on expression of genes BCL2L1, BBC3, FGF2, FN1, and TWIST1 was 81.1% accurate in 53 CT patients. In addition, a random forest (RF) classifier using a gene signature ( ABCB1, ABCB11, ABCC1, ABCC10, BAD, BBC3, BCL2, BCL2L1, BMF, CYP2C8, CYP3A4, MAP2, MAP4, MAPT, NR1I2,SLCO1B3, TUBB1, TUBB4A, and TUBB4B) predicted \u3e3-year survival with 85.5% accuracy in 420 HT patients. A similar RF gene signature showed 82.7% accuracy in 504 patients treated with CT and/or HT. These results suggest that tumor gene expression signatures refined by machine learning techniques can be useful for predicting survival after drug therapies
A fully automatic gridding method for cDNA microarray images
<p>Abstract</p> <p>Background</p> <p>Processing cDNA microarray images is a crucial step in gene expression analysis, since any errors in early stages affect subsequent steps, leading to possibly erroneous biological conclusions. When processing the underlying images, accurately separating the sub-grids and spots is extremely important for subsequent steps that include segmentation, quantification, normalization and clustering.</p> <p>Results</p> <p>We propose a parameterless and fully automatic approach that first detects the sub-grids given the entire microarray image, and then detects the locations of the spots in each sub-grid. The approach, first, detects and corrects rotations in the images by applying an affine transformation, followed by a polynomial-time optimal multi-level thresholding algorithm used to find the positions of the sub-grids in the image and the positions of the spots in each sub-grid. Additionally, a new validity index is proposed in order to find the correct number of sub-grids in the image, and the correct number of spots in each sub-grid. Moreover, a refinement procedure is used to correct possible misalignments and increase the accuracy of the method.</p> <p>Conclusions</p> <p>Extensive experiments on real-life microarray images and a comparison to other methods show that the proposed method performs these tasks fully automatically and with a very high degree of accuracy. Moreover, unlike previous methods, the proposed approach can be used in various type of microarray images with different resolutions and spot sizes and does not need any parameter to be adjusted.</p
The unfinished agenda of communicable diseases among children and adolescents before the COVID-19 pandemic, 1990-2019: a systematic analysis of the Global Burden of Disease Study 2019
BACKGROUND: Communicable disease control has long been a focus of global health policy. There have been substantial reductions in the burden and mortality of communicable diseases among children younger than 5 years, but we know less about this burden in older children and adolescents, and it is unclear whether current programmes and policies remain aligned with targets for intervention. This knowledge is especially important for policy and programmes in the context of the COVID-19 pandemic. We aimed to use the Global Burden of Disease (GBD) Study 2019 to systematically characterise the burden of communicable diseases across childhood and adolescence. METHODS: In this systematic analysis of the GBD study from 1990 to 2019, all communicable diseases and their manifestations as modelled within GBD 2019 were included, categorised as 16 subgroups of common diseases or presentations. Data were reported for absolute count, prevalence, and incidence across measures of cause-specific mortality (deaths and years of life lost), disability (years lived with disability [YLDs]), and disease burden (disability-adjusted life-years [DALYs]) for children and adolescents aged 0-24 years. Data were reported across the Socio-demographic Index (SDI) and across time (1990-2019), and for 204 countries and territories. For HIV, we reported the mortality-to-incidence ratio (MIR) as a measure of health system performance. FINDINGS: In 2019, there were 3·0 million deaths and 30·0 million years of healthy life lost to disability (as measured by YLDs), corresponding to 288·4 million DALYs from communicable diseases among children and adolescents globally (57·3% of total communicable disease burden across all ages). Over time, there has been a shift in communicable disease burden from young children to older children and adolescents (largely driven by the considerable reductions in children younger than 5 years and slower progress elsewhere), although children younger than 5 years still accounted for most of the communicable disease burden in 2019. Disease burden and mortality were predominantly in low-SDI settings, with high and high-middle SDI settings also having an appreciable burden of communicable disease morbidity (4·0 million YLDs in 2019 alone). Three cause groups (enteric infections, lower-respiratory-tract infections, and malaria) accounted for 59·8% of the global communicable disease burden in children and adolescents, with tuberculosis and HIV both emerging as important causes during adolescence. HIV was the only cause for which disease burden increased over time, particularly in children and adolescents older than 5 years, and especially in females. Excess MIRs for HIV were observed for males aged 15-19 years in low-SDI settings. INTERPRETATION: Our analysis supports continued policy focus on enteric infections and lower-respiratory-tract infections, with orientation to children younger than 5 years in settings of low socioeconomic development. However, efforts should also be targeted to other conditions, particularly HIV, given its increased burden in older children and adolescents. Older children and adolescents also experience a large burden of communicable disease, further highlighting the need for efforts to extend beyond the first 5 years of life. Our analysis also identified substantial morbidity caused by communicable diseases affecting child and adolescent health across the world. FUNDING: The Australian National Health and Medical Research Council Centre for Research Excellence for Driving Investment in Global Adolescent Health and the Bill & Melinda Gates Foundation
Global, regional, and national burden of colorectal cancer and its risk factors, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019
Funding: F Carvalho and E Fernandes acknowledge support from Fundação para a Ciência e a Tecnologia, I.P. (FCT), in the scope of the project UIDP/04378/2020 and UIDB/04378/2020 of the Research Unit on Applied Molecular Biosciences UCIBIO and the project LA/P/0140/2020 of the Associate Laboratory Institute for Health and Bioeconomy i4HB; FCT/MCTES through the project UIDB/50006/2020. J Conde acknowledges the European Research Council Starting Grant (ERC-StG-2019-848325). V M Costa acknowledges the grant SFRH/BHD/110001/2015, received by Portuguese national funds through Fundação para a Ciência e Tecnologia (FCT), IP, under the Norma Transitória DL57/2016/CP1334/CT0006.proofepub_ahead_of_prin
Global burden of chronic respiratory diseases and risk factors, 1990–2019: an update from the Global Burden of Disease Study 2019
Background: Updated data on chronic respiratory diseases (CRDs) are vital in their prevention, control, and treatment in the path to achieving the third UN Sustainable Development Goals (SDGs), a one-third reduction in premature mortality from non-communicable diseases by 2030. We provided global, regional, and national estimates of the burden of CRDs and their attributable risks from 1990 to 2019. Methods: Using data from the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2019, we estimated mortality, years lived with disability, years of life lost, disability-adjusted life years (DALYs), prevalence, and incidence of CRDs, i.e. chronic obstructive pulmonary disease (COPD), asthma, pneumoconiosis, interstitial lung disease and pulmonary sarcoidosis, and other CRDs, from 1990 to 2019 by sex, age, region, and Socio-demographic Index (SDI) in 204 countries and territories. Deaths and DALYs from CRDs attributable to each risk factor were estimated according to relative risks, risk exposure, and the theoretical minimum risk exposure level input. Findings: In 2019, CRDs were the third leading cause of death responsible for 4.0 million deaths (95% uncertainty interval 3.6–4.3) with a prevalence of 454.6 million cases (417.4–499.1) globally. While the total deaths and prevalence of CRDs have increased by 28.5% and 39.8%, the age-standardised rates have dropped by 41.7% and 16.9% from 1990 to 2019, respectively. COPD, with 212.3 million (200.4–225.1) prevalent cases, was the primary cause of deaths from CRDs, accounting for 3.3 million (2.9–3.6) deaths. With 262.4 million (224.1–309.5) prevalent cases, asthma had the highest prevalence among CRDs. The age-standardised rates of all burden measures of COPD, asthma, and pneumoconiosis have reduced globally from 1990 to 2019. Nevertheless, the age-standardised rates of incidence and prevalence of interstitial lung disease and pulmonary sarcoidosis have increased throughout this period. Low- and low-middle SDI countries had the highest age-standardised death and DALYs rates while the high SDI quintile had the highest prevalence rate of CRDs. The highest deaths and DALYs from CRDs were attributed to smoking globally, followed by air pollution and occupational risks. Non-optimal temperature and high body-mass index were additional risk factors for COPD and asthma, respectively. Interpretation: Albeit the age-standardised prevalence, death, and DALYs rates of CRDs have decreased, they still cause a substantial burden and deaths worldwide. The high death and DALYs rates in low and low-middle SDI countries highlights the urgent need for improved preventive, diagnostic, and therapeutic measures. Global strategies for tobacco control, enhancing air quality, reducing occupational hazards, and fostering clean cooking fuels are crucial steps in reducing the burden of CRDs, especially in low- and lower-middle income countries
Mapping local patterns of childhood overweight and wasting in low- and middle-income countries between 2000 and 2017
A double burden of malnutrition occurs when individuals, household members or communities experience both undernutrition and overweight. Here, we show geospatial estimates of overweight and wasting prevalence among children under 5 years of age in 105 low- and middle-income countries (LMICs) from 2000 to 2017 and aggregate these to policy-relevant administrative units. Wasting decreased overall across LMICs between 2000 and 2017, from 8.4% (62.3 (55.1–70.8) million) to 6.4% (58.3 (47.6–70.7) million), but is predicted to remain above the World Health Organization’s Global Nutrition Target of <5% in over half of LMICs by 2025. Prevalence of overweight increased from 5.2% (30 (22.8–38.5) million) in 2000 to 6.0% (55.5 (44.8–67.9) million) children aged under 5 years in 2017. Areas most affected by double burden of malnutrition were located in Indonesia, Thailand, southeastern China, Botswana, Cameroon and central Nigeria. Our estimates provide a new perspective to researchers, policy makers and public health agencies in their efforts to address this global childhood syndemic
The global burden of adolescent and young adult cancer in 2019 : a systematic analysis for the Global Burden of Disease Study 2019
Background In estimating the global burden of cancer, adolescents and young adults with cancer are often overlooked, despite being a distinct subgroup with unique epidemiology, clinical care needs, and societal impact. Comprehensive estimates of the global cancer burden in adolescents and young adults (aged 15-39 years) are lacking. To address this gap, we analysed results from the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2019, with a focus on the outcome of disability-adjusted life-years (DALYs), to inform global cancer control measures in adolescents and young adults. Methods Using the GBD 2019 methodology, international mortality data were collected from vital registration systems, verbal autopsies, and population-based cancer registry inputs modelled with mortality-to-incidence ratios (MIRs). Incidence was computed with mortality estimates and corresponding MIRs. Prevalence estimates were calculated using modelled survival and multiplied by disability weights to obtain years lived with disability (YLDs). Years of life lost (YLLs) were calculated as age-specific cancer deaths multiplied by the standard life expectancy at the age of death. The main outcome was DALYs (the sum of YLLs and YLDs). Estimates were presented globally and by Socio-demographic Index (SDI) quintiles (countries ranked and divided into five equal SDI groups), and all estimates were presented with corresponding 95% uncertainty intervals (UIs). For this analysis, we used the age range of 15-39 years to define adolescents and young adults. Findings There were 1.19 million (95% UI 1.11-1.28) incident cancer cases and 396 000 (370 000-425 000) deaths due to cancer among people aged 15-39 years worldwide in 2019. The highest age-standardised incidence rates occurred in high SDI (59.6 [54.5-65.7] per 100 000 person-years) and high-middle SDI countries (53.2 [48.8-57.9] per 100 000 person-years), while the highest age-standardised mortality rates were in low-middle SDI (14.2 [12.9-15.6] per 100 000 person-years) and middle SDI (13.6 [12.6-14.8] per 100 000 person-years) countries. In 2019, adolescent and young adult cancers contributed 23.5 million (21.9-25.2) DALYs to the global burden of disease, of which 2.7% (1.9-3.6) came from YLDs and 97.3% (96.4-98.1) from YLLs. Cancer was the fourth leading cause of death and tenth leading cause of DALYs in adolescents and young adults globally. Interpretation Adolescent and young adult cancers contributed substantially to the overall adolescent and young adult disease burden globally in 2019. These results provide new insights into the distribution and magnitude of the adolescent and young adult cancer burden around the world. With notable differences observed across SDI settings, these estimates can inform global and country-level cancer control efforts. Copyright (C) 2021 The Author(s). Published by Elsevier Ltd.Peer reviewe
The global burden of cancer attributable to risk factors, 2010-19 : a systematic analysis for the Global Burden of Disease Study 2019
Background Understanding the magnitude of cancer burden attributable to potentially modifiable risk factors is crucial for development of effective prevention and mitigation strategies. We analysed results from the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2019 to inform cancer control planning efforts globally. Methods The GBD 2019 comparative risk assessment framework was used to estimate cancer burden attributable to behavioural, environmental and occupational, and metabolic risk factors. A total of 82 risk-outcome pairs were included on the basis of the World Cancer Research Fund criteria. Estimated cancer deaths and disability-adjusted life-years (DALYs) in 2019 and change in these measures between 2010 and 2019 are presented. Findings Globally, in 2019, the risk factors included in this analysis accounted for 4.45 million (95% uncertainty interval 4.01-4.94) deaths and 105 million (95.0-116) DALYs for both sexes combined, representing 44.4% (41.3-48.4) of all cancer deaths and 42.0% (39.1-45.6) of all DALYs. There were 2.88 million (2.60-3.18) risk-attributable cancer deaths in males (50.6% [47.8-54.1] of all male cancer deaths) and 1.58 million (1.36-1.84) risk-attributable cancer deaths in females (36.3% [32.5-41.3] of all female cancer deaths). The leading risk factors at the most detailed level globally for risk-attributable cancer deaths and DALYs in 2019 for both sexes combined were smoking, followed by alcohol use and high BMI. Risk-attributable cancer burden varied by world region and Socio-demographic Index (SDI), with smoking, unsafe sex, and alcohol use being the three leading risk factors for risk-attributable cancer DALYs in low SDI locations in 2019, whereas DALYs in high SDI locations mirrored the top three global risk factor rankings. From 2010 to 2019, global risk-attributable cancer deaths increased by 20.4% (12.6-28.4) and DALYs by 16.8% (8.8-25.0), with the greatest percentage increase in metabolic risks (34.7% [27.9-42.8] and 33.3% [25.8-42.0]). Interpretation The leading risk factors contributing to global cancer burden in 2019 were behavioural, whereas metabolic risk factors saw the largest increases between 2010 and 2019. Reducing exposure to these modifiable risk factors would decrease cancer mortality and DALY rates worldwide, and policies should be tailored appropriately to local cancer risk factor burden. Copyright (C) 2022 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license.Peer reviewe
Recommended from our members
Global burden of 288 causes of death and life expectancy decomposition in 204 countries and territories and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021
BACKGROUND Regular, detailed reporting on population health by underlying cause of death is fundamental for public health decision making. Cause-specific estimates of mortality and the subsequent effects on life expectancy worldwide are valuable metrics to gauge progress in reducing mortality rates. These estimates are particularly important following large-scale mortality spikes, such as the COVID-19 pandemic. When systematically analysed, mortality rates and life expectancy allow comparisons of the consequences of causes of death globally and over time, providing a nuanced understanding of the effect of these causes on global populations. METHODS The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2021 cause-of-death analysis estimated mortality and years of life lost (YLLs) from 288 causes of death by age-sex-location-year in 204 countries and territories and 811 subnational locations for each year from 1990 until 2021. The analysis used 56 604 data sources, including data from vital registration and verbal autopsy as well as surveys, censuses, surveillance systems, and cancer registries, among others. As with previous GBD rounds, cause-specific death rates for most causes were estimated using the Cause of Death Ensemble model-a modelling tool developed for GBD to assess the out-of-sample predictive validity of different statistical models and covariate permutations and combine those results to produce cause-specific mortality estimates-with alternative strategies adapted to model causes with insufficient data, substantial changes in reporting over the study period, or unusual epidemiology. YLLs were computed as the product of the number of deaths for each cause-age-sex-location-year and the standard life expectancy at each age. As part of the modelling process, uncertainty intervals (UIs) were generated using the 2·5th and 97·5th percentiles from a 1000-draw distribution for each metric. We decomposed life expectancy by cause of death, location, and year to show cause-specific effects on life expectancy from 1990 to 2021. We also used the coefficient of variation and the fraction of population affected by 90% of deaths to highlight concentrations of mortality. Findings are reported in counts and age-standardised rates. Methodological improvements for cause-of-death estimates in GBD 2021 include the expansion of under-5-years age group to include four new age groups, enhanced methods to account for stochastic variation of sparse data, and the inclusion of COVID-19 and other pandemic-related mortality-which includes excess mortality associated with the pandemic, excluding COVID-19, lower respiratory infections, measles, malaria, and pertussis. For this analysis, 199 new country-years of vital registration cause-of-death data, 5 country-years of surveillance data, 21 country-years of verbal autopsy data, and 94 country-years of other data types were added to those used in previous GBD rounds. FINDINGS The leading causes of age-standardised deaths globally were the same in 2019 as they were in 1990; in descending order, these were, ischaemic heart disease, stroke, chronic obstructive pulmonary disease, and lower respiratory infections. In 2021, however, COVID-19 replaced stroke as the second-leading age-standardised cause of death, with 94·0 deaths (95% UI 89·2-100·0) per 100 000 population. The COVID-19 pandemic shifted the rankings of the leading five causes, lowering stroke to the third-leading and chronic obstructive pulmonary disease to the fourth-leading position. In 2021, the highest age-standardised death rates from COVID-19 occurred in sub-Saharan Africa (271·0 deaths [250·1-290·7] per 100 000 population) and Latin America and the Caribbean (195·4 deaths [182·1-211·4] per 100 000 population). The lowest age-standardised death rates from COVID-19 were in the high-income super-region (48·1 deaths [47·4-48·8] per 100 000 population) and southeast Asia, east Asia, and Oceania (23·2 deaths [16·3-37·2] per 100 000 population). Globally, life expectancy steadily improved between 1990 and 2019 for 18 of the 22 investigated causes. Decomposition of global and regional life expectancy showed the positive effect that reductions in deaths from enteric infections, lower respiratory infections, stroke, and neonatal deaths, among others have contributed to improved survival over the study period. However, a net reduction of 1·6 years occurred in global life expectancy between 2019 and 2021, primarily due to increased death rates from COVID-19 and other pandemic-related mortality. Life expectancy was highly variable between super-regions over the study period, with southeast Asia, east Asia, and Oceania gaining 8·3 years (6·7-9·9) overall, while having the smallest reduction in life expectancy due to COVID-19 (0·4 years). The largest reduction in life expectancy due to COVID-19 occurred in Latin America and the Caribbean (3·6 years). Additionally, 53 of the 288 causes of death were highly concentrated in locations with less than 50% of the global population as of 2021, and these causes of death became progressively more concentrated since 1990, when only 44 causes showed this pattern. The concentration phenomenon is discussed heuristically with respect to enteric and lower respiratory infections, malaria, HIV/AIDS, neonatal disorders, tuberculosis, and measles. INTERPRETATION Long-standing gains in life expectancy and reductions in many of the leading causes of death have been disrupted by the COVID-19 pandemic, the adverse effects of which were spread unevenly among populations. Despite the pandemic, there has been continued progress in combatting several notable causes of death, leading to improved global life expectancy over the study period. Each of the seven GBD super-regions showed an overall improvement from 1990 and 2021, obscuring the negative effect in the years of the pandemic. Additionally, our findings regarding regional variation in causes of death driving increases in life expectancy hold clear policy utility. Analyses of shifting mortality trends reveal that several causes, once widespread globally, are now increasingly concentrated geographically. These changes in mortality concentration, alongside further investigation of changing risks, interventions, and relevant policy, present an important opportunity to deepen our understanding of mortality-reduction strategies. Examining patterns in mortality concentration might reveal areas where successful public health interventions have been implemented. Translating these successes to locations where certain causes of death remain entrenched can inform policies that work to improve life expectancy for people everywhere. FUNDING Bill & Melinda Gates Foundation
- …