31 research outputs found

    Polychrome: Creating and Assessing Qualitative Palettes with Many Colors

    Get PDF
    Although R includes numerous tools for creating color palettes to display continuous data, facilities for displaying categorical data primarily use the RColorBrewer package, which is, by default, limited to 12 colors. The colorspace package can produce more colors, but it is not immediately clear how to use it to produce colors that can be reliably distingushed in different kinds of plots. However, applications to genomics would be enhanced by the ability to display at least the 24 human chromosomes in distinct colors, as is common in technologies like spectral karyotyping. In this article, we describe the Polychrome package, which can be used to construct palettes with at least 24 colors that can be distinguished by most people with normal color vision. Polychrome includes a variety of visualization methods allowing users to evaluate the proposed palettes. In addition, we review the history of attempts to construct qualitative color palettes with many colors

    Commutator Leavitt path algebras

    Full text link
    For any field K and directed graph E, we completely describe the elements of the Leavitt path algebra L_K(E) which lie in the commutator subspace [L_K(E),L_K(E)]. We then use this result to classify all Leavitt path algebras L_K(E) that satisfy L_K(E)=[L_K(E),L_K(E)]. We also show that these Leavitt path algebras have the additional (unusual) property that all their Lie ideals are (ring-theoretic) ideals, and construct examples of such rings with various ideal structures.Comment: 24 page

    A protocol to evaluate RNA sequencing normalization methods

    Get PDF
    Background RNA sequencing technologies have allowed researchers to gain a better understanding of how the transcriptome affects disease. However, sequencing technologies often unintentionally introduce experimental error into RNA sequencing data. To counteract this, normalization methods are standardly applied with the intent of reducing the non-biologically derived variability inherent in transcriptomic measurements. However, the comparative efficacy of the various normalization techniques has not been tested in a standardized manner. Here we propose tests that evaluate numerous normalization techniques and applied them to a large-scale standard data set. These tests comprise a protocol that allows researchers to measure the amount of non-biological variability which is present in any data set after normalization has been performed, a crucial step to assessing the biological validity of data following normalization. Results In this study we present two tests to assess the validity of normalization methods applied to a large-scale data set collected for systematic evaluation purposes. We tested various RNASeq normalization procedures and concluded that transcripts per million (TPM) was the best performing normalization method based on its preservation of biological signal as compared to the other methods tested. Conclusion Normalization is of vital importance to accurately interpret the results of genomic and transcriptomic experiments. More work, however, needs to be performed to optimize normalization methods for RNASeq data. The present effort helps pave the way for more systematic evaluations of normalization methods across different platforms. With our proposed schema researchers can evaluate their own or future normalization methods to further improve the field of RNASeq normalization

    Plasma microRNA levels following resection of metastatic melanoma

    Get PDF
    Melanoma remains the leading cause of skin cancer–related deaths. Surgical resection and adjuvant therapies can result in disease-free intervals for stage III and stage IV disease; however, recurrence is common. Understanding microRNA (miR) dynamics following surgical resection of melanomas is critical to accurately interpret miR changes suggestive of melanoma recurrence. Plasma of 6 patients with stage III (n = 2) and stage IV (n = 4) melanoma was evaluated using the NanoString platform to determine pre- and postsurgical miR expression profiles, enabling analysis of more than 800 miRs simultaneously in 12 samples. Principal component analysis detected underlying patterns of miR expression between pre- vs postsurgical patients. Group A contained 3 of 4 patients with stage IV disease (pre- and postsurgical samples) and 2 patients with stage III disease (postsurgical samples only). The corresponding preoperative samples to both individuals with stage III disease were contained in group B along with 1 individual with stage IV disease (pre- and postsurgical samples). Group A was distinguished from group B by statistically significant analysis of variance changes in miR expression ( P < .0001). This analysis revealed that group A vs group B had downregulation of let-7b-5p, miR-520f, miR-720, miR-4454, miR-21-5p, miR-22-3p, miR-151a-3p, miR-378e, and miR-1283 and upregulation of miR-126-3p, miR-223-3p, miR-451a, let-7a-5p, let-7g-5p, miR-15b-5p, miR-16-5p, miR-20a-5p, miR-20b-5p, miR-23a-3p, miR-26a-5p, miR-106a-5p, miR-17-5p, miR-130a-3p, miR-142-3p, miR-150-5p, miR-191-5p, miR-199a-3p, miR-199b-3p, and miR-1976. Changes in miR expression were not readily evident in individuals with distant metastatic disease (stage IV) as these individuals may have prolonged inflammatory responses. Thus, inflammatory-driven miRs coinciding with tumor-derived miRs can blunt anticipated changes in expression profiles following surgical resection

    Electronic health record data quality assessment and tools: A systematic review

    Get PDF
    OBJECTIVE: We extended a 2013 literature review on electronic health record (EHR) data quality assessment approaches and tools to determine recent improvements or changes in EHR data quality assessment methodologies. MATERIALS AND METHODS: We completed a systematic review of PubMed articles from 2013 to April 2023 that discussed the quality assessment of EHR data. We screened and reviewed papers for the dimensions and methods defined in the original 2013 manuscript. We categorized papers as data quality outcomes of interest, tools, or opinion pieces. We abstracted and defined additional themes and methods though an iterative review process. RESULTS: We included 103 papers in the review, of which 73 were data quality outcomes of interest papers, 22 were tools, and 8 were opinion pieces. The most common dimension of data quality assessed was completeness, followed by correctness, concordance, plausibility, and currency. We abstracted conformance and bias as 2 additional dimensions of data quality and structural agreement as an additional methodology. DISCUSSION: There has been an increase in EHR data quality assessment publications since the original 2013 review. Consistent dimensions of EHR data quality continue to be assessed across applications. Despite consistent patterns of assessment, there still does not exist a standard approach for assessing EHR data quality. CONCLUSION: Guidelines are needed for EHR data quality assessment to improve the efficiency, transparency, comparability, and interoperability of data quality assessment. These guidelines must be both scalable and flexible. Automation could be helpful in generalizing this process

    Pattern recognition in lymphoid malignancies using CytoGPS and Mercator

    Get PDF
    BACKGROUND: There have been many recent breakthroughs in processing and analyzing large-scale data sets in biomedical informatics. For example, the CytoGPS algorithm has enabled the use of text-based karyotypes by transforming them into a binary model. However, such advances are accompanied by new problems of data sparsity, heterogeneity, and noisiness that are magnified by the large-scale multidimensional nature of the data. To address these problems, we developed the Mercator R package, which processes and visualizes binary biomedical data. We use Mercator to address biomedical questions of cytogenetic patterns relating to lymphoid hematologic malignancies, which include a broad set of leukemias and lymphomas. Karyotype data are one of the most common form of genetic data collected on lymphoid malignancies, because karyotyping is part of the standard of care in these cancers. RESULTS: In this paper we combine the analytic power of CytoGPS and Mercator to perform a large-scale multidimensional pattern recognition study on 22,741 karyotype samples in 47 different hematologic malignancies obtained from the public Mitelman database. CONCLUSION: Our findings indicate that Mercator was able to identify both known and novel cytogenetic patterns across different lymphoid malignancies, furthering our understanding of the genetics of these diseases

    Can We Modify the Intrauterine Environment to Halt the Intergenerational Cycle of Obesity?

    Get PDF
    Child obesity is a global epidemic whose development is rooted in complex and multi-factorial interactions. Once established, obesity is difficult to reverse and epidemiological, animal model, and experimental studies have provided strong evidence implicating the intrauterine environment in downstream obesity. This review focuses on the interplay between maternal obesity, gestational weight gain and lifestyle behaviours, which may act independently or in combination, to perpetuate the intergenerational cycle of obesity. The gestational period, is a crucial time of growth, development and physiological change in mother and child. This provides a window of opportunity for intervention via maternal nutrition and/or physical activity that may induce beneficial physiological alternations in the fetus that are mediated through favourable adaptations to in utero environmental stimuli. Evidence in the emerging field of epigenetics suggests that chronic, sub-clinical perturbations during pregnancy may affect fetal phenotype and long-term human data from ongoing randomized controlled trials will further aid in establishing the science behind ones predisposition to positive energy balance

    Global age-sex-specific fertility, mortality, healthy life expectancy (HALE), and population estimates in 204 countries and territories, 1950-2019 : a comprehensive demographic analysis for the Global Burden of Disease Study 2019

    Get PDF
    Background: Accurate and up-to-date assessment of demographic metrics is crucial for understanding a wide range of social, economic, and public health issues that affect populations worldwide. The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2019 produced updated and comprehensive demographic assessments of the key indicators of fertility, mortality, migration, and population for 204 countries and territories and selected subnational locations from 1950 to 2019. Methods: 8078 country-years of vital registration and sample registration data, 938 surveys, 349 censuses, and 238 other sources were identified and used to estimate age-specific fertility. Spatiotemporal Gaussian process regression (ST-GPR) was used to generate age-specific fertility rates for 5-year age groups between ages 15 and 49 years. With extensions to age groups 10–14 and 50–54 years, the total fertility rate (TFR) was then aggregated using the estimated age-specific fertility between ages 10 and 54 years. 7417 sources were used for under-5 mortality estimation and 7355 for adult mortality. ST-GPR was used to synthesise data sources after correction for known biases. Adult mortality was measured as the probability of death between ages 15 and 60 years based on vital registration, sample registration, and sibling histories, and was also estimated using ST-GPR. HIV-free life tables were then estimated using estimates of under-5 and adult mortality rates using a relational model life table system created for GBD, which closely tracks observed age-specific mortality rates from complete vital registration when available. Independent estimates of HIV-specific mortality generated by an epidemiological analysis of HIV prevalence surveys and antenatal clinic serosurveillance and other sources were incorporated into the estimates in countries with large epidemics. Annual and single-year age estimates of net migration and population for each country and territory were generated using a Bayesian hierarchical cohort component model that analysed estimated age-specific fertility and mortality rates along with 1250 censuses and 747 population registry years. We classified location-years into seven categories on the basis of the natural rate of increase in population (calculated by subtracting the crude death rate from the crude birth rate) and the net migration rate. We computed healthy life expectancy (HALE) using years lived with disability (YLDs) per capita, life tables, and standard demographic methods. Uncertainty was propagated throughout the demographic estimation process, including fertility, mortality, and population, with 1000 draw-level estimates produced for each metric. Findings: The global TFR decreased from 2·72 (95% uncertainty interval [UI] 2·66–2·79) in 2000 to 2·31 (2·17–2·46) in 2019. Global annual livebirths increased from 134·5 million (131·5–137·8) in 2000 to a peak of 139·6 million (133·0–146·9) in 2016. Global livebirths then declined to 135·3 million (127·2–144·1) in 2019. Of the 204 countries and territories included in this study, in 2019, 102 had a TFR lower than 2·1, which is considered a good approximation of replacement-level fertility. All countries in sub-Saharan Africa had TFRs above replacement level in 2019 and accounted for 27·1% (95% UI 26·4–27·8) of global livebirths. Global life expectancy at birth increased from 67·2 years (95% UI 66·8–67·6) in 2000 to 73·5 years (72·8–74·3) in 2019. The total number of deaths increased from 50·7 million (49·5–51·9) in 2000 to 56·5 million (53·7–59·2) in 2019. Under-5 deaths declined from 9·6 million (9·1–10·3) in 2000 to 5·0 million (4·3–6·0) in 2019. Global population increased by 25·7%, from 6·2 billion (6·0–6·3) in 2000 to 7·7 billion (7·5–8·0) in 2019. In 2019, 34 countries had negative natural rates of increase; in 17 of these, the population declined because immigration was not sufficient to counteract the negative rate of decline. Globally, HALE increased from 58·6 years (56·1–60·8) in 2000 to 63·5 years (60·8–66·1) in 2019. HALE increased in 202 of 204 countries and territories between 2000 and 2019

    Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019

    Get PDF
    Background: In an era of shifting global agendas and expanded emphasis on non-communicable diseases and injuries along with communicable diseases, sound evidence on trends by cause at the national level is essential. The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) provides a systematic scientific assessment of published, publicly available, and contributed data on incidence, prevalence, and mortality for a mutually exclusive and collectively exhaustive list of diseases and injuries. Methods: GBD estimates incidence, prevalence, mortality, years of life lost (YLLs), years lived with disability (YLDs), and disability-adjusted life-years (DALYs) due to 369 diseases and injuries, for two sexes, and for 204 countries and territories. Input data were extracted from censuses, household surveys, civil registration and vital statistics, disease registries, health service use, air pollution monitors, satellite imaging, disease notifications, and other sources. Cause-specific death rates and cause fractions were calculated using the Cause of Death Ensemble model and spatiotemporal Gaussian process regression. Cause-specific deaths were adjusted to match the total all-cause deaths calculated as part of the GBD population, fertility, and mortality estimates. Deaths were multiplied by standard life expectancy at each age to calculate YLLs. A Bayesian meta-regression modelling tool, DisMod-MR 2.1, was used to ensure consistency between incidence, prevalence, remission, excess mortality, and cause-specific mortality for most causes. Prevalence estimates were multiplied by disability weights for mutually exclusive sequelae of diseases and injuries to calculate YLDs. We considered results in the context of the Socio-demographic Index (SDI), a composite indicator of income per capita, years of schooling, and fertility rate in females younger than 25 years. Uncertainty intervals (UIs) were generated for every metric using the 25th and 975th ordered 1000 draw values of the posterior distribution. Findings: Global health has steadily improved over the past 30 years as measured by age-standardised DALY rates. After taking into account population growth and ageing, the absolute number of DALYs has remained stable. Since 2010, the pace of decline in global age-standardised DALY rates has accelerated in age groups younger than 50 years compared with the 1990–2010 time period, with the greatest annualised rate of decline occurring in the 0–9-year age group. Six infectious diseases were among the top ten causes of DALYs in children younger than 10 years in 2019: lower respiratory infections (ranked second), diarrhoeal diseases (third), malaria (fifth), meningitis (sixth), whooping cough (ninth), and sexually transmitted infections (which, in this age group, is fully accounted for by congenital syphilis; ranked tenth). In adolescents aged 10–24 years, three injury causes were among the top causes of DALYs: road injuries (ranked first), self-harm (third), and interpersonal violence (fifth). Five of the causes that were in the top ten for ages 10–24 years were also in the top ten in the 25–49-year age group: road injuries (ranked first), HIV/AIDS (second), low back pain (fourth), headache disorders (fifth), and depressive disorders (sixth). In 2019, ischaemic heart disease and stroke were the top-ranked causes of DALYs in both the 50–74-year and 75-years-and-older age groups. Since 1990, there has been a marked shift towards a greater proportion of burden due to YLDs from non-communicable diseases and injuries. In 2019, there were 11 countries where non-communicable disease and injury YLDs constituted more than half of all disease burden. Decreases in age-standardised DALY rates have accelerated over the past decade in countries at the lower end of the SDI range, while improvements have started to stagnate or even reverse in countries with higher SDI. Interpretation: As disability becomes an increasingly large component of disease burden and a larger component of health expenditure, greater research and developm nt investment is needed to identify new, more effective intervention strategies. With a rapidly ageing global population, the demands on health services to deal with disabling outcomes, which increase with age, will require policy makers to anticipate these changes. The mix of universal and more geographically specific influences on health reinforces the need for regular reporting on population health in detail and by underlying cause to help decision makers to identify success stories of disease control to emulate, as well as opportunities to improve. Funding: Bill & Melinda Gates Foundation. © 2020 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 licens

    Convalescent plasma in patients admitted to hospital with COVID-19 (RECOVERY): a randomised controlled, open-label, platform trial

    Get PDF
    SummaryBackground Azithromycin has been proposed as a treatment for COVID-19 on the basis of its immunomodulatoryactions. We aimed to evaluate the safety and efficacy of azithromycin in patients admitted to hospital with COVID-19.Methods In this randomised, controlled, open-label, adaptive platform trial (Randomised Evaluation of COVID-19Therapy [RECOVERY]), several possible treatments were compared with usual care in patients admitted to hospitalwith COVID-19 in the UK. The trial is underway at 176 hospitals in the UK. Eligible and consenting patients wererandomly allocated to either usual standard of care alone or usual standard of care plus azithromycin 500 mg once perday by mouth or intravenously for 10 days or until discharge (or allocation to one of the other RECOVERY treatmentgroups). Patients were assigned via web-based simple (unstratified) randomisation with allocation concealment andwere twice as likely to be randomly assigned to usual care than to any of the active treatment groups. Participants andlocal study staff were not masked to the allocated treatment, but all others involved in the trial were masked to theoutcome data during the trial. The primary outcome was 28-day all-cause mortality, assessed in the intention-to-treatpopulation. The trial is registered with ISRCTN, 50189673, and ClinicalTrials.gov, NCT04381936.Findings Between April 7 and Nov 27, 2020, of 16 442 patients enrolled in the RECOVERY trial, 9433 (57%) wereeligible and 7763 were included in the assessment of azithromycin. The mean age of these study participants was65·3 years (SD 15·7) and approximately a third were women (2944 [38%] of 7763). 2582 patients were randomlyallocated to receive azithromycin and 5181 patients were randomly allocated to usual care alone. Overall,561 (22%) patients allocated to azithromycin and 1162 (22%) patients allocated to usual care died within 28 days(rate ratio 0·97, 95% CI 0·87–1·07; p=0·50). No significant difference was seen in duration of hospital stay (median10 days [IQR 5 to >28] vs 11 days [5 to >28]) or the proportion of patients discharged from hospital alive within 28 days(rate ratio 1·04, 95% CI 0·98–1·10; p=0·19). Among those not on invasive mechanical ventilation at baseline, nosignificant difference was seen in the proportion meeting the composite endpoint of invasive mechanical ventilationor death (risk ratio 0·95, 95% CI 0·87–1·03; p=0·24).Interpretation In patients admitted to hospital with COVID-19, azithromycin did not improve survival or otherprespecified clinical outcomes. Azithromycin use in patients admitted to hospital with COVID-19 should be restrictedto patients in whom there is a clear antimicrobial indication
    corecore