735 research outputs found

    Lost in translation: On the impact of data coding on penalized regression with interactions

    Full text link
    Penalized regression approaches are standard tools in quantitative genetics. It is known that the fit of an \emph{ordinary least squares} (OLS) regression is independent of certain transformations of the coding of the predictor variables, and that the standard mixed model \emph{ridge regression best linear unbiased prediction} (RRBLUP) is neither affected by translations of the variable coding, nor by global scaling. However, it has been reported that an extended version of this mixed model, which incorporates interactions by products of markers as additional predictor variables is affected by translations of the marker coding. In this work, we identify the cause of this loss of invariance in a general context of penalized regression on polynomials in the predictor variables. We show that in most cases, translating the coding of the predictor variables has an impact on effect estimates, with an exception when only the size of the coefficients of monomials of highest total degree are penalized. The invariance of RRBLUP can thus be considered as a special case of this setting, with a polynomial of total degree 1, where the size of the fixed effect (total degree 0) is not penalized but all coefficients of monomials of total degree 1 are. The extended RRBLUP, which includes interactions, is not invariant to translations because it does not only penalize interactions (total degree 2), but also additive effects (total degree 1). Our observations are not restricted to ridge regression, but generally valid for penalized regressions, for instance also for the 1\ell_1 penalty of LASSO

    Cooperative binding: a multiple personality

    Get PDF
    Cooperative binding has been described in many publications and has been related to or defined by several different properties of the binding behavior of the ligand to the target molecule. In addition to the commonly used Hill coefficient, other characteristics such as a sigmoidal shape of the overall titration curve in a linear plot, a change of ligand affinity of the other binding sites when a site of the target molecule becomes occupied, or complex roots of the binding polynomial have been used to define or to quantify cooperative binding. In this work, we analyze how the different properties are related in the most general model for binding curves based on the grand canonical partition function and present several examples which highlight differences between the cooperativity characterizing properties which are discussed. Our results mainly show that among the presented definitions there are not two which fully coincide. Moreover, this work poses the question whether it can make sense to distinguish between positive and negative cooperativity based on the macroscopic binding isotherm only. This article shall emphasize that scientists who investigate cooperative effects in biological systems could help avoiding misunderstandings by stating clearly which kind of cooperativity they discuss.Facultad de Ciencias ExactasCentro Regional de Estudios Genómico

    Integrating Gene Expression Data Into Genomic Prediction

    Get PDF
    Gene expression profiles potentially hold valuable information for the prediction of breeding values and phenotypes. In this study, the utility of transcriptome data for phenotype prediction was tested with 185 inbred lines of Drosophila melanogaster for nine traits in two sexes. We incorporated the transcriptome data into genomic prediction via two methods: GTBLUP and GRBLUP, both combining single nucleotide polymorphisms (SNPs) and transcriptome data. The genotypic data was used to construct the common additive genomic relationship, which was used in genomic best linear unbiased prediction (GBLUP) or jointly in a linear mixed model with a transcriptome-based linear kernel (GTBLUP), or with a transcriptome-based Gaussian kernel (GRBLUP). We studied the predictive ability of the models and discuss a concept of “omics-augmented broad sense heritability” for the multi-omics era. For most traits, GRBLUP and GBLUP provided similar predictive abilities, but GRBLUP explained more of the phenotypic variance. There was only one trait (olfactory perception to Ethyl Butyrate in females) in which the predictive ability of GRBLUP (0.23) was significantly higher than the predictive ability of GBLUP (0.21). Our results suggest that accounting for transcriptome data has the potential to improve genomic predictions if transcriptome data can be included on a larger scale

    Editorial: Genomic selection: Lessons learned and perspectives

    Get PDF
    Genomic selection (GS) has been one of the most prominent Research Topics in breeding science in the last two decades after the milestone paper by Meuwissen et al. (2001). Its huge potential for increasing the efficiency of breeding programs attracted scientific curiosity and research funding. Many different statistical prediction methods have been tested, and different use cases have been explored. We organized this Research Topic to look both back and forward. The objectives were to review the developments of the last 20 years, to provide a snapshot of current hot topics, and potentially also to define areas on which more (or less) focus should be put in the future, thereby supporting readers with formulating and prioritizing their ideas for future research. Several questions were brought up when organizing this Research Topic including: How did GS change breeding schemes? Which impact did GS have on realized selection gain? What, considering the context of particularities of different crops, may be optimal breeding schemes to leverage the full potential of GS? What has been the impact of and what is the potential of hybrid prediction, statistical epistasis models, deep learning and other methods? What are the long-term effects of GS? Can predictive breeding approaches also be used to harness genetic resources from germplasm banks in a more efficient way

    Genome and Environment Based Prediction Models and Methods of Complex Traits Incorporating Genotype × Environment Interaction

    Get PDF
    Genomic-enabled prediction models are of paramount importance for the successful implementation of genomic selection (GS) based on breeding values. As opposed to animal breeding, plant breeding includes extensive multienvironment and multiyear field trial data. Hence, genomic-enabled prediction models should include genotype × environment (G × E) interaction, which most of the time increases the prediction performance when the response of lines are different from environment to environment. In this chapter, we describe a historical timeline since 2012 related to advances of the GS models that take into account G × E interaction. We describe theoretical and practical aspects of those GS models, including the gains in prediction performance when including G × E structures for both complex continuous and categorical scale traits. Then, we detailed and explained the main G × E genomic prediction models for complex traits measured in continuous and noncontinuous (categorical) scale. Related to G × E interaction models this review also examine the analyses of the information generated with high-throughput phenotype data (phenomic) and the joint analyses of multitrait and multienvironment field trial data that is also employed in the general assessment of multitrait G × E interaction. The inclusion of nongenomic data in increasing the accuracy and biological reliability of the G × E approach is also outlined. We show the recent advances in large-scale envirotyping (enviromics), and how the use of mechanistic computational modeling can derive the crop growth and development aspects useful for predicting phenotypes and explaining G × E

    Allowance for Shareholder Equity - Implementing a Neutral Corporate Income Tax in the European Union

    Full text link
    This paper proposes the introduction of a consumption-based corporate income tax in the European Union. Our proposal would guarantee neutrality regarding investment decisions and at the same time increase cost-efficiency. The proposal is based on the S-base cash flow tax, where transactions within the corporate sector are not at all taxable and only transactions be-tween shareholders and corporations are subject to tax. In contrast to existing S-base cash flow tax systems, tax deductibility of investments is deferred. Rather, the acquisition costs and capital endowments are compounded at the capital market rate and are set off against fu-ture capital gains. Dividends and withdrawals are fully taxable at the shareholder level. Be-cause of the similarities to the Allowance for Corporate Equity (ACE) tax our proposal is called Allowance for Shareholder Equity (ASE tax). The ASE tax exhibits the same neutrality properties as the traditional cash flow tax. More-over, the compounded inter-temporal credit method ensures that it is neutral with respect to the decision between domestic and foreign investment. To increase acceptance of the ASE tax, current taxpayers' documentation requirements will be reduced rather than extended. Our proposal is shaped in a way that it could be realized in a single EU country or in all member states of the EU

    Global incidence, prevalence, years lived with disability (YLDs), disability-adjusted life-years (DALYs), and healthy life expectancy (HALE) for 371 diseases and injuries in 204 countries and territories and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021

    Get PDF
    Background: Detailed, comprehensive, and timely reporting on population health by underlying causes of disability and premature death is crucial to understanding and responding to complex patterns of disease and injury burden over time and across age groups, sexes, and locations. The availability of disease burden estimates can promote evidence-based interventions that enable public health researchers, policy makers, and other professionals to implement strategies that can mitigate diseases. It can also facilitate more rigorous monitoring of progress towards national and international health targets, such as the Sustainable Development Goals. For three decades, the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) has filled that need. A global network of collaborators contributed to the production of GBD 2021 by providing, reviewing, and analysing all available data. GBD estimates are updated routinely with additional data and refined analytical methods. GBD 2021 presents, for the first time, estimates of health loss due to the COVID-19 pandemic. Methods: The GBD 2021 disease and injury burden analysis estimated years lived with disability (YLDs), years of life lost (YLLs), disability-adjusted life-years (DALYs), and healthy life expectancy (HALE) for 371 diseases and injuries using 100 983 data sources. Data were extracted from vital registration systems, verbal autopsies, censuses, household surveys, disease-specific registries, health service contact data, and other sources. YLDs were calculated by multiplying cause-age-sex-location-year-specific prevalence of sequelae by their respective disability weights, for each disease and injury. YLLs were calculated by multiplying cause-age-sex-location-year-specific deaths by the standard life expectancy at the age that death occurred. DALYs were calculated by summing YLDs and YLLs. HALE estimates were produced using YLDs per capita and age-specific mortality rates by location, age, sex, year, and cause. 95% uncertainty intervals (UIs) were generated for all final estimates as the 2·5th and 97·5th percentiles values of 500 draws. Uncertainty was propagated at each step of the estimation process. Counts and age-standardised rates were calculated globally, for seven super-regions, 21 regions, 204 countries and territories (including 21 countries with subnational locations), and 811 subnational locations, from 1990 to 2021. Here we report data for 2010 to 2021 to highlight trends in disease burden over the past decade and through the first 2 years of the COVID-19 pandemic. Findings: Global DALYs increased from 2·63 billion (95% UI 2·44–2·85) in 2010 to 2·88 billion (2·64–3·15) in 2021 for all causes combined. Much of this increase in the number of DALYs was due to population growth and ageing, as indicated by a decrease in global age-standardised all-cause DALY rates of 14·2% (95% UI 10·7–17·3) between 2010 and 2019. Notably, however, this decrease in rates reversed during the first 2 years of the COVID-19 pandemic, with increases in global age-standardised all-cause DALY rates since 2019 of 4·1% (1·8–6·3) in 2020 and 7·2% (4·7–10·0) in 2021. In 2021, COVID-19 was the leading cause of DALYs globally (212·0 million [198·0–234·5] DALYs), followed by ischaemic heart disease (188·3 million [176·7–198·3]), neonatal disorders (186·3 million [162·3–214·9]), and stroke (160·4 million [148·0–171·7]). However, notable health gains were seen among other leading communicable, maternal, neonatal, and nutritional (CMNN) diseases. Globally between 2010 and 2021, the age-standardised DALY rates for HIV/AIDS decreased by 47·8% (43·3–51·7) and for diarrhoeal diseases decreased by 47·0% (39·9–52·9). Non-communicable diseases contributed 1·73 billion (95% UI 1·54–1·94) DALYs in 2021, with a decrease in age-standardised DALY rates since 2010 of 6·4% (95% UI 3·5–9·5). Between 2010 and 2021, among the 25 leading Level 3 causes, age-standardised DALY rates increased most substantially for anxiety disorders (16·7% [14·0–19·8]), depressive disorders (16·4% [11·9–21·3]), and diabetes (14·0% [10·0–17·4]). Age-standardised DALY rates due to injuries decreased globally by 24·0% (20·7–27·2) between 2010 and 2021, although improvements were not uniform across locations, ages, and sexes. Globally, HALE at birth improved slightly, from 61·3 years (58·6–63·6) in 2010 to 62·2 years (59·4–64·7) in 2021. However, despite this overall increase, HALE decreased by 2·2% (1·6–2·9) between 2019 and 2021. Interpretation: Putting the COVID-19 pandemic in the context of a mutually exclusive and collectively exhaustive list of causes of health loss is crucial to understanding its impact and ensuring that health funding and policy address needs at both local and global levels through cost-effective and evidence-based interventions. A global epidemiological transition remains underway. Our findings suggest that prioritising non-communicable disease prevention and treatment policies, as well as strengthening health systems, continues to be crucially important. The progress on reducing the burden of CMNN diseases must not stall; although global trends are improving, the burden of CMNN diseases remains unacceptably high. Evidence-based interventions will help save the lives of young children and mothers and improve the overall health and economic conditions of societies across the world. Governments and multilateral organisations should prioritise pandemic preparedness planning alongside efforts to reduce the burden of diseases and injuries that will strain resources in the coming decades. Funding: Bill & Melinda Gates Foundation

    Global burden and strength of evidence for 88 risk factors in 204 countries and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021

    Get PDF
    Background: Understanding the health consequences associated with exposure to risk factors is necessary to inform public health policy and practice. To systematically quantify the contributions of risk factor exposures to specific health outcomes, the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2021 aims to provide comprehensive estimates of exposure levels, relative health risks, and attributable burden of disease for 88 risk factors in 204 countries and territories and 811 subnational locations, from 1990 to 2021. Methods: The GBD 2021 risk factor analysis used data from 54 561 total distinct sources to produce epidemiological estimates for 88 risk factors and their associated health outcomes for a total of 631 risk–outcome pairs. Pairs were included on the basis of data-driven determination of a risk–outcome association. Age-sex-location-year-specific estimates were generated at global, regional, and national levels. Our approach followed the comparative risk assessment framework predicated on a causal web of hierarchically organised, potentially combinative, modifiable risks. Relative risks (RRs) of a given outcome occurring as a function of risk factor exposure were estimated separately for each risk–outcome pair, and summary exposure values (SEVs), representing risk-weighted exposure prevalence, and theoretical minimum risk exposure levels (TMRELs) were estimated for each risk factor. These estimates were used to calculate the population attributable fraction (PAF; ie, the proportional change in health risk that would occur if exposure to a risk factor were reduced to the TMREL). The product of PAFs and disease burden associated with a given outcome, measured in disability-adjusted life-years (DALYs), yielded measures of attributable burden (ie, the proportion of total disease burden attributable to a particular risk factor or combination of risk factors). Adjustments for mediation were applied to account for relationships involving risk factors that act indirectly on outcomes via intermediate risks. Attributable burden estimates were stratified by Socio-demographic Index (SDI) quintile and presented as counts, age-standardised rates, and rankings. To complement estimates of RR and attributable burden, newly developed burden of proof risk function (BPRF) methods were applied to yield supplementary, conservative interpretations of risk–outcome associations based on the consistency of underlying evidence, accounting for unexplained heterogeneity between input data from different studies. Estimates reported represent the mean value across 500 draws from the estimate's distribution, with 95% uncertainty intervals (UIs) calculated as the 2·5th and 97·5th percentile values across the draws. Findings: Among the specific risk factors analysed for this study, particulate matter air pollution was the leading contributor to the global disease burden in 2021, contributing 8·0% (95% UI 6·7–9·4) of total DALYs, followed by high systolic blood pressure (SBP; 7·8% [6·4–9·2]), smoking (5·7% [4·7–6·8]), low birthweight and short gestation (5·6% [4·8–6·3]), and high fasting plasma glucose (FPG; 5·4% [4·8–6·0]). For younger demographics (ie, those aged 0–4 years and 5–14 years), risks such as low birthweight and short gestation and unsafe water, sanitation, and handwashing (WaSH) were among the leading risk factors, while for older age groups, metabolic risks such as high SBP, high body-mass index (BMI), high FPG, and high LDL cholesterol had a greater impact. From 2000 to 2021, there was an observable shift in global health challenges, marked by a decline in the number of all-age DALYs broadly attributable to behavioural risks (decrease of 20·7% [13·9–27·7]) and environmental and occupational risks (decrease of 22·0% [15·5–28·8]), coupled with a 49·4% (42·3–56·9) increase in DALYs attributable to metabolic risks, all reflecting ageing populations and changing lifestyles on a global scale. Age-standardised global DALY rates attributable to high BMI and high FPG rose considerably (15·7% [9·9–21·7] for high BMI and 7·9% [3·3–12·9] for high FPG) over this period, with exposure to these risks increasing annually at rates of 1·8% (1·6–1·9) for high BMI and 1·3% (1·1–1·5) for high FPG. By contrast, the global risk-attributable burden and exposure to many other risk factors declined, notably for risks such as child growth failure and unsafe water source, with age-standardised attributable DALYs decreasing by 71·5% (64·4–78·8) for child growth failure and 66·3% (60·2–72·0) for unsafe water source. We separated risk factors into three groups according to trajectory over time: those with a decreasing attributable burden, due largely to declining risk exposure (eg, diet high in trans-fat and household air pollution) but also to proportionally smaller child and youth populations (eg, child and maternal malnutrition); those for which the burden increased moderately in spite of declining risk exposure, due largely to population ageing (eg, smoking); and those for which the burden increased considerably due to both increasing risk exposure and population ageing (eg, ambient particulate matter air pollution, high BMI, high FPG, and high SBP). Interpretation: Substantial progress has been made in reducing the global disease burden attributable to a range of risk factors, particularly those related to maternal and child health, WaSH, and household air pollution. Maintaining efforts to minimise the impact of these risk factors, especially in low SDI locations, is necessary to sustain progress. Successes in moderating the smoking-related burden by reducing risk exposure highlight the need to advance policies that reduce exposure to other leading risk factors such as ambient particulate matter air pollution and high SBP. Troubling increases in high FPG, high BMI, and other risk factors related to obesity and metabolic syndrome indicate an urgent need to identify and implement interventions
    corecore