4 research outputs found

    Improved high-dimensional prediction with Random Forests by the use of co-data

    Get PDF
    Background: Prediction in high dimensional settings is difficult due to the large number of variables relative to the sample size. We demonstrate how auxiliary 'co-data' can be used to improve the performance of a Random Forest in such a setting. Results: Co-data are incorporated in the Random Forest by replacing the uniform sampling probabilities that are used to draw candidate variables by co-data moderated sampling probabilities. Co-data here are defined as any type information that is available on the variables of the primary data, but does not use its response labels. These moderated sampling probabilities are, inspired by empirical Bayes, learned from the data at hand. We demonstrate the co-data moderated Random Forest (CoRF) with two examples. In the first example we aim to predict the presence of a lymph node metastasis with gene expression data. We demonstrate how a set of external p-values, a gene signature, and the correlation between gene expression and DNA copy number can improve the predictive performance. In the second example we demonstrate how the prediction of cervical (pre-)cancer with methylation data can be improved by including the location of the probe relative to the known CpG islands, the number of CpG sites targeted by a probe, and a set of p-values from a related study. Conclusion: The proposed method is able to utilize auxiliary co-data to improve the performance of a Random Forest

    Animal-Based Factors Prior to Infection Predict Histological Disease Outcome in Porcine Reproductive and Respiratory Syndrome Virus- and Actinobacillus pleuropneumoniae-Infected Pigs

    Get PDF
    A large variety of clinical manifestation in individual pigs occurs after infection with pathogens involved in porcine respiratory disease complex (PRDC). Some pigs are less prone to develop respiratory disease symptoms. The variation in clinical impact after infection and the recovery capacity of an individual animal are measures of its resilience. In this paper, we examined which ones of a range of animal-based factors (rectal temperature, body weight, skin lesion scores, behavior, natural antibody serum levels, serum levels of white blood cells, and type of T and granulocyte subsets) when measured prior to infection are related to disease severity. These animal-based factors and the interaction with housing regimen of the piglets (conventional or enriched) were modeled using linear regression to predict disease severity using a dataset acquired from a previous study using a well-established experimental coinfection model of porcine reproductive and respiratory syndrome virus (PRRSV) and Actinobacillus pleuropneumoniae. Both PRRSV and A. pleuropneumoniae are often involved in PRDC. Histological lung lesion score of each animal was used as a measure for PRDC severity after infection. Prior to infection, higher serum levels of lymphocytes (CD3+), naïve T helper (CD3+CD4+CD8−), CD8+ (as well as higher relative levels of CD8+), and memory T helper (CD3+CD4+CD8+) cells and higher relative levels of granulocytes (CD172a) were related to reduced disease severity in both housing systems. Raised serum concentrations of natural IgM antibodies binding to keyhole limpet hemocyanin (KLH) were also related to reduced disease severity after infection. Increased levels of skin lesions at the central body part (after weaning and before infection) were related to increased disease severity in conventional housing systems only. High resisters showed a lower histological lung lesion score, which appeared unrelated to sex. Body temperature, behavior, and growth prior to infections were influenced by housing regimen but could not explain the variation in lung lesion scores after infection. Raised basal lymphocyte counts and lower skin lesion scores are related to reduced disease severity independent of or dependent on housing system, respectively. In conclusion, our study identifies intrinsic animal-based measures using linear regression analysis that predicts resilience to infections in pigs

    TRY plant trait database, enhanced coverage and open access

    No full text
    Plant traits-the morphological, ahawnatomical, physiological, biochemical and phenological characteristics of plants-determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait-based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits-almost complete coverage for 'plant growth form'. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait-environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives
    corecore