37 research outputs found

    Semi-supervised empirical Bayes group-regularized factor regression

    Full text link
    The features in high dimensional biomedical prediction problems are often well described with lower dimensional manifolds. An example is genes that are organised in smaller functional networks. The outcome can then be described with the factor regression model. A benefit of the factor model is that is allows for straightforward inclusion of unlabeled observations in the estimation of the model, i.e., semi-supervised learning. In addition, the high dimensional features in biomedical prediction problems are often well characterised. Examples are genes, for which annotation is available, and metabolites with pp-values from a previous study available. In this paper, the extra information on the features is included in the prior model for the features. The extra information is weighted and included in the estimation through empirical Bayes, with Variational approximations to speed up the computation. The method is demonstrated in simulations and two applications. One application considers influenza vaccine efficacy prediction based on microarray data. The second application predictions oral cancer metastatsis from RNAseq data.Comment: 19 pages, 5 figures, submitted to Biometrical Journa

    An omics-based machine learning approach to predict diabetes progression:a RHAPSODY study

    Get PDF
    Aims/hypothesis: People with type 2 diabetes are heterogeneous in their disease trajectory, with some progressing more quickly to insulin initiation than others. Although classical biomarkers such as age, HbA 1c and diabetes duration are associated with glycaemic progression, it is unclear how well such variables predict insulin initiation or requirement and whether newly identified markers have added predictive value. Methods: In two prospective cohort studies as part of IMI-RHAPSODY, we investigated whether clinical variables and three types of molecular markers (metabolites, lipids, proteins) can predict time to insulin requirement using different machine learning approaches (lasso, ridge, GRridge, random forest). Clinical variables included age, sex, HbA 1c, HDL-cholesterol and C-peptide. Models were run with unpenalised clinical variables (i.e. always included in the model without weights) or penalised clinical variables, or without clinical variables. Model development was performed in one cohort and the model was applied in a second cohort. Model performance was evaluated using Harrel’s C statistic. Results: Of the 585 individuals from the Hoorn Diabetes Care System (DCS) cohort, 69 required insulin during follow-up (1.0–11.4 years); of the 571 individuals in the Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS) cohort, 175 required insulin during follow-up (0.3–11.8 years). Overall, the clinical variables and proteins were selected in the different models most often, followed by the metabolites. The most frequently selected clinical variables were HbA 1c (18 of the 36 models, 50%), age (15 models, 41.2%) and C-peptide (15 models, 41.2%). Base models (age, sex, BMI, HbA 1c) including only clinical variables performed moderately in both the DCS discovery cohort (C statistic 0.71 [95% CI 0.64, 0.79]) and the GoDARTS replication cohort (C 0.71 [95% CI 0.69, 0.75]). A more extensive model including HDL-cholesterol and C-peptide performed better in both cohorts (DCS, C 0.74 [95% CI 0.67, 0.81]; GoDARTS, C 0.73 [95% CI 0.69, 0.77]). Two proteins, lactadherin and proto-oncogene tyrosine-protein kinase receptor, were most consistently selected and slightly improved model performance. Conclusions/interpretation: Using machine learning approaches, we show that insulin requirement risk can be modestly well predicted by predominantly clinical variables. Inclusion of molecular markers improves the prognostic performance beyond that of clinical variables by up to 5%. Such prognostic models could be useful for identifying people with diabetes at high risk of progressing quickly to treatment intensification. Data availability: Summary statistics of lipidomic, proteomic and metabolomic data are available from a Shiny dashboard at https://rhapdata-app.vital-it.ch. Graphical Abstract: (Figure presented.).</p

    Learning from a lot:Empirical Bayes for high-dimensional model-based prediction

    No full text
    Empirical Bayes is a versatile approach to “learn from a lot” in two ways: first, from a large number of variables and, second, from a potentially large amount of prior information, for example, stored in public repositories. We review applications of a variety of empirical Bayes methods to several well-known model-based prediction methods, including penalized regression, linear discriminant analysis, and Bayesian models with sparse or dense priors. We discuss “formal” empirical Bayes methods that maximize the marginal likelihood but also more informal approaches based on other data summaries. We contrast empirical Bayes to cross-validation and full Bayes and discuss hybrid approaches. To study the relation between the quality of an empirical Bayes estimator and p, the number of variables, we consider a simple empirical Bayes estimator in a linear model setting. We argue that empirical Bayes is particularly useful when the prior contains multiple parameters, which model a priori information on variables termed “co-data”. In particular, we present two novel examples that allow for co-data: first, a Bayesian spike-and-slab setting that facilitates inclusion of multiple co-data sources and types and, second, a hybrid empirical Bayes–full Bayes ridge regression approach for estimation of the posterior predictive interval

    Diarrhoea of unknown cause: medical treatment in a stepwise manner: Management of Idiopathic Diarrhoea Based on Experience of Step-Up Medical Treatment

    Get PDF
    The basic principle for the treatment of idiopathic diarrhoea (functional diarrhoea K59.1) is to delay transit through the gut in order to promote the absorption of electrolytes and water. Under mild conditions, bulking agents may suffice. With increasing severity, antidiarrhoeal pharmaceuticals may be added in a stepwise manner. In diarrhoea of unknown aetiology, peripherally-acting opioid receptor agonists, such as loperamide, are first-line treatment and forms the pharmaceutical basis of antidiarrheal treatment. As second-line treatment opium drops have an approved indication for severe diarrhoea when other treatment options fail. Beyond this, various treatment options are built on experience with more advanced treatments using clonidine, octreotide, as well as GLP-1 and GLP-2 analogs which require specialist knowledge the field.</p

    Mucosa associated invariant T and natural killer cells in active and budesonide treated collagenous colitis patients

    No full text
    IntroductionCollagenous colitis (CC) is an inflammatory bowel disease, which usually responds to budesonide treatment. Our aim was to study the immunological background of the disease. MethodsAnalyses of peripheral and mucosal MAIT (mucosa associated invariant T cells) and NK (natural killer) cells were performed with flow cytometry. Numbers of mucosal cells were calculated using immunohistochemistry. We studied the same patients with active untreated CC (au-CC) and again while in remission on budesonide treatment. Budesonide refractory patients and healthy controls were also included. The memory marker CD45R0 and activation marker CD154 and CD69 were used to further study the cells. Finally B cells, CD4(+) and CD8(+) T cells were also analysed. ResultsThe percentages of circulating CD56(dim)CD16(+) NK cells as well as MAIT cells (CD3(+)TCRVa7.2(+)CD161(+)) were decreased in au-CC compared to healthy controls. This difference was not seen in the mucosa; where we instead found increased numbers of mucosal CD4(+) T cells and CD8(+) T cells in au-CC. Mucosal immune cell numbers were not affected by budesonide treatment. In refractory CC we found increased mucosal numbers of MAIT cells, CD4(+) and CD8(+) T cells compared to au-CC. DiscussionPatients with active collagenous colitis have lower percentages of circulating MAIT and NK cells. However, there was no change of these cells in the colonic mucosa. Most mucosal cell populations were increased in budesonide refractory as compared to au-CC patients, particularly the number of MAIT cells. This may indicate that T cell targeting therapy could be an alternative in budesonide refractory CC

    Adaptive group-regularized logistic elastic net regression

    No full text
    In high-dimensional data settings, additional information on the features is often available. Examples of such external information in omics research are: (i) pp-values from a previous study and (ii) omics annotation. The inclusion of this information in the analysis may enhance classification performance and feature selection but is not straightforward. We propose a group-regularized (logistic) elastic net regression method, where each penalty parameter corresponds to a group of features based on the external information. The method, termed gren, makes use of the Bayesian formulation of logistic elastic net regression to estimate both the model and penalty parameters in an approximate empirical-variational Bayes framework. Simulations and applications to three cancer genomics studies and one Alzheimer metabolomics study show that, if the partitioning of the features is informative, classification performance, and feature selection are indeed enhanced
    corecore