261 research outputs found
Developing Biomarker Combinations in Multicenter Studies via Direct Maximization and Penalization
When biomarker studies involve patients at multiple centers and the goal is to develop biomarker combinations for diagnosis, prognosis, or screening, we consider evaluating the predictive capacity of a given combination with the center-adjusted AUC (aAUC), a summary of conditional performance. Rather than using a general method to construct the biomarker combination, such as logistic regression, we propose estimating the combination by directly maximizing the aAUC. Furthermore, it may be desirable to have a biomarker combination with similar predictive capacity across centers. To that end, we allow for penalization of the variability in center-specific performance. We demonstrate good asymptotic properties of the resulting combinations. Simulations provide small-sample evidence that maximizing the aAUC can lead to combinations with greater predictive capacity than combinations constructed via logistic regression. We further illustrate the utility of constructing combinations by maximizing the aAUC while penalizing variability. We apply these methods to data from a study of acute kidney injury after cardiac surgery
Biomarker Combinations for Diagnosis and Prognosis in Multicenter Studies: Principles and Methods
Many investigators are interested in combining biomarkers to predict an outcome of interest or detect underlying disease. This endeavor is complicated by the fact that many biomarker studies involve data from multiple centers. Depending upon the relationship between center, the biomarkers, and the target of prediction, care must be taken when constructing and evaluating combinations of biomarkers. We introduce a taxonomy to describe the role of center and consider how a biomarker combination should be constructed and evaluated. We show that ignoring center, which is frequently done by clinical researchers, is often not appropriate. The limited statistical literature proposes using random intercept logistic regression models, an approach that we demonstrate is generally inadequate and may be misleading. We instead propose using fixed intercept logistic regression, which appropriately accounts for center without relying on untenable assumptions. After constructing the biomarker combination, we recommend using performance measures that account for the multicenter nature of the data, namely the center-adjusted area under the receiver operating characteristic curve. We apply these methods to data from a multicenter study of acute kidney injury after cardiac surgery. Appropriately accounting for center, both in construction and evaluation, may increase the likelihood of identifying clinically useful biomarker combinations
Using Multilevel Outcomes to Construct and Select Biomarker Combinations for Single-level Prediction
Biomarker studies may involve a multilevel outcome, such as no, mild, or severe disease. There is often interest in predicting one particular level of the outcome due to its clinical significance. The standard approach to constructing biomarker combinations in this context involves dichotomizing the outcome and using a binary logistic regression model. We assessed whether information can be usefully gained from instead using more sophisticated regression methods. Furthermore, it is often necessary to select among several candidate biomarker combinations. One strategy involves selecting a combination on the basis of its ability to predict the outcome level of interest. We propose an algorithm that leverages the multilevel outcome to inform combination selection. We apply this algorithm to data from a study of acute kidney injury after cardiac surgery, where the kidney injury may be absent, mild, or severe. Using more sophisticated modeling approaches to construct combinations provided gains over the binary logistic regression approach in specific settings. In the examples considered, the proposed algorithm for combination selection tended to reduce the impact of bias due to selection and to provide combinations with improved performance. Methods that utilize the multilevel nature of the outcome in the construction and/or selection of biomarker combinations have the potential to yield better combinations
Combining Biomarkers by Maximizing the True Positive Rate for a Fixed False Positive Rate
Biomarkers abound in many areas of clinical research, and often investigators are interested in combining them for diagnosis, prognosis and screening. In many applications, the true positive rate for a biomarker combination at a prespecified, clinically acceptable false positive rate is the most relevant measure of predictive capacity. We propose a distribution-free method for constructing biomarker combinations by maximizing the true positive rate while constraining the false positive rate. Theoretical results demonstrate good operating characteristics for the resulting combination. In simulations, the biomarker combination provided by our method demonstrated improved operating characteristics in a variety of scenarios when compared with more traditional methods for constructing combinations
Testing for improvement in prediction model performance
New methodology has been proposed in recent years for evaluating the improvement in prediction performance gained by adding a new predictor, Y, to a risk model containing a set of baseline predictors, X, for a binary outcome D. We prove theoretically that null hypotheses concerning no improvement in performance are equivalent to the simple null hypothesis that the coefficient for Y is zero in the risk model, P(D = 1|X, Y ). Therefore, testing for improvement in prediction performance is redundant if Y has already been shown to be a risk factor. We investigate properties of tests through simulation studies, focusing on the change in the area under the ROC curve (AUC). An unexpected finding is that standard testing procedures that do not adjust for variability in estimated regression coefficients are extremely conservative. This may explain why the AUC is widely considered insensitive to improvements in prediction performance and suggests that the problem of insensitivity has to do with use of invalid procedures for inference rather than with the measure itself. To avoid redundant testing and use of potentially problematic methods for inference, we recommend that hypothesis testing for no improvement be limited to evaluation of Y as a risk factor, for which methods are well developed and widely available. Analyses of measures of prediction performance should focus on estimation rather than on testing
What Is the Best Reference RNA? And Other Questions Regarding the Design and Analysis of Two-Color Microarray Experiments
The reference design is a practical and popular choice for microarray studies using two-color platforms. In the reference design, the reference RNA uses half of all array resources, leading investigators to ask: What is the best reference RNA? We propose a novel method for evaluating reference RNAs and present the results of an experiment that was specially designed to evaluate three common choices of reference RNA. We found no compelling evidence in favor of any particular reference. In particular, a commercial reference showed no advantage in our data. Our experimental design also enabled a new way to test the effectiveness of pre-processing methods for two-color arrays. Our results favor using an intensity-normalization and foregoing background-subtraction. Finally, we evaluate the sensitivity and specificity of data quality filters, and propose a new filter that can be applied to any experimental design and does not rely on replicate hybridizations
Net Reclassification Indices for Evaluating Risk Prediction Instruments: A Critical Review
Background Net Reclassification Indices (NRI) have recently become popular statistics for measuring the prediction increment of new biomarkers.
Methods In this review, we examine the various types of NRI statistics and their correct interpretations. We evaluate the advantages and disadvantages of the NRI approach. For pre-defined risk categories, we relate NRI to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for NRI statistics and evaluate the merits of NRI-based hypothesis testing.
Conclusions Investigators using NRI statistics should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the NRI components are the same as the changes in the true and false positive rates. We advocate use of true and false positive rates and suggest it is more useful for investigators to retain the existing, descriptive terms. When there are three or more risk categories, we recommend against NRI statistics because they do not adequately account for clinically important differences in movements among risk categories. The category-free NRI is a new descriptive device designed to avoid pre-defined risk categories. The category-free NRI suffers from many of the same problems as other measures such as the area under the receiver operating characteristic curve. In addition, the category-free NRI can mislead investigators by overstating the incremental value of a biomarker, even in independent validation data. When investigators want to test a null hypothesis of no prediction increment, the well-established tests for coefficients in the regression model are superior to the NRI. If investigators want to use NRI measures, their confidence intervals should be calculated using bootstrap methods rather than published variance formulas. The preferred single-number summary of the prediction increment is the improvement in the Net Benefit
Effortless awareness: using real time neurofeedback to investigate correlates of posterior cingulate cortex activity in meditators' self-report
Neurophenomenological studies seek to utilize first-person self-report to elucidate cognitive processes related to physiological data. Grounded theory offers an approach to the qualitative analysis of self-report, whereby theoretical constructs are derived from empirical data. Here we used grounded theory methodology to assess how the first-person experience of meditation relates to neural activity in a core region of the default mode network –the posterior cingulate cortex. We analyzed first-person data consisting of meditators’ accounts of their subjective experience during runs of a real-time fMRI neurofeedback study of meditation, and third-person data consisting of corresponding feedback graphs of posterior cingulate cortex activity during the same runs. We found that for meditators, the subjective experiences of ‘undistracted awareness’ such as ‘concentration’ and ‘observing sensory experience’, and ‘effortless doing’ such as ‘observing sensory experience’, ‘not efforting’, and ‘contentment’, correspond with posterior cingulate cortex deactivation. Further, the subjective experiences of ‘distracted awareness’ such as ‘distraction’ and ‘interpreting’, and ‘controlling’ such as ‘efforting’ and ‘discontentment’, correspond with posterior cingulate cortex activation. Moreover, we derived several novel hypotheses about how specific qualities of cognitive processes during meditation relate to posterior cingulate cortex activity, such as the difference between meditation and ‘trying to meditate’. These findings offer novel insights into the relationship between meditation and self-related thinking and neural activity in the default mode network, driven by the first-person experience
- …