25 research outputs found

    Imputation of continuous variables missing at random using the method of simulated scores

    Get PDF
    For multivariate datasets with missing values, we present a procedure of statistical inference and state its "optimal" properties. Two main assumptions are needed: (1) data are missing at random (MAR); (2) the data generating process is a multivariate normal linear regression. Disentangling the problem of convergence of the iterative estimation/imputation procedure, we show that the estimator is a "method of simulated scores" (a particular case of McFadden's "method of simulated moments"); thus the estimator is equivalent to maximum likelihood if the number of replications is conveniently large, and the whole procedure can be considered an optimal parametric technique for imputation of missing data

    The prognostic value of a nomogram for exercise capacity in women

    Get PDF
    BACKGROUND: Recent studies have demonstrated that exercise capacity is an independent predictor of mortality in women. Normative values of exercise capacity for age in women have not been well established. Our objectives were to construct a nomogram to permit determination of predicted exercise capacity for age in women and to assess the predictive value of the nomogram with respect to survival. METHODS: A total of 5721 asymptomatic women underwent a symptom-limited, maximal stress test. Exercise capacity was measured in metabolic equivalents (MET). Linear regression was used to estimate the mean MET achieved for age. A nomogram was established to allow the percentage of predicted exercise capacity to be estimated on the basis of age and the exercise capacity achieved. The nomogram was then used to determine the percentage of predicted exercise capacity for both the original cohort and a referral population of 4471 women with cardiovascular symptoms who underwent a symptom-limited stress test. Survival data were obtained for both cohorts, and Cox survival analysis was used to estimate the rates of death from any cause and from cardiac causes in each group. RESULTS: The linear regression equation for predicted exercise capacity (in MET) on the basis of age in the cohort of asymptomatic women was as follows: predicted MET = 14.7 - (0.13 x age). The risk of death among asymptomatic women whose exercise capacity was less than 85 percent of the predicted value for age was twice that among women whose exercise capacity was at least 85 percent of the age-predicted value (P<0.001). Results were similar in the cohort of symptomatic women. CONCLUSIONS: We have established a nomogram for predicted exercise capacity on the basis of age that is predictive of survival among both asymptomatic and symptomatic women. These findings could be incorporated into the interpretation of exercise stress tests, providing additional prognostic information for risk stratification

    Segmentation and intensity estimation for microarray images with saturated pixels

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray image analysis processes scanned digital images of hybridized arrays to produce the input spot-level data for downstream analysis, so it can have a potentially large impact on those and subsequent analysis. Signal saturation is an optical effect that occurs when some pixel values for highly expressed genes or peptides exceed the upper detection threshold of the scanner software (2<sup>16 </sup>- 1 = 65, 535 for 16-bit images). In practice, spots with a sizable number of saturated pixels are often flagged and discarded. Alternatively, the saturated values are used without adjustments for estimating spot intensities. The resulting expression data tend to be biased downwards and can distort high-level analysis that relies on these data. Hence, it is crucial to effectively correct for signal saturation.</p> <p>Results</p> <p>We developed a flexible mixture model-based segmentation and spot intensity estimation procedure that accounts for saturated pixels by incorporating a censored component in the mixture model. As demonstrated with biological data and simulation, our method extends the dynamic range of expression data beyond the saturation threshold and is effective in correcting saturation-induced bias when the lost information is not tremendous. We further illustrate the impact of image processing on downstream classification, showing that the proposed method can increase diagnostic accuracy using data from a lymphoma cancer diagnosis study.</p> <p>Conclusions</p> <p>The presented method adjusts for signal saturation at the segmentation stage that identifies a pixel as part of the foreground, background or other. The cluster membership of a pixel can be altered versus treating saturated values as truly observed. Thus, the resulting spot intensity estimates may be more accurate than those obtained from existing methods that correct for saturation based on already segmented data. As a model-based segmentation method, our procedure is able to identify inner holes, fuzzy edges and blank spots that are common in microarray images. The approach is independent of microarray platform and applicable to both single- and dual-channel microarrays.</p

    Elements of statistical computing: numerical computation

    No full text
    corecore