35 research outputs found

    Network modeling of the transcriptional effects of copy number aberrations in glioblastoma

    Get PDF
    DNA copy number aberrations (CNAs) are a characteristic feature of cancer genomes. In this work, Rebecka Jörnsten, Sven Nelander and colleagues combine network modeling and experimental methods to analyze the systems-level effects of CNAs in glioblastoma

    Monitoring international migration flows in Europe. Towards a statistical data base combining data from different sources

    Get PDF
    The paper reviews techniques developed in demography, geography and statistics that are useful for bridging the gap between available data on international migration flows and the information required for policy making and research. The basic idea of the paper is as follows: to establish a coherent and consistent data base that contains sufficiently detailed, up-to-date and accurate information, data from several sources should be combined. That raises issues of definition and measurement, and of how to combine data from different origins properly. The issues may be tackled more easily if the statistics that are being compiled are viewed as different outcomes or manifestations of underlying stochastic processes governing migration. The link between the processes and their outcomes is described by models, the parameters of which must be estimated from the available data. That may be done within the context of socio-demographic accounting. The paper discusses the experience of the U.S. Bureau of the Census in combining migration data from several sources. It also summarizes the many efforts in Europe to establish a coherent and consistent data base on international migration. The paper was written at IIASA. It is part of the Migration Estimation Study, which is a collaborative IIASA-University of Groningen project, funded by the Netherlands Organization for Scientific Research (NWO). The project aims at developing techniques to obtain improved estimates of international migration flows by country of origin and country of destination

    Missing value imputation improves clustering and interpretation of gene expression microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Missing values frequently pose problems in gene expression microarray experiments as they can hinder downstream analysis of the datasets. While several missing value imputation approaches are available to the microarray users and new ones are constantly being developed, there is no general consensus on how to choose between the different methods since their performance seems to vary drastically depending on the dataset being used.</p> <p>Results</p> <p>We show that this discrepancy can mostly be attributed to the way in which imputation methods have traditionally been developed and evaluated. By comparing a number of advanced imputation methods on recent microarray datasets, we show that even when there are marked differences in the measurement-level imputation accuracies across the datasets, these differences become negligible when the methods are evaluated in terms of how well they can reproduce the original gene clusters or their biological interpretations. Regardless of the evaluation approach, however, imputation always gave better results than ignoring missing data points or replacing them with zeros or average values, emphasizing the continued importance of using more advanced imputation methods.</p> <p>Conclusion</p> <p>The results demonstrate that, while missing values are still severely complicating microarray data analysis, their impact on the discovery of biologically meaningful gene groups can – up to a certain degree – be reduced by using readily available and relatively fast imputation methods, such as the Bayesian Principal Components Algorithm (BPCA).</p

    Bayesian profiling of molecular signatures to predict event times

    Get PDF
    BACKGROUND: It is of particular interest to identify cancer-specific molecular signatures for early diagnosis, monitoring effects of treatment and predicting patient survival time. Molecular information about patients is usually generated from high throughput technologies such as microarray and mass spectrometry. Statistically, we are challenged by the large number of candidates but only a small number of patients in the study, and the right-censored clinical data further complicate the analysis. RESULTS: We present a two-stage procedure to profile molecular signatures for survival outcomes. Firstly, we group closely-related molecular features into linkage clusters, each portraying either similar or opposite functions and playing similar roles in prognosis; secondly, a Bayesian approach is developed to rank the centroids of these linkage clusters and provide a list of the main molecular features closely related to the outcome of interest. A simulation study showed the superior performance of our approach. When it was applied to data on diffuse large B-cell lymphoma (DLBCL), we were able to identify some new candidate signatures for disease prognosis. CONCLUSION: This multivariate approach provides researchers with a more reliable list of molecular features profiled in terms of their prognostic relationship to the event times, and generates dependable information for subsequent identification of prognostic molecular signatures through either biological procedures or further data analysis
    corecore