56 research outputs found

    Statistical Approaches in Genome-Wide Association Studies

    Get PDF
    Genome-wide association studies, GWAS, typically contain hundreds of thousands single nucleotide polymorphisms, SNPs, genotyped for few numbers of samples. The aim of these studies is to identify regions harboring SNPs or to predict the outcomes of interest. Since the number of predictors in the GWAS far exceeds the number of samples, it is impossible to analyze the data with classical statistical methods. In the current GWAS, the widely applied methods are based on single marker analysis that does assess association of each SNP with the complex traits independently. Because of the low power of this analysis for detecting true association, simultaneous analysis has recently received more attention. The new statistical methods for simultaneous analysis in high dimensional settings have a limitation of disparity between the number of predictors and the number of samples. Therefore, reducing the dimensionality of the set of SNPs is required. This thesis reviews single marker analysis and simultaneous analysis with a focus on Bayesian methods. It addresses the weaknesses of these approaches with reference to recent literature and illustrating simulation studies. To bypass these problems, we first attempt to reduce dimension of the set of SNPs with random projection technique. Since this method does not improve the predictive performance of the model, we present a new two-stage approach that is a hybrid method of single and simultaneous analyses. This full Bayesian approach selects the most promising SNPs in the first stage by evaluating the impact of each marker independently. In the second stage, we develop a hierarchical Bayesian model to analyze the impact of selected markers simultaneously. The model that accounts for related samples places the local-global shrinkage prior on marker effects in order to shrink small effects to zero while keeping large effects relatively large. The prior specification on marker effects, which is hierarchical representation of generalized double Pareto, improves the predictive performance. Finally, we represent the result of real SNP-data analysis through single-maker study and the new two-stage approach

    Quantile Regression in Survival Analysis: Comparing Check-Based Modeling and the Minimum Distance Approach

    Get PDF
    Introduction: Quantile regression is a valuable alternative for survival data analysis, enabling flexible evaluations of covariate effects on survival outcomes with intuitive interpretations. It offers practical computation and reliability. However, challenges arise when applying quantile regression to censored data, particularly for upper quantiles. The minimum distance approach, utilizing dual-kernel estimation and the inverse cumulative distribution function, shows promise in addressing these challenges, especially with Methods: This study contrasts two methods within the realm of quantile linear regression for survival analysis: check-based modeling and the minimum distance approach. Effectiveness is assessed across various scenarios through comprehensive simulation. Results: The simulation results showed that using the quantile regression model with the minimum distance approach reduces the percentage of root mean square error in parameter estimation compared to the quantile regression models based on the check loss function. Additionally, a larger sample size and reduced censoring percentage led to decreased root mean square error in parameter estimation. Conclusion: The research highlights the benefits of using the minimum distance approach for quantile regression. It reduces errors, improves model predictions, captures patterns, and optimizes parameters even with complete data. However, this approach has limitations. The accuracy of estimated quantiles can be influenced by the choice of distance metric and weighting scheme. The assumption of independence between censoring mechanism and survival time may not hold in real-world scenarios. Additionally, dealing with large datasets can be computationally complex

    From classical mendelian randomization to causal networks for systematic integration of multi-omics

    Get PDF
    The number of studies with information at multiple biological levels of granularity, such as genomics, proteomics, and metabolomics, is increasing each year, and a biomedical questaion is how to systematically integrate these data to discover new biological mechanisms that have the potential to elucidate the processes of health and disease. Causal frameworks, such as Mendelian randomization (MR), provide a foundation to begin integrating data for new biological discoveries. Despite the growing number of MR applications in a wide variety of biomedical studies, there are few approaches for the systematic analysis of omic data. The large number and diverse types of molecular components involved in complex diseases interact through complex networks, and classical MR approaches targeting individual components do not consider the underlying relationships. In contrast, causal network models established in the principles of MR offer significant improvements to the classical MR framework for understanding omic data. Integration of these mostly distinct branches of statistics is a recent development, and we here review the current progress. To set the stage for causal network models, we review some recent progress in the classical MR framework. We then explain how to transition from the classical MR framework to causal networks. We discuss the identification of causal networks and evaluate the underlying assumptions. We also introduce some tests for sensitivity analysis and stability assessment of causal networks. We then review practical details to perform real data analysis and identify causal networks and highlight some of the utility of causal networks. The utilities with validated novel findings reveal the full potential of causal networks as a systems approach that will become necessary to integrate large-scale omic data
    • …
    corecore