4 research outputs found

    South Dakota State University 1969-1970 Catalog

    Get PDF

    Modeling of various biological networks via LCMARS

    No full text
    In system biology, the interactions between components such as genes, proteins, can be represented by a network. To understand the molecular mechanism of complex biological systems, construction of their networks plays a crucial role. However, estimation of these biological networks is a challenging problem because of their high dimensional and sparse structures. Several statistical methods are proposed to overcome this issue. The Conic Multivariate Adaptive Regression Splines (CMARS) is one of the recent nonparametric methods developed for high dimensional and correlated data. This model is suggested to improve the performance of the Multivariate Adaptive Regression Spline (MARS) approach which is a complex model under the generalized additive models. From previous studies, it has been shown that MARS can be a promising model for the description of steady-state activations of biological networks if it is modified as a lasso-type regression via the main effects. In this study, we convert the full description of CMARS as a loop-based approach, so-called LCMARS, by including both main and second-order interaction effects since this description has performed better in benchmark real datasets. Here, we generate various scenarios based on distinct distributions and dimensions to compare the performance of LCMARS with MARS and Gaussian Graphical Model (GGM) in terms of accuracy measures via Monte Carlo runs. Additionally, different real biological datasets are used to observe the performance of underlying methods

    Geniş ölçekli ağların istatistiksel yaklaşımlarla tahmini.

    No full text
    In system biology, the interactions between components such as genes, proteins, can be represented by a network. To understand the molecular mechanism of complex biological systems, construction of their networks plays a crucial role. However, estimation of these networks is a challenging problem because of their high dimensional and sparse structures. The Gaussian graphical model (GGM) is widely used approach to construct the undirected networks. GGM define the interactions between species by using the conditional dependencies of the multivariate normality assumption. However, when the dimension of the systems is high, the performance of the model becomes computationally demanding, and the accuracy of GGM decreases when the observations are far from normality. In this thesis, we suggest a conic multivariate adaptive regression splines (CMARS) as an alternative to GGM to overcome both problems. CMARS is one of the recent nonparametric methods developed for high dimensional and correlated data. We adapted CMARS to describe biological systems and called it “LCMARS” due to its loop-based description. Here, we generate various scenarios based on distinct distributions and dimensions to compare the performance of LCMARS with MARS and GGM in terms of accuracy measures via Monte Carlo runs. Additionally, different real biological datasets are used to observe the performance of underlying methods. Furthermore, in this study, we perform various outlier detection methods as a pre-processing step before modeling the networks in order to investigate whether the outlier detection can improve the accuracy of the model. In the analysis, several synthetic and real benchmark biological datasets are used.Thesis (Ph.D.) -- Graduate School of Natural and Applied Sciences. Statistics
    corecore