7 research outputs found

    Optimal Sparsity Selection Based on an Information Criterion for Accurate Gene Regulatory Network Inference

    Get PDF
    Accurate inference of gene regulatory networks (GRNs) is important to unravel unknown regulatory mechanisms and processes, which can lead to the identification of treatment targets for genetic diseases. A variety of GRN inference methods have been proposed that, under suitable data conditions, perform well in benchmarks that consider the entire spectrum of false-positives and -negatives. However, it is very challenging to predict which single network sparsity gives the most accurate GRN. Lacking criteria for sparsity selection, a simplistic solution is to pick the GRN that has a certain number of links per gene, which is guessed to be reasonable. However, this does not guarantee finding the GRN that has the correct sparsity or is the most accurate one. In this study, we provide a general approach for identifying the most accurate and sparsity-wise relevant GRN within the entire space of possible GRNs. The algorithm, called SPA, applies a “GRN information criterion” (GRNIC) that is inspired by two commonly used model selection criteria, Akaike and Bayesian Information Criterion (AIC and BIC) but adapted to GRN inference. The results show that the approach can, in most cases, find the GRN whose sparsity is close to the true sparsity and close to as accurate as possible with the given GRN inference method and data. The datasets and source code can be found at https://bitbucket.org/sonnhammergrni/spa/

    Biyolojik ağların deterministik modellemesi ve sonuç çıkarımı.

    No full text
    The mathematical description of biological networks can be performed mainly by stochastic and deterministic models. The former gives more information about the system, whereas, it needs very detailed measurements. On the other hand, the latter is relatively less informative, but, the collection of their data is easier than the stochastic ones, rendering it a more preferable modeling approach. In this study, we implement the deterministic modeling of biological systems due to the underlying advantage. Among many alternatives, we use the Gaussian graphical model (GGM) and evaluate its performance with respect to the random forest algorithm, which we suggest as an alternative approach of GGM. We estimate the model parameters, i.e., the structure of the networks, and then assess their findings based on their accuracies. Finally, we extend the study by using copulas in the description of data and apply the same modeling approaches to assess their effects. M.S. - Master of Scienc

    Improving the accuracy of gene regulatory network inference from noisy data

    No full text
    Gene regulatory networks (GRNs) control physiological and pathological processes in a living organism, and their accurate inference from measured gene expression can identify therapeutic mechanisms for complex diseases such as cancers. The biggest obstacle in achieving the accurate reconstruction of GRNs is called ‘noise’, which considerably alters the measured gene expression because the noise generally dominates the biological signal. This situation needs to be addressed carefully so that GRN inference methods do not estimate a fit to the noise instead of the underlying biological signal. Potential noise compensation approaches are a must if the goal is to reconstruct the true system.  To this end, within the scope of this doctoral thesis, I developed two methods that, in different ways, overcome the obstacles introduced by noise in gene expression data. Method 1 allows the collection of more informative subsets of genes whose expression is not as highly affected as those which cause the system to be overall uninformative. Method 2 infers a perturbation design that is better suited to the gene expression data than the originally intended design, and therefore produces more accurate GRNs at high noise levels. Furthermore, a benchmark study was carried out which compares the methodological backgrounds of GRN inference methods in terms of whether they utilize knowledge of the perturbation design or not, which clearly shows that utilization of the perturbation design is essential for accurate inference of GRNs. Finally a method is presented to improve GRN inference accuracy by selecting the GRN with the optimal sparsity based on information theoretical criteria.  The three new methods (PAPERS I, II and IV) can also be used together, which is shown in this thesis to improve the GRN inference accuracy considerably more than the methods separately. As inference of accurate GRNs is a major challenge in gene regulation, the methods presented in this thesis represent an important contribution to move the field forward

    Improving the accuracy of gene regulatory network inference from noisy data

    No full text
    Gene regulatory networks (GRNs) control physiological and pathological processes in a living organism, and their accurate inference from measured gene expression can identify therapeutic mechanisms for complex diseases such as cancers. The biggest obstacle in achieving the accurate reconstruction of GRNs is called ‘noise’, which considerably alters the measured gene expression because the noise generally dominates the biological signal. This situation needs to be addressed carefully so that GRN inference methods do not estimate a fit to the noise instead of the underlying biological signal. Potential noise compensation approaches are a must if the goal is to reconstruct the true system.  To this end, within the scope of this doctoral thesis, I developed two methods that, in different ways, overcome the obstacles introduced by noise in gene expression data. Method 1 allows the collection of more informative subsets of genes whose expression is not as highly affected as those which cause the system to be overall uninformative. Method 2 infers a perturbation design that is better suited to the gene expression data than the originally intended design, and therefore produces more accurate GRNs at high noise levels. Furthermore, a benchmark study was carried out which compares the methodological backgrounds of GRN inference methods in terms of whether they utilize knowledge of the perturbation design or not, which clearly shows that utilization of the perturbation design is essential for accurate inference of GRNs. Finally a method is presented to improve GRN inference accuracy by selecting the GRN with the optimal sparsity based on information theoretical criteria.  The three new methods (PAPERS I, II and IV) can also be used together, which is shown in this thesis to improve the GRN inference accuracy considerably more than the methods separately. As inference of accurate GRNs is a major challenge in gene regulation, the methods presented in this thesis represent an important contribution to move the field forward

    Karmaşık biyolojk sistemlerin deterministik modellemelerinde alternatif yaklaşımlar, parametre tahmini ve kopulalar

    No full text
    Proje, karmaşık biyolojik ağların deterministik modellenmesi ve parametre tahmininde, literatürde var olan ve sıklıkla kullanılan Gaussian grafiksel modele ve adi diferansiyel denklemler modeline alternatif yeni modeller sunmakta, adı geçen yaygın modellerin hesaplamalardaki varsayım ve sonuçlarında görülen eksiklik ve kısıtlarının çözümüne bir öneri getirmektedir. Bu amaçla, proje, parametrik olmayan ve farklı disiplinler ve amaçlar için kullanılan algoritma ve modelleri, biyolojik sistemlerin tahmininde kullanmak üzere modifiye etmeyi, yeni mdoel seçim kriterleri oluşturmayı önermektedir. Önerilen modellerin geçerliliği farklı boyutlu simüle ve gerçek sistemlerle, kopulalar da dahil edilerek farklı dağılım varsayımları yardımıyla detaylı olarak araştırılacaktır