19 research outputs found

    hclimente/martini v0.1-alpha

    No full text
    User-friendly version of gin

    martini: an R package for genome-wide association studies using SNP networks

    No full text
    Systems biology shows that genes that are related to the same phenotype are often functionally related. We can take advantage of this to discover new genes that affect a phenotype. However, the natural unit of analysis in genome-wide association studies (GWAS) is not the gene, but the single nucleotide polymorphism, or SNP. We introduce martini, an R package to build SNP co-function networks and use them to conduct GWAS. In SNP networks, two SNPs are connected if there is evidence they jointly contribute to the same biological function. By leveraging such information in GWAS, we search SNPs that are not only strongly associated with a phenotype, but also functionally related. This, in turn, boosts discovery and interpretability. Martini builds such networks using three sources of information: genomic position, gene annotations, and gene-gene interactions. The resulting SNP networks involve hundreds of thousands of nodes and millions of edges, making their exploration computationally intensive. Martini implements two network-guided biomarker discovery algorithms based on graph cuts that can handle such large networks: SConES and SigMod. They both seek a small subset of SNPs with high association scores with the phenotype of interest and densely interconnected in the network. Both algorithms use parameters that control the relative importance of the SNPs' association scores, the number of SNPs selected, and their interconnection. Martini includes a cross-validation procedure to set these parameters automatically. Lastly, martini includes tools to visualize the selected SNPs' network and association properties. Martini is available on GitHub (https://github.com/hclimente/martini) and Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/martini.html)

    A network-guided protocol to discover susceptibility genes in genome-wide association studies using stability selection

    No full text
    International audienceWe present a network-based protocol to discover susceptibility genes in case-control genome-wide association studies (GWASs). In short, this protocol looks for biomarkers that are informative of disease status and interconnected in an underlying biological network. This boosts discovery and interpretability. Moreover, the protocol tackles the instability of network methods, producing a stable set of genes most likely to replicate in external cohorts. To apply the procedure to a provided GWAS dataset, install the required software and execute our command-line tool

    Block HSIC Lasso

    No full text
    | openaire: EC/H2020/666003/EU//IC-3i-PhDMotivation: Finding non-linear relationships between biomolecules and a biological outcome is computationally expensive and statistically challenging. Existing methods have important drawbacks, including among others lack of parsimony, non-convexity and computational overhead. Here we propose block HSIC Lasso, a non-linear feature selector that does not present the previous drawbacks. Results: We compare block HSIC Lasso to other state-of-the-art feature selection techniques in both synthetic and real data, including experiments over three common types of genomic data: gene-expression microarrays, single-cell RNA sequencing and genome-wide association studies. In all cases, we observe that features selected by block HSIC Lasso retain more information about the underlying biology than those selected by other techniques. As a proof of concept, we applied block HSIC Lasso to a single-cell RNA sequencing experiment on mouse hippocampus. We discovered that many genes linked in the past to brain development and function are involved in the biological differences between the types of neurons.Peer reviewe

    The functional impact of alternative splicing in cancer

    No full text
    Alternative splicing changes are frequently observed in cancer and are starting to be recognized as important signatures for tumor progression and therapy. However, their functional impact and relevance to tumorigenesis remain mostly unknown. We carried out a systematic analysis to characterize the potential functional consequences of alternative splicing changes in thousands of tumor samples. This analysis revealed that a subset of alternative splicing changes affect protein domain families that are frequently mutated in tumors and potentially disrupt protein-protein interactions in cancer-related pathways. Moreover, there was a negative correlation between the number of these alternative splicing changes in a sample and the number of somatic mutations in drivers. We propose that a subset of the alternative splicing changes observed in tumors may represent independent oncogenic processes that could be relevant to explain the functional transformations in cancer, and some of them could potentially be considered alternative splicing drivers (AS drivers).H.C.-G. and E.E. were supported by the MINECO and FEDER (BIO2014-52566-R), Consolider RNAREG (CSD2009-00080), AGAUR (SGR2014-1121), the European ITN Network RNP-Net (ID: 289007), and the Sandra Ibarra Foundation for Cancer (FSI2013). E.P.-P. and A.G. were supported by the SBP CC grant (P30 CA030199). All authors thank The Cancer Genome Atlas project for making their data publicly available. The Computational RNA Biology Group is part of the Research Programme on Biomedical Informatics (GRIB), which is a member of ELIXIR-Excelerate of the European Union Horizon 2020 Programme 2014-2020 (No. 676559) and of the Spanish National Bioinformatics Institute (INB), PRB2-ISCIII and is supported by grant PT13/0001/0023 of the PE I+D+I 2013-2016, funded by ISCIII and FEDE
    corecore