24 research outputs found

    RNA-Based Strategies for Cancer Therapy: In Silico Design and Evaluation of ASOs for Targeted Exon Skipping

    Get PDF
    : Precision medicine in oncology has made significant progress in recent years by approving drugs that target specific genetic mutations. However, many cancer driver genes remain challenging to pharmacologically target (“undruggable”). To tackle this issue, RNA-based methods like antisense oligonucleotides (ASOs) that induce targeted exon skipping (ES) could provide a promising alternative. In this work, a comprehensive computational procedure is presented, focused on the development of ES-based cancer treatments. The procedure aims to produce specific protein variants, including inactive oncogenes and partially restored tumor suppressors. This novel computational procedure encompasses target-exon selection, in silico prediction of ES products, and identification of the best candidate ASOs for further experimental validation. The method was effectively employed on extensively mutated cancer genes, prioritized according to their suitability for ES-based interventions. Notable genes, such as NRAS and VHL, exhibited potential for this therapeutic approach, as specific target exons were identified and optimal ASO sequences were devised to induce their skipping. To the best of our knowledge, this is the first computational procedure that encompasses all necessary steps for designing ASO sequences tailored for targeted ES, contributing with a versatile and innovative approach to addressing the challenges posed by undruggable cancer driver genes and beyond

    Ten simple rules for making training materials FAIR

    Get PDF
    Author summary: Everything we do today is becoming more and more reliant on the use of computers. The field of biology is no exception; but most biologists receive little or no formal preparation for the increasingly computational aspects of their discipline. In consequence, informal training courses are often needed to plug the gaps; and the demand for such training is growing worldwide. To meet this demand, some training programs are being expanded, and new ones are being developed. Key to both scenarios is the creation of new course materials. Rather than starting from scratch, however, it’s sometimes possible to repurpose materials that already exist. Yet finding suitable materials online can be difficult: They’re often widely scattered across the internet or hidden in their home institutions, with no systematic way to find them. This is a common problem for all digital objects. The scientific community has attempted to address this issue by developing a set of rules (which have been called the Findable, Accessible, Interoperable and Reusable [FAIR] principles) to make such objects more findable and reusable. Here, we show how to apply these rules to help make training materials easier to find, (re)use, and adapt, for the benefit of all

    Coding potential of the products of alternative splicing in human

    Get PDF
    Background: Analysis of the human genome has revealed that as much as an order of magnitude more of the genomic sequence is transcribed than accounted for by the predicted and characterized genes. A number of these transcripts are alternatively spliced forms of known protein coding genes; however, it is becoming clear that many of them do not necessarily correspond to a functional protein. Results: In this study we analyze alternative splicing isoforms of human gene products that are unambiguously identified by mass spectrometry and compare their properties with those of isoforms of the same genes for which no peptide was found in publicly available mass spectrometry datasets. We analyze them in detail for the presence of uninterrupted functional domains, active sites as well as the plausibility of their predicted structure. We report how well each of these strategies and their combination can correctly identify translated isoforms and derive a lower limit for their specificity, that is, their ability to correctly identify non-translated products. Conclusions: The most effective strategy for correctly identifying translated products relies on the conservation of active sites, but it can only be applied to a small fraction of isoforms, while a reasonably high coverage, sensitivity and specificity can be achieved by analyzing the presence of non-truncated functional domains. Combining the latter with an assessment of the plausibility of the modeled structure of the isoform increases both coverage and specificity with a moderate cost in terms of sensitivity

    A framework to assess the quality and impact of bioinformatics training across ELIXIR.

    Get PDF
    ELIXIR is a pan-European intergovernmental organisation for life science that aims to coordinate bioinformatics resources in a single infrastructure across Europe; bioinformatics training is central to its strategy, which aims to develop a training community that spans all ELIXIR member states. In an evidence-based approach for strengthening bioinformatics training programmes across Europe, the ELIXIR Training Platform, led by the ELIXIR EXCELERATE Quality and Impact Assessment Subtask in collaboration with the ELIXIR Training Coordinators Group, has implemented an assessment strategy to measure quality and impact of its entire training portfolio. Here, we present ELIXIR's framework for assessing training quality and impact, which includes the following: specifying assessment aims, determining what data to collect in order to address these aims, and our strategy for centralised data collection to allow for ELIXIR-wide analyses. In addition, we present an overview of the ELIXIR training data collected over the past 4 years. We highlight the importance of a coordinated and consistent data collection approach and the relevance of defining specific metrics and answer scales for consortium-wide analyses as well as for comparison of data across iterations of the same course

    Ten simple rules for making training materials FAIR.

    Get PDF
    Everything we do today is becoming more and more reliant on the use of computers. The field of biology is no exception; but most biologists receive little or no formal preparation for the increasingly computational aspects of their discipline. In consequence, informal training courses are often needed to plug the gaps; and the demand for such training is growing worldwide. To meet this demand, some training programs are being expanded, and new ones are being developed. Key to both scenarios is the creation of new course materials. Rather than starting from scratch, however, it's sometimes possible to repurpose materials that already exist. Yet finding suitable materials online can be difficult: They're often widely scattered across the internet or hidden in their home institutions, with no systematic way to find them. This is a common problem for all digital objects. The scientific community has attempted to address this issue by developing a set of rules (which have been called the Findable, Accessible, Interoperable and Reusable [FAIR] principles) to make such objects more findable and reusable. Here, we show how to apply these rules to help make training materials easier to find, (re)use, and adapt, for the benefit of all

    Unraveling the complexity of the human transcriptome: analysis and integration of high-throughput data

    No full text
    BACKGROUND: An undoubted outcome of the last decade genomics research is the evidence of our lack of knowledge about the complexity of the human transcriptome. Researchers found, at first, that there were far fewer genes than expected (<25,000 genes coding for proteins), only to discover later that there were far more non protein-coding transcripts than expected (~30,000). Surprisingly, the transcriptome diversity is also due to a wider occurrence of alternative splicing than previously suspected. It is now clear that about 95% of the human multi-exon genes undergo alternative splicing events. At the same time, a number of studies from others and from our group have shown that a significant fraction of all generated transcripts, or isoforms, are unlikely to be translated into functional protein products. Although the emerging complex picture of transcriptomes is exciting, and genome-scale information is freely accessible from public repositories, there is still to develop approaches that permit a comprehensive characterization and efficient validation of the data. In this context, the progress and contributions of next-generation sequencing techniques have been crucial. By generating high-throughput experimental data and performing bioinformatics processing analysis, it is potentially possible to investigate and compare both protein- and non protein-coding element behaviours in specific biological and biomedical problems. Both the large-scale data generated from international consortium projects and the high-throughput experimental data produced in specific experiments challenge us to translate this massive information into meaningful biological insights. AIM: My thesis focuses on the identification and characterization of functional protein-coding and non-protein-coding products of the human transcriptome, by integrating a combination of various computational methods and biological data. The first part of the study aims at defining a strategy to assess the protein-coding potential of alternative splicing products, by analyzing all genome-scale transcripts available in public repositories. Sequence and structure features of known proteins are investigated and combined using bioinformatics approaches. This study attempts to assess the ability of different criteria to detect most of the actually translated isoforms and can be of great help in estimating the real size of the human proteome (an information that is still missing). The second part of the study aims at detecting functional protein-coding and non-protein-coding transcripts, analyzing high-throughput RNA-sequencing expression data in specific biomedical problems. The analysis investigates mRNA, taking into account all the possible isoforms, and microRNA expression profiles. In order to elucidate and assess a putative regulatory key role of microRNAs in the biological processes under study, a procedure is developed to identify enriched microRNA/mRNA-target associations. RESULTS: The first analysis was focused on human protein-coding genes having at least one protein isoform unequivocally identified by mass-spectrometry peptide data (Positive dataset) and at least one other isoform with no evidence of translation (Unknown dataset). A number of sequence and structural features of typical functional proteins were used to compare the properties of the two isoform datasets. We found out that Positive isoforms are predicted to be structurally more plausible than Unknown isoforms; functional domains are more often truncated in Unknown isoforms than in Positive ones; functional features such as active sites are rarely disrupted in Positive isoforms. Combining the presence of non-truncated functional domains with an assessment of the plausibility of the modelled structure, the estimated percentage of non-plausible protein-coding transcripts is about 45% of the Unknown isoforms. To capture further differences between functional and non functional transcripts, the following features were also investigated: selection pressure, conservation among multiple species and stringent comparison between human and mouse. The second analysis aimed at detecting all the putative functional transcripts involved in biological processes investigated by RNA-sequencing technique. Specialized bioinformatics procedures were set up to analyze both mRNA and microRNA expression profiles and were applied to specific biomedical problems in collaboration with experimental groups. Additionally, integrating also microRNA/target-gene prediction algorithms, the method allows the automatic identification of differentially expressed microRNAs predicted to bind mRNAs that are inversely-regulated in the same experiment. This combined analysis of microRNA and target-mRNA expression data is useful to isolate putative regulatory circuits involved in the investigated biological processes

    AI applications in functional genomics

    No full text
    We review the current applications of artificial intelligence (AI) in functional genomics. The recent explosion of AI follows the remarkable achievements made possible by "deep learning", along with a burst of "big data" that can meet its hunger. Biology is about to overthrow astronomy as the paradigmatic representative of big data producer. This has been made possible by huge advancements in the field of high throughput technologies, applied to determine how the individual components of a biological system work together to accomplish different processes. The disciplines contributing to this bulk of data are collectively known as functional genomics. They consist in studies of: i) the information contained in the DNA (genomics); ii) the modifications that DNA can reversibly undergo (epigenomics); iii) the RNA transcripts originated by a genome (transcriptomics); iv) the ensemble of chemical modifications decorating different types of RNA transcripts (epitranscriptomics); v) the products of protein-coding transcripts (proteomics); and vi) the small molecules produced from cell metabolism (metabolomics) present in an organism or system at a given time, in physiological or pathological conditions. After reviewing main applications of AI in functional genomics, we discuss important accompanying issues, including ethical, legal and economic issues and the importance of explainability. (C) 2021 Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology
    corecore