5,223 research outputs found
Recommended from our members
Clinical metagenomics.
Clinical metagenomic next-generation sequencing (mNGS), the comprehensive analysis of microbial and host genetic material (DNA and RNA) in samples from patients, is rapidly moving from research to clinical laboratories. This emerging approach is changing how physicians diagnose and treat infectious disease, with applications spanning a wide range of areas, including antimicrobial resistance, the microbiome, human host gene expression (transcriptomics) and oncology. Here, we focus on the challenges of implementing mNGS in the clinical laboratory and address potential solutions for maximizing its impact on patient care and public health
Distributed gene clinical decision support system based on cloud computing
Background: The clinical decision support system can effectively break the limitations of doctors’ knowledge and reduce the possibility of misdiagnosis to enhance health care. The traditional genetic data storage and analysis methods based on stand-alone environment are hard to meet the computational requirements with the rapid genetic data growth for the limited scalability.
Methods: In this paper, we propose a distributed gene clinical decision support system, which is named GCDSS. And a prototype is implemented based on cloud computing technology. At the same time, we present CloudBWA which is a novel distributed read mapping algorithm leveraging batch processing strategy to map reads on Apache Spark.
Results: Experiments show that the distributed gene clinical decision support system GCDSS and the distributed read mapping algorithm CloudBWA have outstanding performance and excellent scalability. Compared with state-of-the-art distributed algorithms, CloudBWA achieves up to 2.63 times speedup over SparkBWA. Compared with stand-alone algorithms, CloudBWA with 16 cores achieves up to 11.59 times speedup over BWA-MEM with 1 core.
Conclusions: GCDSS is a distributed gene clinical decision support system based on cloud computing techniques. In particular, we incorporated a distributed genetic data analysis pipeline framework in the proposed GCDSS system. To boost the data processing of GCDSS, we propose CloudBWA, which is a novel distributed read mapping algorithm to leverage batch processing technique in mapping stage using Apache Spark platform.
Keywords: Clinical decision support system, Cloud computing, Spark, Alluxio, Genetic data analysis, Read mappin
Integrative OMICS Data-Driven Procedure Using a Derivatized Meta-Analysis Approach
The wealth of high-throughput data has opened up new opportunities to analyze and describe biological processes at higher resolution, ultimately leading to a significant acceleration of scientific output using high-throughput data from the different omics layers and the generation of databases to store and report raw datasets. The great variability among the techniques and the heterogeneous methodologies used to produce this data have placed meta-analysis methods as one of the approaches of choice to correlate the resultant large-scale datasets from different research groups. Through multi-study meta-analyses, it is possible to generate results with greater statistical power compared to individual analyses. Gene signatures, biomarkers and pathways that provide new insights of a phenotype of interest have been identified by the analysis of large-scale datasets in several fields of science. However, despite all the efforts, a standardized regulation to report large-scale data and to identify the molecular targets and signaling networks is still lacking. Integrative analyses have also been introduced as complementation and augmentation for meta-analysis methodologies to generate novel hypotheses. Currently, there is no universal method established and the different methods available follow different purposes. Herein we describe a new unifying, scalable and straightforward methodology to meta-analyze different omics outputs, but also to integrate the significant outcomes into novel pathways describing biological processes of interest. The significance of using proper molecular identifiers is highlighted as well as the potential to further correlate molecules from different regulatory levels. To show the methodology's potential, a set of transcriptomic datasets are meta-analyzed as an example
NAViGaTing the Micronome – Using Multiple MicroRNA Prediction Databases to Identify Signalling Pathway-Associated MicroRNAs
MicroRNAs are a class of small RNAs known to regulate gene expression at the transcript level, the protein level, or both. Since microRNA binding is sequence-based but possibly structure-specific, work in this area has resulted in multiple databases storing predicted microRNA:target relationships computed using diverse algorithms. We integrate prediction databases, compare predictions to in vitro data, and use cross-database predictions to model the microRNA:transcript interactome--referred to as the micronome--to study microRNA involvement in well-known signalling pathways as well as associations with disease. We make this data freely available with a flexible user interface as our microRNA Data Integration Portal--mirDIP (http://ophid.utoronto.ca/mirDIP).mirDIP integrates prediction databases to elucidate accurate microRNA:target relationships. Using NAViGaTOR to produce interaction networks implicating microRNAs in literature-based, KEGG-based and Reactome-based pathways, we find these signalling pathway networks have significantly more microRNA involvement compared to chance (p<0.05), suggesting microRNAs co-target many genes in a given pathway. Further examination of the micronome shows two distinct classes of microRNAs; universe microRNAs, which are involved in many signalling pathways; and intra-pathway microRNAs, which target multiple genes within one signalling pathway. We find universe microRNAs to have more targets (p<0.0001), to be more studied (p<0.0002), and to have higher degree in the KEGG cancer pathway (p<0.0001), compared to intra-pathway microRNAs.Our pathway-based analysis of mirDIP data suggests microRNAs are involved in intra-pathway signalling. We identify two distinct classes of microRNAs, suggesting a hierarchical organization of microRNAs co-targeting genes both within and between pathways, and implying differential involvement of universe and intra-pathway microRNAs at the disease level
WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data
BACKGROUND: Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data. RESULT: WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery. CONCLUSION: This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at
Posterior Association Networks and Functional Modules Inferred from Rich Phenotypes of Gene Perturbations
Combinatorial gene perturbations provide rich information for a systematic exploration of genetic interactions. Despite successful applications to bacteria and yeast, the scalability of this approach remains a major challenge for higher organisms such as humans. Here, we report a novel experimental and computational framework to efficiently address this challenge by limiting the ‘search space’ for important genetic interactions. We propose to integrate rich phenotypes of multiple single gene perturbations to robustly predict functional modules, which can subsequently be subjected to further experimental investigations such as combinatorial gene silencing. We present posterior association networks (PANs) to predict functional interactions between genes estimated using a Bayesian mixture modelling approach. The major advantage of this approach over conventional hypothesis tests is that prior knowledge can be incorporated to enhance predictive power. We demonstrate in a simulation study and on biological data, that integrating complementary information greatly improves prediction accuracy. To search for significant modules, we perform hierarchical clustering with multiscale bootstrap resampling. We demonstrate the power of the proposed methodologies in applications to Ewing's sarcoma and human adult stem cells using publicly available and custom generated data, respectively. In the former application, we identify a gene module including many confirmed and highly promising therapeutic targets. Genes in the module are also significantly overrepresented in signalling pathways that are known to be critical for proliferation of Ewing's sarcoma cells. In the latter application, we predict a functional network of chromatin factors controlling epidermal stem cell fate. Further examinations using ChIP-seq, ChIP-qPCR and RT-qPCR reveal that the basis of their genetic interactions may arise from transcriptional cross regulation. A Bioconductor package implementing PAN is freely available online at http://bioconductor.org/packages/release/bioc/html/PANR.html
- …