523 research outputs found
Probabilistic network models for cardiovascular monitoring
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 83-85).While treating patients during their hospital stay, physicians must frequently take into consideration massive amounts of clinical data. This data can come in many forms, such as continuous blood pressure tracings, intermittent laboratory results, or simple qualitative observations on the patient's appearance. Although access to such a rich collection of information is beneficial for making diagnoses and treatment decisions, it can sometimes be difficult for clinicians to mentally keep track of everything, especially in hectic environments such as hospital intensive care units (ICUs). In addition, there are certain physiological variables that cannot be measured noninvasively, but are critical indicators of a patient's state of health. One such example in cardiology is cardiac output - the mean flow rate of blood from the heart. In this thesis, we explore probabilistic networks as a method for integrating different types of clinical data into a single model, and as a vehicle for summarizing population statistics from medical databases. These networks can then be used to estimate unobservable variables of interest. We propose and test several networks of varying complexity on both a set of experimental porcine data, and a set of real ICU patient data. We find that continuous estimation of cardiac output is possible using probabilistic networks, and that the errors produced are comparable to those obtained from deterministic methods that employ the same in:Formation. Furthermore, since this technique is purely statistical in nature, it can be easily reformulated for applications where deterministic methods do not exist.by Shirley X. Li.M.Eng
CEAS: cis-regulatory element annotation system
The recent availability of high-density human genome tiling arrays enables biologists to conduct ChIP–chip experiments to locate the in vivo-binding sites of transcription factors in the human genome and explore the regulatory mechanisms. Once genomic regions enriched by transcription factor ChIP–chip are located, genome-scale downstream analyses are crucial but difficult for biologists without strong bioinformatics support. We designed and implemented the first web server to streamline the ChIP–chip downstream analyses. Given genome-scale ChIP regions, the cis-regulatory element annotation system (CEAS) retrieves repeat-masked genomic sequences, calculates GC content, plots evolutionary conservation, maps nearby genes and identifies enriched transcription factor-binding motifs. Biologists can utilize CEAS to retrieve useful information for ChIP–chip validation, assemble important knowledge to include in their publication and generate novel hypotheses (e.g. transcription factor cooperative partner) for further study. CEAS helps the adoption of ChIP–chip in mammalian systems and provides insights towards a more comprehensive understanding of transcriptional regulatory mechanisms. The URL of the server is
Machine Learning on Syngeneic Mouse Tumor Profiles To Model Clinical Immunotherapy Response
Most patients with cancer are refractory to immune checkpoint blockade (ICB) therapy, and proper patient stratification remains an open question. Primary patient data suffer from high heterogeneity, low accessibility, and lack of proper controls. In contrast, syngeneic mouse tumor models enable controlled experiments with ICB treatments. Using transcriptomic and experimental variables from \u3e700 ICB-treated/control syngeneic mouse tumors, we developed a machine learning framework to model tumor immunity and identify factors influencing ICB response. Projected on human immunotherapy trial data, we found that the model can predict clinical ICB response. We further applied the model to predicting ICB-responsive/resistant cancer types in The Cancer Genome Atlas, which agreed well with existing clinical reports. Last, feature analysis implicated factors associated with ICB response. In summary, our computational framework based on mouse tumor data reliably stratified patients regarding ICB response, informed resistance mechanisms, and has the potential for wide applications in disease treatment studies
xMAN: extreme MApping of OligoNucleotides.
BACKGROUND: The ability to rapidly map millions of oligonucleotide fragments to a reference genome is crucial to many high throughput genomic technologies. RESULTS: We propose an intuitive and efficient algorithm, titled extreme MApping of OligoNucleotide (xMAN), to rapidly map millions of oligonucleotide fragments to a genome of any length. By converting oligonucleotides to integers hashed in RAM, xMAN can scan through genomes using bit shifting operation and achieve at least one order of magnitude speed increase over existing tools. xMAN can map the 42 million 25-mer probes on the Affymetrix whole human genome tiling arrays to the entire genome in less than 6 CPU hours. CONCLUSIONS: In addition to the speed advantage, we found the probe mapping of xMAN to substantially improve the final analysis results in both a spike-in experiment on ENCODE tiling arrays and an estrogen receptor ChIP-chip experiment on whole human genome tiling arrays. Those improvements were confirmed by direct ChIP and real-time PCR assay. xMAN can be further extended for application to other high-throughput genomic technologies for oligonucleotide mapping
Model-based analysis of two-color arrays (MA2C)
A normalization method based on probe GC content for two-color tiling arrays and an algorithm for detecting peak regions are presented. They are available in a stand-alone Java program
MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens
We propose the Model-based Analysis of Genome-wide CRISPR/Cas9 Knockout (MAGeCK) method for prioritizing single-guide RNAs, genes and pathways in genome-scale CRISPR/Cas9 knockout screens. MAGeCK demonstrates better performance compared with existing methods, identifies both positively and negatively selected genes simultaneously, and reports robust results across different experimental conditions. Using public datasets, MAGeCK identified novel essential genes and pathways, including EGFR in vemurafenib-treated A375 cells harboring a BRAF mutation. MAGeCK also detected cell type-specific essential genes, including BCR and ABL1, in KBM7 cells bearing a BCR-ABL fusion, and IGF1R in HL-60 cells, which depends on the insulin signaling pathway for proliferation. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0554-4) contains supplementary material, which is available to authorized users
LegumeIP: an integrative database for comparative genomics and transcriptomics of model legumes
Legumes play a vital role in maintaining the nitrogen cycle of the biosphere. They conduct symbiotic nitrogen fixation through endosymbiotic relationships with bacteria in root nodules. However, this and other characteristics of legumes, including mycorrhization, compound leaf development and profuse secondary metabolism, are absent in the typical model plant Arabidopsis thaliana. We present LegumeIP (http://plantgrn.noble.org/LegumeIP/), an integrative database for comparative genomics and transcriptomics of model legumes, for studying gene function and genome evolution in legumes. LegumeIP compiles gene and gene family information, syntenic and phylogenetic context and tissue-specific transcriptomic profiles. The database holds the genomic sequences of three model legumes, Medicago truncatula, Glycine max and Lotus japonicus plus two reference plant species, A. thaliana and Populus trichocarpa, with annotations based on UniProt, InterProScan, Gene Ontology and the Kyoto Encyclopedia of Genes and Genomes databases. LegumeIP also contains large-scale microarray and RNA-Seq-based gene expression data. Our new database is capable of systematic synteny analysis across M. truncatula, G. max, L. japonicas and A. thaliana, as well as construction and phylogenetic analysis of gene families across the five hosted species. Finally, LegumeIP provides comprehensive search and visualization tools that enable flexible queries based on gene annotation, gene family, synteny and relative gene expression
Microarray blob-defect removal improves array analysis
ABSTRACT Motivation: New generation Affymetrix oligonucleotide microarrays often have blob-like image defects that will require investigators to either repeat their hybridization assays or analyze their data with the defects left in place. We investigated the effect of analyzing a spikein experiment on Affymetrix ENCODE tiling arrays in the presence of simulated blobs covering between 1 and 9% of the array area. Using two different ChIP-chip tiling array analysis programs (Affymetrix Tiling Array Software TAS and Model-based Analysis of Tiling arrays MAT), we found that even the smallest blob defects significantly decreased the sensitivity and increased the false discovery rate (FDR) of the spike-in target prediction. Results: We introduced a new software tool, the Microarray Blob Remover (MBR), which allows rapid visualization, detection, and removal of various blob defects from the .CEL files of different types of Affymetrix microarrays. It is shown that using MBR significantly improves the sensitivity and FDR of a tiling array analysis compared to leaving the affected probes in the analysis. Availability: The MBR software and the sample array .CEL files used in this paper are available at
VIPER: Visualization Pipeline for RNA-seq, a Snakemake workflow for efficient and complete RNA-seq analysis
BACKGROUND: RNA sequencing has become a ubiquitous technology used throughout life sciences as an effective method of measuring RNA abundance quantitatively in tissues and cells. The increase in use of RNA-seq technology has led to the continuous development of new tools for every step of analysis from alignment to downstream pathway analysis. However, effectively using these analysis tools in a scalable and reproducible way can be challenging, especially for non-experts.
RESULTS: Using the workflow management system Snakemake we have developed a user friendly, fast, efficient, and comprehensive pipeline for RNA-seq analysis. VIPER (Visualization Pipeline for RNA-seq analysis) is an analysis workflow that combines some of the most popular tools to take RNA-seq analysis from raw sequencing data, through alignment and quality control, into downstream differential expression and pathway analysis. VIPER has been created in a modular fashion to allow for the rapid incorporation of new tools to expand the capabilities. This capacity has already been exploited to include very recently developed tools that explore immune infiltrate and T-cell CDR (Complementarity-Determining Regions) reconstruction abilities. The pipeline has been conveniently packaged such that minimal computational skills are required to download and install the dozens of software packages that VIPER uses.
CONCLUSIONS: VIPER is a comprehensive solution that performs most standard RNA-seq analyses quickly and effectively with a built-in capacity for customization and expansion
Sequence determinants of improved CRISPR sgRNA design
The CRISPR/Cas9 system has revolutionized mammalian somatic cell genetics. Genome-wide functional screens using CRISPR/Cas9-mediated knockout or dCas9 fusion-mediated inhibition/activation (CRISPRi/a) are powerful techniques for discovering phenotype-associated gene function. We systematically assessed the DNA sequence features that contribute to single guide RNA (sgRNA) efficiency in CRISPR-based screens. Leveraging the information from multiple designs, we derived a new sequence model for predicting sgRNA efficiency in CRISPR/Cas9 knockout experiments. Our model confirmed known features and suggested new features including a preference for cytosine at the cleavage site. The model was experimentally validated for sgRNA-mediated mutation rate and protein knockout efficiency. Tested on independent data sets, the model achieved significant results in both positive and negative selection conditions and outperformed existing models. We also found that the sequence preference for CRISPRi/a is substantially different from that for CRISPR/Cas9 knockout and propose a new model for predicting sgRNA efficiency in CRISPRi/a experiments. These results facilitate the genome-wide design of improved sgRNA for both knockout and CRISPRi/a studies
- …