3,675 research outputs found

    A novel framework for chimeric transcript detection based on accurate gene fusion model

    Get PDF
    Next generation sequencing plays a key role in the detection of structural variations. Chimeric transcripts are relevant examples of such variations, as they are involved in several diseases. In this work, we propose an effective methodology for the detection of fused transcripts in RNA-Seq paired-end data. The proposed methodology is based on an accurate fusion model implemented by a set of filters reducing the impact of artifacts. Moreover, the methodology accounts for transcripts consistently expressing in the sample under study even if they are not annotated. The effectiveness of the proposed solution has been experimentally validated on of Chronic Myelogenous Leukemia (CML) samples, providing both the genes involved in the fusion and the exact chimeric sequence. \ua9 2011 IEEE

    Combined aptamer and transcriptome sequencing of single cells.

    Get PDF
    The transcriptome and proteome encode distinct information that is important for characterizing heterogeneous biological systems. We demonstrate a method to simultaneously characterize the transcriptomes and proteomes of single cells at high throughput using aptamer probes and droplet-based single cell sequencing. With our method, we differentiate distinct cell types based on aptamer surface binding and gene expression patterns. Aptamers provide advantages over antibodies for single cell protein characterization, including rapid, in vitro, and high-purity generation via SELEX, and the ability to amplify and detect them with PCR and sequencing

    Cadherin-26 (CDH26) regulates airway epithelial cell cytoskeletal structure and polarity.

    Get PDF
    Polarization of the airway epithelial cells (AECs) in the airway lumen is critical to the proper function of the mucociliary escalator and maintenance of lung health, but the cellular requirements for polarization of AECs are poorly understood. Using human AECs and cell lines, we demonstrate that cadherin-26 (CDH26) is abundantly expressed in differentiated AECs, localizes to the cell apices near ciliary membranes, and has functional cadherin domains with homotypic binding. We find a unique and non-redundant role for CDH26, previously uncharacterized in AECs, in regulation of cell-cell contact and cell integrity through maintaining cytoskeletal structures. Overexpression of CDH26 in cells with a fibroblastoid phenotype increases contact inhibition and promotes monolayer formation and cortical actin structures. CDH26 expression is also important for localization of planar cell polarity proteins. Knockdown of CDH26 in AECs results in loss of cortical actin and disruption of CRB3 and other proteins associated with apical polarity. Together, our findings uncover previously unrecognized functions for CDH26 in the maintenance of actin cytoskeleton and apicobasal polarity of AECs

    A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines

    Get PDF
    SnowShoes-FTD, developed for fusion transcript detection in paired-end mRNA-Seq data, employs multiple steps of false positive filtering to nominate fusion transcripts with near 100% confidence. Unique features include: (i) identification of multiple fusion isoforms from two gene partners; (ii) prediction of genomic rearrangements; (iii) identification of exon fusion boundaries; (iv) generation of a 5′–3′ fusion spanning sequence for PCR validation; and (v) prediction of the protein sequences, including frame shift and amino acid insertions. We applied SnowShoes-FTD to identify 50 fusion candidates in 22 breast cancer and 9 non-transformed cell lines. Five additional fusion candidates with two isoforms were confirmed. In all, 30 of 55 fusion candidates had in-frame protein products. No fusion transcripts were detected in non-transformed cells. Consideration of the possible functions of a subset of predicted fusion proteins suggests several potentially important functions in transformation, including a possible new mechanism for overexpression of ERBB2 in a HER-positive cell line. The source code of SnowShoes-FTD is provided in two formats: one configured to run on the Sun Grid Engine for parallelization, and the other formatted to run on a single LINUX node. Executables in PERL are available for download from our web site: http://mayoresearch.mayo.edu/mayo/research/biostat/stand-alone-packages.cfm

    Optimizing Splicing Junction Detection in Next Generation Sequencing Data on a Virtual-GRID Infrastructure

    Get PDF
    The new protocol for sequencing the messenger RNA in a cell, named RNA-seq produce millions of short sequence fragments. Next Generation Sequencing technology allows more accurate analysis but increase needs in term of computational resources. This paper describes the optimization of a RNA-seq analysis pipeline devoted to splicing variants detection, aimed at reducing computation time and providing a multi-user/multisample environment. This work brings two main contributions. First, we optimized a well-known algorithm called TopHat by parallelizing some sequential mapping steps. Second, we designed and implemented a hybrid virtual GRID infrastructure allowing to efficiently execute multiple instances of TopHat running on different samples or on behalf of different users, thus optimizing the overall execution time and enabling a flexible multi-user environmen

    Identification of fusion genes in breast cancer by paired-end RNA-sequencing

    Get PDF
    Background Until recently, chromosomal translocations and fusion genes have been an underappreciated class of mutations in solid tumors. Next-generation sequencing technologies provide an opportunity for systematic characterization of cancer cell transcriptomes, including the discovery of expressed fusion genes resulting from underlying genomic rearrangements. Results We applied paired-end RNA-seq to identify 24 novel and 3 previously known fusion genes in breast cancer cells. Supported by an improved bioinformatic approach, we had a 95% success rate of validating gene fusions initially detected by RNA-seq. Fusion partner genes were found to contribute promoters (5' UTR), coding sequences and 3' UTRs. Most fusion genes were associated with copy number transitions and were particularly common in high-level DNA amplifications. This suggests that fusion events may contribute to the selective advantage provided by DNA amplifications and deletions. Some of the fusion partner genes, such as GSDMB in the TATDN1-GSDMB fusion and IKZF3 in the VAPB-IKZF3 fusion, were only detected as a fusion transcript, indicating activation of a dormant gene by the fusion event. A number of fusion gene partners have either been previously observed in oncogenic gene fusions, mostly in leukemias, or otherwise reported to be oncogenic. RNA interference-mediated knock-down of the VAPB-IKZF3 fusion gene indicated that it may be necessary for cancer cell growth and survival. Conclusions In summary, using RNA-sequencing and improved bioinformatic stratification, we have discovered a number of novel fusion genes in breast cancer, and identified VAPB-IKZF3 as a potential fusion gene with importance for the growth and survival of breast cancer cells

    ChimeRScope: a novel alignment-free algorithm for fusion gene prediction using paired-end short reads

    Get PDF
    Fusion genes are those that result from the fusion of two or more genes, and they are typically generated due to the perturbations in the genome structure in cancer cells. In turn, fusion genes can contribute to tumor formation and progression by promoting the expression of an oncogene, deregulation of a tumor-suppressor, or producing much more active abnormal proteins. More importantly, oncogenic fusion genes are specifically expressed in the tumor cells, which provide enormous diagnostic and therapeutic advantages for cancer treatment. With the development of next-generation sequencing (NGS) technology, RNA-Seq becomes increasingly popular for transcriptomic study because of its high sensitivity and the capability of detecting novel transcripts including fusion genes. To date, many fusion gene detection tools have been developed, most of which attempt to find reliable alignment evidence for chimeric transcripts from RNA-Seq data. It is well accepted that the alignment quality of sequencing reads against the reference genome is often limited when significant differences in the genomes exist, which is the case with cancer genomes that contain many genomic perturbations and structural variations. Hence, regions where fusion genes occur in the cancer genome tend to be largely different from those in the reference genome, which prevents the alignment-based fusion gene detection methods from achieving good accuracies. We developed a tool called ChimeRScope. ChimeRScope, being an alignment-free method, bypasses the sequence alignment step by assessing the gene fingerprint profiles (in the form of k-mers) from RNA-Seq paired-end reads for fusion gene prediction (Chapter Two). We also optimized the data structure and ChimeRScope algorithms, in order to overcome the common limitations (memory-utilization, low accuracies) that are commonly seen in alignment-free methods (Chapter Two). Results on simulated datasets, previously studied cancer RNA-Seq datasets, and experimental validations on in-house datasets have shown that ChimeRScope consistently performed better than other popular alignment-based methods irrespective of the read length and depth of sequencing coverage (Chapter Three). ChimeRScope also generates graphical outputs for illustrations of the fusion patterns. Lastly, we also developed downloadable software for ChimeRScope and implemented an online data analysis server using the Galaxy platform (Chapter Four). ChimeRScope is available at https://github.com/ChimeRScope/ChimeRScope/
    corecore