179 research outputs found

    Indexing Strategies for Rapid Searches of Short Words in Genome Sequences

    Get PDF
    Searching for matches between large collections of short (14–30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limited in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries

    Exploring the synergies between cross compliance and certification schemes

    Get PDF
    This report presents some of the interim results of the project 'Facilitating the CAP reform: Compliance and competitiveness of European agriculture'. It examines the similarities and differences between mandatory cross compliance standards and those set by voluntary certification schemes. There is a potential synergy between cross compliance and certification schemes, not least because both approaches set minimum standards and enforce those standards through inspection systems. Although there are some strong limitations, there is sufficient overlap in the standards set and in approaches to control to warrant further investigation of the potential for the harmonisation of standards and collaborative approaches to control

    EuroDia: a beta-cell gene expression resource

    Get PDF
    Type 2 diabetes mellitus (T2DM) is a major disease affecting nearly 280 million people worldwide. Whilst the pathophysiological mechanisms leading to disease are poorly understood, dysfunction of the insulin-producing pancreatic beta-cells is key event for disease development. Monitoring the gene expression profiles of pancreatic beta-cells under several genetic or chemical perturbations has shed light on genes and pathways involved in T2DM. The EuroDia database has been established to build a unique collection of gene expression measurements performed on beta-cells of three organisms, namely human, mouse and rat. The Gene Expression Data Analysis Interface (GEDAI) has been developed to support this database. The quality of each dataset is assessed by a series of quality control procedures to detect putative hybridization outliers. The system integrates a web interface to several standard analysis functions from R/Bioconductor to identify differentially expressed genes and pathways. It also allows the combination of multiple experiments performed on different array platforms of the same technology. The design of this system enables each user to rapidly design a custom analysis pipeline and thus produce their own list of genes and pathways. Raw and normalized data can be downloaded for each experiment. The flexible engine of this database (GEDAI) is currently used to handle gene expression data from several laboratory-run projects dealing with different organisms and platforms

    Consistency Analysis of Redundant Probe Sets on Affymetrix Three-Prime Expression Arrays and Applications to Differential mRNA Processing

    Get PDF
    Affymetrix three-prime expression microarrays contain thousands of redundant probe sets that interrogate different regions of the same gene. Differential expression analysis methods rarely consider probe redundancy, which can lead to inaccurate inference about overall gene expression or cause investigators to overlook potentially valuable information about differential regulation of variant mRNA products. We investigated the behaviour and consistency of redundant probe sets in a publicly-available data set containing samples from mouse brain amygdala and hippocampus and asked how applying filtering methods to the data affected consistency of results obtained from redundant probe sets. A genome-based filter that screens and groups probe sets according to their overlapping genomic alignments significantly improved redundant probe set consistency. Screening based on qualitative Present-Absent calls from MAS5 also improved consistency. However, even after applying these filters, many redundant probe sets showed significant fold-change differences relative to each other, suggesting differential regulation of alternative transcript production. Visual inspection of these loci using an interactive genome visualization tool (igb.bioviz.org) exposed thirty putative examples of differential regulation of alternative splicing or polyadenylation across brain regions in mouse. This work demonstrates how P/A-call and genome-based filtering can improve consistency among redundant probe sets while at the same time exposing possible differential regulation of RNA processing pathways across sample types

    Transcriptome profile analysis of flowering molecular processes of early flowering trifoliate orange mutant and the wild-type [Poncirus trifoliata (L.) Raf.] by massively parallel signature sequencing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>After several years in the juvenile phase, trees undergo flowering transition to become mature (florally competent) trees. This transition depends on the balanced expression of a complex network of genes that is regulated by both endogenous and environmental factors. However, relatively little is known about the molecular processes regulating flowering transition in woody plants compared with herbaceous plants.</p> <p>Results</p> <p>Comparative transcript profiling of spring shoots after self-pruning was performed on a spontaneously early flowering trifoliate orange mutant (precocious trifoliate orange, <it>Poncirus trifoliata</it>) with a short juvenile phase and the wild-type (WT) tree by using massively parallel signature sequencing (MPSS). A total of 16,564,500 and 16,235,952 high quality reads were obtained for the WT and the mutant (MT), respectively. Interpretation of the MPSS signatures revealed that the total number of transcribed genes in the MT (31,468) was larger than in the WT (29,864), suggesting that newly initiated transcription occurs in the MT. Further comparison of the transcripts revealed that 2735 genes had more than twofold expression difference in the MT compared with the WT. In addition, we identified 110 citrus flowering-time genes homologous with known elements of flowering-time pathways through sequencing and bioinformatics analysis. These genes are highly conserved in citrus and other species, suggesting that the functions of the related proteins in controlling reproductive development may be conserved as well.</p> <p>Conclusion</p> <p>Our results provide a foundation for comparative gene expression studies between WT and precocious trifoliate orange. Additionally, a number of candidate genes required for the early flowering process of precocious trifoliate orange were identified. These results provide new insight into the molecular processes regulating flowering time in citrus.</p

    In silico prediction of cancer immunogens:current state of the art

    Get PDF
    Cancer kills 8 million annually worldwide. Although survival rates in prevalent cancers continue to increase, many cancers have no effective treatment, prompting the search for new and improved protocols. Immunotherapy is a new and exciting addition to the anti-cancer arsenal. The successful and accurate identification of aberrant host proteins acting as antigens for vaccination and immunotherapy is a key aspiration for both experimental and computational research. Here we describe key elements of in silico prediction, including databases of cancer antigens and bleeding-edge methodology for their prediction. We also highlight the role dendritic cell vaccines can play and how they can act as delivery mechanisms for epitope ensemble vaccines. Immunoinformatics can help streamline the discovery and utility of Cancer Immunogens

    Alternative splicing enriched cDNA libraries identify breast cancer-associated transcripts

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alternative splicing (AS) is a central mechanism in the generation of genomic complexity and is a major contributor to transcriptome and proteome diversity. Alterations of the splicing process can lead to deregulation of crucial cellular processes and have been associated with a large spectrum of human diseases. Cancer-associated transcripts are potential molecular markers and may contribute to the development of more accurate diagnostic and prognostic methods and also serve as therapeutic targets. Alternative splicing-enriched cDNA libraries have been used to explore the variability generated by alternative splicing. In this study, by combining the use of trapping heteroduplexes and RNA amplification, we developed a powerful approach that enables transcriptome-wide exploration of the AS repertoire for identifying AS variants associated with breast tumor cells modulated by <it>ERBB2</it> (<it>HER-2/neu</it>) oncogene expression.</p> <p>Results</p> <p>The human breast cell line (C5.2) and a pool of 5 ERBB2 over-expressing breast tumor samples were used independently for the construction of two AS-enriched libraries. In total, 2,048 partial cDNA sequences were obtained, revealing 214 alternative splicing sequence-enriched tags (ASSETs). A subset with 79 multiple exon ASSETs was compared to public databases and reported 138 different AS events. A high success rate of RT-PCR validation (94.5%) was obtained, and 2 novel AS events were identified. The influence of <it>ERBB2</it>-mediated expression on AS regulation was evaluated by capillary electrophoresis and probe-ligation approaches in two mammary cell lines (Hb4a and C5.2) expressing different levels of <it>ERBB2</it>. The relative expression balance between AS variants from 3 genes was differentially modulated by <it>ERBB2</it> in this model system.</p> <p>Conclusions</p> <p>In this study, we presented a method for exploring AS from any RNA source in a transcriptome-wide format, which can be directly easily adapted to next generation sequencers. We identified AS transcripts that were differently modulated by <it>ERBB2</it>-mediated expression and that can be tested as molecular markers for breast cancer. Such a methodology will be useful for completely deciphering the cancer cell transcriptome diversity resulting from AS and for finding more precise molecular markers.</p

    Developmental Transcriptomic Features of the Carcinogenic Liver Fluke, Clonorchis sinensis

    Get PDF
    Clonorchis sinensis is the causative agent of the life-threatening disease endemic to China, Korea, and Vietnam. It is estimated that about 15 million people are infected with this fluke. C. sinensis provokes inflammation, epithelial hyperplasia, and periductal fibrosis in bile ducts, and may cause cholangiocarcinoma in chronically infected individuals. Accumulation of a large amount of biological information about the adult stage of this liver fluke in recent years has advanced our understanding of the pathological interplay between this parasite and its hosts. However, no developmental gene expression profiles of C. sinensis have been published. In this study, we generated gene expression profiles of three developmental stages of C. sinensis by analyzing expressed sequence tags (ESTs). Complementary DNA libraries were constructed from the adult, metacercaria, and egg developmental stages of C. sinensis. A total of 52,745 ESTs were generated and assembled into 12,830 C. sinensis assembled EST sequences, and then these assemblies were further categorized into groups according to biological functions and developmental stages. Most of the genes that were differentially expressed in the different stages were consistent with the biological and physical features of the particular developmental stage; high energy metabolism, motility and reproduction genes were differentially expressed in adults, minimal metabolism and final host adaptation genes were differentially expressed in metacercariae, and embryonic genes were differentially expressed in eggs. The higher expression of glucose transporters, proteases, and antioxidant enzymes in the adults accounts for active uptake of nutrients and defense against host immune attacks. The types of ion channels present in C. sinensis are consistent with its parasitic nature and phylogenetic placement in the tree of life. We anticipate that the transcriptomic information on essential regulators of development, bile chemotaxis, and physico-metabolic pathways in C. sinensis that presented in this study will guide further studies to identify novel drug targets and diagnostic antigens
    corecore