28 research outputs found

    An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value.</p> <p>Findings</p> <p>We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations.</p> <p>Conclusions</p> <p>TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease.</p

    MAPT and PAICE: Tools for time series and single time point transcriptionist visualization and knowledge discovery

    Get PDF
    With the advent of next-generation sequencing, -omics fields such as transcriptomics have experienced increases in data throughput on the order of magnitudes. In terms of analyzing and visually representing these huge datasets, an intuitive and computationally tractable approach is to map quantified transcript expression onto biochemical pathways while employing datamining and visualization principles to accelerate knowledge discovery. We present two cross-platform tools: MAPT (Mapping and Analysis of Pathways through Time) and PAICE (Pathway Analysis and Integrated Coloring of Experiments), an easy to use analysis suite to facilitate time series and single time point transcriptomics analysis. In unison, MAPT and PAICE serve as a visual workbench for transcriptomics knowledge discovery, data-mining and functional annotation. Both PAICE and MAPT are two distinct but yet inextricably linked tools. The former is specifically designed to map EC accessions onto KEGG pathways while handling multiple gene copies, detection-call analysis, as well as UN/annotated EC accessions lacking quantifiable expression. The latter tool integrates PAICE datasets to drive visualization, annotation, and data-mining

    Population-specific gene expression in the plant pathogenic nematode Heterodera glycines exists prior to infection and during the onset of a resistant or susceptible reaction in the roots of the Glycine max genotype Peking

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A single <it>Glycine max </it>(soybean) genotype (Peking) reacts differently to two different populations of <it>Heterodera glycines </it>(soybean cyst nematode) within the first twelve hours of infection during resistant (R) and susceptible (S) reactions. This suggested that <it>H. glycines </it>has population-specific gene expression signatures. A microarray analysis of 7539 probe sets representing 7431 transcripts on the Affymetrix<sup>® </sup>soybean GeneChip<sup>® </sup>were used to identify population-specific gene expression signatures in pre-infective second stage larva (pi-L2) prior to their infection of Peking. Other analyses focused on the infective L2 at 12hours post infection (i-L2<sub>12h</sub>), and the infective sedentary stages at 3days post infection (i-L2<sub>3d</sub>) and 8days post infection (i-L2/L3<sub>8d</sub>).</p> <p>Results</p> <p>Differential expression and false discovery rate (FDR) analyses comparing populations of pi-L2 (i.e., incompatible population, NL1-RHg to compatible population, TN8) identified 71 genes that were induced in NL1-RHg as compared to TN8. These genes included putative gland protein G23G12, putative esophageal gland protein Hgg-20 and arginine kinase. The comparative analysis of pi-L2 identified 44 genes that were suppressed in NL1-RHg as compared to TN8. These genes included a different Hgg-20 gene, an EXPB1 protein and a cuticular collagen. By 12 h, there were 7 induced genes and 0 suppressed genes in NL1-RHg. By 3d, there were 9 induced and 10 suppressed genes in NL1-RHg. Substantial changes in gene expression became evident subsequently. At 8d there were 13 induced genes in NL1-RHg. This included putative gland protein G20E03, ubiquitin extension protein, putative gland protein G30C02 and β-1,4 endoglucanase. However, 1668 genes were found to be suppressed in NL1-RHg. These genes included steroid alpha reductase, serine proteinase and a collagen protein.</p> <p>Conclusion</p> <p>These analyses identify a genetic expression signature for these two populations both prior to and subsequently as they undergo an R or S reaction. The identification of genes like steroid alpha reductase and serine proteinase that are involved in feeding and nutritional uptake as being highly suppressed during the R response at 8d may indicate genes that the plant is targeting. The analyses also identified numerous putative parasitism genes that are differentially expressed. The 1668 genes that are suppressed in NL1-RHg, and hence induced in TN8 may represent genes that are important during the parasitic stages of <it>H. glycines </it>development. The potential for different arrays of putative parasitism genes to be expressed in different nematode populations may indicate how <it>H. glycines </it>evolve mechanisms to overcome resistance.</p

    Re-annotation of the woodland strawberry (Fragaria vesca) genome

    Get PDF
    Fragaria vesca is a low-growing, small-fruited diploid strawberry species commonly called woodland strawberry. It is native to temperate regions of Eurasia and North America and while it produces edible fruits, it is most highly useful as an experimental perennial plant system that can serve as a model for the agriculturally important Rosaceae family. A draft of the F. vesca genome sequence was published in 2011 [Nat Genet 43:223,2011]. The first generation annotation (version 1.1) were developed using GeneMark-ES+[Nuc Acids Res 33:6494,2005]which is a self-training gene prediction tool that relies primarily on the combination of ab initio predictions with mapping high confidence ESTs in addition to mapping gene deserts from transposable elements. Based on over 25 different tissue transcriptomes, we have revised the F. vesca genome annotation, thereby providing several improvements over version 1.1. The new annotation, which was achieved using Maker, describes many more predicted protein coding genes compared to the GeneMark generated annotation that is currently hosted at the Genome Database for Rosaceae (http://www.rosaceae.org/). Our new annotation also results in an increase in the overall total coding length, and the number of coding regions found. The total number of gene predictions that do not overlap with the previous annotations is 2286, most of which were found to be homologous to other plant genes. We have experimentally verified one of the new gene model predictions to validate our results. Using the RNA-Seq transcriptome sequences from 25 diverse tissue types, the re-annotation pipeline improved existing annotations by increasing the annotation accuracy based on extensive transcriptome data. It uncovered new genes, added exons to current genes, and extended or merged exons. This complete genome re-annotation will significantly benefit functional genomic studies of the strawberry and other members of the Rosaceae.https://doi.org/10.1186/s12864-015-1221-

    Microarray Detection Call Methodology as a Means to Identify and Compare Transcripts Expressed within Syncytial Cells from Soybean (Glycine max) Roots Undergoing Resistant and Susceptible Reactions to the Soybean Cyst Nematode (Heterodera glycines)

    Get PDF
    Background. A comparative microarray investigation was done using detection call methodology (DCM) and differential expression analyses. The goal was to identify genes found in specific cell populations that were eliminated by differential expression analysis due to the nature of differential expression methods. Laser capture microdissection (LCM) was used to isolate nearly homogeneous populations of plant root cells. Results. The analyses identified the presence of 13,291 transcripts between the 4 different sample types. The transcripts filtered down into a total of 6,267 that were detected as being present in one or more sample types. A comparative analysis of DCM and differential expression methods showed a group of genes that were not differentially expressed, but were expressed at detectable amounts within specific cell types. Conclusion. The DCM has identified patterns of gene expression not shown by differential expression analyses. DCM has identified genes that are possibly cell-type specific and/or involved in important aspects of plant nematode interactions during the resistance response, revealing the uniqueness of a particular cell population at a particular point during its differentiation process

    Ascaris suum: cDNA microarray analysis of 4th stage larvae (L4) during self-cure from the intestine

    Get PDF
    There is spontaneous cure of a large portion of Ascaris suum 4th-stage larvae (L4) from the jejunum of infected pigs between 14 and 21 days after inoculation (DAI). Those L4 that remain in the jejunum continue to develop while those that have moved to the ileum are eventually expelled from the intestines. Although increases in intestinal mucosal mast cells and changes in localhost immunity are coincidental with spontaneous cure, the population of L4 that continue to develop in the jejunum may counteract host protective mechanisms by the differential production of factors related to parasitism. To this end, a cDNA library was constructed from L4 isolated from pig jejunum at 21 DAI, and 93% of 1920 original clones containing a single amplicon in the range 400– 1500 bp were verified by gel electrophoresis and printed onto glass slides for microarray analysis. Fluorescent probes were prepared from total RNA isolated from: (1) 3rd stage-larvae from lung at 7 DAI, (L3); (2) L4 from jejunum at 14 DAI (L4-14-J); (3) L4 from jejunum at 21 DAI (L4-21-J); (4) L4 from ileum at 21 DAI (L4-21-I, and; (5) adults (L5). Cy3-labeled L3, L4-14-J, L4-21-I and L5 cDNA, and Cy5-labeled L4-21-J cDNA were simultaneously used to screen the printed arrays containing the L4-21-J-derived cDNA library. Several clones showed consistent differential gene expression over two separate experiments and were grouped into 3 distinct transcription patterns. The data showed that sequences from muscle actin and myosin, ribosomal protein L11, glyceraldehyde-3- phosphate dehydrogenase and the flavoprotein subunit of succinate dehydrogenase were highly expressed in L4-21-J, but not in L4- 21-I; as were a collection of unannotated genes derived from a worm body wall-hypodermis library, and a testes germinal zone tissue library. These results suggest that only actively developing A. suum L4 are destined to parasitize the host and successfully neutralize host protective responses

    BBGD: an online database for blueberry genomic data

    Get PDF
    BACKGROUND: Blueberry is a member of the Ericaceae family, which also includes closely related cranberry and more distantly related rhododendron, azalea, and mountain laurel. Blueberry is a major berry crop in the United States, and one that has great nutritional and economical value. Extreme low temperatures, however, reduce crop yield and cause major losses to US farmers. A better understanding of the genes and biochemical pathways that are up- or down-regulated during cold acclimation is needed to produce blueberry cultivars with enhanced cold hardiness. To that end, the blueberry genomics database (BBDG) was developed. Along with the analysis tools and web-based query interfaces, the database serves both the broader Ericaceae research community and the blueberry research community specifically by making available ESTs and gene expression data in searchable formats and in elucidating the underlying mechanisms of cold acclimation and freeze tolerance in blueberry. DESCRIPTION: BBGD is the world's first database for blueberry genomics. BBGD is both a sequence and gene expression database. It stores both EST and microarray data and allows scientists to correlate expression profiles with gene function. BBGD is a public online database. Presently, the main focus of the database is the identification of genes in blueberry that are significantly induced or suppressed after low temperature exposure. CONCLUSION: By using the database, researchers have developed EST-based markers for mapping and have identified a number of "candidate" cold tolerance genes that are highly expressed in blueberry flower buds after exposure to low temperatures

    SGR: an online genomic resource for the woodland strawberry

    Get PDF
    Fragaria vesca, a diploid strawberry species commonly known as the alpine or woodland strawberry, is a versatile experimental plant system and an emerging model for the Rosaceae family. An ancestral F. vesca genome contributed to the genome of the octoploid dessert strawberry (F. ×ananassa), and the extant genome exhibits synteny with other commercially important members of the Rosaceae family such as apple and peach. To provide a molecular description of floral organ and fruit development at the resolution of specific tissues and cell types, RNAs from flowers and early developmental stage fruit tissues of the inbred F. vesca line YW5AF7 were extracted and the resulting cDNA libraries sequenced using an Illumina HiSeq2000. To enable easy access as well as mining of this two-dimensional (stage and tissue) transcriptome dataset, a web-based database, the Strawberry Genomic Resource (SGR), was developed. SGR is a web accessible database that contains sample description, sample statistics, gene annotation, and gene expression analysis. This information can be accessed publicly from a web-based interface at http://bioinformatics.towson.edu/strawberry/Default.aspx . The SGR website provides user friendly search and browse capabilities for all the data stored in the database. Users are able to search for genes using a gene ID or description or obtain differentially expressed genes by entering different comparison parameters. Search results can be downloaded in a tabular format compatible with Microsoft excel application. Aligned reads to individual genes and exon/intron structures are displayed using the genome browser, facilitating gene re-annotation by individual users. The SGR database was developed to facilitate dissemination and data mining of extensive floral and fruit transcriptome data in the woodland strawberry. It enables users to mine the data in different ways to study different pathways or biological processes during reproductive development.https://doi.org/10.1186/1471-2229-13-22

    Analysis of Gene expression in soybean (Glycine max) roots in response to the root knot nematode Meloidogyne incognita using microarrays and KEGG pathways

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Root-knot nematodes are sedentary endoparasites that can infect more than 3000 plant species. Root-knot nematodes cause an estimated $100 billion annual loss worldwide. For successful establishment of the root-knot nematode in its host plant, it causes dramatic morphological and physiological changes in plant cells. The expression of some plant genes is altered by the nematode as it establishes its feeding site.</p> <p>Results</p> <p>We examined the expression of soybean (<it>Glycine max</it>) genes in galls formed in roots by the root-knot nematode, <it>Meloidogyne incognita</it>, 12 days and 10 weeks after infection to understand the effects of infection of roots by <it>M. incognita</it>. Gene expression was monitored using the Affymetrix Soybean GeneChip containing 37,500 <it>G. max </it>probe sets. Gene expression patterns were integrated with biochemical pathways from the Kyoto Encyclopedia of Genes and Genomes using PAICE software. Genes encoding enzymes involved in carbohydrate and cell wall metabolism, cell cycle control and plant defense were altered.</p> <p>Conclusions</p> <p>A number of different soybean genes were identified that were differentially expressed which provided insights into the interaction between <it>M. incognita </it>and soybean and into the formation and maintenance of giant cells. Some of these genes may be candidates for broadening plants resistance to root-knot nematode through over-expression or silencing and require further examination.</p

    Ascaris suum: cDNA microarray analysis of 4th stage larvae (L4) during self-cure from the intestine

    Get PDF
    There is spontaneous cure of a large portion of Ascaris suum 4th-stage larvae (L4) from the jejunum of infected pigs between 14 and 21 days after inoculation (DAI). Those L4 that remain in the jejunum continue to develop while those that have moved to the ileum are eventually expelled from the intestines. Although increases in intestinal mucosal mast cells and changes in localhost immunity are coincidental with spontaneous cure, the population of L4 that continue to develop in the jejunum may counteract host protective mechanisms by the differential production of factors related to parasitism. To this end, a cDNA library was constructed from L4 isolated from pig jejunum at 21 DAI, and 93% of 1920 original clones containing a single amplicon in the range 400– 1500 bp were verified by gel electrophoresis and printed onto glass slides for microarray analysis. Fluorescent probes were prepared from total RNA isolated from: (1) 3rd stage-larvae from lung at 7 DAI, (L3); (2) L4 from jejunum at 14 DAI (L4-14-J); (3) L4 from jejunum at 21 DAI (L4-21-J); (4) L4 from ileum at 21 DAI (L4-21-I, and; (5) adults (L5). Cy3-labeled L3, L4-14-J, L4-21-I and L5 cDNA, and Cy5-labeled L4-21-J cDNA were simultaneously used to screen the printed arrays containing the L4-21-J-derived cDNA library. Several clones showed consistent differential gene expression over two separate experiments and were grouped into 3 distinct transcription patterns. The data showed that sequences from muscle actin and myosin, ribosomal protein L11, glyceraldehyde-3- phosphate dehydrogenase and the flavoprotein subunit of succinate dehydrogenase were highly expressed in L4-21-J, but not in L4- 21-I; as were a collection of unannotated genes derived from a worm body wall-hypodermis library, and a testes germinal zone tissue library. These results suggest that only actively developing A. suum L4 are destined to parasitize the host and successfully neutralize host protective responses
    corecore