1,257 research outputs found

    DNA microarray experimental design and software based data normalization and analysis

    Get PDF
    [no abstract

    Genotyping on custom arrays using a parallel data pipeline

    Get PDF
    In the past, genotyping (determining a set of alleles in an organism) has been an extremely challenging process. The time, monetary, and technology demands of the task have limited genotype data to a small variety of scientific model organisms with the capacity to conduct genetic crosses. New sequencing technology from companies such as NimbleGen, however, can generate custom organism-specific microarrays at relatively low cost. The combination of these arrays and the knowledge of species\u27 genome-wide SNPs allow genotype experiments, such as generation maps, QTL studies, and natural population variation studies, to be conducted on virtually any organism. Although the NimbleGen technology can create appropriate DNA information, there has been no software that can use this data for custom array-based genotyping. This thesis describes a data pipeline that uses custom DNA microarrays to genotype organisms. The pipeline simplifies the genotyping process, and users can easily customize and run the tool. The pipeline\u27s performance is improved by exploiting parallel aspects of the microarray data, which reduces the genotyping process from days and weeks to minutes and hours. We demonstrate that the pipeline is an effective tool for genotyping custom microarrays across a large number of loci, and describe the effects of user-controlled parameters

    Expression profiling of the schizont and trophozoite stages of Plasmodium falciparum with a long-oligonucleotide microarray

    Get PDF
    BACKGROUND: The worldwide persistence of drug-resistant Plasmodium falciparum, the most lethal variety of human malaria, is a global health concern. The P. falciparum sequencing project has brought new opportunities for identifying molecular targets for antimalarial drug and vaccine development. RESULTS: We developed a software package, ArrayOligoSelector, to design an open reading frame (ORF)-specific DNA microarray using the publicly available P. falciparum genome sequence. Each gene was represented by one or more long 70 mer oligonucleotides selected on the basis of uniqueness within the genome, exclusion of low-complexity sequence, balanced base composition and proximity to the 3' end. A first-generation microarray representing approximately 6,000 ORFs of the P. falciparum genome was constructed. Array performance was evaluated through the use of control oligonucleotide sets with increasing levels of introduced mutations, as well as traditional northern blotting. Using this array, we extensively characterized the gene-expression profile of the intraerythrocytic trophozoite and schizont stages of P. falciparum. The results revealed extensive transcriptional regulation of genes specialized for processes specific to these two stages. CONCLUSIONS: DNA microarrays based on long oligonucleotides are powerful tools for the functional annotation and exploration of the P. falciparum genome. Expression profiling of trophozoites and schizonts revealed genes associated with stage-specific processes and may serve as the basis for future drug targets and vaccine development

    Highly sensitive and specific microRNA expression profiling using BeadArray technology

    Get PDF
    We have developed a highly sensitive, specific and reproducible method for microRNA (miRNA) expression profiling, using the BeadArray™ technology. This method incorporates an enzyme-assisted specificity step, a solid-phase primer extension to distinguish between members of miRNA families. In addition, a universal PCR is used to amplify all targets prior to array hybridization. Currently, assay probes are designed to simultaneously analyse 735 well-annotated human miRNAs. Using this method, highly reproducible miRNA expression profiles were generated with 100–200 ng total RNA input. Furthermore, very similar expression profiles were obtained with total RNA and enriched small RNA species (R2 ≥ 0.97). The method has a 3.5–4 log (105–109 molecules) dynamic range and is able to detect 1.2- to 1.3-fold-differences between samples. Expression profiles generated by this method are highly comparable to those obtained with RT–PCR (R2 = 0.85–0.90) and direct sequencing (R = 0.87–0.89). This method, in conjunction with the 96-sample array matrix should prove useful for high-throughput expression profiling of miRNAs in large numbers of tissue samples

    Analysis of gene expression data using Expressionist 3.1 and GeneSpring 4.2

    Get PDF
    The purpose of this study was to determine the differences in the gene expression analysis methods of two data mining tools, ExpressionisticTM 3.1 and GeneSpringTM 4.2 with focus on basic statistical analysis and clustering algorithms. The data for this analysis was derived from the hybridization of Rattus norvegicus RNA to the Affymetrix RG34A GeneChip. This analysis was derived from experiments designed to identify changes in gene expression patterns that were induced in vivo by an experimental treatment. The tools were found to be comparable with respect to the list of statistically significant genes that were up-regulated by more than two fold. Approximately 78% of this gene list was present in both tools. ExpressionistTm 3.1 was capable of representing the different linkage methods of hierarchical clustering as average, complete and single, whereas in GeneSpringTM 4.2, the user could manipulate the separation ratio and minimum distance of the hierarchical tree

    A multivariate prediction model for microarray cross-hybridization

    Get PDF
    BACKGROUND: Expression microarray analysis is one of the most popular molecular diagnostic techniques in the post-genomic era. However, this technique faces the fundamental problem of potential cross-hybridization. This is a pervasive problem for both oligonucleotide and cDNA microarrays; it is considered particularly problematic for the latter. No comprehensive multivariate predictive modeling has been performed to understand how multiple variables contribute to (cross-) hybridization. RESULTS: We propose a systematic search strategy using multiple multivariate models [multiple linear regressions, regression trees, and artificial neural network analyses (ANNs)] to select an effective set of predictors for hybridization. We validate this approach on a set of DNA microarrays with cytochrome p450 family genes. The performance of our multiple multivariate models is compared with that of a recently proposed third-order polynomial regression method that uses percent identity as the sole predictor. All multivariate models agree that the 'most contiguous base pairs between probe and target sequences,' rather than percent identity, is the best univariate predictor. The predictive power is improved by inclusion of additional nonlinear effects, in particular target GC content, when regression trees or ANNs are used. CONCLUSION: A systematic multivariate approach is provided to assess the importance of multiple sequence features for hybridization and of relationships among these features. This approach can easily be applied to larger datasets. This will allow future developments of generalized hybridization models that will be able to correct for false-positive cross-hybridization signals in expression experiments

    CAD Tools for DNA Micro-Array Design, Manufacture and Application

    Get PDF
    Motivation: As the human genome project progresses and some microbial and eukaryotic genomes are recognized, numerous biotechnological processes have attracted increasing number of biologists, bioengineers and computer scientists recently. Biotechnological processes profoundly involve production and analysis of highthroughput experimental data. Numerous sequence libraries of DNA and protein structures of a large number of micro-organisms and a variety of other databases related to biology and chemistry are available. For example, microarray technology, a novel biotechnology, promises to monitor the whole genome at once, so that researchers can study the whole genome on the global level and have a better picture of the expressions among millions of genes simultaneously. Today, it is widely used in many fields- disease diagnosis, gene classification, gene regulatory network, and drug discovery. For example, designing organism specific microarray and analysis of experimental data require combining heterogeneous computational tools that usually differ in the data format; such as, GeneMark for ORF extraction, Promide for DNA probe selection, Chip for probe placement on microarray chip, BLAST to compare sequences, MEGA for phylogenetic analysis, and ClustalX for multiple alignments. Solution: Surprisingly enough, despite huge research efforts invested in DNA array applications, very few works are devoted to computer-aided optimization of DNA array design and manufacturing. Current design practices are dominated by ad-hoc heuristics incorporated in proprietary tools with unknown suboptimality. This will soon become a bottleneck for the new generation of high-density arrays, such as the ones currently being designed at Perlegen [109]. The goal of the already accomplished research was to develop highly scalable tools, with predictable runtime and quality, for cost-effective, computer-aided design and manufacturing of DNA probe arrays. We illustrate the utility of our approach by taking a concrete example of combining the design tools of microarray technology for Harpes B virus DNA data

    Transcript Specificity in Yeast Pre-mRNA Splicing Revealed by Mutations in Core Spliceosomal Components

    Get PDF
    Appropriate expression of most eukaryotic genes requires the removal of introns from their pre–messenger RNAs (pre-mRNAs), a process catalyzed by the spliceosome. In higher eukaryotes a large family of auxiliary factors known as SR proteins can improve the splicing efficiency of transcripts containing suboptimal splice sites by interacting with distinct sequences present in those pre-mRNAs. The yeast Saccharomyces cerevisiae lacks functional equivalents of most of these factors; thus, it has been unclear whether the spliceosome could effectively distinguish among transcripts. To address this question, we have used a microarray-based approach to examine the effects of mutations in 18 highly conserved core components of the spliceosomal machinery. The kinetic profiles reveal clear differences in the splicing defects of particular pre-mRNA substrates. Most notably, the behaviors of ribosomal protein gene transcripts are generally distinct from other intron-containing transcripts in response to several spliceosomal mutations. However, dramatically different behaviors can be seen for some pairs of transcripts encoding ribosomal protein gene paralogs, suggesting that the spliceosome can readily distinguish between otherwise highly similar pre-mRNAs. The ability of the spliceosome to distinguish among its different substrates may therefore offer an important opportunity for yeast to regulate gene expression in a transcript-dependent fashion. Given the high level of conservation of core spliceosomal components across eukaryotes, we expect that these results will significantly impact our understanding of how regulated splicing is controlled in higher eukaryotes as well
    corecore