Search CORE

272 research outputs found

Model-based probe set optimization for high-performance microarrays

Author: Bernal
Bernhart
Blencowe
Bozdech
Brown
Carninci
Charbonnier
Chen
Chou
D. P. Kreil
Dudley
Fotin
G. G. Leparc
G. Striedner
Gao
Gordon
Griffith
Gunderson
Hofacker
Horak
Hu
I. L. Hofacker
K. Bayer
Kakuhata
Kane
Kreil
Lander
Lee
Li
Li
Li
Luebke
Marko
Mathews
Mrowka
Nadon
Nielsen
P. Sykacek
Pinkel
Rahmann
Ratushna
Relogio
Reymond
Rouillard
Saidi
SantaLucia
Santalucia
SantaLucia
T. Tuchler
Tolstrup
Wang
Wernersson
Xu
Yelin
Zuker
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

A major challenge in microarray design is the selection of highly specific oligonucleotide probes for all targeted genes of interest, while maintaining thermodynamic uniformity at the hybridization temperature. We introduce a novel microarray design framework (Thermodynamic Model-based Oligo Design Optimizer, TherMODO) that for the first time incorporates a number of advanced modelling features: (i) A model of position-dependent labelling effects that is quantitatively derived from experiment. (ii) Multi-state thermodynamic hybridization models of probe binding behaviour, including potential cross-hybridization reactions. (iii) A fast calibrated sequence-similarity-based heuristic for cross-hybridization prediction supporting large-scale designs. (iv) A novel compound score formulation for the integrated assessment of multiple probe design objectives. In contrast to a greedy search for probes meeting parameter thresholds, this approach permits an optimization at the probe set level and facilitates the selection of highly specific probe candidates while maintaining probe set uniformity. (v) Lastly, a flexible target grouping structure allows easy adaptation of the pipeline to a variety of microarray application scenarios. The algorithm and features are discussed and demonstrated on actual design runs. Source code is available on request

Crossref

PubMed Central

Permanent Hosting, Archiving and Indexing of Digital Resources and Assets

Warwick Research Archives Portal Repository

Non-Unique oligonucleotide probe selection heuristics

Author: Wang Lili
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2008
Field of study

The non-unique probe selection problem consists of selecting both unique and nonunique oligonucleotide probes for oligonucleotide microarrays, which are widely used tools to identify viruses or bacteria in biological samples. The non-unique probes, designed to hybridize to at least one target, are used as alternatives when the design of unique probes is particularly difficult for the closely related target genes. The goal of the non-unique probe selection problem is to determine a smallest set of probes able to identify all targets present in a biological sample. This problem is known to be NP-hard. In this thesis, several novel heuristics are presented based on greedy strategy, genetic algorithms and evolutionary strategy respectively for the minimization problem arisen from the non-unique probe selection using the best-known ILP formulation. Experiment results show that our methods are capable of reducing the number of probes required over the state-of-the-art methods

Scholarship at UWindsor

Bayesian Optimization Algorithm for Non-unique Oligonucleotide Probe Selection

Author: Soltan Ghoraie Laleh
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2009
Field of study

One important application of DNA microarrays is measuring the expression levels of genes. The quality of the microarrays design which includes selecting short Oligonucleotide sequences (probes) to be affixed on the surface of the microarray becomes a major issue. A good design is the one that contains the minimum possible number of probes while having an acceptable ability in identifying the targets existing in the sample. We focuse on the problem of computing the minimal set of probes which is able to identify each target of a sample, referred to as Non-unique Oligonucleotide Probe Selection. We present the application of an Estimation of Distribution Algorithm named Bayesian Optimization Algorithm (BOA) to this problem, and consider integration of BOA and one simple heuristic. We also present application of our method in integration with decoding approach in a multiobjective optimization framework for solving the problem in case of multiple targets in the sample

Scholarship at UWindsor

Teolenn: an efficient and customizable workflow to design high-quality probes for microarray experiments

Author: Antoine Margeot
Aurélie Duclos
Bertone
Bozdech
Chao
Christian Brion
Graf
Hugues Mathis
Kane
Laurent Jourdren
Lemoine
Lemoine
Li
Li
Lipson
Manber
Markham
Martinez
Nordberg
Rouillard
SantaLucia
Schliep
Seidl
Shendure
Slater
Stein
Stéphane Le Crom
Thomas Portnoy
Tomiuk
Wernersson
Wold
Publication venue: Oxford University Press
Publication date: 01/06/2010
Field of study

Despite the development of new high-throughput sequencing techniques, microarrays are still attractive tools to study small genome organisms, thanks to sample multiplexing and high-feature densities. However, the oligonucleotide design remains a delicate step for most users. A vast array of software is available to deal with this problem, but each program is developed with its own strategy, which makes the choice of the best solution difficult. Here we describe Teolenn, a universal probe design workflow developed with a flexible and customizable module organization allowing fixed or variable length oligonucleotide generation. In addition, our software is able to supply quality scores for each of the designed probes. In order to assess the relevance of these scores, we performed a real hybridization using a tiling array designed against the Trichoderma reesei fungus genome. We show that our scoring pipeline correlates with signal quality for 97.2% of all the designed probes, allowing for a posteriori comparisons between quality scores and signal intensities. This result is useful in discarding any bad scoring probes during the design step in order to get high-quality microarrays. Teolenn is available at http://transcriptome.ens.fr/teolenn/

CAD Tools for DNA Micro-Array Design, Manufacture and Application

Author: Hundewale Nisar
Publication venue: ScholarWorks @ Georgia State University
Publication date: 04/12/2006
Field of study

Motivation: As the human genome project progresses and some microbial and eukaryotic genomes are recognized, numerous biotechnological processes have attracted increasing number of biologists, bioengineers and computer scientists recently. Biotechnological processes profoundly involve production and analysis of highthroughput experimental data. Numerous sequence libraries of DNA and protein structures of a large number of micro-organisms and a variety of other databases related to biology and chemistry are available. For example, microarray technology, a novel biotechnology, promises to monitor the whole genome at once, so that researchers can study the whole genome on the global level and have a better picture of the expressions among millions of genes simultaneously. Today, it is widely used in many fields- disease diagnosis, gene classification, gene regulatory network, and drug discovery. For example, designing organism specific microarray and analysis of experimental data require combining heterogeneous computational tools that usually differ in the data format; such as, GeneMark for ORF extraction, Promide for DNA probe selection, Chip for probe placement on microarray chip, BLAST to compare sequences, MEGA for phylogenetic analysis, and ClustalX for multiple alignments. Solution: Surprisingly enough, despite huge research efforts invested in DNA array applications, very few works are devoted to computer-aided optimization of DNA array design and manufacturing. Current design practices are dominated by ad-hoc heuristics incorporated in proprietary tools with unknown suboptimality. This will soon become a bottleneck for the new generation of high-density arrays, such as the ones currently being designed at Perlegen [109]. The goal of the already accomplished research was to develop highly scalable tools, with predictable runtime and quality, for cost-effective, computer-aided design and manufacturing of DNA probe arrays. We illustrate the utility of our approach by taking a concrete example of combining the design tools of microarray technology for Harpes B virus DNA data

ScholarWorks @ Georgia State University

Highly Scalable Algorithms for Robust String Barcoding

Author: DasGupta Bhaskar
Konwar Kishori M.
Mandoiu Ion I.
Shvartsman Alex A.
Publication venue
Publication date: 01/01/2005
Field of study

String barcoding is a recently introduced technique for genomic-based identification of microorganisms. In this paper we describe the engineering of highly scalable algorithms for robust string barcoding. Our methods enable distinguisher selection based on whole genomic sequences of hundreds of microorganisms of up to bacterial size on a well-equipped workstation, and can be easily parallelized to further extend the applicability range to thousands of bacterial size genomes. Experimental results on both randomly generated and NCBI genomic data show that whole-genome based selection results in a number of distinguishers nearly matching the information theoretic lower bounds for the problem

arXiv.org e-Print Archive

CiteSeerX

Crossref

OligoSpawn: a software tool for the design of overgo probes from large unigene datasets

Author: Close Timothy J
Jiang Tao
Lonardi Stefano
Madishetty Kavitha
Svensson Jan T
Zheng Jie
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Expressed sequence tag (EST) datasets represent perhaps the largest collection of genetic information. ESTs can be exploited in a variety of biological experiments and analysis. Here we are interested in the design of overlapping oligonucleotide (overgo) probes from large unigene (EST-contigs) datasets. RESULTS: OLIGOSPAWN is a suite of software tools that offers two complementary services, namely (1) the selection of "unique" oligos each of which appears in one unigene but does not occur (exactly or approximately) in any other and (2) the selection of "popular" oligos each of which occurs (exactly or approximately) in as many unigenes as possible. In this paper, we describe the functionalities of OLIGOSPAWN and the computational methods it employs, and we report on experimental results for the overgo probes designed with it. CONCLUSION: The algorithms we designed are highly efficient and capable of processing unigene datasets of sizes on the order of several tens of Mb in a few hours on a regular PC. The software has been used to design overgo probes employed to screen a barley BAC library (Hordeum vulgare). OLIGOSPAWN is freely available at

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

An efficient algorithm for the stochastic simulation of the hybridization of DNA to microarrays

Author: A Halperin
A Pozhitkov
A Renyi
A Sharma
A Vainrub
A Vainrub
AC Eklund
AC Pease
AW Peterson
BJ Hong
C DeLisi
C Gadgil
CC Chou
D Colquhoun
D Yao
DA McQuarrie
DC Montgomery
DJ Lockhart
DR Dorris
DS Kong
DT Gillespie
DT Gillespie
DT Gillespie
DY Chiang
EF Nuwaysir
Erdem Arslan
F Fixe
F Li
G Bhanot
H Dai
Ian J Laurenzi
IG Darvey
IJ Laurenzi
IJ Laurenzi
IV Yang
J Peplies
J Quackenbush
J SantaLucia
J SantaLucia
JE Larkin
JM Cherry
JM Roulliard
JM Roulliard
K Nagino
L Onsager
L Poulsen
L Smith
L Zhang
LM Wick
M Andronescu
M Andronescu
M Chee
M Margulies
M Schena
M Smoluchowski
M Zuker
MA Gibson
MAQC Consortium
MS Shchepinov
N Cloonan
NO Hodas
P Hedge
P Wu
PK Tan
PMK Gordon
PT Spellman
R Shippy
R Versteeg
R Wernersson
RA Dimitrov
RC Tolman
RD Canales
RJ Cho
RJ Lipshutz
S Weckx
SP Fodor
SP Fodor
T Rayner
TJ Albert
W Etienne
X Cui
Y Gao
Y Zhang
Publication venue: BioMed Central
Publication date: 01/12/2009
Field of study

Abstract Background Although oligonucleotide microarray technology is ubiquitous in genomic research, reproducibility and standardization of expression measurements still concern many researchers. Cross-hybridization between microarray probes and non-target ssDNA has been implicated as a primary factor in sensitivity and selectivity loss. Since hybridization is a chemical process, it may be modeled at a population-level using a combination of material balance equations and thermodynamics. However, the hybridization reaction network may be exceptionally large for commercial arrays, which often possess at least one reporter per transcript. Quantification of the kinetics and equilibrium of exceptionally large chemical systems of this type is numerically infeasible with customary approaches. Results In this paper, we present a robust and computationally efficient algorithm for the simulation of hybridization processes underlying microarray assays. Our method may be utilized to identify the extent to which nucleic acid targets (e.g. cDNA) will cross-hybridize with probes, and by extension, characterize probe robustnessusing the information specified by MAGE-TAB. Using this algorithm, we characterize cross-hybridization in a modified commercial microarray assay. Conclusions By integrating stochastic simulation with thermodynamic prediction tools for DNA hybridization, one may robustly and rapidly characterize of the selectivity of a proposed microarray design at the probe and "system" levels. Our code is available at <url>http://www.laurenzi.net</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Efficient non-unique probes selection algorithms for DNA microarray

Author: Deng Ping
Ma Qingkai
Thai My T
Wu Weili
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Improving the efficiency of Bayesian Network Based EDAs and their application in Bioinformatics

Author: Salehi Elham
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2013
Field of study

Estimation of distribution algorithms (EDAs) is a relatively new trend of stochastic optimizers which have received a lot of attention during last decade. In each generation, EDAs build probabilistic models of promising solutions of an optimization problem to guide the search process. New sets of solutions are obtained by sampling the corresponding probability distributions. Using this approach, EDAs are able to provide the user a set of models that reveals the dependencies between variables of the optimization problems while solving them. In order to solve a complex problem, it is necessary to use a probabilistic model which is able to capture the dependencies. Bayesian networks are usually used for modeling multiple dependencies between variables. Learning Bayesian networks, especially for large problems with high degree of dependencies among their variables is highly computationally expensive which makes it the bottleneck of EDAs. Therefore introducing efficient Bayesian learning algorithms in EDAs seems necessary in order to use them for large problems. In this dissertation, after comparing several Bayesian network learning algorithms, we propose an algorithm, called CMSS-BOA, which uses a recently introduced heuristic called max-min parent children (MMPC) in order to constrain the model search space. This algorithm does not consider a fixed and small upper bound on the order of interaction between variables and is able solve problems with large numbers of variables efficiently. We compare the efficiency of CMSS-BOA with the standard Bayesian network based EDA for solving several benchmark problems and finally we use it to build a predictor for predicting the glycation sites in mammalian proteins

Scholarship at UWindsor