190 research outputs found

    A comparative analysis of existing oligonucleotides selection algorithms for microarray technology

    Get PDF
    In system biology, DNA microarray technology is an indispensable tool for the biological analysis involved at the level of the whole genome. Among the sophisticated analytical problems in microarray technology at the front and back ends, respectively, are the selection of optimal DNA oligonucleotides (henceforth oligos) and computational analysis of the genes expression data. A computational comparative analysis of the methods used to select oligos is important since the design and quality of the microarray probes are of critical importance for the hybridization experiments as well as subsequent analysis of the data. In an attempt to enhance efficient and effective design at the front end, a computational comparative analysis was performed on oligos selection tools using the barley ESTs, as well as the Saccharomyces cerevisiae, Encephalitozoon cuniculi and human genomes. The analysis also shows that a large number of the existing tools are difficult to install and configure. For cross hybridization test, most rely on BLAST and therefore design ill specific oligonucleotides. Furthermore, most are non-intuitive to use and lack important oligo design and software features

    Systems Biology and the Development of Vaccines and Drugs for Malaria Treatments

    Get PDF
    The sequencing race has ended and the functional race has already begun. Microarray technology enables simultaneous gene expression analysis of thousands of genes, enabling a snapshot of an organisms’ transcriptome at an unprecedented resolution. The close correlation between gene transcription and function, allow the inference of biological processes from the assessed transcriptome profile. Among the sophisticated analytical problems in microarray technology at the front and back ends respectively, are the selection of optimal DNA oligos and computational analysis of the genes expression. In this review paper, we analyse important methods in use today in customized oligos design. In the course of executing this, we discovered that the oligos designer algorithm hanged on gene PFA0135w of chromosome 1, while designing oligos for the gene sequences of Plasmodium falciparum. We do not know the reason for this yet, as the algorithm runs on other sequences like the yeast (Saccharomyces cervisiae) and Neurospora crassa. We conclude the paper highlighting the procedures encompassing the back end phase and discuss their application to the development of vaccines and drugs for malaria treatment. Note that, malaria is the cause of significant global morbidity and mortality with 300-500 million cases annually. Our aims are not ends, but a means to achieve the following: Iterate the need for experimental biologists to (i) know how to design their customized oligos and (ii) have some idea about gene expression analysis and the need for cooperation between experimental biologists and their counterpart, the computational biologists. These will help experimental biologists to coordinate very well the front and the back ends of the system biology analysis of the whole genome effectively

    Mathematical Methods and their Applications

    Get PDF

    An Efficient Algorithm for Oligonucleotides Selection in a Large EST Databases

    Get PDF
    Identifying unique oligonucleotide (oligo) probe sequences is an important step in PCR and microarray experiments. While there are a growing number of complete and annotated genomes, the largest collection of publicly available genetic sequences are expressed sequence tag (EST) sequences. Furthermore, for many organisms that are important to the society, such as barley, the EST is the major data on the expressed genes in a number of these organisms. For the EST sequences, the unique oligo problem is the selection of oligos each of which appears (exactly) in one EST sequence but does not appear (exactly or approximately, for a given hamming difference d) in any other EST sequence. OligoSpawn, in two phase, has been implemented to efficiently select oligos from ESTs. The notion of a “seed” was used in the construction of OligoSpawn, and its run time is exponential dependent on q (the length of the “seed”). For q = 11, it ran on a previous barley dataset of 28MB for 2 hours and 26 minutes using a 1.2GHz AMD machine, but it is very inefficient for large datasets, like the new 43MB barley dataset. We observed this as OligoSpawn, for q = 11, runs for about 6 days using a 3.0GHz Pentium IV machine. Furthermore, selection of some important unique oligos (e.g., for which q = 13) is unwieldy for OligoSpawn. In this work, using the suffix tree, we give a careful theoretical characterization of the set of seeds required, and prove a subqradratic time algorithm for extracting these seeds. Using this result, we present an efficient algorithm that takes advantage of the new results, that simplify the solution of the least common ancestor (LCA) problem via the range minimum query (RMQ) problem. The run time of our resulting algorithm is O(n3qd/42q). For q = 11 and q = 13, our algorithm runs on the new 43MB barley dataset for 4 days using also a 3.0 GHz Pentium IV. As far as we know, our algorithm is the fastest oligonucleotides selector algorithm for large databases of tens of thousands of EST sequences, such as the barley ESTs

    Aligning Multiple Sequences with Genetic Algorithm

    Get PDF
    The alignment of biological sequences is a crucial tool in molecular biology and genome analysis. It helps to build a phylogenetic tree of related DNA sequences and also to predict the function and structure of unknown protein sequences by aligning with other sequences whose function and structure is already known. However, finding an optimal multiple sequence alignment takes time and space exponential with the length or number of sequences increases. Genetic Algorithms (GAs) are strategies of random searching that optimize an objective function which is a measure of alignment quality (distance) and has the ability for exploratory search through the solution space and exploitation of current results

    Clustering Plasmodium falciparum Genes to their Functional Roles Using k-means

    Get PDF
    We developed recently a new and novel Metric Matrics k-means (MMk-means) clustering algorithm to cluster genes to their functional roles with a view of obtaining further knowledge on many P. falciparum genes. To further pursue this aim, in this study, we compare three different k-means algorithms (including MMk-means) results from an in-vitro microarray data (Le Roch et al., Science, 2003) with the classification from an in-vivo microarray data (Daily et al., Nature, 2007) in other to perform a comparative functional classification of P. falciparum genes and further validate the effectiveness of our MMk-means algorithm. Results from this study indicate that the resulting distribution of the comparison of the three algorithms’ in vitro clusters against the in vivo clusters are similar thereby authenticating our MMk-means method and its effectiveness. However, Daily et al. claim that the physiological state (the environmental stress response) of P. falciparum in selected malaria-infected patients observed in one of their clusters can not be found in any in-vitro clusters is not true as our analysis reveal many in-vitro clusters representation in this cluster
    • …
    corecore