9 research outputs found
Systems Biology and the Development of Vaccines and Drugs for Malaria Treatments
The sequencing race has ended and the functional race has already begun. Microarray technology enables
simultaneous gene expression analysis of thousands of genes, enabling a snapshot of an organisms’
transcriptome at an unprecedented resolution. The close correlation between gene transcription and
function, allow the inference of biological processes from the assessed transcriptome profile. Among the
sophisticated analytical problems in microarray technology at the front and back ends respectively, are the
selection of optimal DNA oligos and computational analysis of the genes expression. In this review paper,
we analyse important methods in use today in customized oligos design. In the course of executing this,
we discovered that the oligos designer algorithm hanged on gene PFA0135w of chromosome 1, while
designing oligos for the gene sequences of Plasmodium falciparum. We do not know the reason for this
yet, as the algorithm runs on other sequences like the yeast (Saccharomyces cervisiae) and Neurospora
crassa. We conclude the paper highlighting the procedures encompassing the back end phase and discuss
their application to the development of vaccines and drugs for malaria treatment. Note that, malaria is the
cause of significant global morbidity and mortality with 300-500 million cases annually. Our aims are not
ends, but a means to achieve the following: Iterate the need for experimental biologists to (i) know how to
design their customized oligos and (ii) have some idea about gene expression analysis and the need for
cooperation between experimental biologists and their counterpart, the computational biologists. These
will help experimental biologists to coordinate very well the front and the back ends of the system
biology analysis of the whole genome effectively
Systems Biology and the Development of Vaccines and Drugs for Malaria Treatments
The sequencing race has ended and the functional race has already begun. Microarray technology enables simultaneous gene expression analysis of thousands of genes, enabling a snapshot of an organisms’transcriptome at an unprecedented resolution. The close correlation between gene transcription and function, allow the inference of biological processes from the assessed transcriptome profile. Among the sophisticated analytical problems in microarray technology at the front and back ends respectively, are the selection of optimal DNA oligos and computational analysis of the genes expression. In this review paper, we analyse important methods in use today in customized oligos design. In the course of executing this, we discovered that the oligos designer algorithm hanged on gene PFA0135w of chromosome 1, while designing oligos for the gene sequences of Plasmodium falciparum. We do not know the reason for this yet, as the algorithm runs on other sequences like the yeast (Saccharomyces cervisiae) and Neurospora crassa. We conclude the paper highlighting the procedures encompassing the back end phase and discuss their application to the development of vaccines and drugs for malaria treatment. Note that, malaria is the cause of significant global morbidity and mortality with 300-500 million cases annually. Our aims are not ends, but a means to achieve the following: Iterate the need for experimental biologists to (i) know how to design their customized oligos and (ii) have some idea about gene expression analysis and the need for cooperation between experimental biologists and their counterpart, the computational biologists. These will help experimental biologists to coordinate very well the front and the back ends of the system biology analysis of the whole genome effectively
Systems Biology And The Development Of Vaccines And Drugs For Malaria Treatments
The sequencing race has ended and the functional race has already
begun. Microarray technology enables simultaneous gene expression
analysis of thousands of genes, enabling a snapshot of an organisms'
transcriptome at an unprecedented resolution. The close correlation
between gene transcription and function, allow the inference of
biological processes from the assessed transcriptome profile. Among the
sophisticated analytical problems in microarray technology at the front
and back ends respectively, are the selection of optimal DNA oligos and
computational analysis of the genes expression. In this review paper,
we analyse important methods in use today in customized oligos design.
In the course of executing this, we discovered that the oligos designer
algorithm hanged on gene PFA0135w of chromosome 1, while designing
oligos for the gene sequences of Plasmodium falciparum . We do not
know the reason for this yet, as the algorithm runs on other sequences
like the yeast ( Saccharomyces cervisiae ) and Neurospora crassa . We
conclude the paper highlighting the procedures encompassing the back
end phase and discuss their application to the development of vaccines
and drugs for malaria treatment. Note that, malaria is the cause of
significant global morbidity and mortality with 300-500 million cases
annually. Our aims are not ends, but a means to achieve the following:
Iterate the need for experimental biologists to (i) know how to design
their customized oligos and (ii) have some idea about gene expression
analysis and the need for cooperation between experimental biologists
and their counterpart, the computational biologists. These will help
experimental biologists to coordinate very well the front and the back
ends of the system biology analysis of the whole genome effectively
Refined repetitive sequence searches utilizing a fast hash function and cross species information retrievals
BACKGROUND: Searching for small tandem/disperse repetitive DNA sequences streamlines many biomedical research processes. For instance, whole genomic array analysis in yeast has revealed 22 PHO-regulated genes. The promoter regions of all but one of them contain at least one of the two core Pho4p binding sites, CACGTG and CACGTT. In humans, microsatellites play a role in a number of rare neurodegenerative diseases such as spinocerebellar ataxia type 1 (SCA1). SCA1 is a hereditary neurodegenerative disease caused by an expanded CAG repeat in the coding sequence of the gene. In bacterial pathogens, microsatellites are proposed to regulate expression of some virulence factors. For example, bacteria commonly generate intra-strain diversity through phase variation which is strongly associated with virulence determinants. A recent analysis of the complete sequences of the Helicobacter pylori strains 26695 and J99 has identified 46 putative phase-variable genes among the two genomes through their association with homopolymeric tracts and dinucleotide repeats. Life scientists are increasingly interested in studying the function of small sequences of DNA. However, current search algorithms often generate thousands of matches – most of which are irrelevant to the researcher. RESULTS: We present our hash function as well as our search algorithm to locate small sequences of DNA within multiple genomes. Our system applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. We discuss our incorporation of the Gene Ontology (GO) database into these algorithms. We conduct an exhaustive time analysis of our system for various repetitive sequence lengths. For instance, a search for eight bases of sequence within 3.224 GBases on 49 different chromosomes takes 1.147 seconds on average. To illustrate the relevance of the search results, we conduct a search with and without added annotation terms for the yeast Pho4p binding sites, CACGTG and CACGTT. Also, a cross-species search is presented to illustrate how potential hidden correlations in genomic data can be quickly discerned. The findings in one species are used as a catalyst to discover something new in another species. These experiments also demonstrate that our system performs well while searching multiple genomes – without the main memory constraints present in other systems. CONCLUSION: We present a time-efficient algorithm to locate small segments of DNA and concurrently to search the annotation data accompanying the sequence. Genome-wide searches for short sequences often return hundreds of hits. Our experiments show that subsequently searching the annotation data can refine and focus the results for the user. Our algorithms are also space-efficient in terms of main memory requirements. Source code is available upon request
An efficient algorithm for finding short approximate non-tandem repeats
Abstract
We study the problem of approximate non-tandem repeat †extraction. Given a long subject string S of length N over a finite alphabet Σ and a threshold D, we would like to find all short substrings of S of length P that repeat with at most D differences, i.e., insertions, deletions, and mismatches. We give a careful theoretical characterization of the set of seeds (i.e., some maximal exact repeats) required by the algorithm, and prove a sublinear bound on their expected numbers. Using this result, we present a sub-quadratic algorithm for finding all short (i.e., of length O(log N)) approximate repeats. The running time of our algorithm is O(DN3pow(ε)−1log N), where ε = D/P and pow(ε) is an increasing, concave function that is 0 when ε = 0 and about 0.9 for DNA and protein sequences.
Contact: [email protected]
†Throughout the paper we only consider non-tandem repeats.</jats:p
Δομικές μελέτες αμυλοειδογόνων πρωτεϊνών με περιοδικότητες στην αλληλουχία τους
Τα αμυλοειδή αποτελούν εξωκυτταρικές/ενδοκυτταρικές εναποθέσεις αδιάλυτων
πρωτεϊνικών ινιδίων που δημιουργούνται από διαλυτές πρωτεΐνες/πεπτίδια, όταν
διπλώνονται κατά μη φυσιολογικό τρόπο και αυτοσυγκροτούνται, προκαλώντας την
καταστροφή κυττάρων και ιστών. Τα αμυλοειδή σχετίζονται με έναν αριθμό
στερεοδιαταξικών ασθενειών, τις λεγόμενες αμυλοειδώσεις. Συχνά αρκετοί
οργανισμοί εκμεταλλεύονται τις ιδιότητες και την αρχιτεκτονική που
χαρακτηρίζουν τα αμυλοειδή ινίδια, με στόχο να υποστηρίξουν πολύπλοκες
βιολογικές λειτουργίες. Οι λειτουργικές αυτές δομές ονομάζονται λειτουργικά
αμυλοειδή (functional amyloids). Αρκετές αμυλοειδογόνες πρωτεΐνες έχουν
συσχετιστεί με την δημιουργία β-σοληνοειδών δομών. Τα β-σωληνοειδή (β-έλικες)
αποτελούν επιμήκεις σπείρες που σχηματίζονται από πολυπεπτιδικές αλυσίδες, οι
οποίες διπλώνουν κατά κυκλικό τρόπο στο χώρο. Βασικό χαρακτηριστικό των
πρωτεϊνικών αυτών αλληλουχιών, αποτελεί η παρουσία αμινοξικών περιοδικοτήτων
μεγέθους 5-40 καταλοίπων. Ταυτόχρονα, εμφανίζουν υψηλή προτίμηση σε αμινοξικά
κατάλοιπα που έχουν φέρουν μικρή πλευρική αλυσίδα, αλλά και πολικό χαρακτήρα.
Σκοπός της εργασίας ήταν η εύρεση τμημάτων της πολυπεπτιδικής αλυσίδας
αμυλοειδογόνων πρωτεϊνών που αποτελούνται από αποκλίνουσες και συνεχείς
περιοδικότητες, μεγέθους 5-40 αμινοξικών καταλοίπων, ικανών για την δημιουργία
β-σωληνοειδών δομών. Tα αποτελέσματα υποδεικνύουν την παρουσία
επαναλαμβανόμενων τμημάτων στις αλληλουχίες των περισσότερων αμυλοειδογόνων
πρωτεϊνών. Ταυτόχρονα, οι δομικές μελέτες που πραγματοποιήθηκαν σε 5 πρότυπες
πρωτεΐνες, υπέδειξαν την ικανότητά τους να σχηματίζουν β-σωληνοειδείς δομές, ο
πολυμερισμός των οποίων μπορεί να οδηγήσει στον σχηματισμό αμυλοειδών
πρωτοϊνιδίων.Amyloids are extracellular/intracellular protein fibrous deposits formed by
otherwise soluble proteins or peptides that fail to adopt a proper fold,
leading to tissue damage and degeneration. Amyloids are related to a number of
conformational diseases, named amyloidoses. However, organisms (from bacteria
to human) exhibit novel and important biological functions, based on the
functional properties of amyloids. Such structures are known as functional
amyloids. Structures known as β-solenoids are elongated spirals which support
the “cross-β” structure of amyloids and are formed by stacked coils,
representing subsequent sequence repeats. They are formed by proteins
sequences, baring successive amino acid repeats, 5-40 residues long. Such
sequences indicate a preference for residues with small side chains (such as
glycine or alanine) and exhibit a high percentage of residues with polar side
chains (such as serine, threonine, glutamine or asparagine). The purpose of
this study, involved an exhaustive search for successive and divergent repeats,
5-40 residues in length, in amyloidogenic sequences that could contribute in
the formation of β-solenoid structures. Admittedly, results presented in the
current study, indicate the presence of divergent repeats in most amyloidogenic
sequences. Moreover, structural studies performed indicate the ability of
certain model cases of the above to form β-solenoid structures