8 research outputs found

    Multiple sequence alignment with user-defined constraints at GOBICS

    Get PDF
    Most multi-alignment methods are fully automated, i.e. they are based on a fixed set of mathematical rules. For various reasons, such methods may fail to produce biologically meaningful alignments. Herein, we describe a semi-automatic approach to multiple sequence alignment where biological expert knowledge can be used to influence the alignment procedure. The user can specify parts of the sequences that are biologically related to each other; our software program uses these sites as anchor points and creates a multiple alignment respecting these user-defined constraints. By using known functionally, structurally or evolutionarily related positions of the input sequences as anchor points, our method can produce alignments that reflect the true biological relationships among the input sequences more accurately than fully automated procedures can do

    Surveying phylogenetic footprints in large gene clusters: applications to Hox cluster duplications

    Get PDF
    Evolutionarily conserved non-coding genomic sequences represent a potentially rich source for the discovery of gene regulatory regions. Since these elements are subject to stabilizing selection they evolve much slower than adjacent non-functional DNA. These so-called phylogenetic footprints can be detected by comparison of the sequences surrounding orthologous genes in different species. In this paper we present a new method and an effcient software tool for the identifcation of corresponding footprints in long sequences from multiple species. This allows the evolutionary study of the origin and loss of phylogenetic footprints if suffcient number and appropriately placed species are included. We apply this method to the published sequences of HoxA clusters of shark, human, and the duplicated zebrafish and Takifugu clusters as well as the published HoxB cluster sequences. We find that there is a massive loss of sequence conservation in the intergenic region of the HoxA clusters, consistent with the finding in [Chiu et al., PNAS 99, 5492-5497 (2002)]. We further propose a simple model to estimate the loss of sequence conservation that can be attributed to gene loss and other structural reasons. We find that the loss of conservation after cluster duplication is more extensive than expected by this model. This suggests that binding site turnover and/or adaptive modification may also contribute to the loss of sequence conservation. We conclude that this method is suitable for the large scale study of the evolution of (putative) cis-regulatory elements

    Studying Evolutionary Change: Transdisciplinary Advances in Understanding and Measuring Evolution

    Get PDF
    Evolutionary processes can be found in almost any historical, i.e. evolving, system that erroneously copies from the past. Well studied examples do not only originate in evolutionary biology but also in historical linguistics. Yet an approach that would bind together studies of such evolving systems is still elusive. This thesis is an attempt to narrowing down this gap to some extend. An evolving system can be described using characters that identify their changing features. While the problem of a proper choice of characters is beyond the scope of this thesis and remains in the hands of experts we concern ourselves with some theoretical as well data driven approaches. Having a well chosen set of characters describing a system of different entities such as homologous genes, i.e. genes of same origin in different species, we can build a phylogenetic tree. Consider the special case of gene clusters containing paralogous genes, i.e. genes of same origin within a species usually located closely, such as the well known HOX cluster. These are formed by step- wise duplication of its members, often involving unequal crossing over forming hybrid genes. Gene conversion and possibly other mechanisms of concerted evolution further obfuscate phylogenetic relationships. Hence, it is very difficult or even impossible to disentangle the detailed history of gene duplications in gene clusters. Expanding gene clusters that use unequal crossing over as proposed by Walter Gehring leads to distinctive patterns of genetic distances. We show that this special class of distances helps in extracting phylogenetic information from the data still. Disregarding genome rearrangements, we find that the shortest Hamiltonian path then coincides with the ordering of paralogous genes in a cluster. This observation can be used to detect ancient genomic rearrangements of gene clus- ters and to distinguish gene clusters whose evolution was dominated by unequal crossing over within genes from those that expanded through other mechanisms. While the evolution of DNA or protein sequences is well studied and can be formally described, we find that this does not hold for other systems such as language evolution. This is due to a lack of detectable mechanisms that drive the evolutionary processes in other fields. Hence, it is hard to quantify distances between entities, e.g. languages, and therefore the characters describing them. Starting out with distortions of distances, we first see that poor choices of the distance measure can lead to incorrect phylogenies. Given that phylogenetic inference requires additive metrics we can infer the correct phylogeny from a distance matrix D if there is a monotonic, subadditive function ζ such that ζ^−1(D) is additive. We compute the metric-preserving transformation ζ as the solution of an optimization problem. This result shows that the problem of phylogeny reconstruction is well defined even if a detailed mechanistic model of the evolutionary process is missing. Yet, this does not hinder studies of language evolution using automated tools. As the amount of available and large digital corpora increased so did the possibilities to study them automatically. The obvious parallels between historical linguistics and phylogenetics lead to many studies adapting bioinformatics tools to fit linguistics means. Here, we use jAlign to calculate bigram alignments, i.e. an alignment algorithm that operates with regard to adjacency of letters. Its performance is tested in different cognate recognition tasks. Using pairwise alignments one major obstacle is the systematic errors they make such as underestimation of gaps and their misplacement. Applying multiple sequence alignments instead of a pairwise algorithm implicitly includes more evolutionary information and thus can overcome the problem of correct gap placement. They can be seen as a generalization of the string-to-string edit problem to more than two strings. With the steady increase in computational power, exact, dynamic programming solutions have become feasible in practice also for 3- and 4-way alignments. For the pairwise (2-way) case, there is a clear distinction between local and global alignments. As more sequences are consid- ered, this distinction, which can in fact be made independently for both ends of each sequence, gives rise to a rich set of partially local alignment problems. So far these have remained largely unexplored. Thus, a general formal frame- work that gives raise to a classification of partially local alignment problems is introduced. It leads to a generic scheme that guides the principled design of exact dynamic programming solutions for particular partially local alignment problems

    DNA binding specificity and transcriptional regulation of Six4: a myotonic dystrophy associated transcription factor

    Get PDF
    Attaining an understanding of the mechanisms underpinning development has been amongst the cardinal scientific challenges of our age. The transition from a single cell organism to the level of complexity evidenced in higher eukaryotes has been facilitated by the advent of intricate developmental networks involving a plethora of factors that synergise to allow for precise spatio-temporal expression of the proteins present in higher organisms. Development is often portrayed as a domino like cascade of events stemming from relatively uncomplicated origins that go on to branch out and form associations and interactions amongst multitudinous actors that will inexorably lead towards a higher state of order. Transcription factors occupy a central position within this tapestry of interactions. They regulate expression of the various required proteins and they provide the cues for the developmental events that will eventually shape an organism. These factors frequently remain unknown until some occurrence causes developmental processes to fail and inadvertently focus attention on the factors that facilitate development. Myotonic dystrophy is a useful paradigm of such a developmental dysfunction that has led to the discovery of a transcription factor integral to both muscle development and gonadogenesis in both Drosophila and higher eukaryotes

    Transcriptional control of macrophage function in the pig and its relationship to infectious disease susceptibility

    Get PDF
    The biology of cells of the mononuclear phagocyte system has been studied extensively in the mouse. Studies of the pig as an experimental model have commonly been consigned to specialist animal science journals. This thesis considered some of the many ways that pigs may address the shortcomings of mice as models for the study of macrophage differentiation and activation in vitro, and the biology of sepsis and other pathologies in the living animal. Flow cytometry was used initially to phenotype cells from the porcine lung, peritoneal cavity, blood and bone marrow using the LPS receptor CD14 and the FC receptor CD16, markers frequently employed to differentiate human monocytes into subsets. The expression of SIRP-alpha (SWC3a, CD172a), which is present on all cells of myeloid origin, and the haemoglobin scavenger receptor, CD163 which has previously been used to study monocyte differentiation in the pig was also studied. The findings validated previous work where blood monocytes were divided into subsets on the expression of CD14 and CD163. Furthermore, like human and mouse, pig monocytes also exhibited variation in CD16 expression, having a subset which was CD14hiCD16lo and another which was CD14loCD16hi. A whole genome approach was then used to study the differences between the monocyte subsets in the pig, using monocytes sorted into two populations based on the expression of CD14 and CD163. The gene expression profiles obtained were then compared to publically available data from monocyte subsets in human and mouse. This thesis also investigated the expression of genes that are known to be differentially expressed between human and mouse. To do this gene expression in porcine bone marrow derived macrophages was analyzed across an LPS time course. Like human macrophages, pig macrophages did not induce nitric oxide nor any arginine metabolizing genes in response to LPS. Instead they responded with robust induction of indoleamine 2,3-dioxygenase (IDO) and other enzymes of the tryptophan metabolism pathway such as kynurenine hydroxylase, kynureninase and tryptophan-tRNA synthetase. The tryptophan metabolism pathway has been implicated in sepsis in man and the absence of this pathway in the mouse may be one of the reasons why an adequate rodent model of sepsis has not been developed. The IDO inhibitor 1-methyl-tryptophan (1-MT) has been used to treat mouse macrophages where it had a protective effect after LPS administration. Similar experiments on pig macrophages did not show the same protective effect and induction of key immune genes was increased after treatment with 1-MT suggesting IDO is involved in feedback control of the immune system. With the completion of the genome sequence and the characterisation of many key regulators and markers, the pig has emerged as a tractable model of human innate immunity and disease that should address the limited predictive value of rodents in preclinical studies. This project aimed to address the gap in our knowledge of the control of innate immunity in the pig and provided further evidence that the pig can function as an ideal model to study innate immunity
    corecore