812 research outputs found

    Parking functions, labeled trees and DCJ sorting scenarios

    Get PDF
    In genome rearrangement theory, one of the elusive questions raised in recent years is the enumeration of rearrangement scenarios between two genomes. This problem is related to the uniform generation of rearrangement scenarios, and the derivation of tests of statistical significance of the properties of these scenarios. Here we give an exact formula for the number of double-cut-and-join (DCJ) rearrangement scenarios of co-tailed genomes. We also construct effective bijections between the set of scenarios that sort a cycle and well studied combinatorial objects such as parking functions and labeled trees.Comment: 12 pages, 3 figure

    Listing all sorting reversals in quadratic time

    Get PDF
    We describe an average-case O(n2) algorithm to list all reversals on a signed permutation π that, when applied to π, produce a permutation that is closer to the identity. This algorithm is optimal in the sense that, the time it takes to write the list is Ω(n2) in the worst case

    Efficient Sampling of Parsimonious Inversion Histories with Application to Genome Rearrangement in Yersinia

    Get PDF
    Inversions are among the most common mutations acting on the order and orientation of genes in a genome, and polynomial-time algorithms exist to obtain a minimal length series of inversions that transform one genome arrangement to another. However, the minimum length series of inversions (the optimal sorting path) is often not unique as many such optimal sorting paths exist. If we assume that all optimal sorting paths are equally likely, then statistical inference on genome arrangement history must account for all such sorting paths and not just a single estimate. No deterministic polynomial algorithm is known to count the number of optimal sorting paths nor sample from the uniform distribution of optimal sorting paths

    Sampling and counting genome rearrangement scenarios

    Get PDF
    Even for moderate size inputs, there are a tremendous number of optimal rearrangement scenarios, regardless what the model is and which specific question is to be answered. Therefore giving one optimal solution might be misleading and cannot be used for statistical inferring. Statistically well funded methods are necessary to sample uniformly from the solution space and then a small number of samples are sufficient for statistical inferring

    Models and Algorithms for Sorting Permutations with Tandem Duplication and Random Loss

    Get PDF
    A central topic of evolutionary biology is the inference of phylogeny, i. e., the evolutionary history of species. A powerful tool for the inference of such phylogenetic relationships is the arrangement of the genes in mitochondrial genomes. The rationale is that these gene arrangements are subject to different types of mutations in the course of evolution. Hence, a high similarity in the gene arrangement between two species indicates a close evolutionary relation. Metazoan mitochondrial gene arrangements are particularly well suited for such phylogenetic studies as they are available for a wide range of species, their gene content is almost invariant, and usually free of duplicates. With these properties gene arrangements of mitochondrial genomes are modeled by permutations in which each element represents a gene, i. e., a specific genetic sequence. The mutations that shape the gene arrangement of genomes are then represented by operations that rearrange elements in permutations, so-called genome rearrangements, and thereby bridge the gap between evolutionary biology and optimization. Many problems of phylogeny inference can be formulated as challenging combinatorial optimization problems which makes this research area especially interesting for computer scientists. The most prominent examples of such optimization problems are the sorting problem and the distance problem. While the sorting problem requires a minimum length sequence of rearrangements that transforms one given permutation into another given permutation, i. e., it aims for a hypothetical scenario of gene order evolution, the distance problem intends to determine only the length of such a sequence. This minimum length is called distance and used as a (dis)similarity measure quantifying the evolutionary relatedness. Most evolutionary changes occurring in gene arrangements of mitochondrial genomes can be explained by the tandem duplication random loss (TDRL) genome rearrangement model. A TDRL consists of a duplication of a consecutive set of genes in tandem followed by a random loss of one copy of each duplicated gene. In spite of the importance of the TDRL genome rearrangement in mitochondrial evolution, its combinatorial properties have rarely been studied. In addition, models of genome rearrangements which include all types of rearrangement that are relevant for mitochondrial genomes, i. e., inversions, transpositions, inverse transpositions, and TDRLs, while admitting computational tractability are rare. Nevertheless, especially for metazoan gene arrangements the TDRL rearrangement should be considered for the reconstruction of phylogeny. Realizing that a better understanding of the TDRL model is indispensable for the study of mitochondrial gene arrangements, the central theme of this thesis is to broaden the horizon of TDRL genome rearrangements with respect to mitochondrial genome evolution. For this purpose, this thesis provides combinatorial properties of the TDRL model and its variants as well as efficient methods for a plausible reconstruction of rearrangement scenarios between gene arrangements. The methods that are proposed consider all types of genome rearrangements that predominately occur during mitochondrial evolution. More precisely, the main points contained in this thesis are as follows: The distance problem and the sorting problem for the TDRL model are further examined in respect to circular permutations, a formal concept that reflects the circular structure of mitochondrial genomes. As a result, a closed formula for the distance is provided. Recently, evidence for a variant of the TDRL rearrangement model in which the duplicated set of genes is additionally inverted have been found. Initiating the algorithmic study of this new rearrangement model on a certain type of permutations, a closed formula solving the distance problem is proposed as well as a quasilinear time algorithm that solves the corresponding sorting problem. The assumption that only one type of genome rearrangement has occurred during the evolution of certain gene arrangements is most likely unrealistic, e. g., at least three types of rearrangements on top of the TDRL rearrangement have to be considered for the evolution metazoan mitochondrial genomes. Therefore, three different biologically motivated constraints are taken into account in this thesis in order to produce plausible evolutionary rearrangement scenarios. The first constraint is extending the considered set of genome rearrangements to the model that covers all four common types of mitochondrial genome rearrangements. For this 4-type model a sharp lower bound and several close additive upper bounds on the distance are developed. As a byproduct, a polynomial-time approximation algorithm for the corresponding sorting problem is provided that guarantees the computation of pairwise rearrangement scenarios that deviate from a minimum length scenario by at most two rearrangement operations. The second biologically motivated constraint is the relative frequency of the different types of rearrangements occurring during the evolution. The frequency is modeled by employing a weighting scheme on the 4-type model in which every rearrangement is weighted with respect to its type. The resulting NP-hard sorting problem is then solved by means of a polynomial size integer linear program. The third biologically motivated constraint that has been taken into account is that certain subsets of genes are often found in close proximity in the gene arrangements of many different species. This observation is reflected by demanding rearrangement scenarios to preserve certain groups of genes which are modeled by common intervals of permutations. In order to solve the sorting problem that considers all three types of biologically motivated constraints, the exact dynamic programming algorithm CREx2 is proposed. CREx2 has a linear runtime for a large class of problem instances. Otherwise, two versions of the CREx2 are provided: The first version provides exact solutions but has an exponential runtime in the worst case and the second version provides approximated solutions efficiently. CREx2 is evaluated by an empirical study for simulated artificial and real biological mitochondrial gene arrangements

    Measurement of the Perceptual Rotation of Visual Stimuli

    Get PDF
    Apparatus for study of the phenomenon of rotation consisted of two rotating turntables constructed to receive disks for presenting varied visual stimuli. Turntables were graduated into 360 degrees for measurement of angular discrepancy in the task of visually matching rotational positions. Subjects from ages four through eleven attempted to match six compass positions for each of three designs--a boxlike house, a straight line, and Bender-Gestalt Figure No. 3. Errors of rotation were classed as either transpositional or nontranspositional, Transpositional error, involving reversal or mirroring of the directional aspect of the designs, largely disappeared by age six. Non-transpositional error declined rapidly between ages four and six, leveled off, then showed another significant decline at age nine. The three designs were readily conceptualized as to direction, showing no differences for inducing rotation. The error scores were minimally related to IQ and achievement. No correlation was found with rotation as measured by the Minnesota Percepto-Diagnostic Test. Groups at ages four, six, and eight were retested after one week, disclosing low reliabilities for non-transpositional error, though mean rotation error and standard deviation for the groups remained stable. Sixty-seven percent of four year olds showed instances of transposition, and as this source of error was scored as limited to fifty degrees and included in the composite score, the reliability for age four was raised from .52 to .96. A second study of children at ages four and five was conducted to verify the possibility of obtaining high reliability by combining both types of error. Utilizing some variations in methodology and designs, test-retest correlations over a two-week interval yielded a reliability of .82 for age four and .93 for age five. It was concluded that the method was applicable in assessing rotational error occurring on a perceptual-intuitive level, and that personal characteristics associated with perceptual-intuitive operations could be reliably measured at ages four and five

    Exploring model-based and model-free reinforcement learning in obsessive-compulsive disorder

    Get PDF
    RESUMO: A Perturbação Obsessivo-Compulsiva (POC) é uma doença neuropsiquiátrica comum, grave e incapacitante, para a qual os tratamentos actuais são ineficazes num grande número de casos. O instrumento mais utilizado para avaliar a gravidade de sintomas obsessivo-compulsivos é a Yale-Brown Obsessive-Compulsive Scale (YBOCS), que foi recentemente revista (Y-BOCS-II). No entanto, a sua validade de construto (tanto divergente como convergente) tem sido reportada como moderada e a sua validade de critério para diagnóstico de POC nunca foi testada. No primeiro capítulo desta tese testei, pela primeira vez, a validade de critério da Y-BOCS-II e demonstrei que um ponto de corte de 13 (pontuação total) atinge o melhor balanço entre sensibilidade e especificidade para o diagnóstico de POC. No entanto, confirmei que a sua validade divergente está longe de ser excelente. Este último achado levoume a procurar outros potenciais marcadores de POC. Têm sido demonstradas várias anomalias em doentes com POC utilizando tarefas neuropsicológicas ou técnicas de neuroimagem. Contudo, não existe ainda um marcador consistente para esta perturbação, que seja capaz de discriminar eficazmente pacientes que sofrem de POC, que seja sensível à mudança após intervenções terapêuticas e para o qual seja possível estabelecer uma correspondência com circuitos ou função cerebral. Uma abordagem que tem sido seguida nos últimos anos considera a POC como sendo caracterizada por uma disfunção nos sistemas cerebrais responsáveis pela aprendizagem de acções. As tarefas de decisão sequencial emergiram recentemente como um instrumento importante e sofisticado para estudar a aprendizagem de acções em humanos através da abordagem de reinforcement learning (RL). De acordo com a teoria subjacente ao RL, as acções podem ser aprendidas de duas formas distintas: um sistema modelbased funciona através da construção de um modelo interno das dinâmicas do ambiente e utiliza esse modelo para planear trajectórias comportamentais futuras, por oposição a um sistema model-free, que funciona armazenando o valor estimado das acções que foram implementadas recentemente e actualizando essas estimativas por tentativa e erro. As chamadas tarefas de decisão sequencial têm vindo a ser utilizadas para estabelecer associações entre disfunção de sistemas cerebrais de RL e algumas perturbações neuropsiquiátricas, como a POC, sendo que um desequilíbrio entre os sistemas model-based e model-free tem sido descrito. Através da aplicação de uma dessas tarefas de decisão sequencial, a two-step task, existe evidência que sugere que os doentes com POC têm um défice no sistema model-based. No entanto, neste paradigma em particular, antes de desempenhar esta tarefa os indivíduos recebem informação detalhada sobre a estrutura da mesma. Assim, não é claro como os dois principais sistemas de RL interagem quando os indivíduos aprendem exclusivamente através de interacção com o ambiente e como a informação explícita afecta as estratégias de RL. No segundo capítulo desta tese, desenvolvi uma nova tarefa de decisões sequenciais que permite não só quantificar o uso de estratégias modelbased RL e model-free RL, mas também diferenciar entre o impacto do conhecimento explícito da estrutura da tarefa e o impacto da experiência na mesma. Os resultados da aplicação da tarefa em indivíduos saudáveis demonstram que inicialmente a escolha de acções é controlada por aprendizagem model-free, com a aprendizagem model-based emergindo apenas numa minoria de indivíduos depois de experiência significativa com a tarefa, não emergindo de todo em indivíduos com POC, que por sua vez mostraram tendência para aumentar o uso de model-free RL com a experiência. Quando foi dada informação explícita sobre a estrutura da tarefa, observou-se um aumento dramático do uso de aprendizagem model-based, tanto nos voluntários saudáveis como em ambos os grupos clínicos. A informação explícita diminuiu o uso do sistema de aprendizagem model-free nos voluntários saudáveis e nos pacientes com perturbação do humor e ansiedade, mas essa diminuição não foi estatisticamente significativa no grupo de doentes com POC. Para além disso, depois das instruções, verificou-se em todos os grupos que a actualização do valor das acções aprendidas através do sistema model-free passou a ser mais influenciada pelo valor dos estados atingidos e menos influenciada pela consequência dos ensaios. Outro efeito da informação explícita sobre a estrutura da tarefa nos indivíduos saudáveis foi tornar as escolhas mais perseverantes, o que é consistente com uma modificação da estratégia de exploração. Estes resultados ajudam a clarificar o perfil de utilização de estratégias de RL dos pacientes com POC, que apresentam défice inespecíficos de aprendizagem model-based e achados mais específicos de maior uso de aprendizagem model-free, em ambos os casos antes de obterem informação sobrea estrutura da tarefa. Por fim, como a literatura ainda não é consensual sobre a interação entre um eventual sistema de model-based RL e um sistema de model-free RL nos circuitos cerebrais em humanos, devenvolvi um protocolo de ressonância magnética funcional para avaliar a escolha de ação sequencial com e sem instruções. Os resultados preliminares, em indivíduos saudáveis, sugerem que a reduced two-step task permite separar comportamento que utiliza aprendizagem predominantemente model-free (antes das instruções) de comportamento que utiliza aprendizagem predominantemente model-based (após as instruções), no mesmo indivíduo, estrutura da tarefa e ambiente. A análise dos dados de imagem funcional sugere que o conhecimento explícito sobre a estrutura da tarefa modifica a atividade neuronal no córtex paracingulado (cortex prefrontal medial) durante a transição do primeiro para o segundo passo da tarefa. Objectivos futuros incluem o uso de técnicas de análise multivariada para explorar a representação cerebral dos estados da tarefa e a aplicação deste protocolo de ressonância magnética funcional em populações clínicas.ABSTRACT: Obsessive-compulsive disorder (OCD) is a common, chronic and disabling neuropsychiatric condition for which current treatments are ineffective in a large proportion of cases. The gold-standard instrument to assess the severity of OCD symptoms is the Yale-Brown Obsessive-Compulsive Scale (Y-BOCS), which was recently revised (Y-BOCS-II). However, its construct validity has been reported has moderate and its criterion-related validity for the diagnosis of OCD has never been tested. In the first chapter of this dissertation, I tested, for the first time, criterion-related validity of the Y-BOCS-II and demonstrated that a cut-off of 13 (total score) attains the best balance between sensitivity and specificity for the diagnosis of OCD. However, I confirmed that its divergent validity is far from excellent. This last finding led me to search for other potential markers of OCD. Several abnormalities have been demonstrated in OCD patients in studies using neuropsychological and neuroimaging approaches, but we still lack a consistent marker for the disorder which is able to discriminate patients with OCD from healthy subjects or from patients with other mental disorders, which is sensitive to treatmentinduced changes, and which can be mapped to brain circuits or function. An approach which has been followed over the last decade is considering OCD as a disorder of action learning systems of the brain. Sequential decision tasks have recently emerged as an influential and sophisticated tool to investigate action learning in humans through the reinforcement learning (RL) framework. According to the RL framework, actions can be learned in two different ways: model-based control works by learning a model of the dynamics of the environment and later using that model to plan future behavioral trajectories, while model-free control works by storing the estimated value of recently taken actions and updating these estimates by trial-and-error. Sequential decision tasks have been used to assess associations between dysfunction in RL control systems and certain behavioral disorders, such as OCD, where an unbalance between model-based and model-free RL has been hypothesized. In fact, using the most commonly applied sequential decision task, the two-step task, evidence has been produced suggesting that OCD patients have a deficit in model-based learning. However, in this specific paradigm, subjects typically receive detailed information about task structure prior to performing the task. Thus, it remains unclear how different RL systems contribute when subjects learn exclusively from experience, and how explicit information about task structure modifies RL strategy. To address these questions, I created a sequential decision task requiring minimal prior instruction, the reduced two-step task. I assessed performance both prior to and after delivering explicit information on task structure, in healthy volunteers, patients with OCD and patients with other mood and anxiety disorders. Initially model-free control dominated, with model-based control emerging only in a minority of subjects after significant task experience, and not at all in patients with OCD, who had instead a tendency to increase their use of model-free control. Once explicit information about task structure was provided, a dramatic increase in the use of model-based RL was observed,similarly across healthy volunteers and both patient groups, including OCD. The debriefing also significantly decreased the use of model-free RL in healthy volunteers and in patients with mood and anxiety disorders, but not in OCD patients. Additionally, after instructions, model-free action value updates were influenced more by state values and less by trial outcomes, in all groups, and subject choices became more perseverative in healthy subjects, consistent with changes in exploration strategy. These results help in clarifying the RL profile for patients with OCD, with unspecific findings of deficient model-based control, and more specific findings of enhanced model-free control, in both cases prior to information about task structure. Finally, as the literature is not yet consensual on how model-free and modelbased RL systems interact in human brain circuits, I developed a functional magnetic resonance imaging (fMRI) protocol to assess uninstructed and instructed sequential action choice. Preliminary results in healthy subjects suggest that the fMRI version of the reduced two-step task allows to separate predominantly model-free control (before instructions) from predominantly model-based control (after instructions), in the same subject, task structure and environment. Across all sessions, choice events were associated with increases blood-oxygen-level-dependent (BOLD) activity in the left precentral gyrus and reward events were associated with increased BOLD activity in the ventral striatum. I found that explicit knowledge about task structure modifies blood-oxygen-level-dependent (BOLD) activity in the paracingulate cortex (medial prefrontal cortex) during the transition from the first- to the second-step of the task. Future directions include using multivariate pattern analysis techniques to explore how the brain represents state space in sequential decision tasks and applying the current fMRI protocol in clinical populations

    Gene order rearrangement methods for the reconstruction of phylogeny

    Get PDF
    The study of phylogeny, i.e. the evolutionary history of species, is a central problem in biology and a key for understanding characteristics of contemporary species. Many problems in this area can be formulated as combinatorial optimisation problems which makes it particularly interesting for computer scientists. The reconstruction of the phylogeny of species can be based on various kinds of data, e.g. morphological properties or characteristics of the genetic information of the species. Maximum parsimony is a popular and widely used method for phylogenetic reconstruction aiming for an explanation of the observed data requiring the least evolutionary changes. A certain property of the genetic information gained much interest for the reconstruction of phylogeny in recent time: the organisation of the genomes of species, i.e. the arrangement of the genes on the chromosomes. But the idea to reconstruct phylogenetic information from gene arrangements has a long history. In Dobzhansky and Sturtevant (1938) it was already pointed out that “a comparison of the different gene arrangements in the same chromosome may, in certain cases, throw light on the historical relationships of these structures, and consequently on the history of the species as a whole”. This kind of data is promising for the study of deep evolutionary relationships because gene arrangements are believed to evolve slowly (Rokas and Holland, 2000). This seems to be the case especially for mitochondrial genomes which are available for a wide range of species (Boore, 1999). The development of methods for the reconstruction of phylogeny from gene arrangement data has made considerable progress during the last years. Prominent examples are the computation of parsimonious evolutionary scenarios, i.e. a shortest sequence of rearrangements transforming one arrangement of genes into another or the length of such a minimal scenario (Hannenhalli and Pevzner, 1995b; Sankoff, 1992; Watterson et al., 1982); the reconstruction of parsimonious phylogenetic trees from gene arrangement data (Bader et al., 2008; Bernt et al., 2007b; Bourque and Pevzner, 2002; Moret et al., 2002a); or the computation of the similarities of gene arrangements (Bergeron et al., 2008a; Heber et al., 2009). 1 1 Introduction The central theme of this work is to provide efficient algorithms for modified versions of fundamental genome rearrangement problems using more plausible rearrangement models. Two types of modified rearrangement models are explored. The first type is to restrict the set of allowed rearrangements as follows. It can be observed that certain groups of genes are preserved during evolution. This may be caused by functional constraints which prevented the destruction (Lathe et al., 2000; Sémon and Duret, 2006; Xie et al., 2003), certain properties of the rearrangements which shaped the gene orders (Eisen et al., 2000; Sankoff, 2002; Tillier and Collins, 2000), or just because no destructive rearrangement happened since the speciation of the gene orders. It can be assumed that gene groups, found in all studied gene orders, are not acquired independently. Accordingly, these gene groups should be preserved in plausible reconstructions of the course of evolution, in particular the gene groups should be present in the reconstructed putative ancestral gene orders. This can be achieved by restricting the set of rearrangements, which are allowed for the reconstruction, to those which preserve the gene groups of the given gene orders. Since it is difficult to determine functionally what a gene group is, it has been proposed to consider common combinatorial structures of the gene orders as gene groups (Marcotte et al., 1999; Overbeek et al., 1999). The second considered modification of the rearrangement model is extending the set of allowed rearrangement types. Different types of rearrangement operations have shuffled the gene orders during evolution. It should be attempted to use the same set of rearrangement operations for the reconstruction otherwise distorted or even wrong phylogenetic conclusions may be obtained in the worst case. Both possibilities have been considered for certain rearrangement problems before. Restricted sets of allowed rearrangements have been used successfully for the computation of parsimonious rearrangement scenarios consisting of inversions only where the gene groups are identified as common intervals (Bérard et al., 2007; Figeac and Varré, 2004). Extending the set of allowed rearrangement operations is a delicate task. On the one hand it is unknown which rearrangements have to be regarded because this is part of the phylogeny to be discovered. On the other hand, efficient exact rearrangement methods including several operations are still rare, in particular when transpositions should be included. For example, the problem to compute shortest rearrangement scenarios including transpositions is still of unknown computational complexity. Currently, only efficient approximation algorithms are known (e.g. Bader and Ohlebusch, 2007; Elias and Hartman, 2006). Two problems have been studied with respect to one or even both of these possibilities in the scope of this work. The first one is the inversion median problem. Given the gene orders of some taxa, this problem asks for potential ancestral gene orders such that the corresponding inversion scenario is parsimonious, i.e. has a minimum length. Solving this problem is an essential component 2 of algorithms for computing phylogenetic trees from gene arrangements (Bourque and Pevzner, 2002; Moret et al., 2002a, 2001). The unconstrained inversion median problem is NP-hard (Caprara, 2003). In Chapter 3 the inversion median problem is studied under the additional constraint to preserve gene groups of the input gene orders. Common intervals, i.e. sets of genes that appear consecutively in the gene orders, are used for modelling gene groups. The problem of finding such ancestral gene orders is called the preserving inversion median problem. Already the problem of finding a shortest inversion scenario for two gene orders is NP-hard (Figeac and Varré, 2004). Mitochondrial gene orders are a rich source for phylogenetic investigations because they are known for more than 1 000 species. Four rearrangement operations are reported at least in the literature to be relevant for the study of mitochondrial gene order evolution (Boore, 1999): That is inversions, transpositions, inverse transpositions, and tandem duplication random loss (TDRL). Efficient methods for a plausible reconstruction of genome rearrangements for mitochondrial gene orders using all four operations are presented in Chapter 4. An important rearrangement operation, in particular for the study of mitochondrial gene orders, is the tandem duplication random loss operation (e.g. Boore, 2000; Mauro et al., 2006). This rearrangement duplicates a part of a gene order followed by the random loss of one of the redundant copies of each gene. The gene order is rearranged depending on which copy is lost. This rearrangement should be regarded for reconstructing phylogeny from gene order data. But the properties of this rearrangement operation have rarely been studied (Bouvel and Rossin, 2009; Chaudhuri et al., 2006). The combinatorial properties of the TDRL operation are studied in Chapter 5. The enumeration and counting of sorting TDRLs, that is TDRL operations reducing the distance, is studied in particular. Closed formulas for computing the number of sorting TDRLs and methods for the enumeration are presented. Furthermore, TDRLs are one of the operations considered in Chapter 4. An interesting property of this rearrangement, distinguishing it from other rearrangements, is its asymmetry. That is the effects of a single TDRL can (in the most cases) not be reversed with a single TDRL. The use of this property for phylogeny reconstruction is studied in Section 4.3. This thesis is structured as follows. The existing approaches obeying similar types of modified rearrangement models as well as important concepts and computational methods to related problems are reviewed in Chapter 2. The combinatorial structures of gene orders that have been proposed for identifying gene groups, in particular common intervals, as well as the computational approaches for their computation are reviewed in Section 2.2. Approaches for computing parsimonious pairwise rearrangement scenarios are outlined in Section 2.3. Methods for the computation genome rearrangement scenarios obeying biologically motivated constraints, as introduced above, are detailed in Section 2.4. The approaches for the inversion median problem are covered in Section 2.5. Methods for the reconstruction of phylogenetic trees from gene arrangement data are briefly outlined in Section 2.6.3 1 Introduction Chapter 3 introduces the new algorithms CIP, ECIP, and TCIP for solving the preserving inversion median problem. The efficiency of the algorithm is empirically studied for simulated as well as mitochondrial data. The description of algorithms CIP and ECIP is based on Bernt et al. (2006b). TCIP has been described in Bernt et al. (2007a, 2008b). But the theoretical foundation of TCIP is extended significantly within this work in order to allow for more than three input permutations. Gene order rearrangement methods that have been developed for the reconstruction of the phylogeny of mitochondrial gene orders are presented in the fourth chapter. The presented algorithm CREx computes rearrangement scenarios for pairs of gene orders. CREx regards the four types of rearrangement operations which are important for mitochondrial gene orders. Based on CREx the algorithm TreeREx for assigning rearrangement events to a given tree is developed. The quality of the CREx reconstructions is analysed in a large empirical study for simulated gene orders. The results of TreeREx are analysed for several mitochondrial data sets. Algorithms CREx and TreeREx have been published in Bernt et al. (2008a, 2007c). The analysis of the mitochondrial gene orders of Echinodermata was included in Perseke et al. (2008). Additionally, a new and simple method is presented to explore the potential of the CREx method. The new method is applied to the complete mitochondrial data set. The problem of enumerating and counting sorting TDRLs is studied in Chapter 5. The theoretical results are covered to a large extent by Bernt et al. (2009b). The missing combinatorial explanation for some of the presented formulas is given here for the first time. Therefor, a new method for the enumeration and counting of sorting TDRLs has been developed (Bernt et al., 2009a)

    Establishment and maintenance of cell polarity in Myxococcus xanthus

    Get PDF
    Cell polarity, the asymmetric distribution of proteins within cellular space, underlies key processes in all cells. Motile polarized cells have a front-rear polarity axis that can change dynamically in response to external signals. The rod-shaped M. xanthus cells move with well-defined front-rear polarity. In response to signaling by the Frz chemosensory system this polarity is inverted, and cells reverse their direction of movement. Front-rear polarity is established by a polarity module consisting of the small GTPase MglA, its cognate GEF RomR/RomX and GAP MglB. All four proteins localize asymmetrically to the cell poles with RomR/RomX and MglB mostly at the lagging pole and MglA mostly at the leading pole. In response to Frz signaling, the four proteins switch poles and front-rear polarity is inverted. We used a combination of quantitative experiments and data-driven theory to uncover the design principles underlying the emergence of polarity in M. xanthus. By studying each of the polarity proteins in isolation, using RomR as a proxy for the RomR/RomX complex, and their effects as we systematically reconstruct the system, using precise in vivo techniques to quantify subcellular protein localization, we deduced the network of effective interactions between the polarity proteins. At the core of this interaction network are two positive feedbacks whereby RomR stimulates its own polar recruitment and RomR and MglB mutually recruit one another to the poles. At the same time, a negative feedback is established through MglA, which is recruited by RomR but inhibits RomR/MglB mutual recruitment. Moreover, we identify the MglC protein as important for the RomR/MglB positive feedback, allowing the GEF/GAP pairing at the lagging pole and the establishment of the asymmetry. Our results further show that continuous cycling of MglA is crucial for the emergence of polarity and in the regulation of polarity switching during reversals. Through FRAP experiments and Photoactivatble protein fusions, we reveal that MglB, MglC and RomR participate in a tripartite cluster in which turnover is regulated by MglA activity. We rationalize the localization pattern of the GEF and GAP as providing stable asymmetry while remaining responsive and capable of polarity inversions in response to Frz signaling during cellular reversals. Our results not only have implications for the understanding of polarity and motility in M. xanthus but also for dynamic cell polarity more broadly in bacteria as well as in eukaryotic cells

    Dynamics of Genome Rearrangement in Bacterial Populations

    Get PDF
    Genome structure variation has profound impacts on phenotype in organisms ranging from microbes to humans, yet little is known about how natural selection acts on genome arrangement. Pathogenic bacteria such as Yersinia pestis, which causes bubonic and pneumonic plague, often exhibit a high degree of genomic rearrangement. The recent availability of several Yersinia genomes offers an unprecedented opportunity to study the evolution of genome structure and arrangement. We introduce a set of statistical methods to study patterns of rearrangement in circular chromosomes and apply them to the Yersinia. We constructed a multiple alignment of eight Yersinia genomes using Mauve software to identify 78 conserved segments that are internally free from genome rearrangement. Based on the alignment, we applied Bayesian statistical methods to infer the phylogenetic inversion history of Yersinia. The sampling of genome arrangement reconstructions contains seven parsimonious tree topologies, each having different histories of 79 inversions. Topologies with a greater number of inversions also exist, but were sampled less frequently. The inversion phylogenies agree with results suggested by SNP patterns. We then analyzed reconstructed inversion histories to identify patterns of rearrangement. We confirm an over-representation of “symmetric inversions”—inversions with endpoints that are equally distant from the origin of chromosomal replication. Ancestral genome arrangements demonstrate moderate preference for replichore balance in Yersinia. We found that all inversions are shorter than expected under a neutral model, whereas inversions acting within a single replichore are much shorter than expected. We also found evidence for a canonical configuration of the origin and terminus of replication. Finally, breakpoint reuse analysis reveals that inversions with endpoints proximal to the origin of DNA replication are nearly three times more frequent. Our findings represent the first characterization of genome arrangement evolution in a bacterial population evolving outside laboratory conditions. Insight into the process of genomic rearrangement may further the understanding of pathogen population dynamics and selection on the architecture of circular bacterial chromosomes
    corecore