11 research outputs found

    Weighted Minimum-Length Rearrangement Scenarios

    Get PDF
    We present the first known model of genome rearrangement with an arbitrary real-valued weight function on the rearrangements. It is based on the dominant model for the mathematical and algorithmic study of genome rearrangement, Double Cut and Join (DCJ). Our objective function is the sum or product of the weights of the DCJs in an evolutionary scenario, and the function can be minimized or maximized. If the likelihood of observing an independent DCJ was estimated based on biological conditions, for example, then this objective function could be the likelihood of observing the independent DCJs together in a scenario. We present an O(n^4)-time dynamic programming algorithm solving the Minimum Cost Parsimonious Scenario (MCPS) problem for co-tailed genomes with n genes (or syntenic blocks). Combining this with our previous work on MCPS yields a polynomial-time algorithm for general genomes. The key theoretical contribution is a novel link between the parsimonious DCJ (or 2-break) scenarios and quadrangulations of a regular polygon. To demonstrate that our algorithm is fast enough to treat biological data, we run it on syntenic blocks constructed for Human paired with Chimpanzee, Gibbon, Mouse, and Chicken. We argue that the Human and Gibbon pair is a particularly interesting model for the study of weighted genome rearrangements

    (Re)introducing regular graph languages

    Get PDF

    Scénarios évolutifs pondérés de réarrangements génomiques

    No full text
    Recent advances in sequencing technologies revealed the ubiquity of genome rearrangements between each and every one of us. These large scale mutationsrearrange segments of chromosomes and have a profound impact on genetic variation, disease, and evolution. The study of the consequences of rearrangements along with their molecular mechanisms, however, is still in its infancy.Given extant genomes, we are interested in tracing back the evolutionary rearrangement scenarios that transformed their least common ancestor into the genomes that we observe today. This helps not only to reveal evolutionary relationships between organisms, but also provides a window for the study of genome rearrangements themselves.The central computational problem in this subfield of comparative genomicsis that of finding optimal rearrangement scenarios transforming one genome into another. Historically all rearrangements were treated as being equally possible, and optimal scenarios were those that contained the minimum number of rearrangements. Recent advances in biology, however, allow us to devise much more sophisticated models. We present a short survey of the existingwork on using biological constraints for genome rearrangements, and argue that a much more flexible approach is necessary to accompany the influx of newly available biological data.In this work we propose an extremely general framework for genome rearrangements with biological constraints. Our main contribution is a polynomial time algorithm that, for an arbitrary cost function, finds a minimum cost scenario among those of minimum length. Along the way we establish a number of novel links between sorting genomes with double cut and join rearrangements, sorting graphs with 2-breaks or edge swaps, sorting permutations with mathematical transpositions, sorting strings with interchanges, and token swapping on graphs.Un rĂ©arrangement gĂ©nomique est une mutation qui modifie la structure des chromosomes voir mĂȘme leur nombre dans un gĂ©nome. Outre des fusions et des fissions de chromosomes, ces rĂ©arrangements comprennent des dĂ©lĂ©tions, des insertions et des inversions de segments chromosomiques. Deux extrĂ©mitĂ©s de chromosomes diffĂ©rents peuvent Ă©galement ĂȘtre Ă©changĂ©es au cours d'une translocation. L'ensemble de ces mutations constitue un scĂ©nario Ă©volutif de rĂ©arrangements entre les espĂšces. Nous nous sommes intĂ©ressĂ©s Ă  la reconstruction des scĂ©narios de rĂ©arrangements entre espĂšces animales.Notre projet associe des outils mathĂ©matiques et algorithmiques avec la comprĂ©hension biologique actuelle des rĂ©arrangements gĂ©nomiques. D'un point de vue biologique, notre objectif est de lier gĂ©nĂ©tique et Ă©pigĂ©nĂ©tique aux rĂ©arrangements dans les deux sens:1) nous dĂ©veloppons une mĂ©thodologie pour Ă©tudier des caractĂ©ristiques gĂ©nĂ©tiques et Ă©pigĂ©nĂ©tiques associĂ©es aux rĂ©arrangements,2) et inversement pour trouver des scĂ©narios de rĂ©arrangements guidĂ©s par de telles caractĂ©ristiques gĂ©nĂ©tiques et Ă©pigĂ©nĂ©tiques.La principale contribution de cette thĂšse est la suivante. Nous prĂ©sentons un cadre sur le modĂšle de rĂ©arrangements double cut and join avec des poids arbitraires. Dans ce cadre un scĂ©nario de poids minimum peut ĂȘtre trouvĂ© en temps polynomial parmi les scĂ©narios de longueur minimale pour deux gĂ©nomes Ă  contenu gĂ©nĂ©tique identique et sans doublons.En plus de cela, nous Ă©tablissons un certain nombre de nouvelles correspondances entre les divers problĂšmes de tri. Ces problĂšmes incluent le tri des gĂ©nomes avec des rĂ©arrangements dits Double Cut and Join, le tri des graphes avec 2-breaks ou edge swaps, le tri des permutations avec des transpositions, le tri des chaĂźnes avec des Ă©changes et l'Ă©change de jetons sur les graphes

    Finding local genome rearrangements

    Get PDF
    International audienceThe double cut and join (DCJ) model of genome rearrangement is well studied due to its mathematical simplicity and power to account for the many events that transform gene order. These studies have mostly been devoted to the understanding of minimum length scenarios transforming one genome into another. In this paper we search instead for rearrangement scenarios that minimize the number of rearrangements whose breakpoints are unlikely due to some biological criteria. One such criterion has recently become accessible due to the advent of the Hi-C experiment, facilitating the study of 3D spacial distance between breakpoint regions

    A General Framework for Genome Rearrangement with Biological Constraints

    No full text
    International audienceThis paper generalizes previous studies on genome rearrangement under biological constraints, using double cut and join (DCJ). We propose a model for weighted DCJ, along with a family of optimization problems called ϕ-MCPS (MiniMuM CoSt ParSiMoniouS SCenario), that are based on labeled graphs. We show how to compute solutions to general instances of ϕ-MCPS, given an algorithm to compute ϕ-MCPS on a circular genome with exactly one occurrence of each gene. These general instances can have an arbitrary number of circular and linear chromosomes, and arbitrary gene content. The practicality of the framework is displayed by presenting polynomial-time algorithms that generalize the results of Bulteau, Fertin, and Tannier on the Sorting by wDCJS anD inDelS in intergeneS problem, and that generalize previous results on the MiniMuM loCal ParSiMoniouS SCenario problem

    Models and algorithms for genome rearrangement with positional constraints

    Get PDF
    International audienceBackgroundTraditionally, the merit of a rearrangement scenario between two gene orders has been measured based on a parsimony criteria alone; two scenarios with the same number of rearrangements are considered equally good. In this paper, we acknowledge that each rearrangement has a certain likelihood of occurring based on biological constraints, e.g. physical proximity of the DNA segments implicated or repetitive sequences.ResultsWe propose optimization problems with the objective of maximizing overall likelihood, by weighting the rearrangements. We study a binary weight function suitable to the representation of sets of genome positions that are most likely to have swapped adjacencies. We give a polynomial-time algorithm for the problem of finding a minimum weight double cut and join scenario among all minimum length scenarios. In the process we solve an optimization problem on colored noncrossing partitions, which is a generalization of the Maximum Independent Set problem on circle graphs.ConclusionsWe introduce a model for weighting genome rearrangements and show that under simple yet reasonable conditions, a fundamental distance can be computed in polynomial time. This is achieved by solving a generalization of the Maximum Independent Set problem on circle graphs. Several variants of the problem are also mentioned

    Rearrangement Scenarios Guided by Chromatin Structure

    Get PDF
    International audienceGenome architecture can be drastically modified through a succession of large-scale rearrangements. In the quest to infer these rearrangement scenarios, it is often the case that the parsimony principal alone does not impose enough constraints. In this paper we make an initial effort towards computing scenarios that respect chromosome con-formation, by using Hi-C data to guide our computations. We confirm the validity of a model – along with optimization problems Minimum Local Scenario and Minimum Local Parsimonious Scenario – developed in previous work that is based on a partition into equivalence classes of the adjacencies between syntenic blocks. To accomplish this we show that the quality of a clustering of the adjacencies based on Hi-C data is directly correlated to the quality of a rearrangement scenario that we compute between Drosophila melanogaster and Drosophila yakuba. We evaluate a simple greedy strategy to choose the next rearrangement based on Hi-C, and motivate the study of the solution space of Minimum Local Parsimonious Scenario

    Co-evolution of AR gene copy number and structural complexity in endocrine therapy resistant prostate cancer

    No full text
    Androgen receptor (AR) inhibition is standard of care for advanced prostate cancer (PC). However, efficacy is limited by progression to castration-resistant PC (CRPC), usually due to AR re-activation via mechanisms that include AR amplification and structural rearrangement. These two classes of AR alterations often co-occur in CRPC tumors, but it is unclear whether this reflects intercellular or intracellular heterogeneity of AR. Resolving this is important for developing new therapies and predictive biomarkers. Here, we analyzed 41 CRPC tumors and 6 patient-derived xenografts (PDXs) using linked-read DNA-sequencing, and identified 7 tumors that developed complex, multiply-rearranged AR gene structures in conjunction with very high AR copy number. Analysis of PDX models by optical genome mapping and fluorescence in situ hybridization showed that AR residing on extrachromosomal DNA (ecDNA) was an underlying mechanism, and was associated with elevated levels and diversity of AR expression. This study identifies co-evolution of AR gene copy number and structural complexity via ecDNA as a mechanism associated with endocrine therapy resistance
    corecore