Search CORE

22 research outputs found

On the distribution of the number of cycles in the breakpoint graph of a random signed permutation

Author: Grusea Simona
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2011
Field of study

International audienceWe use the finite Markov chain embedding technique to obtain the distribution of the number of cycles in the breakpoint graph of a random uniform signed permutation. This further gives a very good approximation of the distribution of the reversal distance between two random genomes

Scientific Publications of the University of Toulouse II Le Mirail

HAL AMU

HAL-INSA Toulouse

Prediction in high dimensional linear models and application to genomic selection under imperfect linkage disequilibrium

Author: Grusea Simona
Rabier Charles-Elie
Publication venue: HAL CCSD
Publication date: 24/11/2019
Field of study

Genomic selection (GS) consists in predicting breeding values of selection candidates, using a large number of genetic markers. An important question in GS is the determination of the number of markers required for a good prediction. When the genetic map is too sparse, it is likely to observe some imperfect linkage disequilibrium: the alleles at a gene location and at a marker located nearby vary.We tackle here the problem of imperfect linkage disequilibrium and we present theoretical results regarding the accuracy criteria, the correlation between predicted value and true value. Illustrations on simulated data and on rice real data are proposed

Compound Poisson Approximation and Testing for Gene Clusters with Multigene Families

Author: Chabrol Olivier
Grusea Simona
Pardoux Etienne
Pontarotti Pierre
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/01/2011
Field of study

International audienceWe present in this article a compound Poisson approximation for computing probabilities involved in significance tests for conserved genomic regions between different species. We consider the case when the conserved genomic regions are found by the reference region approach. An important aspect of our computations is the fact that we are taking into account the existence of multigene families. We obtain convergence results for the error of our approximation by using the Stein-Chen method for compound Poisson approximation

Scientific Publications of the University of Toulouse II Le Mirail

HAL AMU

HAL-INSA Toulouse

The distribution of cycles in breakpoint graphs of signed permutations

Author: Anthony Labarre
Bafna
Björner
Bóna
Christie
Diestel
Doignon
Elias
Fertin
Goodman
Graham
Grusea
Hanlon
Hannenhalli
Kwak
Labarre
Labarre
Li
Macdonald
Simona Grusea
Sury
Székely
Wielandt
Wilf
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Breakpoint graphs are ubiquitous structures in the field of genome rearrangements. Their cycle decomposition has proved useful in computing and bounding many measures of (dis)similarity between genomes, and studying the distribution of those cycles is therefore critical to gaining insight on the distributions of the genomic distances that rely on it. We extend here the work initiated by Doignon and Labarre, who enumerated unsigned permutations whose breakpoint graph contains

k

cycles, to signed permutations, and prove explicit formulas for computing the expected value and the variance of the corresponding distributions, both in the unsigned case and in the signed case. We also compare these distributions to those of several well-studied distances, emphasising the cases where approximations obtained in this way stand out. Finally, we show how our results can be used to derive simpler proofs of other previously known results

arXiv.org e-Print Archive

CiteSeerX

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

HAL-INSA Toulouse

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

CASSIOPE: An expert system for conserved regions searches

Author: Chabrol Olivier
Danchin Etienne GJ
Gouret Philippe
Grusea Simona
Levasseur Anthony
Pontarotti Pierre
Rascol Virginie Lopez
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Understanding genome evolution provides insight into biological mechanisms. For many years comparative genomics and analysis of conserved chromosomal regions have helped to unravel the mechanisms involved in genome evolution and their implications for the study of biological systems. Detection of conserved regions (descending from a common ancestor) not only helps clarify genome evolution but also makes it possible to identify quantitative trait loci (QTLs) and investigate gene function. The identification and comparison of conserved regions on a genome scale is computationally intensive, making process automation essential. Three key requirements are necessary: consideration of phylogeny to identify orthologs between multiple species, frequent updating of the annotation and panel of compared genomes and computation of statistical tests to assess the significance of identified conserved gene clusters. Results We developed a modular system superimposed on a multi-agent framework, called CASSIOPE (Clever Agent System for Synteny Inheritance and Other Phenomena in Evolution). CASSIOPE automatically identifies statistically significant conserved regions between multiple genomes based on automated phylogenies and statistical testing. Conserved regions were searched for in 19 species and 1,561 hits were found. To our knowledge, CASSIOPE is the first system to date that integrates evolutionary biology-based concepts and fulfills all three key requirements stated above. All results are available at <url>http://194.57.197.245/cassiopeWeb/displayCluster?clusterId=1</url> Conclusion CASSIOPE makes it possible to study conserved regions from a chosen query genetic region and to infer conserved gene clusters based on phylogenies and statistical tests assessing the significance of these conserved regions. Source code is freely available, please contact: <email>[email protected]</email></p

Crossref

Springer - Publisher Connector

HAL AMU

Directory of Open Access Journals

PubMed Central

HAL Descartes

ProdInra

Measures for the exceptionality of gene order in conserved genomic regions

Author: Berestycki
Bergeron
Christie
Comtet
Danchin
Diaconis
Doignon
Durand
Eriksen
Glaz
Hoberman
Hoberman
Huntington
Li
Naus
Raghupathy
Sankoff
Sankoff
Simona Grusea
Xu
Xu
Publication venue: 'Elsevier BV'
Publication date: 01/09/2010
Field of study

International audienceWe propose in this article three measures for quantifying the exceptionality of gene order in conserved genomic regions found by the reference region approach. The three measures are based on the transposition distance in the permutation group. We obtain analytic expressions for their distribution in the case of a random uniform permutation, i.e. under the null hypothesis of random gene order. Our results can be used to increase the power of the significance tests for gene clusters which take into account only the proximity of the orthologous genes and not their order

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

HAL AMU

HAL-INSA Toulouse

Applications du calcul des probabilités à la recherche de régions génomiques conservées

Author: Grusea Simona
Publication venue: HAL CCSD
Publication date: 03/12/2008
Field of study

This thesis is concentrated on some probability and statistical issues linked to genomic comparison. In the first part we present a compound Poisson approximation for computing probabilities involved in significance tests for conserved genomic regions found by the reference-region approach. An important aspect of our computations is the fact that we are taking into account the existence of multigene families. In the second part we propose three measures, based on the transposition distance in the symmetric group, for quantifying the exceptionality of the gene order in conserved genomic regions. We obtain analytic expressions for their distribution in the case of a random permutation. In the third part of the thesis we study the distribution of the number of cycles in the breakpoint graph of a random signed permutation. We use the Markov chain imbedding technique to obtain this distribution in terms of a product of transition matrices of a certain finite Markov chain. The knowledge of this distribution provides a very good approximation for the distribution of the reversal distance.Cette thèse se concentre sur quelques sujets de probabilités et statistique liés à la génomique comparative. Dans la première partie nous présentons une approximation de Poisson composée pour calculer des probabilités impliquées dans des tests statistiques pour la significativité des régions génomiques conservées trouvées par une approche de type région de référence.Un aspect important de notre démarche est le fait de prendre en compte l'existence des familles multigéniques. Dans la deuxième partie nous proposons trois mesures, basées sur la distance de transposition dans le groupe symétrique, pour quantifier l'exceptionalité de l'ordre des gènes dans des régions génomiques conservées. Nous avons obtenu des expressions analytiques pour leur distribution dans le cas d'une permutation aléatoire. Dans la troisième partie nous avons étudié la distribution du nombre de cycles dans le graphe des points de rupture d'une permutation signée aléatoire. Nous avons utilisé la technique ``Markov chain imbedding'' pour obtenir cette distribution en terme d'un produit de matrices de transition d'une certaine chaîne de Markov finie. La connaissance de cettedistribution fournit par la suite une très bonne approximation pour la distribution de la distance d'inversion

Thèses en Ligne

HAL AMU

Applications du calcul des probabilités à la recherche de régions genomiques conservées

Author: GRUSEA Simona
PARDOUX Étienne
Publication venue
Publication date: 01/01/2008
Field of study

Cette thèse se concentre sur quelques sujets de probabilités et statistique liés à la génomique comparative. Dans la première partie nous présentons une approximation de Poisson composée pour calculer des probabilités impliquées dans des tests statistiques pour la significativité des régions génomiques conservées trouvées par une approche de type région de référence. Un aspect important de notre démarche est le fait de prendre en compte l existence des familles multigéniques. Dans la deuxième partie nous proposons trois mesures, basées sur la distance de transposition dans le groupe symétrique, pour quantifier l exceptionalité de l ordre des gènes dans des régions génomiques conservées. Nous avons obtenu des expressions analytiques pour leur distribution dans le cas d une permutation aléatoire. Dans la troisième partie nous avons étudié la distribution du nombre de cycles dans le graphe des points de rupture d une permutation signée aléatoire. Nous avons utilisé la technique Markov chain imbedding pour obtenir cette distribution en terme d un produit de matrices de transition d une certaine chaîne de Markov finie. La connaissance de cette distribution fournit par la suite une très bonne approximation pour la distribution de la distance d inversion.This thesis is concentrated on some probability and statistical issues linked to genomic comparison. In the first part we present a compound Poisson approximation for computing probabilities involved in significance tests for conserved genomic regions found by the reference-region approach. An important aspect of our computations is the fact that we are taking into account the existence of multigene families. In the second part we propose three measures, based on the transposition distance in the symmetric group, for quantifying the exceptionality of the gene order in conserved genomic regions. We obtain analytic expressions for their distribution in the case of a random permutation. In the third part of the thesis we study the distribution of the number of cycles in the breakpoint graph of a random signed permutation. We use the Markov chain imbedding technique to obtain this distribution in terms of a product of transition matrices of a certain finite Markov chain. The knowledge of this distribution provides a very good approximation for the distribution of the reversal distance.AIX-MARSEILLE1-Inst.Médit.tech (130552107) / SudocSudocFranceF

OpenGrey Repository