Search CORE

175 research outputs found

L'affaire du Mediator au prisme de la textométrie

Author: Gambette Philippe
Martinez William
Publication venue: Institut Ferdinand de Saussure
Publication date: 01/01/2013
Field of study

http://www.revue-texto.net/index.php?id=3318National audienceSur la base d'un corpus de plus de 2000 articles de la presse française relatant l'affaire du Médiator, nous appliquons les méthodes de la statistique textuelle afin d'étudier les tendances d'emploi du vocabulaire, les thèmes privilégiés et les stratégies discursives mises en place au fil de la couverture journalistique de l'affaire. Objective et exhaustive, la lexicométrie étudie les fréquences d'emploi des mots pour déterminer la variabilité du discours en distinguant les articles de commentaire et les textes factuels, en opposant les avis scientifiques aux opinions politiques et interprétations journalistiques. En particulier, l'analyse des cooccurrences identifie les ancres conceptuelles du corpus et une représentation des textes dans un nuage arboré permet de visualiser les réseaux de mots qui structurent la narration

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Linear-time Constant-ratio Approximation Algorithm and Tight Bounds for the Contiguity of Cographs

Author: Crespelle Christophe
Gambette Philippe
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

International audienceIn this paper we consider a graph parameter called contiguity which aims at encoding a graph by a linear ordering of its vertices. We prove that the contiguity of cographs is unbounded but is always dominated by O(log n), where n is the number of vertices of the graph. And we prove that this bound is tight in the sense that there exists a family of cographs on n vertices whose contiguity is Omega(log n). In addition to these results on the worst-case contiguity of cographs, we design a linear-time constant-ratio approximation algorithm for computing the contiguity of an arbitrary cograph, which constitutes our main result. As a by-product of our proofs, we obtain a min-max theorem, which is worth of interest in itself, stating equality between the rank of a tree and the minimum height of its path partitions

HAL-ENS-LYON

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL

HAL-Lyon 3

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Do branch lengths help to locate a tree in a phylogenetic network?

Author: Gambette Philippe
Kelk Steven
Pardi Fabio
Scornavacca Celine
van Iersel Leo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Phylogenetic networks are increasingly used in evolutionary biology to represent the history of species that have undergone reticulate events such as horizontal gene transfer, hybrid speciation and recombination. One of the most fundamental questions that arise in this context is whether the evolution of a gene with one copy in all species can be explained by a given network. In mathematical terms, this is often translated in the following way: is a given phylogenetic tree contained in a given phylogenetic network? Recently this tree containment problem has been widely investigated from a computational perspective, but most studies have only focused on the topology of the phylo- genies, ignoring a piece of information that, in the case of phylogenetic trees, is routinely inferred by evolutionary analyses: branch lengths. These measure the amount of change (e.g., nucleotide substitutions) that has occurred along each branch of the phylogeny. Here, we study a number of versions of the tree containment problem that explicitly account for branch lengths. We show that, although length information has the potential to locate more precisely a tree within a network, the problem is computationally hard in its most general form. On a positive note, for a number of special cases of biological relevance, we provide algorithms that solve this problem efficiently. This includes the case of networks of limited complexity, for which it is possible to recover, among the trees contained by the network with the same topology as the input tree, the closest one in terms of branch lengths

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

INRIA a CCSD electronic archive server

HAL-IRD

HAL-CIRAD

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

(Nearly-)tight bounds on the contiguity and linearity of cographs

Author: Crespelle Christophe
Gambette Philippe
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceIn this paper we show that the contiguity and linearity of cographs on n vertices are both O(n). Moreover, we show that this bound is tight for contiguity as there exists a family of cographs on n vertices whose contiguity is Ω(log n). We also provide an Ω(log n / log log n) lower bound on the maximum linearity of cographs on n vertices. As a by-product of our proofs, we obtain a min-max theorem, which is worth of interest in itself, stating equality between the rank of a tree and the minimum height of one of its path partitions

HAL-ENS-LYON

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL

HAL-Lyon 3

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

ModClust: a Cytoscape plugin for modularity-based clustering of networks

Author: Gambette Philippe
Guénoche Alain
Tichit Laurent
Publication venue: HAL CCSD
Publication date: 19/10/2011
Field of study

National audienceLarge networks such as protein-protein interaction networks are usually extremely difficult to understand as a whole. We developed ModClust, a Cytoscape plugin for modularity-based clustering of large networks. The aim of this plugin is first to establish classes of high density edges. It also allows to understand the relations between these classes, and how they are assembled within the whole graph. It can be used to predict new protein functions. It implements two novel algorithms: FT and TFit. Their results are compared both on random graphs and on benchmarks where the optimal partition is known. RÉSUMÉ. Les grands graphes, comme les réseaux d'interaction protéine-protéine, sont d'une manière générale difficiles à analyser. Nous avons développé un plugin pour le logiciel Cy-toscape, appelé ModClust, effectuant du partitionnement de graphes par optimisation de la modularité. L'objectif de ce plugin est de comprendre quelles sont les relations entre classes et comment ces dernières sont assemblées dans le graphe. Il nous aide finalement à prédire de nouvelles fonctions protéiques. Deux nouveaux algorithmes, FT et TFit, sont implémentés. Leurs résultats sont comparés sur des graphes aléatoires et sur des benchmarks dont on connait les partitions optimales

HAL AMU

HAL Descartes

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

iPhocomp : calcul automatique de l’indice de complexité phonétique de Jakielski

Author: Barkat-Defradas Melissa
Gambette Philippe
Lee Hyeran
Publication venue: HAL CCSD
Publication date: 23/06/2014
Field of study

National audienceThe index of phonetic complexity was introduced by Jakielski in order to estimate the difficulty to pronounce some words in English. Comparative studies about several language pathologies, in several languages, have observed distinct behavior among the studied subjects facing words with various phonetic complexity indices. We provide a web interface based on a phonetic dictionary to automatically estimate the phonetic complexity index of a set of words in French: iPhocomp. We compare the human computation of this index with the automatic computation provided by iPhocomp in order to estimate the performance of the interface.L'indice de complexité phonétique a été proposé par Jakielski afin d'estimer la difficulté à prononcer certains mots en anglais. Des études comparatives sur plusieurs types de pathologies du langage, dans des langues diverses, ont permis d'observer des comportements différents parmi les sujets étudiés, face à des mots d'indices de complexité phonétique variés. Nous proposons une interface web basée sur un dictionnaire phonétique afin d'estimer automatiquement l'indice de complexité phonétique d'un ensemble de mots en français : iPhocomp. Nous comparons l'étiquetage humain à l'étiquetage automatique fourni par iPhocomp afin d'estimer la performance de l'interface

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

On Distances Between Words with Parameters

Author: Bourhis Pierre
Boussidan Aaron
Gambette Philippe
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)
Publication date: 01/01/2023
Field of study

The edit distance between parameterized words is a generalization of the classical edit distance where it is allowed to map particular letters of the first word, called parameters, to parameters of the second word before computing the distance. This problem has been introduced in particular for detection of code duplication, and the notion of words with parameters has also been used with different semantics in other fields. The complexity of several variants of edit distances between parameterized words has been studied, however, the complexity of the most natural one, the Levenshtein distance, remained open. In this paper, we solve this open question and close the exhaustive analysis of all cases of parameterized word matching and function matching, showing that these problems are np-complete. To this aim, we also provide a comparison of the different problems, exhibiting several equivalences between them. We also provide and implement a MaxSAT encoding of the problem, as well as a simple FPT algorithm in the alphabet size, and study their efficiency on real data in the context of theater play structure comparison

DROPS Dagstuhl Research Online Publication Server

On the challenge of reconstructing level-1 phylogenetic networks from triplets and clusters

Author: Gambette Philippe
Huber K. T.
Kelk S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Phylogenetic networks have gained prominence over the years due to their ability to represent complex non-treelike evolutionary events such as recombination or hybridization. Popular combinatorial objects used to construct them are triplet systems and cluster systems, the motivation being that any network

N

induces a triplet system

\mathcal R(N)

and a softwired cluster system

\mathcal S(N)

. Since in real-world studies it cannot be guaranteed that all triplets/softwired clusters induced by a network are available, it is of particular interest to understand whether subsets of

\mathcal R(N)

\mathcal S(N)

allow one to uniquely reconstruct the underlying network

N

. Here we show that even within the highly restricted yet biologically interesting space of level-1 phylogenetic networks it is not always possible to uniquely reconstruct a level-1 network

N

\kelk{,} even when all triplets in

\mathcal R(N)

or all clusters in

\mathcal S(N)

are available. On the positive side, we introduce a reasonably large subclass of level-1 networks the members of which are uniquely determined by their induced triplet/softwired cluster systems. Along the way, we also establish various enumerative results, both positive and negative, including results which show that certain special subclasses of level-1 networks

N

can be uniquely reconstructed from proper subsets of

\mathcal R(N)

and

\mathcal S(N)

. We anticipate these results to be of use in the design of algorithms for phylogenetic network inference

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

University of East Anglia digital repository

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Trinets encode tree-child and level-2 phylogenetic networks

Author: A Dress
AV Aho
C Scornavacca
D Morrison
DH Huson
G Cardona
G Cardona
G Cardona
G Cardona
G Cardona
G Cardona
G Jin
G Jin
J Byrka
J Fischer
J Jansson
KT Huber
Leo van Iersel
LJJ Iersel van
LJJ Iersel van
LJJ Iersel van
LJJ Iersel van
LJJ Iersel van
P Gambette
P Gambette
P Gambette
SJ Willson
Vincent Moulton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Phylogenetic networks generalize evolutionary trees, and are commonly used to represent evolutionary histories of species that undergo reticulate evolutionary processes such as hybridization, recombination and lateral gene transfer. Recently, there has been great interest in trying to develop methods to construct rooted phylogenetic networks from triplets, that is rooted trees on three species. However, although triplets determine or encode rooted phylogenetic trees, they do not in general encode rooted phylogenetic networks, which is a potential issue for any such method. Motivated by this fact, Huber and Moulton recently introduced trinets as a natural extension of rooted triplets to networks. In particular, they showed that level-1 level-1 phylogenetic networks are encoded by their trinets, and also conjectured that all “recoverable” rooted phylogenetic networks are encoded by their trinets. Here we prove that recoverable binary level-2 networks and binary tree-child networks are also encoded by their trinets. To do this we prove two decomposition theorems based on trinets which hold for all recoverable binary rooted phylogenetic networks. Our results provide some additional evidence in support of the conjecture that trinets encode all recoverable rooted phylogenetic networks, and could also lead to new approaches to construct phylogenetic networks from trinets

Crossref

CWI's Institutional Repository

University of East Anglia digital repository