373 research outputs found
Fully polynomial FPT algorithms for some classes of bounded clique-width graphs
Parameterized complexity theory has enabled a refined classification of the
difficulty of NP-hard optimization problems on graphs with respect to key
structural properties, and so to a better understanding of their true
difficulties. More recently, hardness results for problems in P were achieved
using reasonable complexity theoretic assumptions such as: Strong Exponential
Time Hypothesis (SETH), 3SUM and All-Pairs Shortest-Paths (APSP). According to
these assumptions, many graph theoretic problems do not admit truly
subquadratic algorithms, nor even truly subcubic algorithms (Williams and
Williams, FOCS 2010 and Abboud, Grandoni, Williams, SODA 2015). A central
technique used to tackle the difficulty of the above mentioned problems is
fixed-parameter algorithms for polynomial-time problems with polynomial
dependency in the fixed parameter (P-FPT). This technique was introduced by
Abboud, Williams and Wang in SODA 2016 and continued by Husfeldt (IPEC 2016)
and Fomin et al. (SODA 2017), using the treewidth as a parameter. Applying this
technique to clique-width, another important graph parameter, remained to be
done. In this paper we study several graph theoretic problems for which
hardness results exist such as cycle problems (triangle detection, triangle
counting, girth, diameter), distance problems (diameter, eccentricities, Gromov
hyperbolicity, betweenness centrality) and maximum matching. We provide
hardness results and fully polynomial FPT algorithms, using clique-width and
some of its upper-bounds as parameters (split-width, modular-width and
-sparseness). We believe that our most important result is an -time algorithm for computing a maximum matching where
is either the modular-width or the -sparseness. The latter generalizes
many algorithms that have been introduced so far for specific subclasses such
as cographs, -lite graphs, -extendible graphs and -tidy
graphs. Our algorithms are based on preprocessing methods using modular
decomposition, split decomposition and primeval decomposition. Thus they can
also be generalized to some graph classes with unbounded clique-width
Structure vs métrique dans les graphes
International audienceL'émergence de réseaux de très grande taille oblige à repenser de nombreux problèmes sur les graphes : en apparence simples, mais pour lesquels les algorithmes de résolution connus ne passent plus a l'échelle. Une approche possible est de mieux comprendre les propriétés de ces réseaux complexes, et d'en déduire de nouvelles méthodes plus efficaces. C'est dans ce but que nous démontrons des relations générales entre les propriétés structurelles des graphes et leurs propriétés métriques. Nos relations se déduisent de nouvelles bornes serrées sur le diamètre des séparateurs minimaux dans un graphe. Plus précisément , nous prouvons que dans tout graphe G le diamètre d'un séparateur minimal S dans G est au plus (l(G)/2) · (|S| − 1), avec l(G) la plus grande taille d'un cycle isométrique dans G. Nos preuves reposent sur des propriétés de connexité dans les puissances d'un graphe. Une conséquence de nos résultats est que pour tout graphe G, sa longueur arborescente (treelength) est au plus l(G)/2 fois sa largeur arborescente (treewidth). En complément de cette relation, nous bornons la largeur arborescente par une fonction de la longueur arborescente et du genre du graphe. Cette borne se généralise à la famille des graphes qui excluent un apex-graph H comme mineur. Par conséquent , nous obtenons un algorithme très simple qui, étant donné un graphe excluant un apex-graph fixé comme mineur, calcule sa largeur arborescente en temps O(n²) et avec facteur d'approximation O(l(G))
XRay: Enhancing the Web's Transparency with Differential Correlation
Today's Web services - such as Google, Amazon, and Facebook - leverage user
data for varied purposes, including personalizing recommendations, targeting
advertisements, and adjusting prices. At present, users have little insight
into how their data is being used. Hence, they cannot make informed choices
about the services they choose. To increase transparency, we developed XRay,
the first fine-grained, robust, and scalable personal data tracking system for
the Web. XRay predicts which data in an arbitrary Web account (such as emails,
searches, or viewed products) is being used to target which outputs (such as
ads, recommended products, or prices). XRay's core functions are service
agnostic and easy to instantiate for new services, and they can track data
within and across services. To make predictions independent of the audited
service, XRay relies on the following insight: by comparing outputs from
different accounts with similar, but not identical, subsets of data, one can
pinpoint targeting through correlation. We show both theoretically, and through
experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision
and recall by correlating data from a surprisingly small number of extra
accounts.Comment: Extended version of a paper presented at the 23rd USENIX Security
Symposium (USENIX Security 14
Machine Learning under the light of Phraseology expertise: use case of presidential speeches, De Gaulle -Hollande (1958-2016)
International audienceAuthor identification and text genesis have always been a hot topic for the statistical analysis of textual data community. Recent advances in machine learning have seen the emergence of machines competing state-of-the-art computational linguistic methods on specific natural language processing tasks (part-of-speech tagging, chunking and parsing, etc). In particular, Deep Linguistic Architectures are based on the knowledge of language speci-ficities such as grammar or semantic structure. These models are considered as the most competitive thanks to their assumed ability to capture syntax. However if those methods have proven their efficiency, their underlying mechanisms, both from a theoretical and an empirical analysis point of view, remains hard both to explicit and to maintain stable, which restricts their area of applications. Our work is enlightening mechanisms involved in deep architectures when applied to Natural Language Processing (NLP) tasks. The Query-By-Dropout-Committee (QBDC) algorithm is an active learning technique we have designed for deep architectures: it selects iteratively the most relevant samples to be added to the training set so that the model is improved the most when built from the new training set. However in this article, we do not go into details of the QBDC algorithm-as it has already been studied in the original QBDC article-but we rather confront the relevance of the sentences chosen by our active strategy to state of the art phraseology techniques. We have thus conducted experiments on the presidential discourses from presidents C. De Gaulle, N. Sarkozy and F. Hollande in order to exhibit the interest of our active deep learning method in terms of discourse author identification and to analyze the extracted linguistic patterns by our artificial approach compared to standard phraseology techniques.L'identification de l'auteur et la gen ese d'un texte ont toujours eté une question de tr es grand intérêt pour la com-munauté de l'analyse statistique des données textuelles. Les récentes avancées dans le domaine de l'apprentissage machine ont permis l'´ emergence d'algorithmes concurrençant les méthodes de linguistique computationnelles de l'´ etat de l'art pour des tâches spécifiques en traitement automatique du langage (´ etiquetage des parties du dis-cours, segmentation et l'analyse du texte, etc). En particulier, les architectures profondes pour la linguistique sont fondées sur la connaissance des spécificités linguistiques telles que la grammaire ou la structure sémantique. Ces mod eles sont considérés comme les plus compétitifs grâcè a leur capacité supposée de capturer la syntaxe. Toute-fois, si ces méthodes ont prouvé leur efficacité, leurs mécanismes sous-jacents, tant du point de vue théorique que du point de vue de l'analyse empirique, restent difficilè a la fois a expliciter et a maintenir stables, ce qui limite leur domaine d'application. Notre article visè a mettre enlumì ere certains des mécanismes impliqués dans l'apprentissage profond lorsqu'il est appliqué a des tâches de traitement automatique du langage (TAL). L'algorithme Query-By-Dropout-Committee (QBDC) est une technique d'apprentissage actif, nous avons conçu pour les architectures profondes : il sélectionne itérativement les echantillons les plus pertinents pour etre ajoutés a l'ensemble d'entrainement afin que le mod ele soit amélioré de façon optimale lorsqu'on il est mis a jour a partir du nouvel ensemble d'entrainement. Cependant, dans cet article, nous ne détaillons pas l'algorithme QBDC-qui a déj a ´ eté etudié dans l'article original sur QBDC-mais nous confrontons plutôt la pertinence des phrases choisies par notre stratégie active aux techniques de l'´ etat de l'art en phraséologie. Nous avons donc mené des expériences sur les discours présidentiels des présidents C. De Gaulle , N. Sarkozy et F. Hollande afin de présenter l' intérêt de notre méthode d'apprentissage profond actif en termes de d'identification de l'auteur d'un discours et pour analyser les motifs linguistiques extraits par notre approche artificielle par rapport aux techniques de phraséologie standard
On Computing the Average Distance for Some Chordal-Like Graphs
The Wiener index of a graph G is the sum of all its distances. Up to renormalization, it is also the average distance in G. The problem of computing this parameter has different applications in chemistry and networks. We here study when it can be done in truly subquadratic time (in the size n+m of the input) on n-vertex m-edge graphs. Our main result is a complete answer to this question, assuming the Strong Exponential-Time Hypothesis (SETH), for all the hereditary subclasses of chordal graphs. Interestingly, the exact same result also holds for the diameter problem. The case of non-hereditary chordal subclasses happens to be more challenging. For the chordal Helly graphs we propose an intricate O?(m^{3/2})-time algorithm for computing the Wiener index, where m denotes the number of edges. We complete our results with the first known linear-time algorithm for this problem on the dually chordal graphs. The former algorithm also computes the median set
Optimal Centrality Computations Within Bounded Clique-Width Graphs
International audienceGiven an n-vertex m-edge graph G of clique-width at most k, and a corresponding k-expression, we present algorithms for computing some well-known centrality indices (eccentricity and closeness) that run in O(2^O(k) (n + m)^{1+ϵ}) time for any ϵ > 0. Doing so, we can solve various distance problems within the same amount of time, including: the diameter, the center, the Wiener index and the median set. Our run-times match conditional lower bounds of Coudert et al. (SODA'18) under the Strong Exponential-Time Hypothesis. On our way, we get a distance-labeling scheme for n-vertex m-edge graphs of clique-width at most k, using O(k log^2 n) bits per vertex and constructible in Õ(k(n + m)) time from a given k-expression. Doing so, we match the label size obtained by Courcelle and Vanicat (DAM 2016), while we considerably improve the dependency on k in their scheme. As a corollary, we get an Õ(kn^2)-time algorithm for computing All-Pairs Shortest-Paths on n-vertex graphs of clique-width at most k, being given a k-expression. This partially answers an open question of Kratsch and Nelles (STACS'20). Our algorithms work for graphs with non-negative vertex-weights, under two different types of distances studied in the literature. For that, we introduce a new type of orthogonal range query as a side contribution of this work, that might be of independent interest
Balancing graph Voronoi diagrams with one more vertex
Let be a graph with unit-length edges and nonnegative costs
assigned to its vertices. Being given a list of pairwise different vertices
, the {\em prioritized Voronoi diagram} of with
respect to is the partition of in subsets so
that, for every with , a vertex is in if and
only if is a closest vertex to in and there is no closest vertex
to in within the subset . For every
with , the {\em load} of vertex equals the sum of the
costs of all vertices in . The load of equals the maximum load of a
vertex in . We study the problem of adding one more vertex at the end of
in order to minimize the load. This problem occurs in the context of
optimally locating a new service facility ({\it e.g.}, a school or a hospital)
while taking into account already existing facilities, and with the goal of
minimizing the maximum congestion at a site. There is a brute-force algorithm
for solving this problem in time on -vertex -edge graphs.
We prove a matching time lower bound for the special case where
and , assuming the so called Hitting Set Conjecture of Abboud et al. On
the positive side, we present simple linear-time algorithms for this problem on
cliques, paths and cycles, and almost linear-time algorithms for trees, proper
interval graphs and (assuming to be a constant) bounded-treewidth graphs
- …
