72 research outputs found

    Wikipedia Page Classification

    Get PDF
    Cílem této práce je navrhnout a implementovat systém, který umožní výběr tematicky zaměřených článků z Wikipedie za účelem úspory místa při jejím offline uložení. Řešení tohoto problému je dosaženo s využitím metod spadajících do oblasti vyhledávání informací a jejich konkrétní implementací v rámci nástroje Elasticsearch. Systém se na základě zadaných klíčových slov snaží určit, o jakou tematickou oblast se uživatel zajímá a články z této oblasti zařadit do výsledného výběru. K tomu využívá především mechanismy pro určení podobných dokumentů a zahrnutí všech článků z kategorií, které se ve výběru často opakují. Velikosti souborů generovaných výsledným systémem na základě dotazů nad Simple English Wikipedia se obvykle pohybují pod 30 MB.The goal of this paper is to design and implement a system for selection of Wikipedia articles relevant to a given topic in order to reduce the amount of memory taken by its offline version. The solution of this problem was achieved with use of methods from information retrieval and theirs implementation using Elasticsearch search engine. The system tries to determine the area of user's interest by given keywords and make a selection of articles from that area. This is achieved by measuring of similarity of articles and adding all articles from frequent categories in the selection. The sizes of the output files for queries over Simple English Wikipedia are usually below 30 MB.

    Minimum Eccentricity Shortest Path Problem with Respect to Structural Parameters

    Full text link
    The Minimum Eccentricity Shortest Path Problem consists in finding a shortest path with minimum eccentricity in a given undirected graph. The problem is known to be NP-complete and W[2]-hard with respect to the desired eccentricity. We present fpt algorithms for the problem parameterized by the modular width, distance to cluster graph, the combination of distance to disjoint paths with the desired eccentricity, and maximum leaf number

    Complexity of the Steiner Network Problem with Respect to the Number of Terminals

    Get PDF
    In the Directed Steiner Network problem we are given an arc-weighted digraph GG, a set of terminals TV(G)T \subseteq V(G), and an (unweighted) directed request graph RR with V(R)=TV(R)=T. Our task is to output a subgraph GGG' \subseteq G of the minimum cost such that there is a directed path from ss to tt in GG' for all stA(R)st \in A(R). It is known that the problem can be solved in time V(G)O(A(R))|V(G)|^{O(|A(R)|)} [Feldman&Ruhl, SIAM J. Comput. 2006] and cannot be solved in time V(G)o(A(R))|V(G)|^{o(|A(R)|)} even if GG is planar, unless Exponential-Time Hypothesis (ETH) fails [Chitnis et al., SODA 2014]. However, as this reduction (and other reductions showing hardness of the problem) only shows that the problem cannot be solved in time V(G)o(T)|V(G)|^{o(|T|)} unless ETH fails, there is a significant gap in the complexity with respect to T|T| in the exponent. We show that Directed Steiner Network is solvable in time f(R)V(G)O(cgT)f(R)\cdot |V(G)|^{O(c_g \cdot |T|)}, where cgc_g is a constant depending solely on the genus of GG and ff is a computable function. We complement this result by showing that there is no f(R)V(G)o(T2/logT)f(R)\cdot |V(G)|^{o(|T|^2/ \log |T|)} algorithm for any function ff for the problem on general graphs, unless ETH fails

    Generating faster algorithms for d-Path Vertex Cover

    Full text link
    Many algorithms which exactly solve hard problems require branching on more or less complex structures in order to do their job. Those who design such algorithms often find themselves doing a meticulous analysis of numerous different cases in order to identify these structures and design suitable branching rules, all done by hand. This process tends to be error prone and often the resulting algorithm may be difficult to implement in practice. In this work, we aim to automate a part of this process and focus on simplicity of the resulting implementation. We showcase our approach on the following problem. For a constant dd, the dd-Path Vertex Cover problem (dd-PVC) is as follows: Given an undirected graph and an integer kk, find a subset of at most kk vertices of the graph, such that their deletion results in a graph not containing a path on dd vertices as a subgraph. We develop a fully automated framework to generate parameterized branching algorithms for the problem and obtain algorithms outperforming those previously known for 3d83 \le d \le 8. E.g., we show that 55-PVC can be solved in O(2.7knO(1))O(2.7^k\cdot n^{O(1)}) time

    Beyond Max-Cut: \lambda-Extendible Properties Parameterized Above the Poljak-Turz\'{i}k Bound

    Full text link
    Poljak and Turz\'ik (Discrete Math. 1986) introduced the notion of \lambda-extendible properties of graphs as a generalization of the property of being bipartite. They showed that for any 0<\lambda<1 and \lambda-extendible property \Pi, any connected graph G on n vertices and m edges contains a subgraph H \in {\Pi} with at least \lambda m+ (1-\lambda)/2 (n-1) edges. The property of being bipartite is 1/2-extendible, and thus this bound generalizes the Edwards-Erd\H{o}s bound for Max-Cut. We define a variant, namely strong \lambda-extendibility, to which the bound applies. For a strongly \lambda-extendible graph property \Pi, we define the parameterized Above Poljak- Turz\'ik (APT) (\Pi) problem as follows: Given a connected graph G on n vertices and m edges and an integer parameter k, does there exist a spanning subgraph H of G such that H \in {\Pi} and H has at least \lambda m + (1-\lambda)/2 (n - 1) + k edges? The parameter is k, the surplus over the number of edges guaranteed by the Poljak-Turz\'ik bound. We consider properties {\Pi} for which APT (\Pi) is fixed- parameter tractable (FPT) on graphs which are O(k) vertices away from being a graph in which each block is a clique. We show that for all such properties, APT (\Pi) is FPT for all 0<\lambda<1. Our results hold for properties of oriented graphs and graphs with edge labels. Our results generalize the result of Crowston et al. (ICALP 2012) on Max-Cut parameterized above the Edwards-Erd\H{o}s bound, and yield FPT algorithms for several graph problems parameterized above lower bounds, e.g., Max q-Colorable Subgraph problem. Our results also imply that the parameterized above-guarantee Oriented Max Acyclic Digraph problem is FPT, thus solving an open question of Raman and Saurabh (Theor. Comput. Sci. 2006).Comment: 23 pages, no figur

    A Parameterized Complexity View on Collapsing k-Cores

    Get PDF
    We study the NP-hard graph problem COLLAPSED K-CORE where, given an undirected graph G and integers b, x, and k, we are asked to remove b vertices such that the k-core of remaining graph, that is, the (uniquely determined) largest induced subgraph with minimum degree k, has size at most x. COLLAPSED K-CORE was introduced by Zhang et al. (2017) and it is motivated by the study of engagement behavior of users in a social network and measuring the resilience of a network against user drop outs. COLLAPSED K-CORE is a generalization of R-DEGENERATE VERTEX DELETION (which is known to be NP-hard for all r ≥ 0) where, given an undirected graph G and integers b and r, we are asked to remove b vertices such that the remaining graph is r-degenerate, that is, every its subgraph has minimum degree at most r. We investigate the parameterized complexity of COLLAPSED K-CORE with respect to the parameters b, x, and k, and several structural parameters of the input graph. We reveal a dichotomy in the computational complexity of COLLAPSED K-CORE for k ≤ 2 and k ≥ 3. For the latter case it is known that for all x ≥ 0 COLLAPSED K-CORE is W[P]-hard when parameterized by b. For k ≤ 2 we show that COLLAPSED K-CORE is W[1]-hard when parameterized by b and in FPT when parameterized by (b + x). Furthermore, we outline that COLLAPSED K-CORE is in FPT when parameterized by the treewidth of the input graph and presumably does not admit a polynomial kernel when parameterized by the vertex cover number of the input graph.DFG, 284041127, Algorithmen für Faire Allokationen (AFFA)DFG, 382063982, Multivariate Algorithmik temporaler Graphprobleme (MATE)TU Berlin, Open-Access-Mittel – 202
    corecore