18 research outputs found

    A family of tree-based generators for bubbles in directed graphs

    Get PDF
    6sìopenBubbles are pairs of internally vertex-disjoint (s, t)-paths in a directed graph. In de Bruijn graphs built from reads of RNA and DNA data, bubbles represent interesting biological events, such as alternative splicing (AS) and allelic differences (SNPs and indels). However, the set of all bubbles in a de Bruijn graph built from real data is usually too large to be efficiently enumerated and analysed in practice. In particular, despite significant research done in this area, listing bubbles still remains the main bottleneck for tools that detect AS events in a reference-free context. Recently, in the concept of a bubble generator was introduced as a way for obtaining a compact representation of the bubble space of a graph. Although this bubble generator was quite effective in finding AS events, preliminary experiments showed that it is about 5 times slower than state-of-art methods. In this paper we propose a new family of bubble generators which improve substantially on previous work: bubble generators in this new family are about two orders of magnitude faster and are still able to achieve similar precision in identifying AS events. To highlight the practical value of our new bubble generators, we also report some experimental results on real datasets.openAcuña, Vicente; Soares de Lima, Leandro Ishi; Italiano, Giuseppe F.; Pepè Sciarria, Luca; Sagot, Marie-France; Sinaimeri, BlerinaAcuña, Vicente; Soares de Lima, Leandro Ishi; Italiano, Giuseppe F.; Pepè Sciarria, Luca; Sagot, Marie-France; Sinaimeri, Blerin

    Parameterized Complexity of Domination Problems Using Restricted Modular Partitions

    Get PDF
    For a graph class ?, we define the ?-modular cardinality of a graph G as the minimum size of a vertex partition of G into modules that each induces a graph in ?. This generalizes other module-based graph parameters such as neighborhood diversity and iterated type partition. Moreover, if ? has bounded modular-width, the W[1]-hardness of a problem in ?-modular cardinality implies hardness on modular-width, clique-width, and other related parameters. Several FPT algorithms based on modular partitions compute a solution table in each module, then combine each table into a global solution. This works well when each table has a succinct representation, but as we argue, when no such representation exists, the problem is typically W[1]-hard. We illustrate these ideas on the generic (?, ?)-domination problem, which is a generalization of known domination problems such as Bounded Degree Deletion, k-Domination, and ?-Domination. We show that for graph classes ? that require arbitrarily large solution tables, these problems are W[1]-hard in the ?-modular cardinality, whereas they are fixed-parameter tractable when they admit succinct solution tables. This leads to several new positive and negative results for many domination problems parameterized by known and novel structural graph parameters such as clique-width, modular-width, and cluster-modular cardinality

    Distance-Based Solution of Patrolling Problems with Individual Waiting Times

    Get PDF
    In patrolling problems, robots (or other vehicles) must perpetually visit certain points without exceeding given individual waiting times. Some obvious applications are monitoring, maintenance, and periodic fetching of resources. We propose a new generic formulation of the problem. As its main advantage, it enables a reduction of the multi-robot case to the one-robot case in a certain graph/hypergraph pair, which also relates the problem to some classic path problems in graphs: NP-hardness is shown by a reduction from the Hamiltonian cycle problem, and on the positive side, the formulation allows solution heuristics using distances in the mentioned graph. We demonstrate this approach for the case of two robots patrolling on a line, a problem whose complexity status is open, apart from approximation results. Specifically, we solve all instances with up to 6 equidistant points, and we find some surprising effects, e.g., critical problem instances (which are feasible instances that become infeasible when any waiting time is diminished) may contain rather large individual waiting times

    Hierarchical Entity Resolution using an Oracle

    Get PDF
    In many applications, entity references (i.e., records) and entities need to be organized to capture diverse relationships like type-subtype, is-A (mapping entities to types), and duplicate (mapping records to entities) relationships. However, automatic identification of such relationships is often inaccurate due to noise and heterogeneous representation of records across sources. Similarly, manual maintenance of these relationships is infeasible and does not scale to large datasets. In this work, we circumvent these challenges by considering weak supervision in the form of an oracle to formulate a novel hierarchical ER task. In this setting, records are clustered in a tree-like structure containing records at leaf-level and capturing record-entity (duplicate), entity-type (is-A) and subtype-supertype relationships. For effective use of supervision, we leverage triplet comparison oracle queries that take three records as input and output the most similar pair(s). We develop HierER, a querying strategy that uses record pair similarities to minimize the number of oracle queries while maximizing the identified hierarchical structure. We show theoretically and empirically that HierER is effective under different similarity noise models and demonstrate empirically that HierER can scale up to million-size datasets

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum
    corecore