22,173 research outputs found

    The matching relaxation for a class of generalized set partitioning problems

    Full text link
    This paper introduces a discrete relaxation for the class of combinatorial optimization problems which can be described by a set partitioning formulation under packing constraints. We present two combinatorial relaxations based on computing maximum weighted matchings in suitable graphs. Besides providing dual bounds, the relaxations are also used on a variable reduction technique and a matheuristic. We show how that general method can be tailored to sample applications, and also perform a successful computational evaluation with benchmark instances of a problem in maritime logistics.Comment: 33 pages. A preliminary (4-page) version of this paper was presented at CTW 2016 (Cologne-Twente Workshop on Graphs and Combinatorial Optimization), with proceedings on Electronic Notes in Discrete Mathematic

    Query-Driven Sampling for Collective Entity Resolution

    Full text link
    Probabilistic databases play a preeminent role in the processing and management of uncertain data. Recently, many database research efforts have integrated probabilistic models into databases to support tasks such as information extraction and labeling. Many of these efforts are based on batch oriented inference which inhibits a realtime workflow. One important task is entity resolution (ER). ER is the process of determining records (mentions) in a database that correspond to the same real-world entity. Traditional pairwise ER methods can lead to inconsistencies and low accuracy due to localized decisions. Leading ER systems solve this problem by collectively resolving all records using a probabilistic graphical model and Markov chain Monte Carlo (MCMC) inference. However, for large datasets this is an extremely expensive process. One key observation is that, such exhaustive ER process incurs a huge up-front cost, which is wasteful in practice because most users are interested in only a small subset of entities. In this paper, we advocate pay-as-you-go entity resolution by developing a number of query-driven collective ER techniques. We introduce two classes of SQL queries that involve ER operators --- selection-driven ER and join-driven ER. We implement novel variations of the MCMC Metropolis Hastings algorithm to generate biased samples and selectivity-based scheduling algorithms to support the two classes of ER queries. Finally, we show that query-driven ER algorithms can converge and return results within minutes over a database populated with the extraction from a newswire dataset containing 71 million mentions

    GLB: Lifeline-based Global Load Balancing library in X10

    Full text link
    We present GLB, a programming model and an associated implementation that can handle a wide range of irregular paral- lel programming problems running over large-scale distributed systems. GLB is applicable both to problems that are easily load-balanced via static scheduling and to problems that are hard to statically load balance. GLB hides the intricate syn- chronizations (e.g., inter-node communication, initialization and startup, load balancing, termination and result collection) from the users. GLB internally uses a version of the lifeline graph based work-stealing algorithm proposed by Saraswat et al. Users of GLB are simply required to write several pieces of sequential code that comply with the GLB interface. GLB then schedules and orchestrates the parallel execution of the code correctly and efficiently at scale. We have applied GLB to two representative benchmarks: Betweenness Centrality (BC) and Unbalanced Tree Search (UTS). Among them, BC can be statically load-balanced whereas UTS cannot. In either case, GLB scales well-- achieving nearly linear speedup on different computer architectures (Power, Blue Gene/Q, and K) -- up to 16K cores

    Some results on triangle partitions

    Full text link
    We show that there exist efficient algorithms for the triangle packing problem in colored permutation graphs, complete multipartite graphs, distance-hereditary graphs, k-modular permutation graphs and complements of k-partite graphs (when k is fixed). We show that there is an efficient algorithm for C_4-packing on bipartite permutation graphs and we show that C_4-packing on bipartite graphs is NP-complete. We characterize the cobipartite graphs that have a triangle partition
    corecore