87 research outputs found

    Efficient Pattern Matching in Python

    Full text link
    Pattern matching is a powerful tool for symbolic computations. Applications include term rewriting systems, as well as the manipulation of symbolic expressions, abstract syntax trees, and XML and JSON data. It also allows for an intuitive description of algorithms in the form of rewrite rules. We present the open source Python module MatchPy, which offers functionality and expressiveness similar to the pattern matching in Mathematica. In particular, it includes syntactic pattern matching, as well as matching for commutative and/or associative functions, sequence variables, and matching with constraints. MatchPy uses new and improved algorithms to efficiently find matches for large pattern sets by exploiting similarities between patterns. The performance of MatchPy is investigated on several real-world problems

    On-line construction of position heaps

    Get PDF
    We propose a simple linear-time on-line algorithm for constructing a position heap for a string [Ehrenfeucht et al, 2011]. Our definition of position heap differs slightly from the one proposed in [Ehrenfeucht et al, 2011] in that it considers the suffixes ordered from left to right. Our construction is based on classic suffix pointers and resembles the Ukkonen's algorithm for suffix trees [Ukkonen, 1995]. Using suffix pointers, the position heap can be extended into the augmented position heap that allows for a linear-time string matching algorithm [Ehrenfeucht et al, 2011].Comment: to appear in Journal of Discrete Algorithm

    Approximately Stable Matchings with Budget Constraints

    Full text link
    This paper considers two-sided matching with budget constraints where one side (firm or hospital) can make monetary transfers (offer wages) to the other (worker or doctor). In a standard model, while multiple doctors can be matched to a single hospital, a hospital has a maximum quota: the number of doctors assigned to a hospital cannot exceed a certain limit. In our model, a hospital instead has a fixed budget: the total amount of wages allocated by each hospital to doctors is constrained. With budget constraints, stable matchings may fail to exist and checking for the existence is hard. To deal with the nonexistence of stable matchings, we extend the "matching with contracts" model of Hatfield and Milgrom, so that it handles approximately stable matchings where each of the hospitals' utilities after deviation can increase by factor up to a certain amount. We then propose two novel mechanisms that efficiently return such a stable matching that exactly satisfies the budget constraints. In particular, by sacrificing strategy-proofness, our first mechanism achieves the best possible bound. Furthermore, we find a special case such that a simple mechanism is strategy-proof for doctors, keeping the best possible bound of the general case.Comment: Accepted for the 32nd AAAI Conference on Artificial Intelligence (AAAI2018). arXiv admin note: text overlap with arXiv:1705.0764

    Matchings with lower quotas: Algorithms and complexity

    Get PDF
    We study a natural generalization of the maximum weight many-to-one matching problem. We are given an undirected bipartite graph G=(A∪˙P,E)G=(A∪˙P,E) with weights on the edges in E, and with lower and upper quotas on the vertices in P. We seek a maximum weight many-to-one matching satisfying two sets of constraints: vertices in A are incident to at most one matching edge, while vertices in P are either unmatched or they are incident to a number of matching edges between their lower and upper quota. This problem, which we call maximum weight many-to-one matching with lower and upper quotas (WMLQ), has applications to the assignment of students to projects within university courses, where there are constraints on the minimum and maximum numbers of students that must be assigned to each project. In this paper, we provide a comprehensive analysis of the complexity of WMLQ from the viewpoints of classical polynomial time algorithms, fixed-parameter tractability, as well as approximability. We draw the line between NPNP-hard and polynomially tractable instances in terms of degree and quota constraints and provide efficient algorithms to solve the tractable ones. We further show that the problem can be solved in polynomial time for instances with bounded treewidth; however, the corresponding runtime is exponential in the treewidth with the maximum upper quota umaxumax as basis, and we prove that this dependence is necessary unless FPT=W[1]FPT=W[1]. The approximability of WMLQ is also discussed: we present an approximation algorithm for the general case with performance guarantee umax+1umax+1, which is asymptotically best possible unless P=NPP=NP. Finally, we elaborate on how most of our positive results carry over to matchings in arbitrary graphs with lower quotas

    Propensity Score Matching in Randomized Clinical Trials

    Full text link
    Cluster randomization trials with relatively few clusters have been widely used in recent years for evaluation of health-care strategies. On average, randomized treatment assignment achieves balance in both known and unknown confounding factors between treatment groups, however, in practice investigators can only introduce a small amount of stratification and cannot balance on all the important variables simultaneously. The limitation arises especially when there are many confounding variables in small studies. Such is the case in the  INSTINCT  trial designed to investigate the effectiveness of an education program in enhancing the tPA use in stroke patients. In this article, we introduce a new randomization design, the balance match weighted (BMW) design, which applies the optimal matching with constraints technique to a prospective randomized design and aims to minimize the mean squared error (MSE) of the treatment effect estimator. A simulation study shows that, under various confounding scenarios, the BMW design can yield substantial reductions in the MSE for the treatment effect estimator compared to a completely randomized or matched-pair design. The BMW design is also compared with a model-based approach adjusting for the estimated propensity score and Robins-Mark-Newey E-estimation procedure in terms of efficiency and robustness of the treatment effect estimator. These investigations suggest that the BMW design is more robust and usually, although not always, more efficient than either of the approaches. The design is also seen to be robust against heterogeneous error. We illustrate these methods in proposing a design for the  INSTINCT  trial.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/78638/1/j.1541-0420.2009.01364.x.pd

    Stronger instruments via integer programming in an observational study of late preterm birth outcomes

    Get PDF
    In an optimal nonbipartite match, a single population is divided into matched pairs to minimize a total distance within matched pairs. Nonbipartite matching has been used to strengthen instrumental variables in observational studies of treatment effects, essentially by forming pairs that are similar in terms of covariates but very different in the strength of encouragement to accept the treatment. Optimal nonbipartite matching is typically done using network optimization techniques that can be quick, running in polynomial time, but these techniques limit the tools available for matching. Instead, we use integer programming techniques, thereby obtaining a wealth of new tools not previously available for nonbipartite matching, including fine and near-fine balance for several nominal variables, forced near balance on means and optimal subsetting. We illustrate the methods in our on-going study of outcomes of late-preterm births in California, that is, births of 34 to 36 weeks of gestation. Would lengthening the time in the hospital for such births reduce the frequency of rapid readmissions? A straightforward comparison of babies who stay for a shorter or longer time would be severely biased, because the principal reason for a long stay is some serious health problem. We need an instrument, something inconsequential and haphazard that encourages a shorter or a longer stay in the hospital. It turns out that babies born at certain times of day tend to stay overnight once with a shorter length of stay, whereas babies born at other times of day tend to stay overnight twice with a longer length of stay, and there is nothing particularly special about a baby who is born at 11:00 pm.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS582 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Strategyproof matching with regional minimum and maximum quotas

    Get PDF
    This paper considers matching problems with individual/regional minimum/maximum quotas. Although such quotas are relevant in many real-world settings, there is a lack of strategyproof mechanisms that take such quotas into account. We first show that without any restrictions on the regional structure, checking the existence of a feasible matching that satisfies all quotas is NP-complete. Then, assuming that regions have a hierarchical structure (i.e., a tree), we show that checking the existence of a feasible matching can be done in time linear in the number of regions. We develop two strategyproof matching mechanisms based on the Deferred Acceptance mechanism (DA), which we call Priority List based Deferred Acceptance with Regional minimum and maximum Quotas (PLDA-RQ) and Round-robin Selection Deferred Acceptance with Regional minimum and maximum Quotas (RSDA-RQ). When regional quotas are imposed, a stable matching may no longer exist since fairness and nonwastefulness, which compose stability, are incompatible. We show that both mechanisms are fair. As a result, they are inevitably wasteful. We show that the two mechanisms satisfy different versions of nonwastefulness respectively; each is weaker than the original nonwastefulness. Moreover, we compare our mechanisms with an artificial cap mechanism via simulation experiments, which illustrate that they have a clear advantage in terms of nonwastefulness and student welfare