9 research outputs found

    Ligo: An Open Source Application for the Management and Execution of Administrative Data Linkage

    Get PDF
    Introduction Ligo is an open source application that provides a framework for managing and executing administrative data linking projects. Ligo provides an easy-to-use web interface that lets analysts select among data linking methods including deterministic, probabilistic and machine learning approaches and use these in a documented, repeatable, tested, step-by-step process. Objectives and Approach The linking application has two primary functions: identifying common entities in datasets [de-duplication] and identifying common entities between datasets [linking]. The application is being built from the ground up in a partnership between the Province of British Columbia’s Data Innovation (DI) Program and Population Data BC, and with input from data scientists. The simple web interface allows analysts to streamline the processing of multiple datasets in a straight-forward and reproducible manner. Results Built in Python and implemented as a desktop-capable and cloud-deployable containerized application, Ligo includes many of the latest data-linking comparison algorithms with a plugin architecture that supports the simple addition of new formulae. Currently, deterministic approaches to linking have been implemented and probabilistic methods are in alpha testing. A fully functional alpha, including deterministic and probabilistic methods is expected to be ready in September, with a machine learning extension expected soon after. Conclusion/Implications Ligo has been designed with enterprise users in mind. The application is intended to make the processes of data de-duplication and linking simple, fast and reproducible. By making the application open source, we encourage feedback and collaboration from across the population research and data science community

    The First Simple Symmetric 11-Venn Diagram

    No full text
    An n-Venn diagram is a collection of n simple closed curves in the plane with the following properties: (a) Each of the 2n2^n different intersections of the open interiors or exteriors of the curves is a non-empty connected region; (b) there are only finitely many points where the curves intersect. If each of the intersections is of only two curves, then the diagram is said to be simple. The purpose of this poster is to highlight how we discovered the first simple symmetric 11-Venn diagram

    Generating Simple Convex Venn Diagrams

    Get PDF
    In this paper we are concerned with producing exhaustive lists of simple monotone Venn diagrams that have some symmetry (non-trivial isometry) when drawn on the sphere. A diagram is simple if at most two curves intersect at any point, and it is monotone if it has some embedding on the plane in which all curves are convex. We show that there are 23 such 7-Venn diagrams with a 7-fold rotational symmetry about the polar axis, and that 6 of these have an additional 2-fold rotational symmetry about an equatorial axis. In the case of simple monotone 6-Venn diagrams, we show that there are 39020 non-isomorphic planar diagrams in total, and that 375 of them have a 2-fold symmetry by rotation about an equatorial axis, and amongst these we determine all those that have a richer isometry group on the sphere. Additionally, 270 of the 6-Venn diagrams also have the 2-fold symmetry induced by reflection about the center of the sphere. Since such exhaustive searches are prone to error, we have implemented the search in a couple of ways, and with independent programs. These distinct algorithms are described. We also prove that the Grünbaum encoding can be used to efficiently identify any monotone Venn diagram
    corecore