59,330 research outputs found

    Discovery-based edit assistance for spreadsheets

    Get PDF
    Spreadsheets can be viewed as a highly flexible endusers programming environment which enjoys wide-spread adoption. But spreadsheets lack many of the structured programming concepts of regular programming paradigms. In particular, the lack of data structures in spreadsheets may lead spreadsheet users to cause redundancy, loss, or corruption of data during edit actions. In this paper, we demonstrate how implicit structural properties of spreadsheet data can be exploited to offer edit assistance to spreadsheet users. Our approach is based on the discovery of functional dependencies among data items which allow automatic reconstruction of a relational database schema. From this schema, new formulas and visual objects are embedded into the spreadsheet to offer features for auto-completion, guarded deletion, and controlled insertion. Schema discovery and spreadsheet enhancement are carried out automatically in the background and do not disturb normal user experience

    Inferring tumor-specific cancer dependencies through integrating ex vivo drug response assays and drug-protein profiling

    Full text link
    The development of cancer therapies may be improved by the discovery of tumor-specific molecular dependencies. The requisite tools include genetic and chemical perturbations, each with its strengths and limitations. Chemical perturbations can be readily applied to primary cancer samples at large scale, but mechanistic understanding of hits and further pharmaceutical development is often complicated by the fact that a chemical compound has affinities to multiple proteins. To computationally infer specific molecular dependencies of individual cancers from their ex vivo drug sensitivity profiles, we developed a mathematical model that deconvolutes these data using measurements of protein-drug affinity profiles. Through integrating a drug-kinase profiling dataset and several drug response datasets, our method, DepInfeR, correctly identified known protein kinase dependencies, including the EGFR dependence of HER2+ breast cancer cell lines, the FLT3 dependence of acute myeloid leukemia (AML) with FLT3-ITD mutations and the differential dependencies on the B-cell receptor pathway in the two major subtypes of chronic lymphocytic leukemia (CLL). Furthermore, our method uncovered new subgroup-specific dependencies, including a previously unreported dependence of high-risk CLL on Checkpoint kinase 1 (CHEK1). The method also produced a detailed map of the kinase dependencies in a heterogeneous set of 117 CLL samples. The ability to deconvolute polypharmacological phenotypes into underlying causal molecular dependencies should increase the utility of high-throughput drug response assays for functional precision oncology

    Efficient Discovery of Ontology Functional Dependencies

    Full text link
    Poor data quality has become a pervasive issue due to the increasing complexity and size of modern datasets. Constraint based data cleaning techniques rely on integrity constraints as a benchmark to identify and correct errors. Data values that do not satisfy the given set of constraints are flagged as dirty, and data updates are made to re-align the data and the constraints. However, many errors often require user input to resolve due to domain expertise defining specific terminology and relationships. For example, in pharmaceuticals, 'Advil' \emph{is-a} brand name for 'ibuprofen' that can be captured in a pharmaceutical ontology. While functional dependencies (FDs) have traditionally been used in existing data cleaning solutions to model syntactic equivalence, they are not able to model broader relationships (e.g., is-a) defined by an ontology. In this paper, we take a first step towards extending the set of data quality constraints used in data cleaning by defining and discovering \emph{Ontology Functional Dependencies} (OFDs). We lay out theoretical and practical foundations for OFDs, including a set of sound and complete axioms, and a linear inference procedure. We then develop effective algorithms for discovering OFDs, and a set of optimizations that efficiently prune the search space. Our experimental evaluation using real data show the scalability and accuracy of our algorithms.Comment: 12 page

    Algorithms and implementation of functional dependency discovery in XML : a thesis presented in partial fulfilment of the requirements for the degree of Master of Information Sciences in Information Systems at Massey University

    Get PDF
    1.1 Background Following the advent of the web, there has been a great demand for data interchange between applications using internet infrastructure. XML (extensible Markup Language) provides a structured representation of data empowered by broad adoption and easy deployment. As a subset of SGML (Standard Generalized Markup Language), XML has been standardized by the World Wide Web Consortium (W3C) [Bray et al., 2004], XML is becoming the prevalent data exchange format on the World Wide Web and increasingly significant in storing semi-structured data. After its initial release in 1996, it has evolved and been applied extensively in all fields where the exchange of structured documents in electronic form is required. As with the growing popularity of XML, the issue of functional dependency in XML has recently received well deserved attention. The driving force for the study of dependencies in XML is it is as crucial to XML schema design, as to relational database(RDB) design [Abiteboul et al., 1995]
    • …
    corecore