75,060 research outputs found

    Agreement graphs and data dependencies

    Get PDF
    The problem of deciding whether a join dependency [R] and a set F of functional dependencies logically imply an embedded join dependency [S] is known to be NP-complete. It is shown that if the set F of functional dependencies is required to be embedded in R, the problem can be decided in polynomial time. The problem is approached by introducing agreement graphs, a type of graph structure which helps expose the combinatorial structure of dependency implication problems. Agreement graphs provide an alternative formalism to tableaus and extend the application of graph and hypergraph theory in relational database research;Agreement graphs are also given a more abstract definition and are used to define agreement graph dependencies (AGDs). It is shown that AGDs are equivalent to Fagin\u27s (unirelational) embedded implicational dependencies. A decision method is given for the AGD implication problem. Although the implication problem for AGDs is undecidable, the decision method works in many cases and lends insight into dependency implication. A number of properties of agreement graph dependencies are given and directions for future research are suggested

    Computational Complexity And Algorithms For Dirty Data Evaluation And Repairing

    Get PDF
    In this dissertation, we study the dirty data evaluation and repairing problem in relational database. Dirty data is usually inconsistent, inaccurate, incomplete and stale. Existing methods and theories of consistency describe using integrity constraints, such as data dependencies. However, integrity constraints are good at detection but not at evaluating the degree of data inconsistency and cannot guide the data repairing. This dissertation first studies the computational complexity of and algorithms for the database inconsistency evaluation. We define and use the minimum tuple deletion to evaluate the database inconsistency. For such minimum tuple deletion problem, we study the relationship between the size of rule set and its computational complexity. We show that the minimum tuple deletion problem is NP-hard to approximate the minimum tuple deletion within 17/16 if given three functional dependencies and four attributes involved. A near optimal approximated algorithm for computing the minimum tuple deletion is proposed with a ratio of 2 − 1/2r , where r is the number of given functional dependencies. To guide the data repairing, this dissertation also investigates the data repairing method by using query feedbacks, formally studies two decision problems, functional dependency restricted deletion and insertion propagation problem, corresponding to the feedbacks of deletion and insertion. A comprehensive analysis on both combined and data complexity of the cases is provided by considering different relational operators and feedback types. We have identified the intractable and tractable cases to picture the complexity hierarchy of these problems, and provided the efficient algorithm on these tractable cases. Two improvements are proposed, one focuses on figuring out the minimum vertex cover in conflict graph to improve the upper bound of tuple deletion problem, and the other one is a better dichotomy for deletion and insertion propagation problems at the absence of functional dependencies from the point of respectively considering data, combined and parameterized complexities

    Learning Interpretable Rules for Multi-label Classification

    Full text link
    Multi-label classification (MLC) is a supervised learning problem in which, contrary to standard multiclass classification, an instance can be associated with several class labels simultaneously. In this chapter, we advocate a rule-based approach to multi-label classification. Rule learning algorithms are often employed when one is not only interested in accurate predictions, but also requires an interpretable theory that can be understood, analyzed, and qualitatively evaluated by domain experts. Ideally, by revealing patterns and regularities contained in the data, a rule-based theory yields new insights in the application domain. Recently, several authors have started to investigate how rule-based models can be used for modeling multi-label data. Discussing this task in detail, we highlight some of the problems that make rule learning considerably more challenging for MLC than for conventional classification. While mainly focusing on our own previous work, we also provide a short overview of related work in this area.Comment: Preprint version. To appear in: Explainable and Interpretable Models in Computer Vision and Machine Learning. The Springer Series on Challenges in Machine Learning. Springer (2018). See http://www.ke.tu-darmstadt.de/bibtex/publications/show/3077 for further informatio

    Analysis and operational challenges of dynamic ride sharing demand responsive transportation models

    Get PDF
    There is a wide body of evidence that suggests sustainable mobility is not only a technological question, but that automotive technology will be a part of the solution in becoming a necessary albeit insufficient condition. Sufficiency is emerging as a paradigm shift from car ownership to vehicle usage, which is a consequence of socio-economic changes. Information and Communication Technologies (ICT) now make it possible for a user to access a mobility service to go anywhere at any time. Among the many emerging mobility services, Multiple Passenger Ridesharing and its variants look the most promising. However, challenges arise in implementing these systems while accounting specifically for time dependencies and time windows that reflect users’ needs, specifically in terms of real-time fleet dispatching and dynamic route calculation. On the other hand, we must consider the feasibility and impact analysis of the many factors influencing the behavior of the system – as, for example, service demand, the size of the service fleet, the capacity of the shared vehicles and whether the time window requirements are soft or tight. This paper analyzes - a Decision Support System that computes solutions with ad hoc heuristics applied to variants of Pick Up and Delivery Problems with Time Windows, as well as to Feasibility and Profitability criteria rooted in Dynamic Insertion Heuristics. To evaluate the applications, a Simulation Framework is proposed. It is based on a microscopic simulation model that emulates real-time traffic conditions and a real traffic information system. It also interacts with the Decision Support System by feeding it with the required data for making decisions in the simulation that emulate the behavior of the shared fleet. The proposed simulation framework has been implemented in a model of Barcelona’s Central Business District. The obtained results prove the potential feasibility of the mobility concept.Postprint (published version
    corecore