143 research outputs found
An Efficient Index for Reachability Queries in Public Transport Networks
Computing path queries such as the shortest path in public transport networks is challenging because the path costs between nodes change over time. A reachability query from a node at a given start time on such a network retrieves all points of interest (POIs) that are reachable within a given cost budget. Reachability queries are essential building blocks in many applications, for example, group recommendations, ranking spatial queries, or geomarketing. We propose an efficient solution for reachability queries in public transport networks. Currently, there are two options to solve reachability queries. (1) Execute a modified version of Dijkstra’s algorithm that supports time-dependent edge traversal costs; this solution is slow since it must expand edge by edge and does not use an index. (2) Issue a separate path query for each single POI, i.e., a single reachability query requires answering many path queries. None of these solutions scales to large networks with many POIs. We propose a novel and lightweight reachability index. The key idea is to partition the network into cells. Then, in contrast to other approaches, we expand the network cell by cell. Empirical evaluations on synthetic and real-world networks confirm the efficiency and the effectiveness of our index-based reachability query solution
XML data exchange:Consistency and query answering
Data exchange is the problem of finding an instance of a target schema, given an instance of a source schema and a specification of the relationship between the source and the target. Theoretical foundations of data exchange have recently been investigated for relational data. In this article, we start looking into the basic properties of XML data exchange, that is, restructuring of XML documents that conform to a source DTD under a target DTD, and answering queries written over the target schema. We define XML data exchange settings in which source-to-target dependencies refer to the hierarchical structure of the data. Combining DTDs and dependencies makes some XML data exchange settings inconsistent. We investigate the consistency problem and determine its exact complexity. We then move to query answering, and prove a dichotomy theorem that classifies data exchange settings into those over which query answering is tractable, and those over which it is coNP-complete, depending on classes of regular expressions used in DTDs. Furthermore, for all tractable cases we give polynomial-time algorithms that compute target XML documents over which queries can be answered
On the Complexity of Query Result Diversification
Query result diversification is a bi-criteria optimization problem for ranking query results. Given a database D, a query Q and a positive integer k, it is to find a set of k tuples from Q(D) such that the tuples are as relevant as possible to the query, and at the same time, as diverse as possible to each other. Subsets of Q(D) are ranked by an objective function defined in terms of relevance and diversity. Query result diversification has found a variety of applications in databases, information retrieval and operations research. This paper studies the complexity of result diversification for relational queries. We identify three problems in connection with query result diversification, to determine whether there exists a set of k tuples that is ranked above a bound with respect to relevance and diversity, to assess the rank of a given k-element set, and to count how many k-element sets are ranked above a given bound. We study these problems for a variety of query languages and for three objective functions. We establish the upper and lower bounds of these problems, all matching, for both combined complexity and data complexity. We also investigate several special settings of these problems, identifying tractable cases. 1
NightSplitter: a scheduling tool to optimize (sub)group activities
International audienceHumans are social animals and usually organize activities in groups. However, they are often willing to split temporarily a bigger group in subgroups to enhance their preferences. In this work we present NightSplitter, an on-line tool that is able to plan movie and dinner activities for a group of users, possibly splitting them in subgroups to optimally satisfy their preferences. We first model and prove that this problem is NP-complete. We then use Constraint Programming (CP) or alternatively Simulated Annealing (SA) to solve it. Empirical results show the feasibility of the approach even for big cities where hundreds of users can select among hundreds of movies and thousand of restaurants
Tractable XML data exchange via relations
We consider data exchange for XML documents: given source and target schemas, a mapping between them, and a document conforming to the source schema, construct a target document and answer target queries in a way that is consistent with source information. The problem has primarily been studied in the relational context, in which data-exchange systems have also been built. Since many XML documents are stored in relations, it is natural to consider using a relational system for XML data exchange. However, there is a complexity mismatch between query answering in relational and XML data exchange, which indicates that restrictions have to be imposed on XML schemas and mappings, and on XML shredding schemes, to make the use of relational systems possible. We isolate a set of five requirements that must be fulfilled in order to have a faithful representation of the XML data-exchange problem by a relational translation. We then demonstrate that these requirements naturally suggest the inlining technique for dataexchange tasks. Our key contribution is to provide shredding algorithms for schemas, documents, mappings and queries, and demonstrate that they enable us to correctly perform XML data-exchange tasks using a relational system
Schemas for Unordered XML on a DIME
We investigate schema languages for unordered XML having no relative order
among siblings. First, we propose unordered regular expressions (UREs),
essentially regular expressions with unordered concatenation instead of
standard concatenation, that define languages of unordered words to model the
allowed content of a node (i.e., collections of the labels of children).
However, unrestricted UREs are computationally too expensive as we show the
intractability of two fundamental decision problems for UREs: membership of an
unordered word to the language of a URE and containment of two UREs.
Consequently, we propose a practical and tractable restriction of UREs,
disjunctive interval multiplicity expressions (DIMEs).
Next, we employ DIMEs to define languages of unordered trees and propose two
schema languages: disjunctive interval multiplicity schema (DIMS), and its
restriction, disjunction-free interval multiplicity schema (IMS). We study the
complexity of the following static analysis problems: schema satisfiability,
membership of a tree to the language of a schema, schema containment, as well
as twig query satisfiability, implication, and containment in the presence of
schema. Finally, we study the expressive power of the proposed schema languages
and compare them with yardstick languages of unordered trees (FO, MSO, and
Presburger constraints) and DTDs under commutative closure. Our results show
that the proposed schema languages are capable of expressing many practical
languages of unordered trees and enjoy desirable computational properties.Comment: Theory of Computing System
- …