283,543 research outputs found

    Low-Sensitivity Functions from Unambiguous Certificates

    Get PDF
    We provide new query complexity separations against sensitivity for total Boolean functions: a power 33 separation between deterministic (and even randomized or quantum) query complexity and sensitivity, and a power 2.222.22 separation between certificate complexity and sensitivity. We get these separations by using a new connection between sensitivity and a seemingly unrelated measure called one-sided unambiguous certificate complexity (UCminUC_{min}). We also show that UCminUC_{min} is lower-bounded by fractional block sensitivity, which means we cannot use these techniques to get a super-quadratic separation between bs(f)bs(f) and s(f)s(f). We also provide a quadratic separation between the tree-sensitivity and decision tree complexity of Boolean functions, disproving a conjecture of Gopalan, Servedio, Tal, and Wigderson (CCC 2016). Along the way, we give a power 1.221.22 separation between certificate complexity and one-sided unambiguous certificate complexity, improving the power 1.1281.128 separation due to G\"o\"os (FOCS 2015). As a consequence, we obtain an improved Ω(log⁥1.22n)\Omega(\log^{1.22} n) lower-bound on the co-nondeterministic communication complexity of the Clique vs. Independent Set problem.Comment: 25 pages. This version expands the results and adds Pooya Hatami and Avishay Tal as author

    Querying the web of data with low latency: high performance distributed SPARQL processing and benchmarking

    No full text
    The Web of Data extends the World Wide Web (WWW) in a way that applications can understand information and cooperate with humans on complex tasks. The basis of performing complex tasks is low latency queries over the Web of Data. The large scale and distributed nature of the Web of Data have negative impacts on several critical factors for efficient query processing, including fast data transmission between datasets, predictable data distribution and statistics that summarise and describe certain patterns in the data. Moreover, it is common on the Web of Data that the same resource is identified by multiple URIs. This phenomenon, named co-reference, potentially increases the complexity of query processing, and makes it even harder to obtain accurate statistics. With the aforementioned challenges, it is not clear whether it is possible to achieve efficient queries on the Web of Data on a large scale.In this thesis, we explore techniques to improve the efficiency of querying the Web of Data on a large scale. More specifically, we investigate two typical scenarios on the Web of Data, which are: 1) the scenario in which all datasets provide detailed statistics that are possibly available on a large scale, and 2) the scenario in which co-reference is taken into account, and datasets’ statistics are not reliable. For each scenario we explore existing and novel optimisation techniques that are tailored for querying the Web of Data, as well as well developed techniques with careful adjustments.For the scenario with detailed statistics we provide a scheme that implements a statistics query optimisation approach that requires detailed statistics, and intensively exploits parallelism. We propose an efficient algorithm called Parallel Sub-query Identification () to increase the degree of parallelism. () breaks a SPARQL query into sub-queries that can be processed in parallel while not increasing network traffic. We combine with dynamic programming to produce query plans with both minimum costs and a fair degree of parallelism. Furthermore, we develop a mechanism that maximally exploits bandwidth and computing power of datasets. For the scenario having co-reference and without reliable statistics we provide a scheme that implements a dynamic query optimisation approach that takes co-reference into account, and utilises runtime statistics to elevate query efficiency even further. We propose a model called Virtual Graph to transform a query and all its co-referent siblings into a single query with pre-defined bindings. Virtual Graph reduces the large number of outgoing and incoming requests that is required to process co-referent queries individually. Moreover, Virtual Graph enables query optimisers to find the optimal plan with respect to all co-referent queries as a whole. () is used in this scheme as well but provides a higher degree of parallelism with the help of runtime statistics. A Minimum-Spanning-Tree-based algorithm is used in this scheme as a result of using runtime statistics. The same parallel execution mechanism used in the previous scenario is adopted here as well.In order to examine the effectiveness of our schemes in practice, we deploy the above approaches in two distributed SPARQL engines, LHD-s and LHD-d respectively. Both engines are implemented using a popular Java-based platform for building Semantic Web applications. They can be used as either standalone applications or integrated into existing systems that require quick response of Linked Data queries.We also propose a scalable and flexible benchmark, called Distributed SPARQL Evaluation Framework (DSEF), for evaluating optimisation approaches in the Web of Data. DSEF adopts a expandable virtual-machine-based structure and provides a set of efficient tools to help easily set up RDF networks of arbitrary sizes. We further investigate the proportion and distribution of co-reference in the real world, based on which DESF is able to simulate co-reference for given RDF datasets. DSEF bases its soundness in the usage of widely accepted assessment data and queries.By comparing both LHD-s and LHD-d with existing approaches using DSEF, we provide evidence that neither existing statistics provided by datasets nor cost estimation methods, are sufficiently accurate. On the other hand, dynamic optimisation using runtime statistics together with carefully tuned parallelism are promising for significantly reducing the latency of large scale queries on the Web of Data. We also demonstrate that () and Virtual Graph algorithms significantly increase query efficiency for queries with or without co-reference.In summary, the contributions of this these include: 1) proposing two schemes for improving query efficiency in two typical scenarios in the Web of Data; 2) providing implementations, named LHD-s and LHD-d, for the two schemes respectively; 3) proposing a scalable and flexible evaluation framework for distributed SPARQL engines called DSEF; and 4) showing evidence that runtime-statistics-based dynamic optimisation with parallelism are promising to reduce latency of Linked Data queries on a large scale


    Get PDF
    existsexists-InvSat is the problem which takes as input a relation RR and a finite set mathcalSmathcal S of relations on the same finite domain DD, and asks whether RR is definable by a conjunctive query over mathcalSmathcal S, i.e., by a formula of the form existsmathbfyvarphi(mathbfx,mathbfy)exists mathbf{y} varphi(mathbf{x},mathbf{y}) where varphivarphi is a conjunction of atomic formulas built on the relations in mathcalScup=mathcal S cup {=}. (These are also called emph{primitive positive formulas}.) The problem is known to be in co-NExpTime, and has been shown to be tractable on the boolean domain. We show that there exists k>2k>2 such that existsexists-InvSat is co-NExpTime complete on kk-element domains, answering a question of Creignou, Kolaitis and Zanuttini

    Evaluating geometric queries using few arithmetic operations

    Full text link
    Let \cp:=(P_1,...,P_s) be a given family of nn-variate polynomials with integer coefficients and suppose that the degrees and logarithmic heights of these polynomials are bounded by dd and hh, respectively. Suppose furthermore that for each 1≀i≀s1\leq i\leq s the polynomial PiP_i can be evaluated using LL arithmetic operations (additions, subtractions, multiplications and the constants 0 and 1). Assume that the family \cp is in a suitable sense \emph{generic}. We construct a database D\cal D, supported by an algebraic computation tree, such that for each x∈[0,1]nx\in [0,1]^n the query for the signs of P1(x),...,Ps(x)P_1(x),...,P_s(x) can be answered using h d^{\cO(n^2)} comparisons and nLnL arithmetic operations between real numbers. The arithmetic-geometric tools developed for the construction of D\cal D are then employed to exhibit example classes of systems of nn polynomial equations in nn unknowns whose consistency may be checked using only few arithmetic operations, admitting however an exponential number of comparisons

    Do university students, alumni, educators and employers link assessment and graduate employability?

    Get PDF
    Within higher education literature, assessment and graduate employability are linked and co-presented, in that quality student assessment is purported to enhance employability. This research was designed to query the extent to which these same conceptual links are perceived by those actively involved in higher education. Four stakeholder groups from multiple disciplines and eight Australian states and territories (students, alumni, educators and employers) were interviewed about graduate employability (n s= 127). Interviewers intentionally omitted any mention of assessment to determine whether the various stakeholders would bring it up themselves when asked questions such as what is and is not effective for nurturing employability. The results indicated that among the educators, assessment emerged as a dominant theme. While the three other stakeholder groups infrequently used the term assessment, they did discuss related educational concepts and practices in the context of enhanced employability. All stakeholder groups identified a missing link between theory and practice, with educators specifying that link as assessment. Recommendations to improve employability through assessment are the key takeaways from this research. © 2017 HERDS

    Fuzzy term proximity with boolean queries at 2006 TREC Terabyte task

    Get PDF
    http://trec.nist.gov/pubs/trec15/papers/ecole.tera.final.pdfInternational audienceWe report here the results of fuzzy term proximity method app lied to Terabyte Task. Fuzzy proxmity main feature is based on the idea that the clos er the query terms are in a document, the more relevant this document is. With this p rinciple, we have a high precision method so we complete by these obtained with Zettair search engine default method (dirichlet). Our model is able to deal with Boolean qu eries, but contrary to the traditional extensions of the basic Boolean IR model, it does not explicitly use a proximity operator because it can not be generalized to node s. The fuzzy term proximity is controlled with an influence function. Given a query term a nd a document, the influence function associates to each position in the text a value depe ndant of the distance of the nearest occurence of this query term. To model proximity, th is function is decreasing with distance. Different forms of function can be used: triangula r, gaussian etc. For practical reasons only functions with finite support were used. The sup port of the function is limited by a constant called k. The fuzzy term proximity func tions are associated to every leaves of the query tree. Then fuzzy proximities are co mputed for every nodes with a post-order tree traversal. Given the fuzzy proximities of the sons of a node, its fuzzy proximity is computed, like in the fuzzy IR models, with a mim imum (resp. maximum) combination for conjunctives (resp. disjunctives) nodes. Finally, a fuzzy query proximity value is obtained for each position in this document at the ro ot of the query tree. The score of this document is the integration of the function obt ained at the tree root. For the experiments, we modify Lucy (version 0.5.2) to implement ou r matching function. Two query sets are used for our runs. One set is manually built wit h the title words (and sometimes some description words). Each of these words is OR 'ed with its derivatives like plurals for instance. Then the OR nodes obtained are AND'ed a t the tree root. An other automatic query sets is built with an AND of automatically ex tracted terms from the title field. These two query sets are submitted to our system with tw o values of k: 50 and 200. The two corresponding query sets with flat queries are also su bmitted to zettair search engine

    Query Modification in Object-oriented Database Federation

    Get PDF
    We discuss the modification of queries against an integrated view in a federation of object-oriented databases. We present a generalisation of existing algorithms for simple global query processing that works for arbitrarily defined integration classes. We then extend this algorithm to deal with object-oriented features such as queries involving path expressions and nesting. We show how properties of the OO-style of modelling relationships through object references can be exploited to reduce the number of subqueries necessary to evaluate such querie
