740 research outputs found

    Ranking Bias in Deep Web Size Estimation Using Capture Recapture Method

    Get PDF
    Many deep web data sources are ranked data sources, i.e., they rank the matched documents and return at most the top k number of results even though there are more than k documents matching the query. While estimating the size of such ranked deep web data source, it is well known that there is a ranking bias—the traditional methods tend to underestimate the size when queries overflow (match more documents than the return limit). Numerous estimation methods have been proposed to overcome the ranking bias, such as by avoiding overflowing queries during the sampling process, or by adjusting the initial estimation using a fixed function. We observe that the overflow rate has a direct impact on the accuracy of the estimation. Under certain conditions, the actual size is close to the estimation obtained by unranked model multiplied by the overflow rate. Based on this result, this paper proposes a method that allows overflowing queries in the sampling process

    Web service search: who, when, what, and how

    Get PDF
    Web service search is an important problem in service oriented architecture that has attracted widespread attention from academia as well as industry. Web service searching can be performed by various stakeholders, in different situations, using different forms of queries. All those combinations result in radically different ways of implementation. Using a real world web service composition example, this paper describes when, what, and how to search web services from service assemblers’ point of view, where the semantics of web services are not explicitly described. This example outlines the approach to implement a web service broker that can recommend useful services to service assemblers

    Higher order generalization and its application in program verification

    Get PDF
    Generalization is a fundamental operation of inductive inference. While first order syntactic generalization (anti–unification) is well understood, its various extensions are often needed in applications. This paper discusses syntactic higher order generalization in a higher order language λ2 [1]. Based on the application ordering, we prove that least general generalization exists for any two terms and is unique up to renaming. An algorithm to compute the least general generalization is also presented. To illustrate its usefulness, we propose a program verification system based on higher order generalization that can reuse the proofs of similar programs

    Optimal algorithms for selecting top-k combinations of attributes : theory and applications

    Get PDF
    Traditional top-k algorithms, e.g., TA and NRA, have been successfully applied in many areas such as information retrieval, data mining and databases. They are designed to discover k objects, e.g., top-k restaurants, with highest overall scores aggregated from different attributes, e.g., price and location. However, new emerging applications like query recommendation require providing the best combinations of attributes, instead of objects. The straightforward extension based on the existing top-k algorithms is prohibitively expensive to answer top-k combinations because they need to enumerate all the possible combinations, which is exponential to the number of attributes. In this article, we formalize a novel type of top-k query, called top-k, m, which aims to find top-k combinations of attributes based on the overall scores of the top-m objects within each combination, where m is the number of objects forming a combination. We propose a family of efficient top-k, m algorithms with different data access methods, i.e., sorted accesses and random accesses and different query certainties, i.e., exact query processing and approximate query processing. Theoretically, we prove that our algorithms are instance optimal and analyze the bound of the depth of accesses. We further develop optimizations for efficient query evaluation to reduce the computational and the memory costs and the number of accesses. We provide a case study on the real applications of top-k, m queries for an online biomedical search engine. Finally, we perform comprehensive experiments to demonstrate the scalability and efficiency of top-k, m algorithms on multiple real-life datasets.Peer reviewe

    High sensitivity face shear magneto-electric composite array for weak magnetic field sensing

    Get PDF
    © 2020 Author(s). A magnetic field sensor is designed and fabricated using a piezoelectric face shear mode Pb(Mg1/3Nb2/3)O3-PbTiO3 (PMN-PT)/Metglas magneto-electric (ME) composite. An outstanding ME coupling coefficient up to 1600 V/(cm Oe) was experimentally achieved, being ∼50% higher than the value from the extensional PMN-PT/Metglas ME composite with the same volume. The detection limit was found to be 2 × 10-6 Oe for the DC magnetic field, while it was 2 × 10-8 Oe for the AC magnetic field. The sensitivity of the face shear mode PMN-PT/Metglas ME composite is about one order of magnitude higher than that of a 32 extensional mode PMN-PT/Metglas based ME composite in sensing a weak DC magnetic field. A sensing array was also designed based on the ME composite to image weak DC magnetic fields, demonstrating a great potential promising for sensing weak magnetic fields
    • …
    corecore