740 research outputs found
Ranking Bias in Deep Web Size Estimation Using Capture Recapture Method
Many deep web data sources are ranked data sources, i.e., they rank the matched documents and return at most the top k number of results even though there are more than k documents matching the query. While estimating the size of such ranked deep web data source, it is well known that there is a ranking bias—the traditional methods tend to underestimate the size when queries overflow (match more documents than the return limit). Numerous estimation methods have been proposed to overcome the ranking bias, such as by avoiding overflowing queries during the sampling process, or by adjusting the initial estimation using a fixed function. We observe that the overflow rate has a direct impact on the accuracy of the estimation. Under certain conditions, the actual size is close to the estimation obtained by unranked model multiplied by the overflow rate. Based on this result, this paper proposes a method that allows overflowing queries in the sampling process
Recommended from our members
Understanding the Evolution of Landscape Planning Strategy in China: From Fragmented Urban Green Space System to Regional Greenway Network across Cities
In China, urban green space system (UGSS) is defined as a network of all sorts of green spaces in city built-up area which supports ecological and recreational functions (Wang, 2009). The implementation of UGSS indicates several common problems, such as overemphasizing green spaces in the built area of city, losing stability and rationality in spatial patterns, and mismatching the progress of ecological restoration cycles (Liu & Wen, 2007; Wang, 2009). Greenways represent a distinctly strategic approach to landscape planning through combinations of spatially and functionally compatible land uses within a network (Ahern, 1995). Specially, four principal strategies (Protective, Defensive, Offensive, and Opportunistic) are recognized as an overall planning strategy for greenway (Ahern, 1995). Inspired by the greenway concept, China has constructed 2,372 kilometers of greenway network at Pearl River Delta (PRD), in order to maintain regional ecological safety, to improve regional livability, to stimulate economic growth, and to protect cultural and historic resources (He et al, 2010). Meanwhile, various cities in China have initiated their own greenway network planning for implementation. This indicates a potential greenway movement during the next few years in this country, following the global interest in greenways as a sustainable landscape planning strategy. Through historical review of urban green space system in China and a case study of PRD greenway network, this research attempts to answer the following questions: (1) how contemporary greenway network is planned and implemented in China? (2) How Ahern\u27s four principal strategies (protective, defensive, offensive and opportunistic) have been applied within PRD regional greenway network as landscape planning strategy?
The purpose of this research is to provide a holistic perspective on greenway planning and development in China. Specially, this paper will (1) present evolution of UGSS planning and recent greenway development in China; (2) discuss the practice of implementing greenway network as landscape planning strategy; and (3) discuss the future greenway development in China
Web service search: who, when, what, and how
Web service search is an important problem in service oriented architecture that has attracted widespread attention from academia as well as industry. Web service searching can be performed by various stakeholders, in different situations, using different forms of queries. All those combinations result in radically different ways of implementation. Using a real world web service composition example, this paper describes when, what, and how to search web services from service assemblers’ point of view, where the semantics of web services are not explicitly described. This example outlines the approach to implement a web service broker that can recommend useful services to service assemblers
Higher order generalization and its application in program verification
Generalization is a fundamental operation of inductive inference. While first order syntactic generalization (anti–unification) is well understood, its various extensions are often needed in applications. This paper discusses syntactic higher order generalization in a higher order language λ2 [1]. Based on the application ordering, we prove that least general generalization exists for any two terms and is unique up to renaming. An algorithm to compute the least general generalization is also presented. To illustrate its usefulness, we propose a program verification system based on higher order generalization that can reuse the proofs of similar programs
Optimal algorithms for selecting top-k combinations of attributes : theory and applications
Traditional top-k algorithms, e.g., TA and NRA, have been successfully applied in many areas such as information retrieval, data mining and databases. They are designed to discover k objects, e.g., top-k restaurants, with highest overall scores aggregated from different attributes, e.g., price and location. However, new emerging applications like query recommendation require providing the best combinations of attributes, instead of objects. The straightforward extension based on the existing top-k algorithms is prohibitively expensive to answer top-k combinations because they need to enumerate all the possible combinations, which is exponential to the number of attributes. In this article, we formalize a novel type of top-k query, called top-k, m, which aims to find top-k combinations of attributes based on the overall scores of the top-m objects within each combination, where m is the number of objects forming a combination. We propose a family of efficient top-k, m algorithms with different data access methods, i.e., sorted accesses and random accesses and different query certainties, i.e., exact query processing and approximate query processing. Theoretically, we prove that our algorithms are instance optimal and analyze the bound of the depth of accesses. We further develop optimizations for efficient query evaluation to reduce the computational and the memory costs and the number of accesses. We provide a case study on the real applications of top-k, m queries for an online biomedical search engine. Finally, we perform comprehensive experiments to demonstrate the scalability and efficiency of top-k, m algorithms on multiple real-life datasets.Peer reviewe
High sensitivity face shear magneto-electric composite array for weak magnetic field sensing
© 2020 Author(s). A magnetic field sensor is designed and fabricated using a piezoelectric face shear mode Pb(Mg1/3Nb2/3)O3-PbTiO3 (PMN-PT)/Metglas magneto-electric (ME) composite. An outstanding ME coupling coefficient up to 1600 V/(cm Oe) was experimentally achieved, being ∼50% higher than the value from the extensional PMN-PT/Metglas ME composite with the same volume. The detection limit was found to be 2 × 10-6 Oe for the DC magnetic field, while it was 2 × 10-8 Oe for the AC magnetic field. The sensitivity of the face shear mode PMN-PT/Metglas ME composite is about one order of magnitude higher than that of a 32 extensional mode PMN-PT/Metglas based ME composite in sensing a weak DC magnetic field. A sensing array was also designed based on the ME composite to image weak DC magnetic fields, demonstrating a great potential promising for sensing weak magnetic fields
- …