92 research outputs found

    A Review: Data Mining Technique Used for Searching the Keywords

    Get PDF
    Convential Spatial queries contains range search, nearest neighbor retrival involve only conditions on object geometric properties. Today, many modern applications call for innovative kind of queries that aim to find objects satisfying both a spatial predicate, and a predicate on their associated texts. For example, instead of considering all the restaurants, a nearest neighbor query would instead ask for the restaurant that is the closest among those whose menus contain “Dosa, Idli, Wadapav” all at the same time. Currently the best solution to such queries is based on the IR2-tree, which, such type of queries can be efficiently handled by IR2tree. In the proposed work, we are developing a system searching is done on the basis of methods like nearest neighbor search with keywords is done by IR2 tree and spatial inverted index.We could first fetch all the restaurants whose menus contain the set of keywords {Dosa, Idli, Wadapav}, and then from the retrieved restaurants, find the nearest one The IR2-tree combines the R tree with signature files. Inverted indexes (I-index) have proved to be an effective access method for keyword based document retrieval

    A Density-Based Approach to the Retrieval of Top-K Spatial Textual Clusters

    Full text link
    Keyword-based web queries with local intent retrieve web content that is relevant to supplied keywords and that represent points of interest that are near the query location. Two broad categories of such queries exist. The first encompasses queries that retrieve single spatial web objects that each satisfy the query arguments. Most proposals belong to this category. The second category, to which this paper's proposal belongs, encompasses queries that support exploratory user behavior and retrieve sets of objects that represent regions of space that may be of interest to the user. Specifically, the paper proposes a new type of query, namely the top-k spatial textual clusters (k-STC) query that returns the top-k clusters that (i) are located the closest to a given query location, (ii) contain the most relevant objects with regard to given query keywords, and (iii) have an object density that exceeds a given threshold. To compute this query, we propose a basic algorithm that relies on on-line density-based clustering and exploits an early stop condition. To improve the response time, we design an advanced approach that includes three techniques: (i) an object skipping rule, (ii) spatially gridded posting lists, and (iii) a fast range query algorithm. An empirical study on real data demonstrates that the paper's proposals offer scalability and are capable of excellent performance

    Reverse spatial visual top-k query

    Get PDF
    With the wide application of mobile Internet techniques an location-based services (LBS), massive multimedia data with geo-tags has been generated and collected. In this paper, we investigate a novel type of spatial query problem, named reverse spatial visual top- kk query (RSVQ k ) that aims to retrieve a set of geo-images that have the query as one of the most relevant geo-images in both geographical proximity and visual similarity. Existing approaches for reverse top- kk queries are not suitable to address this problem because they cannot effectively process unstructured data, such as image. To this end, firstly we propose the definition of RSVQ k problem and introduce the similarity measurement. A novel hybrid index, named VR 2 -Tree is designed, which is a combination of visual representation of geo-image and R-Tree. Besides, an extension of VR 2 -Tree, called CVR 2 -Tree is introduced and then we discuss the calculation of lower/upper bound, and then propose the optimization technique via CVR 2 -Tree for further pruning. In addition, a search algorithm named RSVQ k algorithm is developed to support the efficient RSVQ k query. Comprehensive experiments are conducted on four geo-image datasets, and the results illustrate that our approach can address the RSVQ k problem effectively and efficiently

    Nearest Neighbor Search with Keywords in Spatial Databases

    Get PDF
    In real world, there are billions of rows in a spatial database. If someone want to search for a location or place, it searches all the rows and return the result. Practically there can be only few rows in the database which are of importance to use. As with many pioneering solutions, the IR2-tree has a few drawbacks that affect its efficiency. The most serious issue among all is that the number of false hits can be really very large when the object of final result is far away from the query point, or the result is empty. In such cases, the query algorithm would need to load the documents of many objects, causing expensive overhead as each loading necessitates a random access. So if search is performed only in the used data subspace, the execution time would be saved. We propose such system which can implement this efficiently with the help of R-tree and Nearest neighbor algorithm using inverted Index spatial R-Tree to solve this problem

    Classical and Probabilistic Information Retrieval Techniques: An Audit

    Get PDF
    Information retrieval is acquiring particular information from large resources and presenting it according to the user’s need. The incredible increase in information resources on the Internet formulates the information retrieval procedure, a monotonous and complicated task for users. Due to over access of information, better methodology is required to retrieve the most appropriate information from different sources. The most important information retrieval methods include the probabilistic, fuzzy set, vector space, and boolean models. Each of these models usually are used for evaluating the connection between the question and the retrievable documents. These methods are based on the keyword and use lists of keywords to evaluate the information material. In this paper, we present a survey of these models so that their working methodology and limitations are discussed. This is an important understanding because it makes possible to select an information retrieval technique based on the basic requirements. The survey results showed that the existing model for knowledge recovery is somewhere short of what was planned. We have also discussed different areas of IR application where these models could be used

    Best Keyword Query Search Using Minimum Spatial Cover

    Full text link
    Enlivened by the expanding fame of Mobile processing, administrations in view of area and with the accessibility of computerized maps, the spatial catchphrase look has achieved wide consideration. In spatial databases the relationship of items is finished with watchwords. The reason for existing is to locate various free protests, in which each item is closer to the area of question and the catchphrases related will be identified with the gathering of inquiry watchwords. The related watchword closeness is connected to gauge the connection among two gathered catchphrases. The idea of watchword spread, covers all related inquiry catchphrases which are nearer to each other. This methodology is known as m Closest Keywords (mCK) inquiry. The goal is to investigate a general structure, known as Best Keyword Cover (BKC) questions, which alongside bury objects separate additionally considers appraisals of catchphrase, which improves the basic leadership process. In BKC inquiry handling, two calculations are utilized: Baseline and Keyword Nearest Neighbor Expansion (KNNE). The gauge calculation is gotten from mCK question handling. The working of the gauge calculation diminishes radically as a result of incomprehensible catchphrase covers produced. To beat this disadvantage, a more extensible calculation KNNE is utilized. This calculation decreases the quantity of watchword spreads delivered

    Ranked Spatial-keyword Search over Web-accessible Geotagged Data: State of the Art

    Get PDF
    Search engines, such as Google and Yahoo!, provide efficient retrieval and ranking of web pages based on queries consisting of a set of given keywords. Recent studies show that 20% of all Web queries also have location constraints, i.e., also refer to the location of a geotagged web page. An increasing number of applications support location based keyword search, including Google Maps, Bing Maps, Yahoo! Local, and Yelp. Such applications depict points of interest on the map and combine their location with the keywords provided by the associated document(s). The posed queries consist of two conditions: a set of keywords and a spatial location. The goal is to find points of interest with these keywords close to the location. We refer to such a query as spatial-keyword query. Moreover, mobile devices nowadays are enhanced with built-in GPS receivers, which permits applications (such as search engines or yellow page services) to acquire the location of the user implicitly, and provide location-based services. For instance, Google Mobile App provides a simple search service for smartphones where the location of the user is automatically captured and employed to retrieve results relevant to her current location. As an example, a search for ”pizza” results in a list of pizza restaurants nearby the user. Given the popularity of spatial-keyword queries and their wide applicability in practical scenarios, it is critical to (i) establish mechanisms for efficient processing of spatial-keyword queries, and (ii) support more expressive query formulation by means of novel 1 query types. Although studies on both keyword search and spatial queries do exist, the problem of combining the search capabilities of both simultaneously has received little attention

    GEIR: a Full-Fledged Geographically Enhanced Information Retrieval Solution

    Get PDF
    With the development of search engines (e.g. Google, Bing, Yahoo, etc.), people is ambitiously expecting higher quality and improvements of current technologies. Bringing human intelligence features to these tools, like the ability to find implicit information through semantics, is one of the must prominent research lines in Computer Science. Information semantics is a very wide concept, as wide as the human capability to interpret, in particular, the analysis of geographical semantics gives the possibility to associate information with a place. It is estimated that more than 70\% of all information in the world has some kind of geographic features \cite{Jones04}. In 2012, Ed Parsons, a GeoSpatial Technologist from Google, reported that between 30\% and 40\% of the user queries at Google search engine contain geographic references \cite{Parsons12}. This thesis addresses the field of geographic information extraction and retrieval in unstructured texts. This process includes the identification of spatial features in textual documents, the data indexing, the manipulation of the relevance of the identified geographic entities and the multi-criteria retrieval according to the thematic and geographic information. The main contributions of this work include a custom geographic knowledge base, built from the combination of GeoNames and WordNet; a Natural Language Processing and knowledge based heuristics for Toponym Recognition and Toponym Disambiguation; and a geographic relevance weighting model that supports non-spatial indexing and simple ranking combination approaches. The validity of each one of these components is supported by practical experiments that show their effectiveness in different scenarios and their alignment with state of the art solutions. In addition, it also constitutes a main contribution of this work GEIR, a general purpose GIR framework that includes the implementations of the above described components and brings the possibility of implementing new ones and test their performance within an end to end GIR system
    • …
    corecore