872 research outputs found

    Towards Why-Not Spatial Keyword Top-k Queries:A Direction-Aware Approach

    Get PDF

    From Keyword Search to Exploration: How Result Visualization Aids Discovery on the Web

    No full text
    A key to the Web's success is the power of search. The elegant way in which search results are returned is usually remarkably effective. However, for exploratory search in which users need to learn, discover, and understand novel or complex topics, there is substantial room for improvement. Human computer interaction researchers and web browser designers have developed novel strategies to improve Web search by enabling users to conveniently visualize, manipulate, and organize their Web search results. This monograph offers fresh ways to think about search-related cognitive processes and describes innovative design approaches to browsers and related tools. For instance, while key word search presents users with results for specific information (e.g., what is the capitol of Peru), other methods may let users see and explore the contexts of their requests for information (related or previous work, conflicting information), or the properties that associate groups of information assets (group legal decisions by lead attorney). We also consider the both traditional and novel ways in which these strategies have been evaluated. From our review of cognitive processes, browser design, and evaluations, we reflect on the future opportunities and new paradigms for exploring and interacting with Web search results

    Efficient spatial keyword query processing on geo-textual data

    Get PDF

    空間Webデータにおけるm-最近接キーワード検索問題のトップダウン解法に関する研究

    Get PDF
    This thesis addresses the problem of m-closest keywords queries (mCK queries) over spatial web objects that contain descriptive texts and spatial information. The mCK query is a problem to find the optimal set of records in the sense that they are the spatially-closest records that satisfy m user-given keywords in their texts. The mCK query can be widely used in various applications to find the place of user’s interest. Generally, top-down search techniques using tree-style data structures are appropriate for finding optimal results of queries over spatial datasets. Thus in order to solve the mCK query problem, a previous study of NUS group assumed a specialized R*-tree (called bR*-tree) to store all records and proposed a top-down approach which uses an Apriori-based node-set enumeration in top-down process. However this assumption of prepared bR*-tree is not applicable to practical spatial web datasets, and the pruning ability of Apriori-based enumeration is highly dependent on the data distribution. In this thesis, we do not expect any prepared data-partitioning, but assume that we create a grid partitioning from necessary data only when an mCK query is given. Under this assumption, we propose a new search strategy termed Diameter Candidate Check (DCC), which can find a smaller node-set at an earlier stage of search so that it can reduce search space more efficiently. According to DCC search strategy, we firstly employ an implementation of DCC strategy in a nested loop search algorithm (called DCC-NL). Next, we improve the DCC-NL in a recursive way (called RDCC). RDCC can afford a more reasonable priority order of node-set enumeration. We also uses a tight lower bound to improve pruning ability in RDCC. RDCC performs well in a wide variey of data distributions, but it has still deficiency when one data-point has many query keywords and numerous node-sets are generated. Hence in order to avoid the generation of node-sets which is an unstable factor of search efficiency, we propose another different top-down search approach called Pairwise Expansion. Finally, we discuss some optimization techniques to enhance Pairwise Expansion approach. We first discuss the index structure in the Pairwise Expansion approach, and try to use an on-the-fly kd-tree to reduce building cost in the query process. Also a new lower bound and an upper bound are employed for more powerful pruning in Pairwise Expansion. We evaluate these approaches by using both real datasets and synthetic datasets for different data distributions, including 1.6 million of Flickr photo data. The result shows that DCC strategy can provide more stable search performance than the Apriori-based approach. And the Pairwise Expansion approach enhanced with lower/upper bounds, has more advantages over those algorithms having node-set generation, and is applicable for real spatial web data.電気通信大学201

    Closing Information Gaps with Need-driven Knowledge Sharing

    Get PDF
    Informationslücken schließen durch bedarfsgetriebenen Wissensaustausch Systeme zum asynchronen Wissensaustausch – wie Intranets, Wikis oder Dateiserver – leiden häufig unter mangelnden Nutzerbeiträgen. Ein Hauptgrund dafür ist, dass Informationsanbieter von Informationsuchenden entkoppelt, und deshalb nur wenig über deren Informationsbedarf gewahr sind. Zentrale Fragen des Wissensmanagements sind daher, welches Wissen besonders wertvoll ist und mit welchen Mitteln Wissensträger dazu motiviert werden können, es zu teilen. Diese Arbeit entwirft dazu den Ansatz des bedarfsgetriebenen Wissensaustauschs (NKS), der aus drei Elementen besteht. Zunächst werden dabei Indikatoren für den Informationsbedarf erhoben – insbesondere Suchanfragen – über deren Aggregation eine fortlaufende Prognose des organisationalen Informationsbedarfs (OIN) abgeleitet wird. Durch den Abgleich mit vorhandenen Informationen in persönlichen und geteilten Informationsräumen werden daraus organisationale Informationslücken (OIG) ermittelt, die auf fehlende Informationen hindeuten. Diese Lücken werden mit Hilfe so genannter Mediationsdienste und Mediationsräume transparent gemacht. Diese helfen Aufmerksamkeit für organisationale Informationsbedürfnisse zu schaffen und den Wissensaustausch zu steuern. Die konkrete Umsetzung von NKS wird durch drei unterschiedliche Anwendungen illustriert, die allesamt auf bewährten Wissensmanagementsystemen aufbauen. Bei der Inversen Suche handelt es sich um ein Werkzeug das Wissensträgern vorschlägt Dokumente aus ihrem persönlichen Informationsraum zu teilen, um damit organisationale Informationslücken zu schließen. Woogle erweitert herkömmliche Wiki-Systeme um Steuerungsinstrumente zur Erkennung und Priorisierung fehlender Informationen, so dass die Weiterentwicklung der Wiki-Inhalte nachfrageorientiert gestaltet werden kann. Auf ähnliche Weise steuert Semantic Need, eine Erweiterung für Semantic MediaWiki, die Erfassung von strukturierten, semantischen Daten basierend auf Informationsbedarf der in Form strukturierter Anfragen vorliegt. Die Umsetzung und Evaluation der drei Werkzeuge zeigt, dass bedarfsgetriebener Wissensaustausch technisch realisierbar ist und eine wichtige Ergänzung für das Wissensmanagement sein kann. Darüber hinaus bietet das Konzept der Mediationsdienste und Mediationsräume einen Rahmen für die Analyse und Gestaltung von Werkzeugen gemäß der NKS-Prinzipien. Schließlich liefert der hier vorstellte Ansatz auch Impulse für die Weiterentwicklung von Internetdiensten und -Infrastrukturen wie der Wikipedia oder dem Semantic Web

    Why-Query Support in Graph Databases

    Get PDF
    In the last few decades, database management systems became powerful tools for storing large amount of data and executing complex queries over them. In addition to extended functionality, novel types of databases appear like triple stores, distributed databases, etc. Graph databases implementing the property-graph model belong to this development branch and provide a new way for storing and processing data in the form of a graph with nodes representing some entities and edges describing connections between them. This consideration makes them suitable for keeping data without a rigid schema for use cases like social-network processing or data integration. In addition to a flexible storage, graph databases provide new querying possibilities in the form of path queries, detection of connected components, pattern matching, etc. However, the schema flexibility and graph queries come with additional costs. With limited knowledge about data and little experience in constructing the complex queries, users can create such ones, which deliver unexpected results. Forced to debug queries manually and overwhelmed by the amount of query constraints, users can get frustrated by using graph databases. What is really needed, is to improve usability of graph databases by providing debugging and explaining functionality for such situations. We have to assist users in the discovery of what were the reasons of unexpected results and what can be done in order to fix them. The unexpectedness of result sets can be expressed in terms of their size or content. In the first case, users have to solve the empty-answer, too-many-, or too-few-answers problems. In the second case, users care about the result content and miss some expected answers or wonder about presence of some unexpected ones. Considering the typical problems of receiving no or too many results by querying graph databases, in this thesis we focus on investigating the problems of the first group, whose solutions are usually represented by why-empty, why-so-few, and why-so-many queries. Our objective is to extend graph databases with debugging functionality in the form of why-queries for unexpected query results on the example of pattern matching queries, which are one of general graph-query types. We present a comprehensive analysis of existing debugging tools in the state-of-the-art research and identify their common properties. From them, we formulate the following features of why-queries, which we discuss in this thesis, namely: holistic support of different cardinality-based problems, explanation of unexpected results and query reformulation, comprehensive analysis of explanations, and non-intrusive user integration. To support different cardinality-based problems, we develop methods for explaining no, too few, and too many results. To cover different kinds of explanations, we present two types: subgraph- and modification-based explanations. The first type identifies the reasons of unexpectedness in terms of query subgraphs and delivers differential graphs as answers. The second one reformulates queries in such a way that they produce better results. Considering graph queries to be complex structures with multiple constraints, we investigate different ways of generating explanations starting from the most general one that considers only a query topology through coarse-grained rewriting up to fine-grained modification that allows fine changes of predicates and topology. To provide a comprehensive analysis of explanations, we propose to compare them on three levels including a syntactic description, a content, and a size of a result set. In order to deliver user-aware explanations, we discuss two models for non-intrusive user integration in the generation process. With the techniques proposed in this thesis, we are able to provide fundamentals for debugging of pattern-matching queries, which deliver no, too few, or too many results, in graph databases implementing the property-graph model

    On the Foundations of Data Interoperability and Semantic Search on the Web

    Get PDF
    This dissertation studies the problem of facilitating semantic search across disparate ontologies that are developed by different organizations. There is tremendous potential in enabling users to search independent ontologies and discover knowledge in a serendipitous fashion, i.e., often completely unintended by the developers of the ontologies. The main difficulty with such search is that users generally do not have any control over the naming conventions and content of the ontologies. Thus terms must be appropriately mapped across ontologies based on their meaning. The meaning-based search of data is referred to as semantic search, and its facilitation (aka semantic interoperability) then requires mapping between ontologies. In relational databases, searching across organizational boundaries currently involves the difficult task of setting up a rigid information integration system. Linked Data representations more flexibly tackle the problem of searching across organizational boundaries on the Web. However, there exists no consensus on how ontology mapping should be performed for this scenario, and the problem is open. We lay out the foundations of semantic search on the Web of Data by comparing it to keyword search in the relational model and by providing effective mechanisms to facilitate data interoperability across organizational boundaries. We identify two sharply distinct goals for ontology mapping based on real-world use cases. These goals are: (i) ontology development, and (ii) facilitating interoperability. We systematically analyze these goals, side-by-side, and contrast them. Our analysis demonstrates the implications of the goals on how to perform ontology mapping and how to represent the mappings. We rigorously compare facilitating interoperability between ontologies to information integration in databases. Based on the comparison, class matching is emphasized as a critical part of facilitating interoperability. For class matching, various class similarity metrics are formalized and an algorithm that utilizes these metrics is designed. We also experimentally evaluate the effectiveness of the class similarity metrics on real-world ontologies. In order to encode the correspondences between ontologies for interoperability, we develop a novel W3C-compliant representation, named skeleton

    Entity-Oriented Search

    Get PDF
    This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms

    多次元データに対するランキング問合せ処理に関する研究

    Get PDF
    筑波大学 (University of Tsukuba)201
    corecore