28 research outputs found

    Temporal Diversification of Search Results

    No full text

    Search-Optimized Suffix-Tree Storage for Biological Applications

    No full text
    Suffix-trees are popular indexing structures for various sequence processing problems in biological data management. We investigate here the possibility of enhancing the search efficiency of disk-resident suffix-trees through customized layouts of tree-nodes to disk-pages. Specifically, we propose a new layout strategy, called Stellar, that provides significantly improved search performance on a representative set of real genomic sequences. Further, Stellar supports both the standard root-to-leaf lookup queries as well as sophisticated sequence search algorithms that exploit the suffix-links of suffix-trees. Our results are encouraging with regard to the ultimate objective of seamlessly integrating sequence processing in database engines

    Interview with GraceAnne DeCandido

    Get PDF
    Computing shortest paths between two given nodes is a fundamental operation over graphs, but known to be nontrivial over large disk-resident instances of graph data. While a number of techniques exist for answering reachability queries and approximating node distances efficiently, determining actual shortest paths (i. e. the sequence of nodes involved) is often neglected. However, in applications arising in massive online social networks, biological networks, and knowledge graphs it is often essential to find out many, if not all, shortest paths between two given nodes. In this paper, we address this problem and present a scalable sketch-based index structure that not only supports estimation of node distances, but also computes corresponding shortest paths themselves. Generating the actual path information allows for further improvements to the estimation accuracy of distances (and paths), leading to near-exact shortest-path approximations in real world graphs. We evaluate our techniques – implemented within a fully functional RDF graph database system – over large real-world social and biological networks of sizes ranging from tens of thousand to millions of nodes and edges. Experiments on several datasets show that we can achieve query response times providing several orders of magnitude speedup over traditional path computations while keeping the estimation errors between 0% and 1% on average

    Sparqling {Kleene} -- Fast Property Paths in {RDF-3X}

    No full text

    Flood Little, Cache More: Effective Result-reuse in P2P IR Systems

    No full text

    Antourage: Mining Distance-constrained Trips from Flickr

    No full text
    We study how to automatically extract tourist trips from large volumes of geo-tagged photographs. Working with more than 8 million of these photographs that are publicly available via photo- sharing communities such as Flickr and Panoramio, our goal is to satisfy the needs of a tourist who specifies a starting location (typically a hotel) together with a bounded travel distance and demands a tour that visits the popular sites along the way. Our system, named ANTOURAGE, solves this intractable problem using a novel adaptation of the max-min ant system (MMAS) meta-heuristic. Experiments using GPS metadata crawled from Flickr show that ANTOURAGE can generate high-quality tours

    Rank Synopses for Efficient Time Travel on the Web Graph

    No full text

    Temporal Knowledge for Timely Intelligence

    No full text
    Knowledge bases about entities and their relationships are a great asset for business intelligence. Major advances in information extraction and the proliferation of knowledge-sharing communities like Wikipedia have enabled ways for the largely automated construction of rich knowledge bases. Such knowledge about entity-oriented facts can greatly improve the output quality and possibly also efficiency of processing business-relevant documents and event logs. This holds for information within the enterprise as well as in Web communities such as blogs. However, no knowledge base will ever be fully complete and real-world knowledge is continuously changing: new facts supersede old facts, knowledge grows in various dimensions, and completely new classes, relation types, or knowledge structures will arise. This leads to a number of difficult research questions regarding temporal knowledge and the life-cycle of knowledge bases. This short paper outlines challenging issues and research opportunities, and provides references to technical literature

    Rank synopses for efficient time travel on the web graph

    No full text

    {NEAT}: News Exploration Along Time

    No full text
    corecore