2,323 research outputs found

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Peer to Peer Information Retrieval: An Overview

    Get PDF
    Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom

    Report on the Information Retrieval Festival (IRFest2017)

    Get PDF
    The Information Retrieval Festival took place in April 2017 in Glasgow. The focus of the workshop was to bring together IR researchers from the various Scottish universities and beyond in order to facilitate more awareness, increased interaction and reflection on the status of the field and its future. The program included an industry session, research talks, demos and posters as well as two keynotes. The first keynote was delivered by Prof. Jaana Kekalenien, who provided a historical, critical reflection of realism in Interactive Information Retrieval Experimentation, while the second keynote was delivered by Prof. Maarten de Rijke, who argued for more Artificial Intelligence usage in IR solutions and deployments. The workshop was followed by a "Tour de Scotland" where delegates were taken from Glasgow to Aberdeen for the European Conference in Information Retrieval (ECIR 2017

    A Mobile Query Service for Integrated Access to Large Numbers of Online Semantic Web Data Sources

    Get PDF
    From the Semantic Web’s inception, a number of concurrent initiatives have given rise to multiple segments: large semantic datasets, exposed by query endpoints; online Semantic Web documents, in the form of RDF files; and semantically annotated web content (e.g., using RDFa), semantic sources in their own right. In various mobile application scenarios, online semantic data has proven to be useful. While query endpoints are most commonly exploited, they are mainly useful to expose large semantic datasets. Alternatively, mobile RDF stores are utilized to query local semantic data, but this requires the design-time identification and replication of relevant data. Instead, we present a mobile query service that supports on-the-fly and integrated querying of semantic data, originating from a largely unused portion of the Semantic Web, comprising online RDF files and semantics embedded in annotated webpages. To that end, our solution performs dynamic identification, retrieval and caching of query-relevant semantic data. We explore several data identification and caching alternatives, and investigate the utility of source metadata in optimizing these tasks. Further, we introduce a novel cache replacement strategy, fine- tuned to the described query dataset, and include explicit support for the Open World Assumption. An extensive experimental validation evaluates the query service and its alternative components

    Toward Entity-Aware Search

    Get PDF
    As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability

    Lightweight Federation of Non-Cooperating Digital Libraries

    Get PDF
    This dissertation studies the challenges and issues faced in federating heterogeneous digital libraries (DLs). The objective of this research is to demonstrate the feasibility of interoperability among non-cooperating DLs by presenting a lightweight, data driven approach, or Data Centered Interoperability (DCI). We build a Lightweight Federated Digital Library (LFDL) system to provide federated search service for existing digital libraries with no prior coordination. We describe the motivation, architecture, design and implementation of the LFDL. We develop, deploy, and evaluate key services of the federation. The major difference to existing DL interoperability approaches is one where we do not insist on cooperation among DLs, that is, they do not have to change anything in their system or processes. The underlying approach is to have a dynamic federation where digital libraries can be added (removed) to the federation in real-time. This is made possible by describing the behavior of participating DLs in an XML-based language that the federation engine understands. The major contributions of this work are: (1) This dissertation addresses the interoperability issues among non-cooperating DLs and presents a practical and efficient approach toward providing federated search service for those DLs. The DL itself remains autonomous and does not need to change its structure, data format, protocol and other internal features when it is added to the federation. (2) The implementation of the LFDL is based on a lightweight, dynamic, data-centered and rule-driven architecture. To add a DL to the federation, all that is needed is observing a DL\u27s interaction with the user and storing the interaction specification in a human-readable and highly maintainable format. The federation engine provides the federated service based on the specification of a DL. A registration service allows dynamic DL registration, removal, or modification. No code needs to be rewritten or recompiled to add or change a DL. These notions are achieved by designing a new specification language in XML format and a powerful processing engine that enforces and implements the rules specified using the language. (3) In this thesis we explore an alternate approach where searches are distributed to participating DLs in real time. We have addressed the performance and reliability problems associated with other distributed search approaches. This is achieved by a locally maintained metadata repository extracted from DLs, as well as an efficient caching system based on the repository

    AMCIS 2008 Panel Report: Aging Content on the Web: Issues, Implications, and Potential Research Opportunities

    Get PDF
    Since its inception in the early 1990s, the World Wide Web (Web) has grown enormously. According to the “official Google blog” (Google 2008), the Web had 1 trillion (as in 1,000,000,000,000) unique coexisting URL’s as of July 25, 2008. Given the exponential growth of the Web over time, an issue that is likely to gain prominence is that of outdated information. This is especially important to study since many of us rely on the Web to find facts in order to take decisions. For example, for students and researchers, the “date” of a document is important for scholarship and student work. However, getting an accurate date on content is challenging, and furthermore, outdated pages that are not deleted from Web servers will continue to be returned in response to Web searches. The panel, held at the 2008 Americas Conference on Information Systems in Toronto, Canada, identified a number of research issues and opportunities that arise as a result of this phenomenon
    corecore