18 research outputs found

    Overview of the TREC 2014 Federated Web Search Track

    Get PDF
    The TREC Federated Web Search track facilitates research in topics related to federated web search, by providing a large realistic data collection sampled from a multitude of online search engines. The FedWeb 2013 challenges of Resource Selection and Results Merging challenges are again included in FedWeb 2014, and we additionally introduced the task of vertical selection. Other new aspects are the required link between the Resource Selection and Results Merging, and the importance of diversity in the merged results. After an overview of the new data collection and relevance judgments, the individual participants’ results for the tasks are introduced, analyzed, and compared

    Overview of the TREC 2014 Federated Web Search Track

    Get PDF
    The TREC Federated Web Search track facilitates research in topics related to federated web search, by providing a large realistic data collection sampled from a multitude of online search engines. The FedWeb 2013 challenges of Resource Selection and Results Merging challenges are again included in FedWeb 2014, and we additionally introduced the task of vertical selection. Other new aspects are the required link between the Resource Selection and Results Merging, and the importance of diversity in the merged results. After an overview of the new data collection and relevance judgments, the individual participants’ results for the tasks are introduced, analyzed, and compared

    Using Semantic-Based User Profile Modeling for Context-Aware Personalised Place Recommendations

    Get PDF
    Place Recommendation Systems (PRS's) are used to recommend places to visit to World Wide Web users. Existing PRS's are still limited by several problems, some of which are the problem of recommending similar set of places to different users (Lack of Personalization) and no diversity in the set of recommended items (Content Overspecialization). One of the main objectives in the PRS's or Contextual suggestion systems is to fill the semantic gap among the queries and suggestions and going beyond keywords matching. To address these issues, in this study we attempt to build a personalized context-aware place recommender system using semantic-based user profile modeling to address the limitations of current user profile building techniques and to improve the retrieval performance of personalized place recommender system. This approach consists of building a place ontology based on the Open Directory Project (ODP), a hierarchical ontology scheme for organizing websites. We model a semantic user profile from the place concepts extracted from place ontology and weighted according to their semantic relatedness to user interests. The semantic user profile is then exploited to devise a personalized recommendation by re-ranking process of initial search results for improving retrieval performance. We evaluate this approach on dataset obtained using Google Paces API. Results show that our proposed approach significantly improves the retrieval performance compare to classic keyword-based place recommendation model

    The Role of Document Structure and Citation Analysis in Literature Information Retrieval

    Get PDF
    Literature Information Retrieval (IR) is the task of searching relevant publications given a particular information need expressed as a set of queries. With the staggering growth of scientific literature, it is critical to design effective retrieval solutions to facilitate efficient access to them. We hypothesize that particular genre specific characteristics of scientific literature such as metadata and citations are potentially helpful for enhancing scientific literature search. We conducted systematic and extensive IR experiments on open information retrieval test collections to investigate their roles in enhancing literature information retrieval effectiveness. This thesis consists of three major parts of studies. First, we examined the role of document structure in literature search through comprehensive studies on the retrieval effectiveness of a set of structure-aware retrieval models on ad hoc scientific literature search tasks. Second, under the language modeling retrieval framework, we studied exploiting citation and co-citation analysis results as sources of evidence for enhancing literature search. Specifically, we examined relevant document distribution patterns over partitioned clusters of document citation and co-citation graphs; we examined seven ways of modeling document prior probabilities of being relevant based on document citation and co-citation analysis; we studied the effectiveness of boosting retrieved documents with scores of their neighborhood documents in terms co-citation counts, co-citation similarities and Howard White's pennant scores. Third, we combined both structured retrieval features and citation related features in developing machine learned retrieval models for literatures search and assessed the effectiveness of learning to rank algorithms and various literature-specific features. Our major findings are as follows. State-of-the-art structure-ware retrieval models though reportedly perform well in known item finding tasks do not significantly outperform non-fielded baseline retrieval models in ad hoc literature information retrieval. Though relevant document distributions over citation and co-citation network graph partitions reveal favorable pattern, citation and co-citation analysis results on the current iSearch test collection only modestly improve retrieval effectiveness. However, priors derived from co-citation analysis outperform that derived from citation analysis, and pennant score for document expansion outperforms raw co-citation count or cosine similarity of co-citation counts. Our learning to rank experiments show that in a heterogeneous collection setting, citation related features can significantly outperform baselines.Ph.D., Information Studies -- Drexel University, 201

    Multimodal Legal Information Retrieval

    Get PDF
    The goal of this thesis is to present a multifaceted way of inducing semantic representation from legal documents as well as accessing information in a precise and timely manner. The thesis explored approaches for semantic information retrieval (IR) in the Legal context with a technique that maps specific parts of a text to the relevant concept. This technique relies on text segments, using the Latent Dirichlet Allocation (LDA), a topic modeling algorithm for performing text segmentation, expanding the concept using some Natural Language Processing techniques, and then associating the text segments to the concepts using a semi-supervised text similarity technique. This solves two problems, i.e., that of user specificity in formulating query, and information overload, for querying a large document collection with a set of concepts is more fine-grained since specific information, rather than full documents is retrieved. The second part of the thesis describes our Neural Network Relevance Model for E-Discovery Information Retrieval. Our algorithm is essentially a feature-rich Ensemble system with different component Neural Networks extracting different relevance signal. This model has been trained and evaluated on the TREC Legal track 2010 data. The performance of our models across board proves that it capture the semantics and relatedness between query and document which is important to the Legal Information Retrieval domain

    Information journeys in digital archives

    Get PDF
    Archival collections have particular properties that make physical and intellectual access difficult for researchers. This generates feelings of uncertainty in the researchers leading to a large burden of enquiries to the archive, many routine. In this thesis I investigate the information seeking behaviours of archival researchers and the distinct properties of the archive first through the respective literatures and then through a series of five studies. Using systems, data and researchers from the National Archives, these studies examine the nature of the enquiries archives receive across many channels, the in-person interactions between archivists and researchers in the reading rooms and the unmediated search behaviours of archival researchers. I proceed to outline the barriers inhibiting research progress and the techniques or 'regulators' used by researchers to surmount or mitigate these barriers. In the final two studies I develop and attempt to validate an instrument for measuring uncertainty in information seeking in large digital collections. This three factor (disorientation, prospect and preparedness) scale of archival uncertainty allows improvements to online archival systems to be effectively tested before implementation. I also propose system properties which seem likely to assist researchers to make progress given these factors and which could be tested using this instrument

    Proceedings of the 12th International Conference on Digital Preservation

    Get PDF
    The 12th International Conference on Digital Preservation (iPRES) was held on November 2-6, 2015 in Chapel Hill, North Carolina, USA. There were 327 delegates from 22 countries. The program included 12 long papers, 15 short papers, 33 posters, 3 demos, 6 workshops, 3 tutorials and 5 panels, as well as several interactive sessions and a Digital Preservation Showcase

    Proceedings of the 12th International Conference on Digital Preservation

    Get PDF
    The 12th International Conference on Digital Preservation (iPRES) was held on November 2-6, 2015 in Chapel Hill, North Carolina, USA. There were 327 delegates from 22 countries. The program included 12 long papers, 15 short papers, 33 posters, 3 demos, 6 workshops, 3 tutorials and 5 panels, as well as several interactive sessions and a Digital Preservation Showcase

    CURATION AND MANAGEMENT OF CULTURAL HERITAGE THROUGH LIBRARIES

    Get PDF
    Libraries, museums and archives hold valuable collections in a variety of media, presenting a vast body of knowledge rooted in the history of human civilisation. These form the repository of the wisdom of great works by thinkers of past and the present. The holdings of these institutions are priceless heritage of the mankind as they preserve documents, ideas, and the oral and written records. To value the cultural heritage and to care for it as a treasure bequeathed to us by our ancestors is the major responsibility of libraries. The past records constitute a natural resource and are indispensable to the present generation as well as to the generations to come. Libraries preserve the documentary heritage resources for which they are primarily responsible. Any loss of such materials is simply irreplaceable. Therefore, preserving this intellectual, cultural heritage becomes not only the academic commitment but also the moral responsibility of the librarians/information scientists, who are in charge of these repositories. The high quality of the papers and the discussion represent the thinking and experience of experts in their particular fields. The contributed papers also relate to the methodology used in libraries in Asia to provide access to manuscripts and cultural heritage. The volume discusses best practices in Knowledge preservation and how to collaborate and preserve the culture. The book also deals with manuscript and archives issues in the digital era. The approach of this book is concise, comprehensively, covering all major aspects of preservation and conservation through libraries. The readership of the book is not just limited to library and information science professionals, but also for those involved in conservation, preservation, restoration or other related disciplines. The book will be useful for librarians, archivists and conservators. We thank the Sunan Kalijaga University, Special Libraries Association- Asian Chapter for their trust and their constant support, all the contributors for their submissions, the members of the Local and International Committee for their reviewing effort for making this publication possible
    corecore