21,457 research outputs found

    Estimating Position Bias without Intrusive Interventions

    Full text link
    Presentation bias is one of the key challenges when learning from implicit feedback in search engines, as it confounds the relevance signal. While it was recently shown how counterfactual learning-to-rank (LTR) approaches \cite{Joachims/etal/17a} can provably overcome presentation bias when observation propensities are known, it remains to show how to effectively estimate these propensities. In this paper, we propose the first method for producing consistent propensity estimates without manual relevance judgments, disruptive interventions, or restrictive relevance modeling assumptions. First, we show how to harvest a specific type of intervention data from historic feedback logs of multiple different ranking functions, and show that this data is sufficient for consistent propensity estimation in the position-based model. Second, we propose a new extremum estimator that makes effective use of this data. In an empirical evaluation, we find that the new estimator provides superior propensity estimates in two real-world systems -- Arxiv Full-text Search and Google Drive Search. Beyond these two points, we find that the method is robust to a wide range of settings in simulation studies

    Automated user modeling for personalized digital libraries

    Get PDF
    Digital libraries (DL) have become one of the most typical ways of accessing any kind of digitalized information. Due to this key role, users welcome any improvements on the services they receive from digital libraries. One trend used to improve digital services is through personalization. Up to now, the most common approach for personalization in digital libraries has been user-driven. Nevertheless, the design of efficient personalized services has to be done, at least in part, in an automatic way. In this context, machine learning techniques automate the process of constructing user models. This paper proposes a new approach to construct digital libraries that satisfy user’s necessity for information: Adaptive Digital Libraries, libraries that automatically learn user preferences and goals and personalize their interaction using this information

    The complexities of managing historic buildings with BIM

    Get PDF
    Purpose The adoption of building information modelling (BIM) in managing built heritage is an exciting prospect, but one that presents complexities additional to those of modern buildings. If challenges can be identified and overcome, the adoption of historic BIM (HBIM) could offer efficiencies in how heritage buildings are managed. Design/methodology/approach Using Durham Cathedral as a case study, we present the workflows applied to create an asset information model to improve the way this unique UNESCO World Heritage Site is managed, and in doing so, set out the challenges and complexities in achieving an HBIM solution. Findings This study identifies the need for a better understanding of the distinct needs and context for managing historic assets, and the need for heritage information requirements (HIR) that reflect this. Originality/value This study presents first-hand findings based on a unique application of BIM at Durham Cathedral, a UNESCO World Heritage Site. The study provides a better understanding of the challenges and drivers of HBIM adoption across the heritage sector and underlines the need for information requirements that are unique to historical buildings/assets to deliver a coherent and relevant HBIM approach

    Recording, Documentation, and Information Management for the Conservation of Heritage Places: Guiding Principles

    Get PDF
    Provides guidance on integrating recording, documentation, and information management of territories, sites, groups of buildings, or monuments into the conservation process; evaluating proposals; consulting specialists; and controlling implementation

    Hybrid Approach Combining Statistical and Rule-Based Models for the Automated Indexing of Bibliographic Metadata in the Area of Planning and Building Construction

    Get PDF
    ICONDA®^{®} Bibliographic (International Construction Database) is a bibliographic database, which contains English-language documents in the area of planning and building construction. The documents are indexed with descriptors from controlled vocabularies (FINDEX thesauri, an authority list). The manual assignment of the descriptors is time-consuming and expensive. To solve this problem, an automated indexing system was developed. The indexing system combines a statistical classifier that is based on the vector space model with a rule-based classifier. In the statistical classifier, descriptor profiles are automatically trained from already indexed documents. The results provided by the statistical classifier will be improved with the rule based classifier that filters incorrect and adds missing descriptors. The rules can be created manually or automatically from already indexed documents. The hybrid approach is particularly useful when a descriptor cannot be successfully trained by the statistical classifier. In this case, the system can be easily fine-tuned by adding specific rules for the descriptor

    Regional Data Archiving and Management for Northeast Illinois

    Get PDF
    This project studies the feasibility and implementation options for establishing a regional data archiving system to help monitor and manage traffic operations and planning for the northeastern Illinois region. It aims to provide a clear guidance to the regional transportation agencies, from both technical and business perspectives, about building such a comprehensive transportation information system. Several implementation alternatives are identified and analyzed. This research is carried out in three phases. In the first phase, existing documents related to ITS deployments in the broader Chicago area are summarized, and a thorough review is conducted of similar systems across the country. Various stakeholders are interviewed to collect information on all data elements that they store, including the format, system, and granularity. Their perception of a data archive system, such as potential benefits and costs, is also surveyed. In the second phase, a conceptual design of the database is developed. This conceptual design includes system architecture, functional modules, user interfaces, and examples of usage. In the last phase, the possible business models for the archive system to sustain itself are reviewed. We estimate initial capital and recurring operational/maintenance costs for the system based on realistic information on the hardware, software, labor, and resource requirements. We also identify possible revenue opportunities. A few implementation options for the archive system are summarized in this report; namely: 1. System hosted by a partnering agency 2. System contracted to a university 3. System contracted to a national laboratory 4. System outsourced to a service provider The costs, advantages and disadvantages for each of these recommended options are also provided.ICT-R27-22published or submitted for publicationis peer reviewe

    Stochastic Query Covering for Fast Approximate Document Retrieval

    Get PDF
    We design algorithms that, given a collection of documents and a distribution over user queries, return a small subset of the document collection in such a way that we can efficiently provide high-quality answers to user queries using only the selected subset. This approach has applications when space is a constraint or when the query-processing time increases significantly with the size of the collection. We study our algorithms through the lens of stochastic analysis and prove that even though they use only a small fraction of the entire collection, they can provide answers to most user queries, achieving a performance close to the optimal. To complement our theoretical findings, we experimentally show the versatility of our approach by considering two important cases in the context of Web search. In the first case, we favor the retrieval of documents that are relevant to the query, whereas in the second case we aim for document diversification. Both the theoretical and the experimental analysis provide strong evidence of the potential value of query covering in diverse application scenarios

    An Approach to Conserve Gaza Architectural Heritage Through Digital Technology

    Get PDF
    Heritage is considered as one of the constituents that preserve culture and national identity of any community. This is because heritage is a witness of accumulating experiences of that community. During recent years, architectural styles have changed dramatically in our country. Modern and western styles of housing prevailed so much that the Palestinian heritage touches nearly disappeared. In addition, many of the historic buildings in Gaza City have been destroyed due to the lack of public awareness about the importance of this heritage. This situation created the need for restoring the inveterate Palestinian culture and heritage through 3D Visualizing of Architectural Heritage. Awareness of cultural heritage through publicizing the digitized simulation of this heritage by using virtual reality and 3D modelling and animation techni-ques would help in preserving it and achieving balanced environment which reflects both originality of past and modernity of contemporaneity
    • …
    corecore