12,866 research outputs found

    Adapting to the Shifting Intent of Search Queries

    Full text link
    Search engines today present results that are often oblivious to abrupt shifts in intent. For example, the query `independence day' usually refers to a US holiday, but the intent of this query abruptly changed during the release of a major film by that name. While no studies exactly quantify the magnitude of intent-shifting traffic, studies suggest that news events, seasonal topics, pop culture, etc account for 50% of all search queries. This paper shows that the signals a search engine receives can be used to both determine that a shift in intent has happened, as well as find a result that is now more relevant. We present a meta-algorithm that marries a classifier with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. We provide strong evidence that this regret is close to the best achievable. Finally, via a series of experiments, we demonstrate that our algorithm outperforms prior approaches, particularly as the amount of intent-shifting traffic increases.Comment: This is the full version of the paper in NIPS'0

    Linking with Meaning: Ontological Hypertext for Scholars

    No full text
    The links in ontological hypermedia are defined according to the relationships between real-world objects. An ontology that models the significant objects in a scholar’s world can be used toward producing a consistently interlinked research literature. Currently the papers that are available online are mainly divided between subject- and publisher-specific archives, with little or no interoperability. This paper addresses the issue of ontological interlinking, presenting two experimental systems whose hypertext links embody ontologies based on the activities of researchers and scholars

    Annotated Bibliography: The Reference Desk: Grand Idea or Gone Down the River?

    Full text link
    This bibliography is from a panel presentation at the 2017 ACL Conference. The goal of this panel was to explore different rationales or sets of values that illustrated the continuation of the reference desk and reference service as essential to the success of the academic community. We discovered that “what to do with reference” is far from a settled question. We discovered passionate arguments, diverse models, and an array of data. In this current stage of figuring out the value of academic libraries to the campus as a whole and to students in particular, it seemed that there was limited hard data connecting Reference services to how they met students’ needs. How do we make ourselves valuable, important, essential, and useful? Maybe we need to change our model? If so, how do we examine ourselves and our environment appropriately to make this happen? What factors should we examine? Which ones must we keep? What things can we discard or change? When students come to seek assistance, they generally need the short, instant, and personal help, without having to attend a whole training session or class. Individual and personalized guidance for their immediate need is the most important factor for them. How do libraries provide that

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    The role of boundary objects in the co-evolution of design and use: the KMP project experimentation

    Get PDF
    Nowadays, it is widely recognized that an ICT tool cannot be built without knowing who will use it and what they will do with. In this perspective, Human-Computer Interaction community (Carroll, 1990; Jarke, Tung Bui and Carroll, 1998; Young and Barnard, 1987; Young and al., 1989) developed a scenario-based approach contrasting with the traditional information system design. The scenario describes an existing or envisioned system from the perspective of one or more users and includes a narration of their goals, plans and reactions (Rosson and Carroll, 2002). As a result, design is founded on the use of scenarios as a central representation for the analysis and design of use. The scenario-based design appears to be a first step in the integration of users in the design of ICT tool. However, we would like to underline in this paper a more active role of users in the design process. According to Orlikowski (2000) while a technology can be seen to have been constructed with particular materials and inscribed with developers' assumptions and knowledge about the world at a point in time, it is only when this technology is used in recurrent social practices that it can be said to structure user's action. The use of technology in recurrent social practices must be considered because how technological properties will for the moment be used or appropriate is not inherent or predetermined. Finally, this approach leads us to dissociate the designers' world from the users' world. In this perspective, the design project is the result of the co-evolution and the convergence of both worlds: on the one hand, the world of design and a first integration of users by scenarios; on the other hand, the world of users where innovation is the art of interesting an increasing number of allies who will make the world of design stronger and stronger. The objective of this paper is to understand the mechanisms of interaction between the world of design and that of users i.e. between loops of co-design and loops of uses. Indeed, according to Akrich, Callon and Latour (1988) we adopt a whirlwind model of innovation. In this perspective, “innovation continuously transforms itself according to the trials to which it is submitted i.e. of the “interessements” tried out » (Akrich and al., 2002: 7). We will demonstrate that the key success of an innovation depends on the co-evolution and convergence of design and use around boundary objects developed during this process (see Figure 1). More specifically, we will show the role of boundary objects on the integration and on the involvement of users in the design process. In order to do so, we carried out an empirical research – the Knowledge Management Platform project - located in the scientific park of Sophia Antipolis (Alpes-Maritimes, France), focusing on the Telecom Valley® (TV) association which gathers the main actors of the Sophia Antipolis Telecom cluster. Indeed, the KMP project aims to build a semantic web service of competencies in order to enhance exchange and combination dynamics of knowledge within the Telecom cluster thanks to an interactive mapping of competencies. This paper will comprise three parts: Based on the researches of Akrich, Callon and Latour (1988), Hatchuel and Mollet (1986), Orlikowski (2000), Romme and Endenburg (2006) we will identify and analyse in a first part the process of design. The combination of these approaches leads us to distinguish the design' world from the users' world. In this perspective, the success of an innovation may be explained by the co-evolution and the convergence of these two worlds. In this process, we suggest that boundary objects play a key role in the convergence of these two worlds. We will present in a second part the empirical study of the KMP project within the TV network. The KMP project involved researchers from socio-economic sciences (GREDEG Laboratory, UNSA-CNRS, Rodige and Latapses teams), cognitive sciences and artificial intelligence (INRIA, Acacia team), telecommunications (GET) and users (TV) for a total force of 187 men per month for a two-year period (2003-2005). At this present time this project is being set up in a pre-industrialization phase, supported by TV and the PACA region. Here, we will analyse the specific process of design experimented by KMP. Finally, the third part discusses the role of boundary objects in the KMP experimentation. In this part, we will show the evolution of boundary objects during the loops of design. More specifically, the focus will be on the emergence of compromises between designers and users, their materialisation in boundary objects and finally their evolution during the design' process.boundary objects, IS development, actor network theory

    Global disease monitoring and forecasting with Wikipedia

    Full text link
    Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data such as social media and search queries are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with r2r^2 up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein and adjust novelty claims accordingly; revise title; various revisions for clarit

    Modélisation des comportements de recherche basé sur les interactions des utilisateurs

    Get PDF
    Les utilisateurs de systèmes d'information divisent normalement les tâches en une séquence de plusieurs étapes pour les résoudre. En particulier, les utilisateurs divisent les tâches de recherche en séquences de requêtes, en interagissant avec les systèmes de recherche pour mener à bien le processus de recherche d'informations. Les interactions des utilisateurs sont enregistrées dans des journaux de requêtes, ce qui permet de développer des modèles pour apprendre automatiquement les comportements de recherche à partir des interactions des utilisateurs avec les systèmes de recherche. Ces modèles sont à la base de multiples applications d'assistance aux utilisateurs qui aident les systèmes de recherche à être plus interactifs, faciles à utiliser, et cohérents. Par conséquent, nous proposons les contributions suivantes : un modèle neuronale pour apprendre à détecter les limites des tâches de recherche dans les journaux de requête ; une architecture de regroupement profond récurrent qui apprend simultanément les représentations de requête et regroupe les requêtes en tâches de recherche ; un modèle non supervisé et indépendant d'utilisateur pour l'identification des tâches de recherche prenant en charge les requêtes dans seize langues ; et un modèle de tâche de recherche multilingue, une approche non supervisée qui modélise simultanément l'intention de recherche de l'utilisateur et les tâches de recherche. Les modèles proposés améliorent les méthodes existantes de modélisation, en tenant compte de la confidentialité des utilisateurs, des réponses en temps réel et de l'accessibilité linguistique. Le respect de la vie privée de l'utilisateur est une préoccupation majeure, tandis que des réponses rapides sont essentielles pour les systèmes de recherche qui interagissent avec les utilisateurs en temps réel, en particulier dans la recherche par conversation. Dans le même temps, l'accessibilité linguistique est essentielle pour aider les utilisateurs du monde entier, qui interagissent avec les systèmes de recherche dans de nombreuses langues. Les contributions proposées peuvent bénéficier à de nombreuses applications d'assistance aux utilisateurs, en aidant ces derniers à mieux résoudre leurs tâches de recherche lorsqu'ils accèdent aux systèmes de recherche pour répondre à leurs besoins d'information.Users of information systems normally divide tasks in a sequence of multiple steps to solve them. In particular, users divide search tasks into sequences of queries, interacting with search systems to carry out the information seeking process. User interactions are registered on search query logs, enabling the development of models to automatically learn search patterns from the users' interactions with search systems. These models underpin multiple user assisting applications that help search systems to be more interactive, user-friendly, and coherent. User assisting applications include query suggestion, the ranking of search results based on tasks, query reformulation analysis, e-commerce applications, retrieval of advertisement, query-term prediction, mapping of queries to search tasks, and so on. Consequently, we propose the following contributions: a neural model for learning to detect search task boundaries in query logs; a recurrent deep clustering architecture that simultaneously learns query representations through self-training, and cluster queries into groups of search tasks; Multilingual Graph-Based Clustering, an unsupervised, user-agnostic model for search task identification supporting queries in sixteen languages; and Language-agnostic Search Task Model, an unsupervised approach that simultaneously models user search intent and search tasks. Proposed models improve on existing methods for modeling user interactions, taking into account user privacy, realtime response times, and language accessibility. User privacy is a major concern in Ethics for intelligent systems, while fast responses are critical for search systems interacting with users in realtime, particularly in conversational search. At the same time, language accessibility is essential to assist users worldwide, who interact with search systems in many languages. The proposed contributions can benefit many user assisting applications, helping users to better solve their search tasks when accessing search systems to fulfill their information needs

    Large language models can accurately predict searcher preferences

    Full text link
    Relevance labels, which indicate whether a search result is valuable to a searcher, are key to evaluating and optimising search systems. The best way to capture the true preferences of users is to ask them for their careful feedback on which results would be useful, but this approach does not scale to produce a large number of labels. Getting relevance labels at scale is usually done with third-party labellers, who judge on behalf of the user, but there is a risk of low-quality data if the labeller doesn't understand user needs. To improve quality, one standard approach is to study real users through interviews, user studies and direct feedback, find areas where labels are systematically disagreeing with users, then educate labellers about user needs through judging guidelines, training and monitoring. This paper introduces an alternate approach for improving label quality. It takes careful feedback from real users, which by definition is the highest-quality first-party gold data that can be derived, and develops an large language model prompt that agrees with that data. We present ideas and observations from deploying language models for large-scale relevance labelling at Bing, and illustrate with data from TREC. We have found large language models can be effective, with accuracy as good as human labellers and similar capability to pick the hardest queries, best runs, and best groups. Systematic changes to the prompts make a difference in accuracy, but so too do simple paraphrases. To measure agreement with real searchers needs high-quality ``gold'' labels, but with these we find that models produce better labels than third-party workers, for a fraction of the cost, and these labels let us train notably better rankers
    corecore