113,918 research outputs found

    Analyzing Query Success and User Context

    Get PDF
    This paper describes the participation of DAEDALUS at the LogCLEF lab in CLEF 2011. This year, the objectives of our participation are twofold. The first topic is to analyze if there is any measurable effect on the success of the search queries if the native language and the interface language chosen by the user are different. The idea is to determine if this difference may condition the way in which the user interacts with the search application. The second topic is to analyze the user context and his/her interaction with the system in the case of successful queries, to discover out any relation among the user native language, the language of the resource involved and the interaction strategy adopted by the user to find out such resource. Only 6.89% of queries are successful out of the 628,607 queries in the 320,001 sessions with at least one search query in the log. The main conclusion that can be drawn is that, in general for all languages, whether the native language matches the interface language or not does not seem to affect the success rate of the search queries. On the other hand, the analysis of the strategy adopted by users when looking for a particular resource shows that people tend to use the simple search tool, frequently first running short queries build up of just one specific term and then browsing through the results to locate the expected resourc

    Exploiting user context and preferences for intelligent web search

    Get PDF
    Seeking information relevant to a topic of interest has become a common task in our daily activities. However, searching the Web using current technologies still presents many limitations. One of the main limitations is that existing tools for searching the Web restrict user queries to a small number of terms. As a result, a single query may not reflect the user information needs at a sufficient level of detail. In addition, even if longer queries were allowed, the user may not find the right terms to supply appropriate queries, or may not be willing to put the effort required to explicitly describe his or her information needs. Another limitation of today’s search tools is that they are not capable of performing qualitative inference on the suggestions they offer. For certain domains, such as news or scientific articles, a good amount of structural information can be usefully exploited to extract meaningful content. This can help sort out the material returned by a search engine and to perform a qualitative analysis to warrant some of the search results. This paper shows how to enhance current search engines capabilities by (1) taking advantage of the user context, and (2) ranking search results based on preferential criteria provided by the user. We describe ongoing research on the use of context-specific terms to refine Web search and on the use of a defeasible argumentation framework to prioritize search results.Eje: Agentes y Sistemas InteligentesRed de Universidades con Carreras en Informática (RedUNCI

    What on-line searches tell us about public interest and potential impact on behaviour in response to minimum unit pricing of alcohol in Scotland.

    Get PDF
    AIMS: To investigate whether the introduction of minimum unit pricing (MUP) in Scotland on 1 May 2018 was reflected in changes in the likelihood of alcohol-related queries submitted to an internet search engine, and in particular whether there was any evidence of increased interest in purchasing of alcohol from outside Scotland. DESIGN: Observational study in which individual queries to the internet Bing search engine for 2018 in Scotland and England were captured and analysed. Fluctuations over time in the likelihood of specific topic searches were examined. The patterns seen in Scotland were contrasted with those in England. SETTING: Scotland and England. PARTICIPANTS: People who used the Bing search engine during 2018. MEASUREMENTS: Numbers of daily queries submitted to Bing in 2018 on eight alcohol-related topics expressed as a proportion of queries on that day on any topic. These daily likelihoods were smoothed using a 14-day moving average for Scotland and England separately. FINDINGS: There were substantial peaks in queries about MUP itself, cheap sources of alcohol and online alcohol outlets at the time of introduction of MUP in May 2018 in Scotland, but not England. These were relatively short-lived. Queries related to intoxication and alcohol problems did not show a MUP peak, but were appreciably higher in Scotland than in England throughout 2018. CONCLUSIONS: Analysis of internet search engine queries appears to show that a fraction of people in Scotland may have considered circumventing minimum unit pricing in 2018 by looking for on-line alcohol retailers. The overall higher levels of queries related to alcohol problems in Scotland compared with England mirrors the corresponding differences in alcohol consumption and harms between the countries

    Using Search Queries to Understand Health Information Needs in Africa

    Full text link
    The lack of comprehensive, high-quality health data in developing nations creates a roadblock for combating the impacts of disease. One key challenge is understanding the health information needs of people in these nations. Without understanding people's everyday needs, concerns, and misconceptions, health organizations and policymakers lack the ability to effectively target education and programming efforts. In this paper, we propose a bottom-up approach that uses search data from individuals to uncover and gain insight into health information needs in Africa. We analyze Bing searches related to HIV/AIDS, malaria, and tuberculosis from all 54 African nations. For each disease, we automatically derive a set of common search themes or topics, revealing a wide-spread interest in various types of information, including disease symptoms, drugs, concerns about breastfeeding, as well as stigma, beliefs in natural cures, and other topics that may be hard to uncover through traditional surveys. We expose the different patterns that emerge in health information needs by demographic groups (age and sex) and country. We also uncover discrepancies in the quality of content returned by search engines to users by topic. Combined, our results suggest that search data can help illuminate health information needs in Africa and inform discussions on health policy and targeted education efforts both on- and offline.Comment: Extended version of an ICWSM 2019 pape

    What Users Ask a Search Engine: Analyzing One Billion Russian Question Queries

    Full text link
    We analyze the question queries submitted to a large commercial web search engine to get insights about what people ask, and to better tailor the search results to the users’ needs. Based on a dataset of about one billion question queries submitted during the year 2012, we investigate askers’ querying behavior with the support of automatic query categorization. While the importance of question queries is likely to increase, at present they only make up 3–4% of the total search traffic. Since questions are such a small part of the query stream and are more likely to be unique than shorter queries, clickthrough information is typically rather sparse. Thus, query categorization methods based on the categories of clicked web documents do not work well for questions. As an alternative, we propose a robust question query classification method that uses the labeled questions from a large community question answering platform (CQA) as a training set. The resulting classifier is then transferred to the web search questions. Even though questions on CQA platforms tend to be different to web search questions, our categorization method proves competitive with strong baselines with respect to classification accuracy. To show the scalability of our proposed method we apply the classifiers to about one billion question queries and discuss the trade-offs between performance and accuracy that different classification models offer. Our findings reveal what people ask a search engine and also how this contrasts behavior on a CQA platform

    Auditing Search Engines for Differential Satisfaction Across Demographics

    Get PDF
    Many online services, such as search engines, social media platforms, and digital marketplaces, are advertised as being available to any user, regardless of their age, gender, or other demographic factors. However, there are growing concerns that these services may systematically underserve some groups of users. In this paper, we present a framework for internally auditing such services for differences in user satisfaction across demographic groups, using search engines as a case study. We first explain the pitfalls of na\"ively comparing the behavioral metrics that are commonly used to evaluate search engines. We then propose three methods for measuring latent differences in user satisfaction from observed differences in evaluation metrics. To develop these methods, we drew on ideas from the causal inference literature and the multilevel modeling literature. Our framework is broadly applicable to other online services, and provides general insight into interpreting their evaluation metrics.Comment: 8 pages Accepted at WWW 201

    United we fall, divided we stand: A study of query segmentation and PRF for patent prior art search

    Get PDF
    Previous research in patent search has shown that reducing queries by extracting a few key terms is ineffective primarily because of the vocabulary mismatch between patent applications used as queries and existing patent documents. This finding has led to the use of full patent applications as queries in patent prior art search. In addition, standard information retrieval (IR) techniques such as query expansion (QE) do not work effectively with patent queries, principally because of the presence of noise terms in the massive queries. In this study, we take a new approach to QE for patent search. Text segmentation is used to decompose a patent query into selfcoherent sub-topic blocks. Each of these much shorted sub-topic blocks which is representative of a specific aspect or facet of the invention, is then used as a query to retrieve documents. Documents retrieved using the different resulting sub-queries or query streams are interleaved to construct a final ranked list. This technique can exploit the potential benefit of QE since the segmented queries are generally more focused and less ambiguous than the full patent query. Experiments on the CLEF-2010 IP prior-art search task show that the proposed method outperforms the retrieval effectiveness achieved when using a single full patent application text as the query, and also demonstrates the potential benefits of QE to alleviate the vocabulary mismatch problem in patent search

    Distributed resource discovery using a context sensitive infrastructure

    Get PDF
    Distributed Resource Discovery in a World Wide Web environment using full-text indices will never scale. The distinct properties of WWW information (volume, rate of change, topical diversity) limits the scaleability of traditional approaches to distributed Resource Discovery. An approach combining metadata clustering and query routing can, on the other hand, be proven to scale much better. This paper presents the Content-Sensitive Infrastructure, which is a design building on these results. We also present an analytical framework for comparing scaleability of different distribution strategies
    corecore