11 research outputs found

    Large-scale Generative Query Autocompletion

    Get PDF
    Query Autocompletion (QAC) systems are interactive tools that assist a searcher in entering a query given a partial query prefix. Existing QAC research -- with a number of notable exceptions --relies upon large existing query logs from which to extract historical queries. These queries are then ordered by some ranking algorithm as candidate completions, given the query prefix. Given the numerous search environments (e.g. enterprises, personal or secured data repositories) in which large query logs are unavailable, the need for synthetic -- or generative -- QAC systems will become increasingly important. Generative QAC systems may be used to augment traditional query-based approaches, and/or entirely replace them in certain privacy sensitive applications. Even in commercial Web search engines, a significant proportion (up to 15%) of queries issued daily have never been seen previously, meaning there will always be opportunity to assist users in formulating queries which have not occurred historically. In this paper, we describe a system that can construct generative QAC suggestions within a user-acceptable timeframe (~58ms), and report on a series of experiments over three publicly available, large-scale question sets that investigate different aspects of the system's performance

    MWAND: A New Early Termination Algorithm for Fast and Efficient Query Evaluation

    Get PDF
    Nowadays, current information systems are so large and maintain huge amount of data. At every time, they process millions of documents and millions of queries. In order to choose the most important responses from this amount of data, it is well to apply what is so called early termination algorithms. These ones attempt to extract the Top-K documents according to a specified increasing monotone function. The principal idea behind is to reach and score the most significant less number of documents. So, they avoid fully processing the whole documents. WAND algorithm is at the state of the art in this area. Despite it is efficient, it is missing effectiveness and precision. In this paper, we propose two contributions, the principal proposal is a new early termination algorithm based on WAND approach, we call it MWAND (Modified WAND). This one is faster and more precise than the first. It has the ability to avoid unnecessary WAND steps. In this work, we integrate a tree structure as an index into WAND and we add new levels in query processing. In the second contribution, we define new fine metrics to ameliorate the evaluation of the retrieved information. The experimental results on real datasets show that MWAND is more efficient than the WAND approach

    Automatic query expansion: A structural linguistic perspective

    Get PDF
    A user’s query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques ignore information about the dependencies that exist between words in natural language. However, more recent approaches have demonstrated that by explicitly modeling associations between terms significant improvements in retrieval effectiveness can be achieved over those that ignore these dependencies. State-of-the-art dependency-based approaches have been shown to primarily model syntagmatic associations. Syntagmatic associations infer a likelihood that two terms co-occur more often than by chance. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process will improve retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine

    An enterprise search paradigm based on extended query auto-completion: do we still need search and navigation?

    No full text
    Enterprise query auto-completion (QAC) can allow website or intranet visitors to satisfy a need more efficiently than traditional searching and browsing. The limited scope of an enterprise makes it possible to satisfy a high proportion of information needs through completion. Further, the availability of structured sources of completions such as product catalogues compensates for sparsity of log data. Extended forms (X-QAC) can give access to information that is inaccessible via a conventional crawled index. We show that it can be guaranteed that for every suggestion there is a prefix which causes it to appear in the top k suggestions. Using university query logs and structured lists, we quantify the significant keystroke savings attributable to this guarantee (worst case). Such savings may be of particular value for mobile devices. A user experiment showed that a staff lookup task took an average of 61% longer with a conventional search interface than with an X-QAC system. Using wine catalogue data we demonstrate a further extension which allows a user to home in on desired items in faceted-navigation style. We also note that advertisements can be triggered from QAC. Given the advantages and power of X-QAC systems, we envisage that websites and intranets of the [near] future will provide less navigation and rely less on conventional search

    Impact of location on social media credibility

    Get PDF
    Social media platforms such as Twitter and Facebook allow users from all over the world to contribute content. However, these users publish content without peer review, and contributions of low quality can create credibility concerns. This reduces the potential social benefits of social media. Social media credibility models rely on popularity, temporal patterns and other collective behaviour of users to study the credibility of user generated content (UGC). However, such approaches do not take into account end user credibility perceptions and factors that may influence a contributor (author), which in turn affects credibility models in social media. Therefore, I studied the factors that influence readers' credibility perception and content credibility. I identified a number of limitations in existing models: most research considers only users' perceptions from one country or culture and then generalises the results to others. I also found these models do not consider author location when assessing credibility. Therefore, I proposed a study on the influence of author, reader and event location on user credibility perception and content credibility in social media. I propose a model that has been validated using a crowdsourced labelling approach. I ran three controlled experiments mainly varying source-based features (author) and content (text). Further, I applied a linguistic analysis approach to validate the influence of location on content credibility. I also applied a number of statistical analyses to measure the effect of all features. I validated the model using a common social media platform (Twitter) and showed the influence of non-textual features on credibility judgments of readers. Also, I found that reader location represented by culture can determine their credibility perception in social media. Moreover, I showed how distance between the event and author location can affect sources and credibility distribution in social media. Location of readers and authors, and the interaction with event locations can be used to improve assessment of credibility in social media. Reader characteristics are found to be important when studying credibility in social media as they can be used to improve user experience in social media. Moreover, an author's location can enhance credibility detection models to assess content accurately as it can differentiate between content with different credibility levels. While I do not claim that only user location can be used to build a standalone credibility system, I conclude that adding geographic location and culture of users can improve the performance of existing credibility models significantly

    B!SON: A Tool for Open Access Journal Recommendation

    Get PDF
    Finding a suitable open access journal to publish scientific work is a complex task: Researchers have to navigate a constantly growing number of journals, institutional agreements with publishers, funders’ conditions and the risk of Predatory Publishers. To help with these challenges, we introduce a web-based journal recommendation system called B!SON. It is developed based on a systematic requirements analysis, built on open data, gives publisher-independent recommendations and works across domains. It suggests open access journals based on title, abstract and references provided by the user. The recommendation quality has been evaluated using a large test set of 10,000 articles. Development by two German scientific libraries ensures the longevity of the project

    Preface

    Get PDF

    Pseudo National Security System of Health in Indonesia

    Get PDF
    ABstRACt Adolescence is a crucial period where one tends to identify who they are as an individual. However, as a teenager is struggling to find his/her place in this world, it is also a time where they are prone to engaging in risk behaviors, which tend to have an extreme psychological impact. The objective was to explore the experiences of an adolescent who engages in risk behaviors and to understand their level of personal fables. The study was a qualitative design with content analysis with semi-structured interviews of ten male adolescents aged 16-18 years. The major findings of the study indicated that adolescent’s pattern of thinking revolves around the fact that they are invincible and invulnerable. Furthermore, adolescents are aware of the risks they are putting themselves through and how in the process they are hurting others. The implications of the study are to conduct more life skill programs in schools; greater awareness has to be created on the impact and harmful effects of such behaviors
    corecore