53,970 research outputs found

    Firsthand Opiates Abuse on Social Media: Monitoring Geospatial Patterns of Interest Through a Digital Cohort

    Get PDF
    In the last decade drug overdose deaths reached staggering proportions in the US. Besides the raw yearly deaths count that is worrisome per se, an alarming picture comes from the steep acceleration of such rate that increased by 21% from 2015 to 2016. While traditional public health surveillance suffers from its own biases and limitations, digital epidemiology offers a new lens to extract signals from Web and Social Media that might be complementary to official statistics. In this paper we present a computational approach to identify a digital cohort that might provide an updated and complementary view on the opioid crisis. We introduce an information retrieval algorithm suitable to identify relevant subspaces of discussion on social media, for mining data from users showing explicit interest in discussions about opioid consumption in Reddit. Moreover, despite the pseudonymous nature of the user base, almost 1.5 million users were geolocated at the US state level, resembling the census population distribution with a good agreement. A measure of prevalence of interest in opiate consumption has been estimated at the state level, producing a novel indicator with information that is not entirely encoded in the standard surveillance. Finally, we further provide a domain specific vocabulary containing informal lexicon and street nomenclature extracted by user-generated content that can be used by researchers and practitioners to implement novel digital public health surveillance methodologies for supporting policy makers in fighting the opioid epidemic.Comment: Proceedings of the 2019 World Wide Web Conference (WWW '19

    Building a diversity featured search system by fusing existing tools

    Get PDF
    This paper describes our diversity featured retrieval system which are built for the task of ImageCLEFPhoto 2008. Two existing tools are used: Solr and Carrot. We have experimented with different settings of the system to see how the performance changes. The results suggest that the system can indeed increase diversity of the retrieved results and keep the precision about the same

    Concept-based Interactive Query Expansion Support Tool (CIQUEST)

    Get PDF
    This report describes a three-year project (2000-03) undertaken in the Information Studies Department at The University of Sheffield and funded by Resource, The Council for Museums, Archives and Libraries. The overall aim of the research was to provide user support for query formulation and reformulation in searching large-scale textual resources including those of the World Wide Web. More specifically the objectives were: to investigate and evaluate methods for the automatic generation and organisation of concepts derived from retrieved document sets, based on statistical methods for term weighting; and to conduct user-based evaluations on the understanding, presentation and retrieval effectiveness of concept structures in selecting candidate terms for interactive query expansion. The TREC test collection formed the basis for the seven evaluative experiments conducted in the course of the project. These formed four distinct phases in the project plan. In the first phase, a series of experiments was conducted to investigate further techniques for concept derivation and hierarchical organisation and structure. The second phase was concerned with user-based validation of the concept structures. Results of phases 1 and 2 informed on the design of the test system and the user interface was developed in phase 3. The final phase entailed a user-based summative evaluation of the CiQuest system. The main findings demonstrate that concept hierarchies can effectively be generated from sets of retrieved documents and displayed to searchers in a meaningful way. The approach provides the searcher with an overview of the contents of the retrieved documents, which in turn facilitates the viewing of documents and selection of the most relevant ones. Concept hierarchies are a good source of terms for query expansion and can improve precision. The extraction of descriptive phrases as an alternative source of terms was also effective. With respect to presentation, cascading menus were easy to browse for selecting terms and for viewing documents. In conclusion the project dissemination programme and future work are outlined

    Overview of the CLEF-2005 cross-language speech retrieval track

    Get PDF
    The task for the CLEF-2005 cross-language speech retrieval track was to identify topically coherent segments of English interviews in a known-boundary condition. Seven teams participated, performing both monolingual and cross-language searches of ASR transcripts, automatically generated metadata, and manually generated metadata. Results indicate that monolingual search technology is sufficiently accurate to be useful for some purposes (the best mean average precision was 0.18) and cross-language searching yielded results typical of those seen in other applications (with the best systems approximating monolingual mean average precision)

    TRECVID 2004 experiments in Dublin City University

    Get PDF
    In this paper, we describe our experiments for TRECVID 2004 for the Search task. In the interactive search task, we developed two versions of a video search/browse system based on the Físchlár Digital Video System: one with text- and image-based searching (System A); the other with only image (System B). These two systems produced eight interactive runs. In addition we submitted ten fully automatic supplemental runs and two manual runs. A.1, Submitted Runs: • DCUTREC13a_{1,3,5,7} for System A, four interactive runs based on text and image evidence. • DCUTREC13b_{2,4,6,8} for System B, also four interactive runs based on image evidence alone. • DCUTV2004_9, a manual run based on filtering faces from an underlying text search engine for certain queries. • DCUTV2004_10, a manual run based on manually generated queries processed automatically. • DCU_AUTOLM{1,2,3,4,5,6,7}, seven fully automatic runs based on language models operating over ASR text transcripts and visual features. • DCUauto_{01,02,03}, three fully automatic runs based on exploring the benefits of multiple sources of text evidence and automatic query expansion. A.2, In the interactive experiment it was confirmed that text and image based retrieval outperforms an image-only system. In the fully automatic runs, DCUauto_{01,02,03}, it was found that integrating ASR, CC and OCR text into the text ranking outperforms using ASR text alone. Furthermore, applying automatic query expansion to the initial results of ASR, CC, OCR text further increases performance (MAP), though not at high rank positions. For the language model-based fully automatic runs, DCU_AUTOLM{1,2,3,4,5,6,7}, we found that interpolated language models perform marginally better than other tested language models and that combining image and textual (ASR) evidence was found to marginally increase performance (MAP) over textual models alone. For our two manual runs we found that employing a face filter disimproved MAP when compared to employing textual evidence alone and that manually generated textual queries improved MAP over fully automatic runs, though the improvement was marginal. A.3, Our conclusions from our fully automatic text based runs suggest that integrating ASR, CC and OCR text into the retrieval mechanism boost retrieval performance over ASR alone. In addition, a text-only Language Modelling approach such as DCU_AUTOLM1 will outperform our best conventional text search system. From our interactive runs we conclude that textual evidence is an important lever for locating relevant content quickly, but that image evidence, if used by experienced users can aid retrieval performance. A.4, We learned that incorporating multiple text sources improves over ASR alone and that an LM approach which integrates shot text, neighbouring shots and entire video contents provides even better retrieval performance. These findings will influence how we integrate textual evidence into future Video IR systems. It was also found that a system based on image evidence alone can perform reasonably and given good query images can aid retrieval performance

    Exploiting Query Structure and Document Structure to Improve Document Retrieval Effectiveness

    Get PDF
    In this paper we present a systematic analysis of document retrieval using unstructured and structured queries within the score region algebra (SRA) structured retrieval framework. The behavior of diÂŽerent retrieval models, namely Boolean, tf.idf, GPX, language models, and Okapi, is tested using the transparent SRA framework in our three-level structured retrieval system called TIJAH. The retrieval models are implemented along four elementary retrieval aspects: element and term selection, element score computation, score combination, and score propagation. The analysis is performed on a numerous experiments evaluated on TREC and CLEF collections, using manually generated unstructured and structured queries. Unstructured queries range from the short title queries to long title + description + narrative queries. For generating structured queries we exploit the knowledge of the document structure and the content used to semantically describe or classify documents. We show that such structured information can be utilized in retrieval engines to give more precise answers to user queries then when using unstructured queries

    Overview of the 2005 cross-language image retrieval track (ImageCLEF)

    Get PDF
    The purpose of this paper is to outline efforts from the 2005 CLEF crosslanguage image retrieval campaign (ImageCLEF). The aim of this CLEF track is to explore the use of both text and content-based retrieval methods for cross-language image retrieval. Four tasks were offered in the ImageCLEF track: a ad-hoc retrieval from an historic photographic collection, ad-hoc retrieval from a medical collection, an automatic image annotation task, and a user-centered (interactive) evaluation task that is explained in the iCLEF summary. 24 research groups from a variety of backgrounds and nationalities (14 countries) participated in ImageCLEF. In this paper we describe the ImageCLEF tasks, submissions from participating groups and summarise the main fndings

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed
    • …
    corecore