3,067 research outputs found

    Investigating the document structure as a source of evidence for multimedia fragment retrieval

    Get PDF
    International audienceMultimedia objects can be retrieved using their context that can be for instance the text surrounding them in documents. This text may be either near or far from the searched objects. Our goal in this paper is to study the impact, in term of effectiveness, of text position relatively to searched objects. The multimedia objects we consider are described in structured documents such as XML ones. The document structure is therefore exploited to provide this text position in documents. Although structural information has been shown to be an effective source of evidence in textual information retrieval, only a few works investigated its interest in multimedia retrieval. More precisely, the task we are interested in this paper is to retrieve multimedia fragments (i.e. XML elements having at least one multimedia object). Our general approach is built on two steps: we first retrieve XML elements containing multimedia objects, and we then explore the surrounding information to retrieve relevant multimedia fragments. In both cases, we study the impact of the surrounding information using the documents structure.Our work is carried out on images, but it can be extended to any other media, since the physical content of multimedia objects is not used. We conducted several experiments in the context of the Multimedia track of the INEX evaluation campaign. Results showed that structural evidences are of high interest to tune the importance of textual context for multimedia retrieval. Moreover, the proposed approach outperforms state of the art approaches

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    #mytweet via Instagram: Exploring User Behaviour across Multiple Social Networks

    Full text link
    We study how users of multiple online social networks (OSNs) employ and share information by studying a common user pool that use six OSNs - Flickr, Google+, Instagram, Tumblr, Twitter, and YouTube. We analyze the temporal and topical signature of users' sharing behaviour, showing how they exhibit distinct behaviorial patterns on different networks. We also examine cross-sharing (i.e., the act of user broadcasting their activity to multiple OSNs near-simultaneously), a previously-unstudied behaviour and demonstrate how certain OSNs play the roles of originating source and destination sinks.Comment: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2015. This is the pre-peer reviewed version and the final version is available at http://wing.comp.nus.edu.sg/publications/2015/lim-et-al-15.pd

    Sensemaking on the Pragmatic Web: A Hypermedia Discourse Perspective

    Get PDF
    The complexity of the dilemmas we face on an organizational, societal and global scale forces us into sensemaking activity. We need tools for expressing and contesting perspectives flexible enough for real time use in meetings, structured enough to help manage longer term memory, and powerful enough to filter the complexity of extended deliberation and debate on an organizational or global scale. This has been the motivation for a programme of basic and applied action research into Hypermedia Discourse, which draws on research in hypertext, information visualization, argumentation, modelling, and meeting facilitation. This paper proposes that this strand of work shares a key principle behind the Pragmatic Web concept, namely, the need to take seriously diverse perspectives and the processes of meaning negotiation. Moreover, it is argued that the hypermedia discourse tools described instantiate this principle in practical tools which permit end-user control over modelling approaches in the absence of consensus

    Doctor of Philosophy

    Get PDF
    dissertationMedical knowledge learned in medical school can become quickly outdated given the tremendous growth of the biomedical literature. It is the responsibility of medical practitioners to continuously update their knowledge with recent, best available clinical evidence to make informed decisions about patient care. However, clinicians often have little time to spend on reading the primary literature even within their narrow specialty. As a result, they often rely on systematic evidence reviews developed by medical experts to fulfill their information needs. At the present, systematic reviews of clinical research are manually created and updated, which is expensive, slow, and unable to keep up with the rapidly growing pace of medical literature. This dissertation research aims to enhance the traditional systematic review development process using computer-aided solutions. The first study investigates query expansion and scientific quality ranking approaches to enhance literature search on clinical guideline topics. The study showed that unsupervised methods can improve retrieval performance of a popular biomedical search engine (PubMed). The proposed methods improve the comprehensiveness of literature search and increase the ratio of finding relevant studies with reduced screening effort. The second and third studies aim to enhance the traditional manual data extraction process. The second study developed a framework to extract and classify texts from PDF reports. This study demonstrated that a rule-based multipass sieve approach is more effective than a machine-learning approach in categorizing document-level structures and iv that classifying and filtering publication metadata and semistructured texts enhances the performance of an information extraction system. The proposed method could serve as a document processing step in any text mining research on PDF documents. The third study proposed a solution for the computer-aided data extraction by recommending relevant sentences and key phrases extracted from publication reports. This study demonstrated that using a machine-learning classifier to prioritize sentences for specific data elements performs equally or better than an abstract screening approach, and might save time and reduce errors in the full-text screening process. In summary, this dissertation showed that there are promising opportunities for technology enhancement to assist in the development of systematic reviews. In this modern age when computing resources are getting cheaper and more powerful, the failure to apply computer technologies to assist and optimize the manual processes is a lost opportunity to improve the timeliness of systematic reviews. This research provides methodologies and tests hypotheses, which can serve as the basis for further large-scale software engineering projects aimed at fully realizing the prospect of computer-aided systematic reviews

    Content And Multimedia Database Management Systems

    Get PDF
    A database management system is a general-purpose software system that facilitates the processes of defining, constructing, and manipulating databases for various applications. The main characteristic of the ‘database approach’ is that it increases the value of data by its emphasis on data independence. DBMSs, and in particular those based on the relational data model, have been very successful at the management of administrative data in the business domain. This thesis has investigated data management in multimedia digital libraries, and its implications on the design of database management systems. The main problem of multimedia data management is providing access to the stored objects. The content structure of administrative data is easily represented in alphanumeric values. Thus, database technology has primarily focused on handling the objects’ logical structure. In the case of multimedia data, representation of content is far from trivial though, and not supported by current database management systems

    Towards memory supporting personal information management tools

    Get PDF
    In this article we discuss re-retrieving personal information objects and relate the task to recovering from lapse(s) in memory. We propose that fundamentally it is lapses in memory that impede users from successfully re-finding the information they need. Our hypothesis is that by learning more about memory lapses in non-computing contexts and how people cope and recover from these lapses, we can better inform the design of PIM tools and improve the user's ability to re-access and re-use objects. We describe a diary study that investigates the everyday memory problems of 25 people from a wide range of backgrounds. Based on the findings, we present a series of principles that we hypothesize will improve the design of personal information management tools. This hypothesis is validated by an evaluation of a tool for managing personal photographs, which was designed with respect to our findings. The evaluation suggests that users' performance when re-finding objects can be improved by building personal information management tools to support characteristics of human memory
    corecore