2,701 research outputs found

    Indexing and retrieval of multimodal lecture recordings from open repositories for personalized access in modern learning settings

    Get PDF
    An increasing number of lecture recordings are available to complement face-to face and the more conventional content-based e-learning approaches. These recordings provide additional channels for remote students and time-independent access to the lectures. Many universities offer even complete series of recordings of hundreds of courses which are available for public access and this service provides added value for users outside the university. The lecture recordings show the use of a great variety of media or modalities (such as video, audiom lecture media, presentation behaviour) and formats. Insofar, none of the existing systems and services have sufficient retrieval functionality or support appropriate interfaces to enable searching for lecture recordings over several repositories. This situation has motivated us to initiate research on a lecture recording indexing and retrieval system for knowledge transfer and learning activities in various settings. This system is built on our former experiences and prototypes developed within the MISTRAL research project. In this paper we outline requirements for an enhanced lecture recording retrieval system, introduce our solution and prototype, and discuss the initial results and findings

    Discovering real-world usage scenarios for a multimodal math search interface

    Get PDF
    To use math expressions in search, current search engines require knowing expression names or using a structure editor or string encoding (e.g., LaTeX) to enter expressions. This is unfortunate for people who are not math experts, as this can lead to an intention gap between the math query they wish to express, and what the interface will allow. min is a search interface that supports drawing expressions on a canvas using a mouse/touch, keyboard and images. We designed a user study to examine how the multimodal interface of min changes search behavior for mathematical non-experts, and discover real-world usage scenarios. Participants demonstrated increased use of math expressions in queries when using min. There was little difference in task success reported by participants using min vs. text-based search, but the majority of participants appreciated the multimodal input, and identified real-world scenarios in which they would like to use systems like min

    Voice-assisted Image Labelling for Endoscopic Ultrasound Classification using Neural Networks

    Get PDF
    Ultrasound imaging is a commonly used technology for visualising patient anatomy in real-time during diagnostic and therapeutic procedures. High operator dependency and low reproducibility make ultrasound imaging and interpretation challenging with a steep learning curve. Automatic image classification using deep learning has the potential to overcome some of these challenges by supporting ultrasound training in novices, as well as aiding ultrasound image interpretation in patient with complex pathology for more experienced practitioners. However, the use of deep learning methods requires a large amount of data in order to provide accurate results. Labelling large ultrasound datasets is a challenging task because labels are retrospectively assigned to 2D images without the 3D spatial context available in vivo or that would be inferred while visually tracking structures between frames during the procedure. In this work, we propose a multi-modal convolutional neural network (CNN) architecture that labels endoscopic ultrasound (EUS) images from raw verbal comments provided by a clinician during the procedure. We use a CNN composed of two branches, one for voice data and another for image data, which are joined to predict image labels from the spoken names of anatomical landmarks. The network was trained using recorded verbal comments from expert operators. Our results show a prediction accuracy of 76% at image level on a dataset with 5 different labels. We conclude that the addition of spoken commentaries can increase the performance of ultrasound image classification, and eliminate the burden of manually labelling large EUS datasets necessary for deep learning applications
    • …
    corecore