4,317 research outputs found

    DocMIR: An automatic document-based indexing system for meeting retrieval

    Get PDF
    This paper describes the DocMIR system which captures, analyzes and indexes automatically meetings, conferences, lectures, etc. by taking advantage of the documents projected (e.g. slideshows, budget tables, figures, etc.) during the events. For instance, the system can automatically apply the above-mentioned procedures to a lecture and automatically index the event according to the presented slides and their contents. For indexing, the system requires neither specific software installed on the presenter's computer nor any conscious intervention of the speaker throughout the presentation. The only material required by the system is the electronic presentation file of the speaker. Even if not provided, the system would temporally segment the presentation and offer a simple storyboard-like browsing interface. The system runs on several capture boxes connected to cameras and microphones that records events, synchronously. Once the recording is over, indexing is automatically performed by analyzing the content of the captured video containing projected documents and detects the scene changes, identifies the documents, computes their duration and extracts their textual content. Each of the captured images is identified from a repository containing all original electronic documents, captured audio-visual data and metadata created during post-production. The identification is based on documents' signatures, which hierarchically structure features from both layout structure and color distributions of the document images. Video segments are finally enriched with textual content of the identified original documents, which further facilitate the query and retrieval without using OCR. The signature-based indexing method proposed in this article is robust and works with low-resolution images and can be applied to several other applications including real-time document recognition, multimedia IR and augmented reality system

    CONTENT BASED RETRIEVAL OF LECTURE VIDEO REPOSITORY: LITERATURE REVIEW

    Get PDF
    Multimedia has a significant role in communicating the information and a large amount of multimedia repositories make the browsing, retrieval and delivery of video contents. For higher education, using video as a tool for learning and teaching through multimedia application is a considerable promise. Many universities adopt educational systems where the teacher lecture is video recorded and the video lecture is made available to students with minimum post-processing effort. Since each video may cover many subjects, it is critical for an e-Learning environment to have content-based video searching capabilities to meet diverse individual learning needs. The present paper reviewed 120+ core research article on the content based retrieval of the lecture video repositories hosted on cloud by government academic and research organization of India

    DOCUMENT GENERATOR BASED ON RANDOMIZED TEMPLATES

    Get PDF
    This disclosure relates to the field of synthetic document generation for training Deep Learning algorithms in order to understand document contents. Document understanding is important for applications such as document quality enhancement and information extraction pipelines. In document quality enhancement, different computer vision techniques are applied to specific regions of the document depending on the element type (text, image), for tasks like printing and/or scanning. Information extraction pipelines aim to retrieve valuable knowledge from documents in an automated fashion. Again, depending on the element type, different extractors are used. Machine Learning techniques may be applied to decompose a document into element types: text, images, equations, charts, and diagrams. Regardless of the training regime (supervised or unsupervised), data is necessary. An option could be to obtain documents from the Internet. However, there are some problems: No permissive license, unbalanced data (i.e. slides with only text elements), and difficulty to extract precise annotations for training ML models from raw documents. This disclosure presents a synthetic data generator able to create a diverse set of documents based on randomized template formats, here we focus on slide presentations

    Associating Facial Expressions and Upper-Body Gestures with Learning Tasks for Enhancing Intelligent Tutoring Systems

    Get PDF
    Learning involves a substantial amount of cognitive, social and emotional states. Therefore, recognizing and understanding these states in the context of learning is key in designing informed interventions and addressing the needs of the individual student to provide personalized education. In this paper, we explore the automatic detection of learner’s nonverbal behaviors involving hand-over-face gestures, head and eye movements and emotions via facial expressions during learning. The proposed computer vision-based behavior monitoring method uses a low-cost webcam and can easily be integrated with modern tutoring technologies. We investigate these behaviors in-depth over time in a classroom session of 40 minutes involving reading and problem-solving exercises. The exercises in the sessions are divided into three categories: an easy, medium and difficult topic within the context of undergraduate computer science. We found that there is a significant increase in head and eye movements as time progresses, as well as with the increase of difficulty level. We demonstrated that there is a considerable occurrence of hand-over-face gestures (on average 21.35%) during the 40 minutes session and is unexplored in the education domain. We propose a novel deep learning approach for automatic detection of hand-over-face gestures in images with a classification accuracy of 86.87%. There is a prominent increase in hand-over-face gestures when the difficulty level of the given exercise increases. The hand-over-face gestures occur more frequently during problem-solving (easy 23.79%, medium 19.84% and difficult 30.46%) exercises in comparison to reading (easy 16.20%, medium 20.06% and difficult 20.18%)

    A Survey of Smart Classroom Literature

    Get PDF
    Recently, there has been a substantial amount of research on smart classrooms, encompassing a number of areas, including Information and Communication Technology, Machine Learning, Sensor Networks, Cloud Computing, and Hardware. Smart classroom research has been quickly implemented to enhance education systems, resulting in higher engagement and empowerment of students, educators, and administrators. Despite decades of using emerging technology to improve teaching practices, critics often point out that methods miss adequate theoretical and technical foundations. As a result, there have been a number of conflicting reviews on different perspectives of smart classrooms. For a realistic smart classroom approach, a piecemeal implementation is insufficient. This survey contributes to the current literature by presenting a comprehensive analysis of various disciplines using a standard terminology and taxonomy. This multi-field study reveals new research possibilities and problems that must be tackled in order to integrate interdisciplinary works in a synergic manner. Our analysis shows that smart classroom is a rapidly developing research area that complements a number of emerging technologies. Moreover, this paper also describes the co-occurrence network of technological keywords using VOSviewer for an in-depth analysis

    Segmenting Lecture Videos by Topic: From Manual to Automated Methods

    Get PDF
    More and more universities and corporations are starting to provide videotaped lectures online for knowledge sharing and learning. Segmenting lecture videos into short clips by topic can extract the hidden information structure of the videos and facilitate information searching and learning. Manual segmentation has high accuracy rates but is very labor intensive. In order to develop a high performance automated segmentation method for lecture videos, we conducted a case study to learn the segmentation process of humans and the effective segmentation features used in the process. Based on the findings from the case study, we designed an automated segmentation approach with two phases: initial segmentation and segmentation refinement. The approach combines segmentation features from three information sources of video (speech text transcript, audio and video) and makes use of various knowledge sources such as world knowledge and domain knowledge. Our preliminary results show that the proposed two-phase approach is promising

    VLEngagement: A Dataset of Scientific Video Lectures for Evaluating Population-based Engagement

    Get PDF
    With the emergence of e-learning and personalised education, the production and distribution of digital educational resources have boomed. Video lectures have now become one of the primary modalities to impart knowledge to masses in the current digital age. The rapid creation of video lecture content challenges the currently established human-centred moderation and quality assurance pipeline, demanding for more efficient, scalable and automatic solutions for managing learning resources. Although a few datasets related to engagement with educational videos exist, there is still an important need for data and research aimed at understanding learner engagement with scientific video lectures. This paper introduces VLEngagement, a novel dataset that consists of content-based and video-specific features extracted from publicly available scientific video lectures and several metrics related to user engagement. We introduce several novel tasks related to predicting and understanding context-agnostic engagement in video lectures, providing preliminary baselines. This is the largest and most diverse publicly available dataset to our knowledge that deals with such tasks. The extraction of Wikipedia topic-based features also allows associating more sophisticated Wikipedia based features to the dataset to improve the performance in these tasks. The dataset, helper tools and example code snippets are available publicly at https://github.com/sahanbull/context-agnostic-engagemen
    corecore