12,757 research outputs found

    Semantic topic discovery for lecture video

    Full text link
    © Springer Nature Switzerland AG 2020. With more and more lecture, videos are available on the Internet, on-line learning and e-learning are getting increasing concerns because of many advantages such as high degree of interactivity. The semantic content discovery for lecture video is a key problem. In this paper, we propose a Multi-modal LDA model, which discovers the semantic topics of lecture videos by considering audio and visual information. Specifically, the speaking content and the information of presentation slides are extracted from the lecture videos. With the proposed inference and learning algorithm, the semantic topics of the video can be discovered. The experimental results show that the proposed method can effectively discover the meaningful semantic characters of the lecture videos

    CONTENT BASED RETRIEVAL OF LECTURE VIDEO REPOSITORY: LITERATURE REVIEW

    Get PDF
    Multimedia has a significant role in communicating the information and a large amount of multimedia repositories make the browsing, retrieval and delivery of video contents. For higher education, using video as a tool for learning and teaching through multimedia application is a considerable promise. Many universities adopt educational systems where the teacher lecture is video recorded and the video lecture is made available to students with minimum post-processing effort. Since each video may cover many subjects, it is critical for an e-Learning environment to have content-based video searching capabilities to meet diverse individual learning needs. The present paper reviewed 120+ core research article on the content based retrieval of the lecture video repositories hosted on cloud by government academic and research organization of India

    Video Categorization Using Data Mining

    Get PDF
    Video categorization using data mining is the area of the research that aims to propose adeveloped method based on Artificial Neural Network (ANN), which could be used to classify video files into different categories according to the content. In order to test this method, the classifications of video files are discussed. The applied system proposes that the video could be categorized in two classes. The first one is educational while is noneducational. The classification is conducted based on the motion using optical flow. Several experiments were conducted using Artificial Neural Network (ANN) model. The research facilitate access to the required educational video to the learners students, especially novice students. This research objective is to investigate how the effect of motion feature can be useful in such lassification. We believe that other effects such audio features, text features, and other factors can enhance accuracy, but this requires wider studies and need more time. The accuracy of results in video classification to educational and non-educational through technique 3 fold cross validation and using (ANN) model is 54%. This result may can be improved by introducing other factors mentioned above

    Annotation of multimedia learning materials for semantic search

    Get PDF
    Multimedia is the main source for online learning materials, such as videos, slides and textbooks, and its size is growing with the popularity of online programs offered by Universities and Massive Open Online Courses (MOOCs). The increasing amount of multimedia learning resources available online makes it very challenging to browse through the materials or find where a specific concept of interest is covered. To enable semantic search on the lecture materials, their content must be annotated and indexed. Manual annotation of learning materials such as videos is tedious and cannot be envisioned for the growing quantity of online materials. One of the most commonly used methods for learning video annotation is to index the video, based on the transcript obtained from translating the audio track of the video into text. Existing speech to text translators require extensive training especially for non-native English speakers and are known to have low accuracy. This dissertation proposes to index the slides, based on the keywords. The keywords extracted from the textbook index and the presentation slides are the basis of the indexing scheme. Two types of lecture videos are generally used (i.e., classroom recording using a regular camera or slide presentation screen captures using specific software) and their quality varies widely. The screen capture videos, have generally a good quality and sometimes come with metadata. But often, metadata is not reliable and hence image processing techniques are used to segment the videos. Since the learning videos have a static background of slide, it is challenging to detect the shot boundaries. Comparative analysis of the state of the art techniques to determine best feature descriptors suitable for detecting transitions in a learning video is presented in this dissertation. The videos are indexed with keywords obtained from slides and a correspondence is established by segmenting the video temporally using feature descriptors to match and align the video segments with the presentation slides converted into images. The classroom recordings using regular video cameras often have poor illumination with objects partially or totally occluded. For such videos, slide localization techniques based on segmentation and heuristics is presented to improve the accuracy of the transition detection. A region prioritized ranking mechanism is proposed that integrates the keyword location in the presentation into the ranking of the slides when searching for a slide that covers a given keyword. This helps in getting the most relevant results first. With the increasing size of course materials gathered online, a user looking to understand a given concept can get overwhelmed. The standard way of learning and the concept of “one size fits all” is no longer the best way to learn for millennials. Personalized concept recommendation is presented according to the user’s background knowledge. Finally, the contributions of this dissertation have been integrated into the Ultimate Course Search (UCS), a tool for an effective search of course materials. UCS integrates presentation, lecture videos and textbook content into a single platform with topic based search capabilities and easy navigation of lecture materials

    Towards Arabic multi-modal sentiment analysis

    Get PDF
    In everyday life, people use internet to express and share opinions, facts, and sentiments about products and services. In addition, social media applications such as Facebook, Twitter, WhatsApp, Snapchat etc., have become important information sharing platforms. Apart from these, a collection of product reviews, facts, poll information, etc., is a need for every company or organization ranging from start-ups to big firms and governments. Clearly, it is very challenging to analyse such big data to improve products, services, and satisfy customer requirements. Therefore, it is necessary to automate the evaluation process using advanced sentiment analysis techniques. Most of previous works focused on uni-modal sentiment analysis mainly textual model. In this paper, a novel Arabic multimodal dataset is presented and validated using state-of-the-art support vector machine (SVM) based classification method

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR
    corecore