38 research outputs found

    MĂ©thode rapide de segmentation et d'indexation du flux MPEG1-2 par bloc DCT

    Get PDF
    L'accessibilité des données multimédias est tributaire d'une indexation précise, ce qui demande un temps trÚs long. Cet article propose une nouvelle méthode pour renseigner de façon automatique plusieurs champs MPEG7 (par exemple, le mouvement de la caméra et des objets). Nous exploitons au maximum les informations contenues dans le flux MPEG1-2 [1]. Afin d'accélérer le calcul, les images ne sont pas décompressées, mais nous nous limitons au décodage entropique et à la quantification inverse. Les informations de mouvement présentent dans le flux MPEG1-2 permettent d'estimer le mouvement apparent de la caméra. La segmentation des zones de couleur est obtenue grùce à un algorithme de division-fusion. Les valeurs des coefficients DCT sont aussi utilisés

    Audio-Visual VQ Shot Clustering for Video Programs

    Get PDF
    Many post-production video documents such as movies, sitcoms and cartoons present well structured story-lines organized in separated audio-visual scenes. Accurate grouping of shots into these logical video segments could lead to semantic indexing of scenes and events for interactive multimedia retrieval. In this paper we introduce a novel shot based analysis approach which aims to cluster together shots with similar audio-visual content. We demonstrate how the use of codebooks of audio and visual codewords (generated by a vector quantization process) results to be an effective method to represent clusters containing shots with similar long-term consistency of chromatic compositions and audio. The output clusters obtained by a simple single-link clustering algorithm, allow the further application of the well-known scene transition graph framework for scene change detection and shot-pattern investigation. In the end the merging of audio and visual results leads to a hierarchical description of the whole video document, useful for multimedia retrieval and summarization purposes

    Robotic Goal-Based Semi-Autonomous Algorithms Improve Remote Operator Performance

    Get PDF
    The focus of this research was to determine if reliable goal-based semi-autonomous algorithms are able to improve remote operator performance or not. Two semi-autonomous algorithms were examined: visual servoing and visual dead reckoning. Visual servoing uses computer vision techniques to generate movement commands while using internal properties of the camera combined with sensor data that tell the robot its current position based on its previous position. This research shows that the semi-autonomous algorithms developed increased performance in a measurable way. An analysis of tracking algorithms for visual servoing was conducted and tracking algorithms were enhanced to make them as robust as possible. The developed algorithms were implemented on a currently fielded military robot and a human-in-the-loop experiment was conducted to measure performance

    Temporal Feature Integration for Music Organisation

    Get PDF

    Cloud media video encoding:review and challenges

    Get PDF
    In recent years, Internet traffic patterns have been changing. Most of the traffic demand by end users is multimedia, in particular, video streaming accounts for over 53%. This demand has led to improved network infrastructures and computing architectures to meet the challenges of delivering these multimedia services while maintaining an adequate quality of experience. Focusing on the preparation and adequacy of multimedia content for broadcasting, Cloud and Edge Computing infrastructures have been and will be crucial to offer high and ultra-high definition multimedia content in live, real-time, or video-on-demand scenarios. For these reasons, this review paper presents a detailed study of research papers related to encoding and transcoding techniques in cloud computing environments. It begins by discussing the evolution of streaming and the importance of the encoding process, with a focus on the latest streaming methods and codecs. Then, it examines the role of cloud systems in multimedia environments and provides details on the cloud infrastructure for media scenarios. After doing a systematic literature review, we have been able to find 49 valid papers that meet the requirements specified in the research questions. Each paper has been analyzed and classified according to several criteria, besides to inspect their relevance. To conclude this review, we have identified and elaborated on several challenges and open research issues associated with the development of video codecs optimized for diverse factors within both cloud and edge architectures. Additionally, we have discussed emerging challenges in designing new cloud/edge architectures aimed at more efficient delivery of media traffic. This involves investigating ways to improve the overall performance, reliability, and resource utilization of architectures that support the transmission of multimedia content over both cloud and edge computing environments ensuring a good quality of experience for the final user

    Max-Planck-Institute for Psycholinguistics: Annual Report 2001

    No full text

    Segmentation sémantique des contenus audio-visuels

    Get PDF
    Dans ce travail, nous avons mis au point une mĂ©thode de segmentation des contenus audiovisuels applicable aux appareils de stockage domestiques pour cela nous avons expĂ©rimentĂ© un systĂšme distribuĂ© pour l’analyse du contenu composĂ© de modules individuels d’analyse : les Service Unit. L’un d’entre eux a Ă©tĂ© dĂ©diĂ© Ă  la caractĂ©risation des Ă©lĂ©ments hors contenu, i.e. les publicitĂ©s, et offre de bonnes performances. ParallĂšlement, nous avons testĂ© diffĂ©rents dĂ©tecteurs de changement de plans afin de retenir le meilleur d’entre eux pour la suite. Puis, nous avons proposĂ© une Ă©tude des rĂšgles de production des films, i.e. grammaire de films, qui a permis de dĂ©finir les sĂ©quences de Parallel Shot. Nous avons, ainsi, testĂ© quatre mĂ©thodes de regroupement basĂ©es similaritĂ© afin de retenir la meilleure d’entre elles pour la suite. Finalement, nous avons recherchĂ© diffĂ©rentes mĂ©thodes de dĂ©tection des frontiĂšres de scĂšnes et avons obtenu les meilleurs rĂ©sultats en combinant une mĂ©thode basĂ©e couleur avec un critĂšre de longueur de plan. Ce dernier offre des performances justifiant son intĂ©gration dans les appareils de stockage grand public.In this work we elaborated a method for semantic segmentation of audiovisual content applicable for consumer electronics storage devices. For the specific solution we researched first a service-oriented distributed multimedia content analysis framework composed of individual content analysis modules, i.e. Service Units. One of the latter was dedicated to identify non-content related inserts, i.e. commercials blocks, which reached high performance results. In a subsequent step we researched and benchmarked various Shot Boundary Detectors and implement the best performing one as Service Unit. Here after, our study of production rules, i.e. film grammar, provided insights of Parallel Shot sequences, i.e. Cross-Cuttings and Shot-Reverse-Shots. We researched and benchmarked four similarity-based clustering methods, two colour- and two feature-point-based ones, in order to retain the best one for our final solution. Finally, we researched several audiovisual Scene Boundary Detector methods and achieved best results combining a colour-based method with a shot length based criteria. This Scene Boundary Detector identified semantic scene boundaries with a robustness of 66% for movies and 80% for series, which proofed to be sufficient for our envisioned application Advanced Content Navigation

    Feedback-Based Gameplay Metrics and Gameplay Performance Segmentation: An audio-visual approach for assessing player experience.

    Get PDF
    Gameplay metrics is a method and approach that is growing in popularity amongst the game studies research community for its capacity to assess players’ engagement with game systems. Yet, little has been done, to date, to quantify players’ responses to feedback employed by games that conveys information to players, i.e., their audio-visual streams. The present thesis introduces a novel approach to player experience assessment - termed feedback-based gameplay metrics - which seeks to gather gameplay metrics from the audio-visual feedback streams presented to the player during play. So far, gameplay metrics - quantitative data about a game state and the player's interaction with the game system - are directly logged via the game's source code. The need to utilise source code restricts the range of games that researchers can analyse. By using computer science algorithms for audio-visual processing, yet to be employed for processing gameplay footage, the present thesis seeks to extract similar metrics through the audio-visual streams, thus circumventing the need for access to, whilst also proposing a method that focuses on describing the way gameplay information is broadcast to the player during play. In order to operationalise feedback-based gameplay metrics, the present thesis introduces the concept of gameplay performance segmentation which describes how coherent segments of play can be identified and extracted from lengthy game play sessions. Moreover, in order to both contextualise the method for processing metrics and provide a conceptual framework for analysing the results of a feedback-based gameplay metric segmentation, a multi-layered architecture based on five gameplay concepts (system, game world instance, spatial-temporal, degree of freedom and interaction) is also introduced. Finally, based on data gathered from game play sessions with participants, the present thesis discusses the validity of feedback-based gameplay metrics, gameplay performance segmentation and the multi-layered architecture. A software system has also been specifically developed to produce gameplay summaries based on feedback-based gameplay metrics, and examples of summaries (based on several games) are presented and analysed. The present thesis also demonstrates that feedback-based gameplay metrics can be conjointly analysed with other forms of data (such as biometry) in order to build a more complete picture of game play experience. Feedback based game-play metrics constitutes a post-processing approach that allows the researcher or analyst to explore the data however they wish and as many times as they wish. The method is also able to process any audio-visual file, and can therefore process material from a range of audio-visual sources. This novel methodology brings together game studies and computer sciences by extending the range of games that can now be researched but also to provide a viable solution accounting for the exact way players experience games

    Distributed multimedia systems

    Get PDF
    A distributed multimedia system (DMS) is an integrated communication, computing, and information system that enables the processing, management, delivery, and presentation of synchronized multimedia information with quality-of-service guarantees. Multimedia information may include discrete media data, such as text, data, and images, and continuous media data, such as video and audio. Such a system enhances human communications by exploiting both visual and aural senses and provides the ultimate flexibility in work and entertainment, allowing one to collaborate with remote participants, view movies on demand, access on-line digital libraries from the desktop, and so forth. In this paper, we present a technical survey of a DMS. We give an overview of distributed multimedia systems, examine the fundamental concept of digital media, identify the applications, and survey the important enabling technologies.published_or_final_versio
    corecore