15 research outputs found

    Region-based representations of image and video: segmentation tools for multimedia services

    Get PDF
    This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standards are discussed. The review is structured around the strategies used by the algorithms (transition based or homogeneity based) and the decision spaces (spatial, spatio-temporal, and temporal). The second part of this paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in a systematic and universal way and what has to be application dependent. It is shown in particular how a single partition tree created with an extremely simple similarity feature can support a large number of segmentation applications: spatial segmentation, motion estimation, region-based coding, semantic object extraction, and region-based retrieval.Peer ReviewedPostprint (published version

    GRADUAL TRANSITION DETECTION FOR VIDEO PARTITIONING USING MORPHOLOGICAL OPERATORS

    Full text link

    Sayısal video indeksleme için bölümleme algoritmalarının karşılaştırılması ve yeni bir algoritma gerçeklenmesi

    Get PDF
    06.03.2018 tarihli ve 30352 sayılı Resmi Gazetede yayımlanan “Yükseköğretim Kanunu İle Bazı Kanun Ve Kanun Hükmünde Kararnamelerde Değişiklik Yapılması Hakkında Kanun” ile 18.06.2018 tarihli “Lisansüstü Tezlerin Elektronik Ortamda Toplanması, Düzenlenmesi ve Erişime Açılmasına İlişkin Yönerge” gereğince tam metin erişime açılmıştır.Anahtar kelimeler: Video İndeksleme, Video Bölümlendirme, Video veritabanlarına hızlı erişim, Anlamsal video erişimiSayısal videoların günümüzde analog videoların yerini alması, ucuz ve kolay kaydedilip depolanabilmesi gibi özelliklerinden dolayı binlerce saatlik sayısal video arşivleri ortaya çıkmıştır. Bu video veritabanlarına erişimin hızlı olması için video içinde anlamsal olarak arama yapılabilmesi büyük bir ihtiyaç haline gelmiştir. Sayısal videolar içinde standart resim ve metin arama teknikleri kullanılamadığından video indeksleme konusu ilgiler artarak literatürde yer alan birçok çalışma yapılmıştır. Sayısal video indeksleme yöntemi videoları bölümlendirerek bölümlere ait özet bilgilerin çıkarılması ile uygulanmaktadır. Bu bağlamda doğru video bölümlendirme video indekslemenin temelini oluşturan en önemli kısmı olarak ortaya çıkmıştır.Bu çalışmada sıkıştırılmamış videolar üzerine çalışan bölümlendirme algoritmaları incelenerek performans karşılaştırması yapılmıştır. Algoritmaların pratik uygulamalarını mümkğn kılacak bir arayüz tasarımı yapılmıştır. Algoritmaların performans karşılaştırmalarını otomatik olarak yaparak kullanıcıya sayısal değerler ve grafikler halinde sonuçları verecek bir arayüz tasarlanmıştır. Ayrıca mevcut video bölümlendirme algoritmalarından daha başarılı bölümlendirme sonuçları elde eden Filtrelenmiş Video Histogram Karşılaştırması algoritması tasarlanmıştır.Key Words: Video indexing, Video segmentation, Rapid access to video databases, Semantic retrieval of videoToday, the use of digital videos instead of analog videos has become most popular in terms of easily recordable and low-cost storage. Therefore, huge amount of digital video archives have been came out dramatically. In order to access this video database very fast, the semantic search in a video has been an important demand. While the standard picture and text searching methods cannot be used in a digital video, the video indexing has become a popular interest, and lots of research have been came out. Digital video indexing methods are implemented by segmenting the videos into sub-scenes and extracting the main data related to these sub-scenes. In this sense, correct video segmenting has been most important part of the video indexing.In this study, video segmenting algorithms about uncompressed videos have been investigated and compared with their performance analysis. An interface software has been designed for making the practical applications of these segmenting algorithms possible. Also different interface software has been designed for comparing the performance analysis of the segmenting algorithms automatically and giving the results to the user in graphically and numerically. Added to these, an algorithm called Filtered Video Histogram Comparison has been implemented which has more successful segmenting results than existing video segmenting algorithms

    Audio-coupled video content understanding of unconstrained video sequences

    Get PDF
    Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results

    Audio-coupled video content understanding of unconstrained video sequences

    Get PDF
    Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Driver Behaviour and Road Safety Analysis Using Computer Vision and Applications in Roundabout Safety

    Get PDF
    RÉSUMÉ L’un des principaux défis provenant de l’analyse traditionnelle de la sécurité routière basée sur les données historiques d’accidents est le besoin d’observer de véritables collisions entre les usagers de la route. Non seulement indésirables, ces collisions sont difficiles à observer. À cet effet, les méthodes d’analyse substitutive de la sécurité routière gagnent du terrain dans le milieu de la recherche en tant qu’alternative proactive à l’observation de ces accidents de la route : cette approche promet de modéliser indirectement les collisions par l’intermédiaire de précurseurs de collisions pouvant être retrouvés dans des données de circulation ordinaire : situations de trajectoire de collision, quasi-accidents, conflits de circulation, etc. En sus, l’analyse substitutive de la sécurité donne également les chercheurs un aperçu des mécanismes de collision, ce qui permettrait de mieux comprendre les facteurs favorisant les accidents. Cependant, un grand nombre de définition de ces situations précurseures de collision, ainsi que des problèmes de cohérence et de subjectivité des méthodes de collecte de données ont entravé l’adoption des méthodes d’analyse substitutive de la sécurité routière.----------ABSTRACT One of the main challenges of traditional road safety analysis based on historical accident records is its dependence on the occurrence and subsequent observation of real traffic collisions. Traffic collisions are not only undesirable, they are difficult to observe. Surrogate safety analysis is gaining traction in the research community as a proactive alternative to observing historical accident records. With this approach, collisions are instead predicted indirectly via precursors to collisions found in everyday traffic scenarios: collision-courses, near-misses, traffic conflicts, etc. Furthermore, the scope of surrogate safety analysis provides insight into collision mechanisms, allowing for better investigative procedures. However, the wide range of collision precursor definitions, and issues with inconsistent or sometimes subjective data collection methods have hampered surrogate safety analysis adoption

    Segmentation sémantique des contenus audio-visuels

    Get PDF
    Dans ce travail, nous avons mis au point une méthode de segmentation des contenus audiovisuels applicable aux appareils de stockage domestiques pour cela nous avons expérimenté un système distribué pour l’analyse du contenu composé de modules individuels d’analyse : les Service Unit. L’un d’entre eux a été dédié à la caractérisation des éléments hors contenu, i.e. les publicités, et offre de bonnes performances. Parallèlement, nous avons testé différents détecteurs de changement de plans afin de retenir le meilleur d’entre eux pour la suite. Puis, nous avons proposé une étude des règles de production des films, i.e. grammaire de films, qui a permis de définir les séquences de Parallel Shot. Nous avons, ainsi, testé quatre méthodes de regroupement basées similarité afin de retenir la meilleure d’entre elles pour la suite. Finalement, nous avons recherché différentes méthodes de détection des frontières de scènes et avons obtenu les meilleurs résultats en combinant une méthode basée couleur avec un critère de longueur de plan. Ce dernier offre des performances justifiant son intégration dans les appareils de stockage grand public.In this work we elaborated a method for semantic segmentation of audiovisual content applicable for consumer electronics storage devices. For the specific solution we researched first a service-oriented distributed multimedia content analysis framework composed of individual content analysis modules, i.e. Service Units. One of the latter was dedicated to identify non-content related inserts, i.e. commercials blocks, which reached high performance results. In a subsequent step we researched and benchmarked various Shot Boundary Detectors and implement the best performing one as Service Unit. Here after, our study of production rules, i.e. film grammar, provided insights of Parallel Shot sequences, i.e. Cross-Cuttings and Shot-Reverse-Shots. We researched and benchmarked four similarity-based clustering methods, two colour- and two feature-point-based ones, in order to retain the best one for our final solution. Finally, we researched several audiovisual Scene Boundary Detector methods and achieved best results combining a colour-based method with a shot length based criteria. This Scene Boundary Detector identified semantic scene boundaries with a robustness of 66% for movies and 80% for series, which proofed to be sufficient for our envisioned application Advanced Content Navigation

    Large-scale video sequence indexing: Impacts, ideas and trends

    No full text
    With the advances of hardware (e.g., wide availability of Webcam) and software (e.g., video editing or instant messaging software), the amount of video data has grown rapidly in many fields, such as broadcasting, advertising, filming, personal video archive, and medical/scientific video repository. In addition, Web has generated enormous impact by popularizing video publishing and sharing (e.g., social networking websites). Online delivery of video content has surged to an unprecedented level. The wide availability of video data fuels many novel applications, such as near-duplicate video detection, in-video advertising, video recommendation, web video search, etc. With these demanding applications, how to manage large-scale video databases and search similar video content is of uttermost importance. Although content-based video search has recently attracted plenty of attention, the high complexity of video data, coupled with large volume, poses huge challenges towards large-scale video sequence search. As the volume of video data continues to grow rapidly, the demand of efficient indexing on large-scale video databases from database community is increasingly imperative. In this talk, we will look at the problem of effective indexing supports for large-scale video sequence search in various forms, such as clip matching, subsequence matching and continuous stream matching. Its impacts and challenges will be discussed, followed by our recent ideas and developments to tackle this problem. Its future trends in next age will also be discussed

    Combining Audio And Video For Video Sequence Indexing Applications

    No full text
    In this paper we address the problem of detecting shots of subjects that are interviewed in news sequences. This is useful since usually these kinds of scenes contain important and reusable information that can be used for other news programs. In a previous paper, we presented a technique based on a priori knowledge of the editing techniques used in news sequences which allowed a fast search of news stories. In this paper we present a new shot descriptor technique which improves the previous search results by using a simple, yet efficient algorithm, based on the information contained in consecutive frames. Results are provided which prove the validity of the approach

    FOVEA: A Video Frame Organizer Via Identity Extraction and Analysis

    No full text
    In this work we present FOVEA, a system for video sequence indexing based on the identity of appearing subjects. We will detail the features of the single component modules, devoting special attention to classification and clustering. The system presents a number of peculiarities that distinguish it from existing ones. Among these, the account for the concomitant appearance of two subjects in the same clip, and the use of such information in the process of identity mapping. FOVEA was tested on 16 video clips with results that assess both its efficacy and efficiency. © 2011 IEEE
    corecore