6 research outputs found

    A Genetic Algorithm and Fuzzy Logic Approach for Video Shot Boundary Detection

    Get PDF
    This paper proposed a shot boundary detection approach using Genetic Algorithm and Fuzzy Logic. In this, the membership functions of the fuzzy system are calculated using Genetic Algorithm by taking preobserved actual values for shot boundaries. The classification of the types of shot transitions is done by the fuzzy system. Experimental results show that the accuracy of the shot boundary detection increases with the increase in iterations or generations of the GA optimization process. The proposed system is compared to latest techniques and yields better result in terms of F1score parameter

    Efficient compression of synthetic video

    Get PDF
    Streaming of on-line gaming video is a challenging problem because of the enormous amounts of video data that need to be sent during game playing, especially within the limitations of uplink capabilities. The encoding complexity is also a challenge because of the time delay while on-line gamers are communicating. The main goal of this research study is to propose an enhanced on-line game video streaming system. First, the most common video coding techniques have been evaluated. The evaluation study considers objective and subjective metrics. Three widespread video coding techniques are selected and evaluated in the study; H.264, MPEG-4 Visual and VP- 8. Diverse types of video sequences were used with different frame rates and resolutions. The effects of changing frame rate and resolution on compression efficiency and viewers‟ satisfaction are also presented. Results showed that the compression process and perceptual satisfaction are severely affected by the nature of the compressed sequence. As a result, H.264 showed higher compression efficiency for synthetic sequences and outperformed other codecs in the subjective evaluation tests. Second, a fast inter prediction technique to speed up the encoding process of H.264 has been devised. The on-line game streaming service is a real time application, thus, compression complexity significantly affects the whole process of on-line streaming. H.264 has been recommended for synthetic video coding by our results gained in codecs comparative studies. However, it still suffers from high encoding complexity; thus a low complexity coding algorithm is presented as fast inter coding model with reference management technique. The proposed algorithm was compared to a state of the art method, the results showing better achievement in time and bit rate reduction with negligible loss of fidelity. Third, recommendations on tradeoff between frame rates and resolution within given uplink capabilities are provided for H.264 video coding. The recommended tradeoffs are offered as a result of extensive experiments using Double Stimulus Impairment Scale (DSIS) subjective evaluation metric. Experiments showed that viewers‟ satisfaction is profoundly affected by varying frame rates and resolutions. In addition, increasing frame rate or frame resolution does not always guarantee improved increments of perceptual quality. As a result, tradeoffs are recommended to compromise between frame rate and resolution within a given bit rate to guarantee the highest user satisfaction. For system completeness and to facilitate the implementation of the proposed techniques, an efficient game video streaming management system is proposed. Compared to existing on-line live video service systems for games, the proposed system provides improved coding efficiency, complexity reduction and better user satisfaction

    Le référencement en langue des signes : analyse et reconnaissance du pointé

    Get PDF
    Cette thèse porte sur le rôle et l'analyse du regard en langue des signes où celui-ci joue un rôle important. Dans toute langue, le regard permet de maintenir la relation de communication. En langue des signes, il permet, en plus, de structurer le discours ou l'interaction entre locuteurs, en s'investissant dans des fonctions linguistiques complexes. Nous nous intéressons au rôle de référencement qui consiste à mettre le focus sur un élément du discours. En langue des signes, les éléments du discours sont spatialisés dans l'espace de signation ; ainsi, mettre le focus sur un élément du discours revient à identifier et activer son emplacement spatial (locus), ce qui va mobiliser un ou plusieurs composants corporels, les mains, les épaules, la tête et le regard. Nous avons donc analysé le concept de référencement sous ses formes manuelles et / ou non manuelles et avons mis en place un système de reconnaissance de structures de référencement qui prend en entrée une vidéo en langue des signes. Le système de reconnaissance consiste en trois étapes: 1) la modélisation 3D du concept de référencement, 2) la transformation du modèle 3D en un modèle d'aspect exploitable par un programme de traitement 2D et 3) la détection, qui utilise ce modèle d'aspect. La modélisation consiste en l'extraction de caractéristiques gestuelles du concept de référencement à partir de corpus composés de capture 3D de mouvement et du regard et annotés manuellement à partir de vidéos. La modélisation concerne la description des composantes corporelles qui jouent un rôle dans le référencement et la quantification de quelques propriétés gestuelles des composantes corporelles en question. Les modèles obtenus décrivent : 1) La dynamique du mouvement de la main dominante et 2) la proximité spatiale entre des composantes corporelles et l'élément discursif spatialisé. La mise en œuvre de la méthode de reconnaissance intègre ces modèles 3D de profil dynamique de la main et de variation de distance entre composantes corporelles et l'élément discursif ainsi que le modèle temporel de décalages entre mouvements. Etant donné que les modèles obtenus sont tridimensionnels et que l'entrée du système de reconnaissance de structures de référencement est une vidéo 2D, nous proposons une transformation des modèles 3D en 2D afin de permettre leur exploitation dans l'analyse de la vidéo 2D et la reconnaissance des structures de référencement. Nous pouvons alors appliquer un algorithme de reconnaissance à ces corpus vidéo 2D. Les résultats de reconnaissance sont sous la forme d'intervalles temporels. On constate la présence de deux variantes principales de référencement. Ce travail pionnier sur la caractérisation et la détection des référencements nécessiterait d'être approfondi sur des corpus beaucoup plus importants, cohérents et riches et avec des méthodes plus élaborées de classification. Cependant il a permis d'élaborer une méthodologie d'analyse réutilisable.This thesis focuses on the role and analysis of gaze in sign language where it plays an important role. In any language, the gaze keeps the communication relationship. In addition to that, it allows structuring a sign language discourse or interaction between signers, by investing in complex linguistic features. We focus on the role of reference, which is to put the focus on an element of the discourse. In sign language, the components of the discourse are localized in the signing space; thus putting the focus on an element of discourse which is to identify and activate its spatial location (locus), which will mobilize one or more body parts, hands, shoulders, head and eyes. We therefore analyzed the concept of reference in its manual and / or non- manual gestures and set up a reference-based recognition system that takes as input a video in sign language. The recognition system consists of three steps: - 3D modeling of the concept of reference. - The transformation of the 3D model into a 2D model useable by a 2D recognition system. - The detection system, which uses this 2D model. Modeling involves the extraction of gestural characteristics of the concept of reference from corpus consisted of 3D motion capture and gaze and manually annotated videos and the temporal pattern of time lags between motions. Modeling concerns the description of body parts that play a role in reference and the quantification of their gestural. The resulting models describe: 1) The dynamic movement of the dominant hand and 2) the distances between body parts and locus and 3) the time lags between the beginning of motions. The implementation of the recognition method integrates these 3D models. Since the resulting models are three-dimensional and the recognition system has, as input, a 2D video, we propose a transformation of 3D models to 2D to allow their use in the analysis of 2D video and in pattern recognition of reference structures. We can then apply a recognition algorithm to the 2D video corpus. The recognition results are a set of time slots with two main variants of reference. This pioneering work on the characterization and detection of references structures would need to be applied on much larger corpus, consistent and rich and more sophisticated classification methods. However, it allowed to make a reusable methodology of analysis
    corecore