11 research outputs found

    A generalized gamma correction algorithm based on the SLIP model

    Get PDF

    An In-Vehicle Vision-Based Driver's Drowsiness Detection System

    Get PDF
    [[abstract]]Many traffic accidents have been reported due to driver’s drowsiness/fatigue. Drowsiness degrades driving performance due to the declinations of visibility, situational awareness and decision-making capability. In this study, a vision-based drowsiness detection and warning system is presented, which attempts to bring to the attention of a driver to his/her own potential drowsiness. The information provided by the system can also be utilized by adaptive systems to manage noncritical operations, such as starting a ventilator, spreading fragrance, turning on a radio, and providing entertainment options. In high drowsiness situation, the system may initiate navigation aids and alert others to the drowsiness of the driver. The system estimates the fatigue level of a driver based on his/her facial images acquired by a video camera mounted in the front of the vehicle. There are five major steps involved in the system process: preprocessing, facial feature extraction, face tracking, parameter estimation, and reasoning. In the preprocessing step, the input image is sub-sampled for reducing the image size and in turn the processing time. A lighting compensation process is next applied to the reduced image in order to remove the influences of ambient illumination variations. Afterwards, for each image pixel a number of chrominance values are calculated, which are to be used in the next step for detecting facial features. There are four sub-steps constituting the feature extraction step: skin detection, face localization, eyes and mouth detection, and feature confirmation. To begin, the skin areas are located in the image based on the chrominance values of pixels calculated in the previous step and a predefined skin model. We next search for the face region within the largest skin area. However, the detected face is typically imperfect. Facial feature detection within the imperfect face region is unreliable. We actually look for facial features throughout the entire image. As to the face region, it will later be used to confirm the detected facial features. Once facial features are located, they are tracked over the video sequence until they are missed detecting in a video image. At this moment, the facial feature detection process is revoked again. Although facial feature detection is time consuming, facial feature tracking is fast and reliable. During facial feature tracking, parameters of facial expression, including percentage of eye closure over time, eye blinking frequency, durations of eye closure, gaze and mouth opening, as well as head orientation, are estimated. The estimated parameters are then utilized in the reasoning step to determine the driver’s drowsiness level. A fuzzy integral technique is employed, which integrates various types of parameter values to arrive at a decision about the drowsiness level of the driver. A number of video sequences of different drivers and illumination conditions have been tested. The results revealed that our system can work reasonably in daytime. We may extend the system in the future work to apply in nighttime. For this, infrared sensors should be included.

    Théorie de l’évidence pour suivi de visage

    Get PDF
    Le suivi de visage par caméra vidéo est abordé ici sous l’angle de la fusion évidentielle. La méthode proposée repose sur un apprentissage sommaire basé sur une initialisation supervisée. Le formalisme du modèle de croyances transférables est utilisé pour pallier l’incomplétude du modèle a priori de visage due au manque d’exhaustivité de la base d’apprentissage. L’algorithme se décompose en deux étapes. La phase de détection de visage synthétise un modèle évidentiel où les attributs du détecteur de Viola et Jones sont convertis en fonctions de croyance, et fusionnés avec des fonctions de masse couleur modélisant un détecteur de teinte chair, opérant dans un espace chromatique original obtenu par transformation logarithmique. Pour fusionner les sources couleur dépendantes, nous proposons un opérateur de compromis inspiré de la règle prudente de Denœux. Pour la phase de suivi, les probabilités pignistiques issues du modèle de visage garantissent la compatibilité entre les cadres crédibiliste et probabiliste. Elles alimentent un filtre particulaire classique qui permet le suivi du visage en temps réel. Nous analysons l’influence des paramètres du modèle évidentiel sur la qualité du suivi.This paper deals with real time face detection and tracking by a video camera. The method is based on a simple and fast initializing stage for learning. The transferable belief model is used to deal with the prior model incompleteness due to the lack of exhaustiveness of the learning stage. The algorithm works in two steps. The detection phase synthesizes an evidential face model by merging basic beliefs elaborated from the Viola and Jones face detector and from colour mass functions. These functions are computed from information sources in a logarithmic colour space. To deal with the colour information dependence in the fusion process, we propose a compromise operator close to the Denœux cautious rule. As regards the tracking phase, the pignistic probabilities from the face model guarantee the compatibility between the believes and the probability formalism. They are the inputs of a particle filter which ensures face tracking at video rate. The optimal parameter tuning of the evidential model is discussed

    Nonlinear Color Space and Spatiotemporal MRF for Hierarchical Segmentation of Face Features in Video

    No full text
    International audienceThis paper addresses the design of the image processing stage for face analysis. In human computer interfaces, user-friendliness requires robustness and good quality in image processing. To cope with unsupervised lighting conditions and unknown speaker, two original preprocessing tools are introduced here: a logarithmic color transform and an entropy-based motion threshold. As regards the main processing stage, a hierarchical segmentation scheme based on Markov random fields is proposed, that combines color and motion observations within a spatiotemporal neighborhood. Relevant face regions are thereafter automatically segmented. The good quality of the label fields enables localization and tracking of the face. Geometrical measurements on facial feature edges, such as lips or eyes, are provided by an active contour postprocessing stage. Results are shown both on well-known test sequences, and also on typical sequences acquired from micro and motorized cameras. The robustness and accuracy of the extracted contours are promising for any real-time application aiming at facial communication in unsupervised viewing conditions

    Video object segmentation.

    Get PDF
    Wei Wei.Thesis submitted in: December 2005.Thesis (M.Phil.)--Chinese University of Hong Kong, 2006.Includes bibliographical references (leaves 112-122).Abstracts in English and Chinese.Abstract --- p.IIList of Abbreviations --- p.IVChapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Overview of Content-based Video Standard --- p.1Chapter 1.2 --- Video Object Segmentation --- p.4Chapter 1.2.1 --- Video Object Plane (VOP) --- p.4Chapter 1.2.2 --- Object Segmentation --- p.5Chapter 1.3 --- Problems of Video Object Segmentation --- p.6Chapter 1.4 --- Objective of the research work --- p.7Chapter 1.5 --- Organization of This Thesis --- p.8Chapter 1.6 --- Notes on Publication --- p.8Chapter Chapter 2 --- Literature Review --- p.10Chapter 2.1 --- What is segmentation? --- p.10Chapter 2.1.1 --- Manual Segmentation --- p.10Chapter 2.1.2 --- Automatic Segmentation --- p.11Chapter 2.1.3 --- Semi-automatic segmentation --- p.12Chapter 2.2 --- Segmentation Strategy --- p.14Chapter 2.3 --- Segmentation of Moving Objects --- p.17Chapter 2.3.1 --- Motion --- p.18Chapter 2.3.2 --- Motion Field Representation --- p.19Chapter 2.3.3 --- Video Object Segmentation --- p.25Chapter 2.4 --- Summary --- p.35Chapter Chapter 3 --- Automatic Video Object Segmentation Algorithm --- p.37Chapter 3.1 --- Spatial Segmentation --- p.38Chapter 3.1.1 --- k:-Medians Clustering Algorithm --- p.39Chapter 3.1.2 --- Cluster Number Estimation --- p.41Chapter 3.1.2 --- Region Merging --- p.46Chapter 3.2 --- Foreground Detection --- p.48Chapter 3.2.1 --- Global Motion Estimation --- p.49Chapter 3.2.2 --- Detection of Moving Objects --- p.50Chapter 3.3 --- Object Tracking and Extracting --- p.50Chapter 3.3.1 --- Binary Model Tracking --- p.51Chapter 3.3.1.2 --- Initial Model Extraction --- p.53Chapter 3.3.2 --- Region Descriptor Tracking --- p.59Chapter 3.4 --- Results and Discussions --- p.65Chapter 3.4.1 --- Objective Evaluation --- p.65Chapter 3.4.2 --- Subjective Evaluation --- p.66Chapter 3.5 --- Conclusion --- p.74Chapter Chapter 4 --- Disparity Estimation and its Application in Video Object Segmentation --- p.76Chapter 4.1 --- Disparity Estimation --- p.79Chapter 4.1.1. --- Seed Selection --- p.80Chapter 4.1.2. --- Edge-based Matching by Propagation --- p.82Chapter 4.2 --- Remedy Matching Sparseness by Interpolation --- p.84Chapter 4.2 --- Disparity Applications in Video Conference Segmentation --- p.92Chapter 4.3 --- Conclusion --- p.106Chapter Chapter 5 --- Conclusion and Future Work --- p.108Chapter 5.1 --- Conclusion and Contribution --- p.108Chapter 5.2 --- Future work --- p.109Reference --- p.11

    Adaptive threshold optimisation for colour-based lip segmentation in automatic lip-reading systems

    Get PDF
    A thesis submitted to the Faculty of Engineering and the Built Environment, University of the Witwatersrand, Johannesburg, in ful lment of the requirements for the degree of Doctor of Philosophy. Johannesburg, September 2016Having survived the ordeal of a laryngectomy, the patient must come to terms with the resulting loss of speech. With recent advances in portable computing power, automatic lip-reading (ALR) may become a viable approach to voice restoration. This thesis addresses the image processing aspect of ALR, and focuses three contributions to colour-based lip segmentation. The rst contribution concerns the colour transform to enhance the contrast between the lips and skin. This thesis presents the most comprehensive study to date by measuring the overlap between lip and skin histograms for 33 di erent colour transforms. The hue component of HSV obtains the lowest overlap of 6:15%, and results show that selecting the correct transform can increase the segmentation accuracy by up to three times. The second contribution is the development of a new lip segmentation algorithm that utilises the best colour transforms from the comparative study. The algorithm is tested on 895 images and achieves percentage overlap (OL) of 92:23% and segmentation error (SE) of 7:39 %. The third contribution focuses on the impact of the histogram threshold on the segmentation accuracy, and introduces a novel technique called Adaptive Threshold Optimisation (ATO) to select a better threshold value. The rst stage of ATO incorporates -SVR to train the lip shape model. ATO then uses feedback of shape information to validate and optimise the threshold. After applying ATO, the SE decreases from 7:65% to 6:50%, corresponding to an absolute improvement of 1:15 pp or relative improvement of 15:1%. While this thesis concerns lip segmentation in particular, ATO is a threshold selection technique that can be used in various segmentation applications.MT201

    Low and Variable Frame Rate Face Tracking Using an IP PTZ Camera

    Get PDF
    RÉSUMÉ En vision par ordinateur, le suivi d'objets avec des caméras PTZ a des applications dans divers domaines, tels que la surveillance vidéo, la surveillance du trafic, la surveillance de personnes et la reconnaissance de visage. Toutefois, un suivi plus précis, efficace, et fiable est requis pour une utilisation courante dans ces domaines. Dans cette thèse, le suivi est appliqué au haut du corps d'un humain, en incluant son visage. Le suivi du visage permet de déterminer son emplacement pour chaque trame d'une vidéo. Il peut être utilisé pour obtenir des images du visage d'un humain dans des poses différentes. Dans ce travail, nous proposons de suivre le visage d'un humain à l’aide d'une caméra IP PTZ (caméra réseau orientable). Une caméra IP PTZ répond à une commande via son serveur Web intégré et permet un accès distribué à partir d'Internet. Le suivi avec ce type de caméra inclut un bon nombre de défis, tels que des temps de réponse irrégulier aux commandes de contrôle, des taux de trame faibles et irréguliers, de grand mouvements de la cible entre deux trames, des occlusions, des modifications au champ de vue, des changements d'échelle, etc. Dans notre travail, nous souhaitons solutionner les problèmes des grands mouvements de la cible entre deux trames consécutives, du faible taux de trame, des modifications de l'arrière-plan, et du suivi avec divers changements d'échelle. En outre, l'algorithme de suivi doit prévoir les temps de réponse irréguliers de la caméra. Notre solution se compose d’une phase d’initialisation pour modéliser la cible (haut du corps), d’une adaptation du filtre de particules qui utilise le flux optique pour générer des échantillons à chaque trame (APF-OFS), et du contrôle de la caméra. Chaque composante exige des stratégies différentes. Lors de l'initialisation, on suppose que la caméra est statique. Ainsi, la détection du mouvement par soustraction d’arrière-plan est utilisée pour détecter l'emplacement initial de la personne. Ensuite, pour supprimer les faux positifs, un classificateur Bayesien est appliqué sur la région détectée afin de localiser les régions avec de la peau. Ensuite, une détection du visage basée sur la méthode de Viola et Jones est effectuée sur les régions de la peau. Si un visage est détecté, le suivi est lancé sur le haut du corps de la personne.----------ABSTRACT Object tracking with PTZ cameras has various applications in different computer vision topics such as video surveillance, traffic monitoring, people monitoring and face recognition. Accurate, efficient, and reliable tracking is required for this task. Here, object tracking is applied to human upper body tracking and face tracking. Face tracking determines the location of the human face for each input image of a video. It can be used to get images of the face of a human target under different poses. We propose to track the human face by means of an Internet Protocol (IP) Pan-Tilt-Zoom (PTZ) camera (i.e. a network-based camera that pans, tilts and zooms). An IP PTZ camera responds to command via its integrated web server. It allows a distributed access from Internet (access from everywhere, but with non-defined delay). Tracking with such camera includes many challenges such as irregular response times to camera control commands, low and irregular frame rate, large motions of the target between two frames, target occlusion, changing field of view (FOV), various scale changes, etc. In our work, we want to cope with the problem of large inter-frame motion of targets, low usable frame rate, background changes, and tracking with various scale changes. In addition, the tracking algorithm should handle the camera response time and zooming. Our solution consists of a system initialization phase which is the processing before camera motion and a tracker based on an Adaptive Particle Filter using Optical Flow based Sampling (APF-OFS) tracker, and camera control that are the processing after the motion of the camera. Each part requires different strategies. For initialization, when the camera is stationary, motion detection for a static camera is used to detect the initial location of the person face entering an area. For motion detection in the FOV of the camera, a background subtraction method is applied. Then to remove false positives, Bayesian skin classifier is applied on the detected motion region to discriminate skin regions from non skin regions. Face detection based on Viola and Jones face detector can be performed on the detected skin regions independently of their face size and position within the image

    Traitement logarithmique d'images couleur

    Get PDF
    Cette thèse de doctorat porte sur l'extension du modèle LIP (Logarithmic Image Processing) aux images en couleurs. Le modèle CoLIP (Color Logarithmic Image Processing) est défini, étudié et appliqué au traitement d'image dans ce manuscrit.Le modèle LIP est un cadre mathématique original complet, développé pour le traitement d'images à niveaux de gris, rigoureusement établi mathématiquement, compatible avec les lois physiques de formation d'image, et mieux adapté que l'approche classique pour modéliser la perception visuelle humaine de l'intensité de la lumière. Après une étude de la vision des couleurs et de la science des couleurs, le modèle CoLIP est construit en suivant les étapes de la perception humaine des couleurs, tout en intégrant le cadre mathématique du modèle LIP. Dans un premier temps, le CoLIP est construit en suivant les étapes de la photoréception, de la compression lumineuse et du codage antagoniste. Il est donc développé comme un espace couleur représentant une image couleur par un ensemble de trois fonctions de tons antagonistes, sur lesquelles sont définies les opérations CoLIP d'addition et de multiplication par un scalaire, qui confèrent à cet espace couleur la structure d'espace vectoriel couleur. Ensuite, l'espace couleur CoLIP étant un espace de type luminance-chrominance uniforme, les attributs relatifs et absolus de la perception humaine des couleurs (teinte, chroma, coloration, luminosité, clarté, et saturation) peuvent être définis. Cette construction fait du CoLIP à la fois un espace vectoriel couleur bien structuré mathématiquement, et un modèle d'apparence couleur. Dans un deuxième temps, un grand nombre de justifications physiques, mathématiques, et psychophysiques du modèle CoLIP sont proposées, notamment la comparaison des formes des ellipses de MacAdam dans l'espace de couleur uniforme CoLIP et dans d'autres modèles uniformes, sur des critères d'aire et d'excentricité des ellipses. Enfin, diverses applications utilisant la structure d'espace vectoriel couleur du modèle CoLIP sont proposées, telles que le rehaussement de contraste, le rehaussement d'image et la détection de contour. Des applications utilisant la structure de modèle d'apparence couleur, qui permet de travailler sur les notions de teinte, de luminosité et de saturation, sont également développées. Une application spécifique permettant de mesurer la viabilité des cellules sur des images de lames obtenues par cytocentrifugation et marquage couleur est également proposée.This doctoral thesis introduces the extension of the LIP (Logarithmic Image Processing) model to color images. The CoLIP (Color Logarithmic Image Processing) model is defined, studied and applied to image processing in this manuscript. The Logarithmic Image Processing (LIP) approach is a mathematical framework developed for the representation and processing of images valued in a bounded intensity range. The LIP theory is physically and psychophysically well justified since it is consistent with several laws of human brightness perception and with the multiplicative image formation model. Following a study of color vision and color science, the CoLIP model is constructed according to the human color perception stages, while integrating the mathematical framework of the LIP.Initially, the CoLIP is constructed by following the photoreception, non-linear cone compression, and opponent processing human color perception steps. It is developed as a color space representing a color image by a set of three antagonists tones functions, that can be combined by means of specific CoLIP operations: addition, scalar multiplication, and subtraction, which provide to the CoLIP framework a vector space structure. Then, as the CoLIP color space is a luminance-chrominance uniform color space, relative and absolute perception attributes (hue, chroma, colorfulness, brightness, lightness, and saturation) can be defined. Thus, the CoLIP framework combines advantages of a mathematically well structured vector space, and advantages of a color appearance model. In a second step, physical, mathematical, physiological and psychophysical justifications are proposed including a comparison of MacAdam ellipses shapes in the CoLIP uniform model, and in other uniform models, based on ellipses area and eccentricity criterions. Finally, various applications using the CoLIP vector space structure are proposed, such as contrast enhancement, image enhancement and edge detection. Applications using the CoLIP color appearance model structure, defined on hue, brightness and saturation criterions are also proposed. A specific application dedicated to the quantification of viable cells from samples obtained after cytocentrifugation process and coloration is also presented.ST ETIENNE-ENS des Mines (422182304) / SudocSudocFranceF
    corecore