105,123 research outputs found

    Person annotation in video sequences

    Get PDF
    In the recent years, the demand for video tools to automatically annotate and classify large audiovisual datasets has increased considerably. One specific task in this field applies to TV broadcast videos, to determine who and when a person appears in a video sequence. This work starts from the base of the ALBAYZIN evaluation series presented in the IberSPEECH-RTVE 2018 in Barcelona, and the purpose of this thesis is trying to improve the results obtained and compare the different face detection and tracking methods. We will evaluate the performance of classic face detection techniques and other techniques based on machine learning on a closed dataset of 34 known people. The rest of characters on the audiovisual document will be labelled as "unknown". We will work with small videos and images of each known character to build his/her model and finally, evaluate the performance of the ALBAYZIN algorithm over a 2h video called "La noche en 24H" whose format is like a news program. We will analyze the results and the type of errors and scenarios we encountered as well as the solutions we propose for each of them if there is any. In this work, We will only focus on a monomodal basis of face recognition and tracking

    Enhanced Emotion Recognition in Videos: A Convolutional Neural Network Strategy for Human Facial Expression Detection and Classification

    Get PDF
    The human face is essential in conveying emotions, as facial expressions serve as effective, natural, and universal indicators of emotional states. Automated emotion recognition has garnered increasing interest due to its potential applications in various fields, such as human-computer interaction, machine learning, robotic control, and driver emotional state monitoring. With artificial intelligence and computational power advancements, visual emotion recognition has become a prominent research area. Despite extensive research employing machine learning algorithms like convolutional neural networks (CNN), challenges remain concerning input data processing, emotion classification scope, data size, optimal CNN configurations, and performance evaluation. To address these issues, we propose a comprehensive CNN-based model for real-time detection and classification of five primary emotions: anger, happiness, neutrality, sadness, and surprise. We employ the Amsterdam Dynamic Facial Expression Set – Bath Intensity Variations (ADFES-BIV) video dataset, extracting image frames from the video samples. Image processing techniques such as histogram equalization, color conversion, cropping, and resizing are applied to the frames before labeling. The Viola-Jones algorithm is then used for face detection on the processed grayscale images. We develop and train a CNN on the processed image data, implementing dropout, batch normalization, and L2 regularization to reduce overfitting. The ideal hyperparameters are determined through trial and error, and the model's performance is evaluated. The proposed model achieves a recognition accuracy of 99.38%, with the confusion matrix, recall, precision, F1 score, and processing time further quantifying its performance characteristics. The model's generalization performance is assessed using images from the Warsaw Set of Emotional Facial Expression Pictures (WSEFEP) and Extended Cohn-Kanade Database (CK+) datasets. The results demonstrate the efficiency and usability of our proposed approach, contributing valuable insights into real-time visual emotion recognition

    Precise eye localization using HOG descriptors

    Full text link
    In this paper, we present a novel algorithm for precise eye detection. First, a couple of AdaBoost classifiers trained with Haar-like features are used to preselect possible eye locations. Then, a Support Vector Machine machine that uses Histograms of Oriented Gradients descriptors is used to obtain the best pair of eyes among all possible combinations of preselected eyes. Finally, we compare the eye detection results with three state-of-the-art works and a commercial software. The results show that our algorithm achieves the highest accuracy on the FERET and FRGCv1 databases, which is the most complete comparative presented so far. © Springer-Verlag 2010.This work has been partially supported by the grant TEC2009-09146 of the Spanish Government.Monzó Ferrer, D.; Albiol Colomer, A.; Sastre, J.; Albiol Colomer, AJ. (2011). Precise eye localization using HOG descriptors. Machine Vision and Applications. 22(3):471-480. https://doi.org/10.1007/s00138-010-0273-0S471480223Riopka, T., Boult, T.: The eyes have it. In: Proceedings of ACM SIGMM Multimedia Biometrics Methods and Applications Workshop, Berkeley, CA, pp. 9–16 (2003)Kim C., Choi C.: Image covariance-based subspace method for face recognition. Pattern Recognit. 40(5), 1592–1604 (2007)Wang, P., Green, M., Ji, Q., Wayman, J.: Automatic eye detection and its validation. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, vol. 3, San Diego, CA, pp. 164–171 (2005)Amir A., Zimet L., Sangiovanni-Vincentelli A., Kao S.: An embedded system for an eye-detection sensor. Comput. Vis. Image Underst. 98(1), 104–123 (2005)Zhu Z., Ji Q.: Robust real-time eye detection and tracking under variable lighting conditions and various face orientations. Comput. Vis. Image Underst. 98(1), 124–154 (2005)Huang, W., Mariani, R.: Face detection and precise eyes location. In: Proceedings of the International Conference on Pattern Recognition, vol. 4, Washington, DC, USA, pp. 722–727 (2000)Brunelli R., Poggio T.: Face recognition: features versus templates. IEEE Trans. Pattern Anal. Mach. Intell. 15(10), 1042–1052 (1993)Guan, Y.: Robust eye detection from facial image based on multi-cue facial information. In: Proceedings of IEEE International Conference on Control and Automation, pp. 1775–1778 (2007)Rizon, M., Kawaguchi, T.: Automatic eye detection using intensity and edge information. In: Proceedings of TENCON, vol. 2, Kuala Lumpur, Malaysia, pp. 415–420 (2000)Han, C., Liao, H., Yu, K., Chen, L.: Fast face detection via morphology-based pre-processing. In: Proceedings of the 9th International Conference on Image Analysis and Processing, vol. 2. Springer, London, UK, pp. 469–476 (1997)Song J., Chi Z., Liu J.: A robust eye detection method using combined binary edge and intensity information. Pattern Recognit. 39(6), 1110–1125 (2006)Campadelli, P., Lanzarotti, R., Lipori, G.: Precise eye localization through a general-to-specific model definition. In: Proceedings of the British Machine Vision Conference, Edinburgh, Scotland, pp. 187–196 (2006)Smeraldi F., Carmona O., Bign J.: Saccadic search with gabor features applied to eye detection and real-time head tracking. Image Vis. Comput. 18(4), 323–329 (1998)Sirohey S. A., Rosenfeld A.: Eye detection in a face image using linear and nonlinear filters. Pattern Recognit. 34(7), 1367–1391 (2001)Ma, Y., Ding, X., Wang, Z., Wang, N.: Robust precise eye location under probabilistic framework. In: Proceedings of the International Conference on Automatic Face and Gesture Recognition, Seoul, Korea, pp. 339–344 (2004)Lu, H., Zhang, W., Yang D.: Eye detection based on rectangle features and pixel-pattern-based texture features. In: Proceedings of the International Symposium on Intelligent Signal Processing and Communication Systems, pp. 746–749 (2007)Jin, L., Yuan, X., Satoh, S., Li, J., Xia, L.: A hybrid classifier for precise and robust eye detection. In: Proceedings of the International Conference on Pattern Recognition, vol. 4, Hong Kong, pp. 731–735 (2006)Vapnik V. N.: The Nature of Statistical Learning Theory. Springer, New York Inc, New York, NY (1995)Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, vol. 1, Hawaii, pp. 511–518 (2001)Fasel I., Fortenberry B., Movellan J.: A generative framework for real time object detection and classification. Comput. Vis. Image Underst. 98(1), 182–210 (2005)Huang J., Wechsler H.: Visual routines for eye location using learning and evolution. IEEE Trans. Evolut. Comput. 4(1), 73–82 (2000)Behnke S.: Face localization and tracking in the neural abstraction pyramid. Neural Comput. Appl. 14(2), 97–103 (2005)Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the 9th European Conference on Computer Vision, vol. 2, San Diego, CA, pp. 886–893 (2005)Albiol A., Monzo D., Martin A., Sastre J., Albiol A.: Face recognition using hog-ebgm. Pattern Recognit. Lett. 29(10), 1537–1543 (2008)Lowe D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)Bicego, M., Lagorio, A., Grosso, E., Tistarelli M.: On the use of SIFT features for face authentication. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition Workshop, New York, p. 35 (2006)Yang M.-H., Kriegman D., Ahuja N.: Detecting faces in images: a survey. Trans. Pattern Anal. Mach. Intell. 24(1), 34–58 (2002)Jain A., Murty M., Flynn P.: Data clustering: a review. ACM Comput. Syst. 31(3), 264–323 (1999)Mikolajczyk K., Schmid C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)Humanscan, BioID database. http://www.bioid.comPeer, P.: CVL Face database, University of Ljubjana. http://www.fri.uni-lj.si/enPhillips P. J., Moon H., Rizvi S. A., Rauss P. J.: The feret evaluation methodology for face-recognition algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 22(10), 1090–1104 (2000)Phillips, P.J., Flynn, P.J., Scruggs, T., Bowyer, K.W., Jin, C., Hoffman, K., Marques, J., Jaesik, M., Worek, W.: Overview of the face recognition grand challenge. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, vol. 1, San Diego, CA, pp. 947–954 (2005)Jesorsky, O., Kirchberg, K.J., Frischholz, R.: Robust face detection using the hausdorff distance. In: Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication, Springer, London, UK, pp. 90–95 (2001)Neurotechnologija, Biometrical and Artificial Intelligence Technologies, Verilook SDK. http://www.neurotechnologija.comWitten I., Frank E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn: Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2005)Turk M., Pentland A.: Eigenfaces for recognition. J. Cogn. Neurosci. 3(1), 71–86 (1991

    Enhancing Online Security with Image-based Captchas

    Get PDF
    Given the data loss, productivity, and financial risks posed by security breaches, there is a great need to protect online systems from automated attacks. Completely Automated Public Turing Tests to Tell Computers and Humans Apart, known as CAPTCHAs, are commonly used as one layer in providing online security. These tests are intended to be easily solvable by legitimate human users while being challenging for automated attackers to successfully complete. Traditionally, CAPTCHAs have asked users to perform tasks based on text recognition or categorization of discrete images to prove whether or not they are legitimate human users. Over time, the efficacy of these CAPTCHAs has been eroded by improved optical character recognition, image classification, and machine learning techniques that can accurately solve many CAPTCHAs at rates approaching those of humans. These CAPTCHAs can also be difficult to complete using the touch-based input methods found on widely used tablets and smartphones.;This research proposes the design of CAPTCHAs that address the shortcomings of existing implementations. These CAPTCHAs require users to perform different image-based tasks including face detection, face recognition, multimodal biometrics recognition, and object recognition to prove they are human. These are tasks that humans excel at but which remain difficult for computers to complete successfully. They can also be readily performed using click- or touch-based input methods, facilitating their use on both traditional computers and mobile devices.;Several strategies are utilized by the CAPTCHAs developed in this research to enable high human success rates while ensuring negligible automated attack success rates. One such technique, used by fgCAPTCHA, employs image quality metrics and face detection algorithms to calculate a fitness value representing the simulated performance of human users and automated attackers, respectively, at solving each generated CAPTCHA image. A genetic learning algorithm uses these fitness values to determine customized generation parameters for each CAPTCHA image. Other approaches, including gradient descent learning, artificial immune systems, and multi-stage performance-based filtering processes, are also proposed in this research to optimize the generated CAPTCHA images.;An extensive RESTful web service-based evaluation platform was developed to facilitate the testing and analysis of the CAPTCHAs developed in this research. Users recorded over 180,000 attempts at solving these CAPTCHAs using a variety of devices. The results show the designs created in this research offer high human success rates, up to 94.6\% in the case of aiCAPTCHA, while ensuring resilience against automated attacks

    Fair comparison of skin detection approaches on publicly available datasets

    Full text link
    Skin detection is the process of discriminating skin and non-skin regions in a digital image and it is widely used in several applications ranging from hand gesture analysis to track body parts and face detection. Skin detection is a challenging problem which has drawn extensive attention from the research community, nevertheless a fair comparison among approaches is very difficult due to the lack of a common benchmark and a unified testing protocol. In this work, we investigate the most recent researches in this field and we propose a fair comparison among approaches using several different datasets. The major contributions of this work are an exhaustive literature review of skin color detection approaches, a framework to evaluate and combine different skin detector approaches, whose source code is made freely available for future research, and an extensive experimental comparison among several recent methods which have also been used to define an ensemble that works well in many different problems. Experiments are carried out in 10 different datasets including more than 10000 labelled images: experimental results confirm that the best method here proposed obtains a very good performance with respect to other stand-alone approaches, without requiring ad hoc parameter tuning. A MATLAB version of the framework for testing and of the methods proposed in this paper will be freely available from https://github.com/LorisNann

    A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"

    Full text link
    Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second authorshi
    • …
    corecore