Search CORE

5 research outputs found

Advanced concepts for intelligent vision systems : 16th International Conference, ACIVS 2015, Catania, Italy, October 26-29, 2015. Proceedings

Author: Battiato Sebastiano
Blanc-Talon Jacques
Gallo Giovanni
Philips Wilfried
Popescu Dan
Scheunders Paul
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

A Novel Method for Lip Movement Detection using Deep Neural Network

Author: Karthik R
Srilakshmi Kanagala
Publication venue: CSIR-National Institute of Science Communication and Policy Research (NIScPR)
Publication date: 25/07/2022
Field of study

Recognition of Lip movements has become one of the most challenging tasks and has crucial applications in the contemporary scenario. It is the recognition of the speech uttered by individual using visual cues. Visual interpretation of lip movement is especially useful in scenarios like video surveillance, where auditory signals are either not available or too noisy for interpretation. It is also useful for hearing-impaired individuals where audio signal is of no use. Many developments have taken place in this nascent field using various deep learning-based techniques. This research does analysis over various state-of-the-art deep-learning models on MIRACL-VC1 dataset. This study also aims to find out the optimal baseline architecture suitable for building a new model with high accuracy for lip movement detection. The models are trained from scratch over the pre-processed MIRACL-VC1 dataset consisting of small-size images. Experimental observations with state-of-the-art deep learning models indicate that EfficientNet B0 architecture yielded an accuracy of 80.13%. Thus, EfficientNet B0 is further utilized as baseline deep architecture to design a customized model for effective detection. This research proposes an attention based deep learning model combined with Long Short-Term Memory (LSTM) layer having EfficientNet B0 as the backbone architecture. The proposed model yielded an accuracy of 91.13%

Online Publishing @ NISCAIR

Програмний комплекс для розпізнавання промовленого тексту з відео по міміці людини

Author: Онбиш Олександр Олегович
Publication venue: Київ
Publication date: 01/06/2019
Field of study

У даній бакалаврській роботі розглянуті питання розпізнавання промовленого тексту за відеорядом міміки людини за допомогою методів глибокого навчання. Досліджені різні способи розпізнавання промовленого тексту. Розроблений спосіб розпізнавання на базі згорткової нейронної мережі з рекурентною мережою. Розроблена програма для розпізнавання промовленого тексту за відеорядом обличчя людини, досліджено роботу моделі на різних наборах даних. Загальний обсяг роботи: 62 сторінки, 23 рисунків, 8 таблиць, 23 джерел.In this work, the problem of recognition of spoken text by the video of human facial expressions through the methods of deep learning are studied. Different ways of recognition of spoken text have been investigated. The method of recognition on the basis of a convolutional neural network with a recurrent network is developed. The program for recognition of the spoken text on the face of the person is developed, the model work is investigated on different data sets. Total volume of work: 62 pages, 23 figures, 8 tables, 23 sources

Electronic Archive of Kyiv Polytechnic Institute