20 research outputs found

    Biometric Systems

    Get PDF
    Biometric authentication has been widely used for access control and security systems over the past few years. The purpose of this book is to provide the readers with life cycle of different biometric authentication systems from their design and development to qualification and final application. The major systems discussed in this book include fingerprint identification, face recognition, iris segmentation and classification, signature verification and other miscellaneous systems which describe management policies of biometrics, reliability measures, pressure based typing and signature verification, bio-chemical systems and behavioral characteristics. In summary, this book provides the students and the researchers with different approaches to develop biometric authentication systems and at the same time includes state-of-the-art approaches in their design and development. The approaches have been thoroughly tested on standard databases and in real world applications

    Recognition of facial action units from video streams with recurrent neural networks : a new paradigm for facial expression recognition

    Get PDF
    Philosophiae Doctor - PhDThis research investigated the application of recurrent neural networks (RNNs) for recognition of facial expressions based on facial action coding system (FACS). Support vector machines (SVMs) were used to validate the results obtained by RNNs. In this approach, instead of recognizing whole facial expressions, the focus was on the recognition of action units (AUs) that are defined in FACS. Recurrent neural networks are capable of gaining knowledge from temporal data while SVMs, which are time invariant, are known to be very good classifiers. Thus, the research consists of four important components: comparison of the use of image sequences against single static images, benchmarking feature selection and network optimization approaches, study of inter-AU correlations by implementing multiple output RNNs, and study of difference images as an approach for performance improvement. In the comparative studies, image sequences were classified using a combination of Gabor filters and RNNs, while single static images were classified using Gabor filters and SVMs. Sets of 11 FACS AUs were classified by both approaches, where a single RNN/SVM classifier was used for classifying each AU. Results indicated that classifying FACS AUs using image sequences yielded better results than using static images. The average recognition rate (RR) and false alarm rate (FAR) using image sequences was 82.75% and 7.61%, respectively, while the classification using single static images yielded a RR and FAR of 79.47% and 9.22%, respectively. The better performance by the use of image sequences can be at- tributed to RNNs ability, as stated above, to extract knowledge from time-series data. Subsequent research then investigated benchmarking dimensionality reduction, feature selection and network optimization techniques, in order to improve the performance provided by the use of image sequences. Results showed that an optimized network, using weight decay, gave best RR and FAR of 85.38% and 6.24%, respectively. The next study was of the inter-AU correlations existing in the Cohn-Kanade database and their effect on classification models. To accomplish this, a model was developed for the classification of a set of AUs by a single multiple output RNN. Results indicated that high inter-AU correlations do in fact aid classification models to gain more knowledge and, thus, perform better. However, this was limited to AUs that start and reach apex at almost the same time. This suggests the need for availability of a larger database of AUs, which could provide both individual and AU combinations for further investigation. The final part of this research investigated use of difference images to track the motion of image pixels. Difference images provide both noise and feature reduction, an aspect that was studied. Results showed that the use of difference image sequences provided the best results, with RR and FAR of 87.95% and 3.45%, respectively, which is shown to be significant when compared to use of normal image sequences classified using RNNs. In conclusion, the research demonstrates that use of RNNs for classification of image sequences is a new and improved paradigm for facial expression recognition

    Layer-based coding, smoothing, and scheduling of low-bit-rate video for teleconferencing over tactical ATM networks

    Get PDF
    This work investigates issues related to distribution of low bit rate video within the context of a teleconferencing application deployed over a tactical ATM network. The main objective is to develop mechanisms that support transmission of low bit rate video streams as a series of scalable layers that progressively improve quality. The hierarchical nature of the layered video stream is actively exploited along the transmission path from the sender to the recipients to facilitate transmission. A new layered coder design tailored to video teleconferencing in the tactical environment is proposed. Macroblocks selected due to scene motion are layered via subband decomposition using the fast Haar transform. A generalized layering scheme groups the subbands to form an arbitrary number of layers. As a layering scheme suitable for low motion video is unsuitable for static slides, the coder adapts the layering scheme to the video content. A suboptimal rate control mechanism that reduces the kappa dimensional rate distortion problem resulting from the use of multiple quantizers tailored to each layer to a 1 dimensional problem by creating a single rate distortion curve for the coder in terms of a suboptimal set of kappa dimensional quantizer vectors is investigated. Rate control is thus simplified into a table lookup of a codebook containing the suboptimal quantizer vectors. The rate controller is ideal for real time video and limits fluctuations in the bit stream with no corresponding visible fluctuations in perceptual quality. A traffic smoother prior to network entry is developed to increase queuing and scheduler efficiency. Three levels of smoothing are studied: frame, layer, and cell interarrival. Frame level smoothing occurs via rate control at the application. Interleaving and cell interarrival smoothing are accomplished using a leaky bucket mechanism inserted prior to the adaptation layer or within the adaptation layerhttp://www.archive.org/details/layerbasedcoding00parkLieutenant Commander, United States NavyApproved for public release; distribution is unlimited

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity

    Artificial Intelligence for Multimedia Signal Processing

    Get PDF
    Artificial intelligence technologies are also actively applied to broadcasting and multimedia processing technologies. A lot of research has been conducted in a wide variety of fields, such as content creation, transmission, and security, and these attempts have been made in the past two to three years to improve image, video, speech, and other data compression efficiency in areas related to MPEG media processing technology. Additionally, technologies such as media creation, processing, editing, and creating scenarios are very important areas of research in multimedia processing and engineering. This book contains a collection of some topics broadly across advanced computational intelligence algorithms and technologies for emerging multimedia signal processing as: Computer vision field, speech/sound/text processing, and content analysis/information mining

    Prediction-based techniques for the optimization of mobile networks

    Get PDF
    Mención Internacional en el título de doctorMobile cellular networks are complex system whose behavior is characterized by the superposition of several random phenomena, most of which, related to human activities, such as mobility, communications and network usage. However, when observed in their totality, the many individual components merge into more deterministic patterns and trends start to be identifiable and predictable. In this thesis we analyze a recent branch of network optimization that is commonly referred to as anticipatory networking and that entails the combination of prediction solutions and network optimization schemes. The main intuition behind anticipatory networking is that knowing in advance what is going on in the network can help understanding potentially severe problems and mitigate their impact by applying solution when they are still in their initial states. Conversely, network forecast might also indicate a future improvement in the overall network condition (i.e. load reduction or better signal quality reported from users). In such a case, resources can be assigned more sparingly requiring users to rely on buffered information while waiting for the better condition when it will be more convenient to grant more resources. In the beginning of this thesis we will survey the current anticipatory networking panorama and the many prediction and optimization solutions proposed so far. In the main body of the work, we will propose our novel solutions to the problem, the tools and methodologies we designed to evaluate them and to perform a real world evaluation of our schemes. By the end of this work it will be clear that not only is anticipatory networking a very promising theoretical framework, but also that it is feasible and it can deliver substantial benefit to current and next generation mobile networks. In fact, with both our theoretical and practical results we show evidences that more than one third of the resources can be saved and even larger gain can be achieved for data rate enhancements.Programa Oficial de Doctorado en Ingeniería TelemáticaPresidente: Albert Banchs Roca.- Presidente: Pablo Serrano Yañez-Mingot.- Secretario: Jorge Ortín Gracia.- Vocal: Guevara Noubi

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

    Towards Better Image Embeddings Using Neural Networks

    Get PDF
    The primary focus of this dissertation is to study image embeddings extracted by neural networks. Deep Learning (DL) is preferred over traditional Machine Learning (ML) for the reason that feature representations can be automatically constructed from data without human involvement. On account of the effectiveness of deep features, the last decade has witnessed unprecedented advances in Computer Vision (CV), and more real-world applications are expected to be introduced in the coming years. A diverse collection of studies has been included, covering areas such as person re-identification, vehicle attribute recognition, neural image compression, clustering and unsupervised anomaly detection. More specifically, three aspects of feature representations have been thoroughly analyzed. Firstly, features should be distinctive, i.e., features of samples from distinct categories ought to differ significantly. Extracting distinctive features is essential for image retrieval systems, in which an algorithm finds the gallery sample that is closest to a query sample. Secondly, features should be privacy-preserving, i.e., inferring sensitive information from features must be infeasible. With the widespread adoption of Machine Learning as a Service (MLaaS), utilizing privacy-preserving features prevents privacy violations even if the server has been compromised. Thirdly, features should be compressible, i.e., compact features are preferable as they require less storage space. Obtaining compressible features plays a vital role in data compression. Towards the goal of deriving distinctive, privacy-preserving and compressible feature representations, research articles included in this dissertation reveal different approaches to improving image embeddings learned by neural networks. This topic remains a fundamental challenge in Machine Learning, and further research is needed to gain a deeper understanding
    corecore