Search CORE

14 research outputs found

Motion learning using spatio-temporal neural network

Author: Ahmad Farzana Kabir
Jemili Mohamad Farif
Yusoff Nooraini
Publication venue: 'UUM Press, Universiti Utara Malaysia'
Publication date: 01/04/2020
Field of study

Motion trajectory prediction is one of the key areas in behaviour and surveillance studies. Many related successful applications have been reported in the literature. However, most of the studies are based on sigmoidal neural networks in which some dynamic properties of the data are overlooked due to the absence of spatiotemporal encoding functionalities. Even though some sequential (motion) learning studies have been proposed using spatiotemporal neural networks, as in those sigmoidal neural networks, the approach used is mainly supervised learning. In such learning, it requires a target signal, in which this is not always available in some applications. For this study, motion learning using spatio temporal neural network is proposed. The learning is based on reward-modulated spike-timing-dependent plasticity (STDP), whereby the learning weight adjustment provided by the standard STDP is modulated by the reinforcement. The implementation of reinforcement approach for motion trajectory can be regarded as a major contribution of this study. In this study, learning is implemented on a reward basis without the need for learning targets.The algorithm has shown good potential in learning motion trajectory particularly in noisy and dynamic settings. Furthermore, the learning uses generic neural network architecture, which makes learning adaptable for many applications

UUM Repository

Improving Functionalities in a Multi-agent Architecture for Ocean Monitoring

Author: García Alberto
Gil Gonzalo Oscar
Martín Beatriz
Prieta Pintado Fernando de la
Zato Domínguez Davinia Carolina
Publication venue: Springer Science + Business Media
Publication date: 01/01/2010
Field of study

This paper presents an improved version of a multiagent architecture aimed at providing solutions for monitoring the interaction between the atmosphere and the ocean. The ocean surface and the atmosphere exchange carbon dioxide. This process is can be modeled by a multiagent system with advanced learning and adaption capabilities. The proposed multiagent architecture incorporates CBR-agents. The CBR-agents proposed in this paper integrate novel strategies that both monitor the parameters that affect the interaction, and facilitate the creation of models. The system was tested and this paper presents the results obtained

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Gestion del Repositorio Documental de la Universidad de Salamanca

Adaptive noise reduction and code matching for IRIS pattern recognition system

Author: Dehkordi Arezou Banitalebi
Publication venue
Publication date: 01/01/2016
Field of study

Among all biometric modalities, iris is becoming more popular due to its high performance in recognizing or verifying individuals. Iris recognition has been used in numerous fields such as authentications at prisons, airports, banks and healthcare. Although iris recognition system has high accuracy with very low false acceptance rate, the system performance can still be affected by noise. Very low intensity value of eyelash pixels or high intensity values of eyelids and light reflection pixels cause inappropriate threshold values, and therefore, degrade the accuracy of system. To reduce the effects of noise and improve the accuracy of an iris recognition system, a robust algorithm consisting of two main components is proposed. First, an Adaptive Fuzzy Switching Noise Reduction (AFSNR) filter is proposed. This filter is able to reduce the effects of noise with different densities by employing fuzzy switching between adaptive median filter and filling method. Next, an Adaptive Weighted Shifting Hamming Distance (AWSHD) is proposed which improves the performance of iris code matching stage and level of decidability of the system. As a result, the proposed AFSNR filter with its adaptive window size successfully reduces the effects ofdifferent types of noise with different densities. By applying the proposed AWSHD, the distance corresponding to a genuine user is reduced, while the distance for impostors is increased. Consequently, the genuine user is more likely to be authenticated and the impostor is more likely to be rejected. Experimental results show that the proposed algorithm with genuine acceptance rate (GAR) of 99.98% and is accurate to enhance the performance of the iris recognition system

Universiti Teknologi Malaysia Institutional Repository

Recommended from our members

Computational models of object motion detectors accelerated using FPGA technology

Author: Baptista Machado PM
Publication venue
Publication date: 01/07/2021
Field of study

The detection of moving objects is a trivial task when performed by vertebrate retinas, yet a complex computer vision task. This PhD research programme has made three key contributions, namely: 1) a multi-hierarchical spiking neural network (MHSNN) architecture for detecting horizontal and vertical movements, 2) a Hybrid Sensitive Motion Detector (HSMD) algorithm for detecting object motion and 3) the Neuromorphic Hybrid Sensitive Motion Detector (NeuroHSMD) , a real-time neuromorphic implementation of the HSMD algorithm. The MHSNN is a customised 4 layers Spiking Neural Network (SNN) architecture designed to reflect the basic connectivity, similar to canonical behaviours found in the majority of vertebrate retinas (including human retinas). The architecture, was trained using images from a custom dataset generated in laboratory settings. Simulation results revealed that each cell model is sensitive to vertical and horizontal movements, with a detection error of 6.75% contrasted against the teaching signals (expected output signals) used to train the MHSNN. The experimental evaluation of the methodology shows that the MH SNN was not scalable because of the overall number of neurons and synapses which lead to the development of the HSMD. The HSMD algorithm enhanced an existing Dynamic Background subtraction (DBS) algorithm using a customised 3-layer SNN. The customised 3-layer SNN was used to stabilise the foreground information of moving objects in the scene, which improves the object motion detection. The algorithm was compared against existing background subtraction approaches, available on the Open Computer Vision (OpenCV) library, specifically on the 2012 Change Detection (CDnet2012) and the 2014 Change Detection (CDnet2014) benchmark datasets. The accuracy results show that the HSMD was ranked overall first and performed better than all the other benchmarked algorithms on four of the categories, across all eight test metrics. Furthermore, the HSMD is the first to use an SNN to enhance the existing dynamic background subtraction algorithm without a substantial degradation of the frame rate, being capable of processing images 720 × 480 at 13.82 Frames Per Second (fps) (CDnet2014) and 720 × 480 at 13.92 fps (CDnet2012) on a High Performance computer (96 cores and 756 GB of RAM). Although the HSMD analysis shows good Percentage of Correct Classifications (PCC) on the CDnet2012 and CDnet2014, it was identified that the 3-layer customised SNN was the bottleneck, in terms of speed, and could be improved using dedicated hardware. The NeuroHSMD is thus an adaptation of the HSMD algorithm whereby the SNN component has been fully implemented on dedicated hardware [Terasic DE10-pro Field-Programmable Gate Array (FPGA) board]. Open Computer Language (OpenCL) was used to simplify the FPGA design flow and allow the code portability to other devices such as FPGA and Graphical Processing Unit (GPU). The NeuroHSMD was also tested against the CDnet2012 and CDnet2014 datasets with an acceleration of 82% over the HSMD algorithm, being capable of processing 720 × 480 images at 28.06 fps (CDnet2012) and 28.71 fps (CDnet2014)

Nottingham Trent Institutional Repository (IRep)

Advancing iris biometric technology

Author: Abdullah Mohammed Abdulmuttaleb M.
Publication venue: Newcastle University
Publication date: 01/01/2017
Field of study

PhD ThesisThe iris biometric is a well-established technology which is already in use in several nation-scale applications and it is still an active research area with several unsolved problems. This work focuses on three key problems in iris biometrics namely: segmentation, protection and cross-matching. Three novel methods in each of these areas are proposed and analyzed thoroughly. In terms of iris segmentation, a novel iris segmentation method is designed based on a fusion of an expanding and a shrinking active contour by integrating a new pressure force within the Gradient Vector Flow (GVF) active contour model. In addition, a new method for closed eye detection is proposed. The experimental results on the CASIA V4, MMU2, UBIRIS V1 and UBIRIS V2 databases show that the proposed method achieves state-of-theart results in terms of segmentation accuracy and recognition performance while being computationally more efficient. In this context, improvements by 60.5%, 42% and 48.7% are achieved in segmentation accuracy for the CASIA V4, MMU2 and UBIRIS V1 databases, respectively. For the UBIRIS V2 database, a superior time reduction is reported (85.7%) while maintaining a similar accuracy. Similarly, considerable time improvements by 63.8%, 56.6% and 29.3% are achieved for the CASIA V4, MMU2 and UBIRIS V1 databases, respectively. With respect to iris biometric protection, a novel security architecture is designed to protect the integrity of iris images and templates using watermarking and Visual Cryptography (VC). Firstly, for protecting the iris image, text which carries personal information is embedded in the middle band frequency region of the iris image using a novel watermarking algorithm that randomly interchanges multiple middle band pairs of the Discrete Cosine Transform (DCT). Secondly, for iris template protection, VC is utilized to protect the iii iris template. In addition, the integrity of the stored template in the biometric smart card is guaranteed by using the hash signatures. The proposed method has a minimal effect on the iris recognition performance of only 3.6% and 4.9% for the CASIA V4 and UBIRIS V1 databases, respectively. In addition, the VC scheme is designed to be readily applied to protect any biometric binary template without any degradation to the recognition performance with a complexity of only O(N). As for cross-spectral matching, a framework is designed which is capable of matching iris images in different lighting conditions. The first method is designed to work with registered iris images where the key idea is to synthesize the corresponding Near Infra-Red (NIR) images from the Visible Light (VL) images using an Artificial Neural Network (ANN) while the second method is capable of working with unregistered iris images based on integrating the Gabor filter with different photometric normalization models and descriptors along with decision level fusion to achieve the cross-spectral matching. A significant improvement by 79.3% in cross-spectral matching performance is attained for the UTIRIS database. As for the PolyU database, the proposed verification method achieved an improvement by 83.9% in terms of NIR vs Red channel matching which confirms the efficiency of the proposed method. In summary, the most important open issues in exploiting the iris biometric are presented and novel methods to address these problems are proposed. Hence, this work will help to establish a more robust iris recognition system due to the development of an accurate segmentation method working for iris images taken under both the VL and NIR. In addition, the proposed protection scheme paves the way for a secure iris images and templates storage. Moreover, the proposed framework for cross-spectral matching will help to employ the iris biometric in several security applications such as surveillance at-a-distance and automated watch-list identification.Ministry of Higher Education and Scientific Research in Ira

Newcastle University eTheses

Improved Human Face Recognition by Introducing a New Cnn Arrangement and Hierarchical Method

Author: Parsai Soroosh
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2023
Field of study

Human face recognition has become one of the most attractive topics in the fields ‎of biometrics due to its wide applications. The face is a part of the body that carries ‎the most information regarding identification in human interactions. Features such ‎as the composition of facial components, skin tone, face\u27s central axis, distances ‎between eyes, and many more, alongside the other biometrics, are used ‎unconsciously by the brain to distinguish a person. Indeed, analyzing the facial ‎features could be the first method humans use to identify a person in their lives. ‎As one of the main biometric measures, human face recognition has been utilized in ‎various commercial applications over the past two decades. From banking to smart ‎advertisement and from border security to mobile applications. These are a few ‎examples that show us how far these methods have come. We can confidently say ‎that the techniques for face recognition have reached an acceptable level of ‎accuracy to be implemented in some real-life applications. However, there are other ‎applications that could benefit from improvement. Given the increasing demand ‎for the topic and the fact that nowadays, we have almost all the infrastructure that ‎we might need for our application, make face recognition an appealing topic. ‎ When we are evaluating the quality of a face recognition method, there are some ‎benchmarks that we should consider: accuracy, speed, and complexity are the main ‎parameters. Of course, we can measure other aspects of the algorithm, such as size, ‎precision, cost, etc. But eventually, every one of those parameters will contribute to ‎improving one or some of these three concepts of the method. Then again, although ‎we can see a significant level of accuracy in existing algorithms, there is still much ‎room for improvement in speed and complexity. In addition, the accuracy of the ‎mentioned methods highly depends on the properties of the face images. In other ‎words, uncontrolled situations and variables like head pose, occlusion, lighting, ‎image noise, etc., can affect the results dramatically. ‎ Human face recognition systems are used in either identification or verification. In ‎verification, the system\u27s main goal is to check if an input belongs to a pre-determined tag or a person\u27s ID. ‎Almost every face recognition system consists of four major steps. These steps are ‎pre-processing, face detection, feature extraction, and classification. Improvement ‎in each of these steps will lead to the overall enhancement of the system. In this ‎work, the main objective is to propose new, improved and enhanced methods in ‎each of those mentioned steps, evaluate the results by comparing them with other ‎existing techniques and investigate the outcome of the proposed system.

Scholarship at UWindsor

Applications de la représentation parcimonieuse perceptuelle par graphe de décharges (Spikegramme) pour la protection du droit d’auteur des signaux sonores

Author: Erfani Yousof
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2016
Field of study

Chaque année, le piratage mondial de la musique coûte plusieurs milliards de dollars en pertes économiques, pertes d’emplois et pertes de gains des travailleurs ainsi que la perte de millions de dollars en recettes fiscales. La plupart du piratage de la musique est dû à la croissance rapide et à la facilité des technologies actuelles pour la copie, le partage, la manipulation et la distribution de données musicales [Domingo, 2015], [Siwek, 2007]. Le tatouage des signaux sonores a été proposé pour protéger les droit des auteurs et pour permettre la localisation des instants où le signal sonore a été falsifié. Dans cette thèse, nous proposons d’utiliser la représentation parcimonieuse bio-inspirée par graphe de décharges (spikegramme), pour concevoir une nouvelle méthode permettant la localisation de la falsification dans les signaux sonores. Aussi, une nouvelle méthode de protection du droit d’auteur. Finalement, une nouvelle attaque perceptuelle, en utilisant le spikegramme, pour attaquer des systèmes de tatouage sonore. Nous proposons tout d’abord une technique de localisation des falsifications (‘tampering’) des signaux sonores. Pour cela nous combinons une méthode à spectre étendu modifié (‘modified spread spectrum’, MSS) avec une représentation parcimonieuse. Nous utilisons une technique de poursuite perceptive adaptée (perceptual marching pursuit, PMP [Hossein Najaf-Zadeh, 2008]) pour générer une représentation parcimonieuse (spikegramme) du signal sonore d’entrée qui est invariante au décalage temporel [E. C. Smith, 2006] et qui prend en compte les phénomènes de masquage tels qu’ils sont observés en audition. Un code d’authentification est inséré à l’intérieur des coefficients de la représentation en spikegramme. Puis ceux-ci sont combinés aux seuils de masquage. Le signal tatoué est resynthétisé à partir des coefficients modifiés, et le signal ainsi obtenu est transmis au décodeur. Au décodeur, pour identifier un segment falsifié du signal sonore, les codes d’authentification de tous les segments intacts sont analysés. Si les codes ne peuvent être détectés correctement, on sait qu’alors le segment aura été falsifié. Nous proposons de tatouer selon le principe à spectre étendu (appelé MSS) afin d’obtenir une grande capacité en nombre de bits de tatouage introduits. Dans les situations où il y a désynchronisation entre le codeur et le décodeur, notre méthode permet quand même de détecter des pièces falsifiées. Par rapport à l’état de l’art, notre approche a le taux d’erreur le plus bas pour ce qui est de détecter les pièces falsifiées. Nous avons utilisé le test de l’opinion moyenne (‘MOS’) pour mesurer la qualité des systèmes tatoués. Nous évaluons la méthode de tatouage semi-fragile par le taux d’erreur (nombre de bits erronés divisé par tous les bits soumis) suite à plusieurs attaques. Les résultats confirment la supériorité de notre approche pour la localisation des pièces falsifiées dans les signaux sonores tout en préservant la qualité des signaux. Ensuite nous proposons une nouvelle technique pour la protection des signaux sonores. Cette technique est basée sur la représentation par spikegrammes des signaux sonores et utilise deux dictionnaires (TDA pour Two-Dictionary Approach). Le spikegramme est utilisé pour coder le signal hôte en utilisant un dictionnaire de filtres gammatones. Pour le tatouage, nous utilisons deux dictionnaires différents qui sont sélectionnés en fonction du bit d’entrée à tatouer et du contenu du signal. Notre approche trouve les gammatones appropriés (appelés noyaux de tatouage) sur la base de la valeur du bit à tatouer, et incorpore les bits de tatouage dans la phase des gammatones du tatouage. De plus, il est montré que la TDA est libre d’erreur dans le cas d’aucune situation d’attaque. Il est démontré que la décorrélation des noyaux de tatouage permet la conception d’une méthode de tatouage sonore très robuste. Les expériences ont montré la meilleure robustesse pour la méthode proposée lorsque le signal tatoué est corrompu par une compression MP3 à 32 kbits par seconde avec une charge utile de 56.5 bps par rapport à plusieurs techniques récentes. De plus nous avons étudié la robustesse du tatouage lorsque les nouveaux codec USAC (Unified Audion and Speech Coding) à 24kbps sont utilisés. La charge utile est alors comprise entre 5 et 15 bps. Finalement, nous utilisons les spikegrammes pour proposer trois nouvelles méthodes d’attaques. Nous les comparons aux méthodes récentes d’attaques telles que 32 kbps MP3 et 24 kbps USAC. Ces attaques comprennent l’attaque par PMP, l’attaque par bruit inaudible et l’attaque de remplacement parcimonieuse. Dans le cas de l’attaque par PMP, le signal de tatouage est représenté et resynthétisé avec un spikegramme. Dans le cas de l’attaque par bruit inaudible, celui-ci est généré et ajouté aux coefficients du spikegramme. Dans le cas de l’attaque de remplacement parcimonieuse, dans chaque segment du signal, les caractéristiques spectro-temporelles du signal (les décharges temporelles ;‘time spikes’) se trouvent en utilisant le spikegramme et les spikes temporelles et similaires sont remplacés par une autre. Pour comparer l’efficacité des attaques proposées, nous les comparons au décodeur du tatouage à spectre étendu. Il est démontré que l’attaque par remplacement parcimonieux réduit la corrélation normalisée du décodeur de spectre étendu avec un plus grand facteur par rapport à la situation où le décodeur de spectre étendu est attaqué par la transformation MP3 (32 kbps) et 24 kbps USAC.Abstract : Every year global music piracy is making billion dollars of economic, job, workers’ earnings losses and also million dollars loss in tax revenues. Most of the music piracy is because of rapid growth and easiness of current technologies for copying, sharing, manipulating and distributing musical data [Domingo, 2015], [Siwek, 2007]. Audio watermarking has been proposed as one approach for copyright protection and tamper localization of audio signals to prevent music piracy. In this thesis, we use the spikegram- which is a bio-inspired sparse representation- to propose a novel approach to design an audio tamper localization method as well as an audio copyright protection method and also a new perceptual attack against any audio watermarking system. First, we propose a tampering localization method for audio signal, based on a Modified Spread Spectrum (MSS) approach. Perceptual Matching Pursuit (PMP) is used to compute the spikegram (which is a sparse and time-shift invariant representation of audio signals) as well as 2-D masking thresholds. Then, an authentication code (which includes an Identity Number, ID) is inserted inside the sparse coefficients. For high quality watermarking, the watermark data are multiplied with masking thresholds. The time domain watermarked signal is re-synthesized from the modified coefficients and the signal is sent to the decoder. To localize a tampered segment of the audio signal, at the decoder, the ID’s associated to intact segments are detected correctly, while the ID associated to a tampered segment is mis-detected or not detected. To achieve high capacity, we propose a modified version of the improved spread spectrum watermarking called MSS (Modified Spread Spectrum). We performed a mean opinion test to measure the quality of the proposed watermarking system. Also, the bit error rates for the presented tamper localization method are computed under several attacks. In comparison to conventional methods, the proposed tamper localization method has the smallest number of mis-detected tampered frames, when only one frame is tampered. In addition, the mean opinion test experiments confirms that the proposed method preserves the high quality of input audio signals. Moreover, we introduce a new audio watermarking technique based on a kernel-based representation of audio signals. A perceptive sparse representation (spikegram) is combined with a dictionary of gammatone kernels to construct a robust representation of sounds. Compared to traditional phase embedding methods where the phase of signal’s Fourier coefficients are modified, in this method, the watermark bit stream is inserted by modifying the phase of gammatone kernels. Moreover, the watermark is automatically embedded only into kernels with high amplitudes where all masked (non-meaningful) gammatones have been already removed. Two embedding methods are proposed, one based on the watermark embedding into the sign of gammatones (one dictionary method) and another one based on watermark embedding into both sign and phase of gammatone kernels (two-dictionary method). The robustness of the proposed method is shown against 32 kbps MP3 with an embedding rate of 56.5 bps while the state of the art payload for 32 kbps MP3 robust iii iv watermarking is lower than 50.3 bps. Also, we showed that the proposed method is robust against unified speech and audio codec (24 kbps USAC, Linear predictive and Fourier domain modes) with an average payload of 5 − 15 bps. Moreover, it is shown that the proposed method is robust against a variety of signal processing transforms while preserving quality. Finally, three perceptual attacks are proposed in the perceptual sparse domain using spikegram. These attacks are called PMP, inaudible noise adding and the sparse replacement attacks. In PMP attack, the host signals are represented and re-synthesized with spikegram. In inaudible noise attack, the inaudible noise is generated and added to the spikegram coefficients. In sparse replacement attack, each specific frame of the spikegram representation - when possible - is replaced with a combination of similar frames located in other parts of the spikegram. It is shown than the PMP and inaudible noise attacks have roughly the same efficiency as the 32 kbps MP3 attack, while the replacement attack reduces the normalized correlation of the spread spectrum decoder with a greater factor than when attacking with 32 kbps MP3 or 24 kbps unified speech and audio coding (USAC)

Savoirs UdeS