4 research outputs found
Face Analysis and Recognition in Mobile Devices
Face recognition is currently a very active research topic due to the great variety of applications it can offer. Moreover, nowadays is very common for people to have mobile devices such as PDAs or mobile phones which have an integrated digital camera. This gives the opportunity to develop face recognition applications for this type of devices. This thesis treats the problem of face analysis and recognition in mobile devices. The problem is discussed and analyzed, observing which are the difficulties that are encountered. Some algorithms for face recognition are analyzed and optimized so that they are better suited for the constrains of mobile devices. An implementation of the algorithms in J2ME are presented, and these are used to create a demonstration application that works in commercial phones.Villegas SantamarĂa, M. (2008). Face Analysis and Recognition in Mobile Devices. http://hdl.handle.net/10251/13013Archivo delegad
Recommended from our members
Design and performance assessment of correlation filters for the detection of objects in high clutter thermal imagery
The research reported in this thesis has examined means of enhancing the performance of the Optimal Trade-off Maximum Average Correlation Height (OT-MACH) filter for target detection in Forward Looking Infra-Red (FLIR) imagery acquired from a helicopter and border security FLIR camera in northern Kuwait. The data acquired with these FLIR sensors allows real-world evaluation of the comparative performance of the various filters that have been developed in the thesis. The results obtained have been quantified using well known performance measures such as Peak to Side-lobe Ratio (PSR) and Total Detection Error (TDE). The initial focus was to study the effect of modifying the OT-MACH parameters on the correlation metrics. A new optimisation technique has been presented, which computes statistically the filter alpha parameter associated with controlling the response of the filter to clutter noise. A further modification of the OT-MACH filter performance using the Difference of Gaussian bandpass filter (named the D-MACH filter) as a pre-processing stage has been described. The D-MACH has been applied to several test images containing single and multiple targets in the scene. Enhanced performance of the modified filter is demonstrated with improved metrics being obtained with less false side peaks in the correlation plane, especially when multiple targets are present in the test images.
A further pre-processing technique was investigated using the Rayleigh distribution as a pre-processing filter (named the R-MACH filter). The R-MACH filter has been applied
to multiple target types with tests conducted across various image data sets. The filter demonstrated an improvement over the Difference of Gaussian filter in terms of 6 reducing the number of parameters needing to be tuned whilst producing further enhanced correlation plane metrics.
Finally, recommendations for future work has been made to improve the use of the OT-MACH filter in target detection and identification. A novel training image representation is proposed for further investigation, which will minimise the computational intensity of using the MACH filter for unconstrained object recognition
Robust and Efficient Activity Recognition from Videos
With technological advancement in embedded system design, powerful cameras have been embedded within smart phones, and wireless cameras can be easily deployed at street corners, traffic lights, big stadiums, train stations, etc. Besides, the growth of online media, surveillance, and mobile cameras have resulted in an explosion of videos being uploaded to social media sites such as Facebook and YouTube. The availability of such a vast volume of videos has attracted the computer vision community to conduct much research on human activity recognition since people are arguably the most interesting subjects of such videos. Automatic human activity recognition allows engineers and computer scientists to design smarter surveillance systems, semantically aware video indexes and also more natural human-computer interfaces. Despite the explosion of video data, the ability to automatically recognize and understand human activities is still rather limited. This is primarily due to multiple challenges inherent to the recognition task, namely large variability in human execution styles, the complexity of the visual stimuli in terms of camera motion, background clutter, viewpoint changes, etc., and the number of activities that can be recognized. In addition, the ability to predict future actions of objects based on past observed video frames is very useful. Therefore, in this thesis, we explore four designs to solve the problems we discussed earlier, namely
(1) A semantics-based deep learning model, namely SBGAR, is proposed to do group activity recognition. This model achieves higher accuracy and efficiency than existing group activity recognition methods.
(2) Despite its high accuracy, SBGAR has some limitations, namely (i) it requires a large dataset with caption information, (ii) activity recognition model is independent of the caption generation model and hence SBGAR may not perform well in some cases. To remove such limitations, we design ReHAR, a robust and efficient human activity recognition scheme. ReHAR can be used to recognize both single-person activities and group activities.
(3) In many application scenarios, merely knowing what the moving agents are doing is not sufficient. It also requires predictions of future trajectories of moving agents. Thus, we propose GRIP, a graph-based interaction-aware motion intent prediction scheme. The scheme uses a graph to represent the relationships between two objects, e.g., human joints or traffic agents, and predict the motion intents of all observed objects simultaneously.
(4) Action recognition and trajectory prediction schemes are typically deployed in resource-constrained devices. Thus, any technique that can accelerate the computation speed of our schemes is important. Hence, we propose a novel deep learning model decomposition method called DAC that is capable of factorizing an ordinary convolutional layer into two layers with much fewer parameters. DAC computes the corresponding weights for the newly generated layers directly from the weights of the original convolutional layer. Thus, no training (or fine-tuning) or any data is needed