1,392 research outputs found
On-line Learning of Mutually Orthogonal Subspaces for Face Recognition by Image Sets
We address the problem of face recognition by matching image sets. Each set of face images is represented by a subspace (or linear manifold) and recognition is carried out by subspace-to-subspace matching. In this paper, 1) a new discriminative method that maximises orthogonality between subspaces is proposed. The method improves the discrimination power of the subspace angle based face recognition method by maximizing the angles between different classes. 2) We propose a method for on-line updating the discriminative subspaces as a mechanism for continuously improving recognition accuracy. 3) A further enhancement called locally orthogonal subspace method is presented to maximise the orthogonality between competing classes. Experiments using 700 face image sets have shown that the proposed method outperforms relevant prior art and effectively boosts its accuracy by online learning. It is shown that the method for online learning delivers the same solution as the batch computation at far lower computational cost and the locally orthogonal method exhibits improved accuracy. We also demonstrate the merit of the proposed face recognition method on portal scenarios of multiple biometric grand challenge
Wearable face recognition aid
The feasibility of realising a low cost wearable face recognition aid based on a robust correlation algorithm is investigated. The aim of the study is to determine the limiting spatial and grey level resolution of the probe and gallery images that would support successful prompting of the identity of input face images. Low spatial and grey level resolution images are obtained from good quality image data algorithmically. The tests carried out on the XM2VTS database demonstrate that robust correlation is very resilient to degradations of spatial and grey level image resolution. Correct prompts have been generated in 98% cases even for severely degraded images
Automatic annotation of tennis games: an integration of audio, vision, and learning
Fully automatic annotation of tennis game using broadcast video is a task with a great potential but with enormous challenges. In this paper we describe our approach to this task, which integrates computer vision, machine listening, and machine learning. At the low level processing, we improve upon our previously proposed state-of-the-art tennis ball tracking algorithm and employ audio signal processing techniques to detect key events and construct features for classifying the events. At high level analysis, we model event classification as a sequence labelling problem, and investigate four machine learning techniques using simulated event sequences. Finally, we evaluate our proposed approach on three real world tennis games, and discuss the interplay between audio, vision and learning. To the best of our knowledge, our system is the only one that can annotate tennis game at such a detailed level
Revealing the molecular signatures of host-pathogen interactions.
Advances in sequencing technology and genome-wide association studies are now revealing the complex interactions between hosts and pathogen through genomic variation signatures, which arise from evolutionary co-existence
Recommended from our members
On the electronic properties of a single dislocation
A detailed knowledge of the electronic properties of individual dislocations is necessary for next generation nanodevices. Dislocations are fundamental crystal defects controlling the growth of different nanostructures (nanowires) or appear during device processing. We present a method to record electric properties of single dislocations in thin silicon layers. Results of measurements on single screw dislocations are shown for the first time. Assuming a cross-section area of the dislocation core of about 1 nm2, the current density through a single dislocation is J = 3.8 × 1012 A/cm2 corresponding to a resistivity of ρ ≅ 1 × 10-8 Ω cm. This is about eight orders of magnitude lower than the surrounding silicon matrix. The reason of the supermetallic behavior is the high strain in the cores of the dissociated dislocations modifying the local band structure resulting in high conductive carrier channels along defect cores
Kernel Discriminant Analysis Using Triangular Kernel for Semantic Scene Classification
Semantic scene classification is a challenging research problem that aims to categorise images into semantic classes such as beaches, sunsets or mountains. This prob-lem can be formulated as multi-labeled classification prob-lem where an image can belong to more than one concep-tual class such as sunsets and beaches at the same time. Re-cently, Kernel Discriminant Analysis combined with spec-tral regression (SR-KDA) has been successfully used for face, text and spoken letter recognition. But SR-KDA method works only with positive definite symmetric matri-ces. In this paper, we have modified this method to support both definite and indefinite symmetric matrices. The main idea is to use LDLT decomposition instead of Cholesky decomposition. The modified SR-KDA is applied to scene database involving 6 concepts. We validate the advocated approach and demonstrate that it yields significant perfor-mance gains when conditionally positive definite triangular kernel is used instead of positive definite symmetric kernels such as linear, polynomial or RBF. The results also indicate performance gains when compared with the state-of-the art multi-label methods for semantic scene classification.
Skin Lesion Analyser: An Efficient Seven-Way Multi-Class Skin Cancer Classification Using MobileNet
Skin cancer, a major form of cancer, is a critical public health problem with
123,000 newly diagnosed melanoma cases and between 2 and 3 million non-melanoma
cases worldwide each year. The leading cause of skin cancer is high exposure of
skin cells to UV radiation, which can damage the DNA inside skin cells leading
to uncontrolled growth of skin cells. Skin cancer is primarily diagnosed
visually employing clinical screening, a biopsy, dermoscopic analysis, and
histopathological examination. It has been demonstrated that the dermoscopic
analysis in the hands of inexperienced dermatologists may cause a reduction in
diagnostic accuracy. Early detection and screening of skin cancer have the
potential to reduce mortality and morbidity. Previous studies have shown Deep
Learning ability to perform better than human experts in several visual
recognition tasks. In this paper, we propose an efficient seven-way automated
multi-class skin cancer classification system having performance comparable
with expert dermatologists. We used a pretrained MobileNet model to train over
HAM10000 dataset using transfer learning. The model classifies skin lesion
image with a categorical accuracy of 83.1 percent, top2 accuracy of 91.36
percent and top3 accuracy of 95.34 percent. The weighted average of precision,
recall, and f1-score were found to be 0.89, 0.83, and 0.83 respectively. The
model has been deployed as a web application for public use at
(https://saketchaturvedi.github.io). This fast, expansible method holds the
potential for substantial clinical impact, including broadening the scope of
primary care practice and augmenting clinical decision-making for dermatology
specialists.Comment: This is a pre-copyedited version of a contribution published in
Advances in Intelligent Systems and Computing, Hassanien A., Bhatnagar R.,
Darwish A. (eds) published by Chaturvedi S.S., Gupta K., Prasad P.S. The
definitive authentication version is available online via
https://doi.org/10.1007/978-981-15-3383-9_1
Audio-Visual Person Verification
In this paper we investigate benefits of classifier combination for a multimodal system for personal identity verification. The system uses frontal face images and speech. We show that a sophisticated fusion strategy enables the system to outperform its facial and vocal modules when taken separately. We show that both trained linear weighted schemes and fusion by Support Vector Machine classifier leads to a significant reduction of total error rates. The complete system is tested on data from a publicly available audio-visual database according to a published protocol
- …