41 research outputs found

    Pixel classification methods for identifying and quantifying leaf surface injury from digital images

    Full text link
    Plants exposed to stress due to pollution, disease or nutrient deficiency often develop visible symptoms on leaves such as spots, colour changes and necrotic regions. Early symptom detection is important for precision agriculture, environmental monitoring using bio-indicators and quality assessment of leafy vegetables. Leaf injury is usually assessed by visual inspection, which is labour-intensive and to a consid- erable extent subjective. In this study, methods for classifying individual pixels as healthy or injured from images of clover leaves exposed to the air pollutant ozone were tested and compared. RGB images of the leaves were acquired under controlled conditions in a laboratory using a standard digital SLR camera. Different feature vectors were extracted from the images by including different colour and texture (spa- tial) information. Four approaches to classification were evaluated: (1) Fit to a Pattern Multivariate Image Analysis (FPM) combined with T2 statistics (FPM-T2) or (2) Residual Sum of Squares statistics (FPM-RSS), (3) linear discriminant analysis (LDA) and (4) K-means clustering. The predicted leaf pixel classifications were trained from and compared to manually segmented images to evaluate classification performance. The LDA classifier outperformed the three other approaches in pixel identification with significantly higher accuracy, precision, true positive rate and F-score and significantly lower false positive rate and computation time. A feature vector of single pixel colour channel intensities was sufficient for capturing the information relevant for pixel identification. Including neighbourhood pixel information in the feature vector did not improve performance, but significantly increased the computation time. The LDA classifier was robust with 95% mean accuracy, 83% mean true positive rate and 2% mean false positive rate, indicating that it has potential for real-time applications.Opstad Kruse, OM.; Prats Montalbán, JM.; Indahl, UG.; Kvaal, K.; Ferrer Riquelme, AJ.; Futsaether, CM. (2014). Pixel classification methods for identifying and quantifying leaf surface injury from digital images. Computers and Electronics in Agriculture. 108:155-165. doi:10.1016/j.compag.2014.07.010S15516510

    Illumination tolerance in facial recognition

    Get PDF
    In this research work, five different preprocessing techniques were experimented with two different classifiers to find the best match for preprocessor + classifier combination to built an illumination tolerant face recognition system. Hence, a face recognition system is proposed based on illumination normalization techniques and linear subspace model using two distance metrics on three challenging, yet interesting databases. The databases are CAS PEAL database, the Extended Yale B database, and the AT&T database. The research takes the form of experimentation and analysis in which five illumination normalization techniques were compared and analyzed using two different distance metrics. The performances and execution times of the various techniques were recorded and measured for accuracy and efficiency. The illumination normalization techniques were Gamma Intensity Correction (GIC), discrete Cosine Transform (DCT), Histogram Remapping using Normal distribution (HRN), Histogram Remapping using Log-normal distribution (HRL), and Anisotropic Smoothing technique (AS). The linear subspace models utilized were principal component analysis (PCA) and Linear Discriminant Analysis (LDA). The two distance metrics were Euclidean and Cosine distance. The result showed that for databases with both illumination (shadows), and lighting (over-exposure) variations like the CAS PEAL database the Histogram remapping technique with normal distribution produced excellent result when the cosine distance is used as the classifier. The result indicated 65% recognition rate in 15.8 ms/img. Alternatively for databases consisting of pure illumination variation, like the extended Yale B database, the Gamma Intensity Correction (GIC) merged with the Euclidean distance metric gave the most accurate result with 95.4% recognition accuracy in 1ms/img. It was further gathered from the set of experiments that the cosine distance produces more accurate result compared to the Euclidean distance metric. However the Euclidean distance is faster than the cosine distance in all the experiments conducted

    Data driven analysis of faces from images

    Get PDF
    This thesis proposes three new data-driven approaches to detect, analyze, or modify faces in images. All presented contributions are inspired by the use of prior knowledge and they derive information about facial appearances from pre-collected databases of images or 3D face models. First, we contribute an approach that extends a widely-used monocular face detector by an additional classifier that evaluates disparity maps of a passive stereo camera. The algorithm runs in real-time and significantly reduces the number of false positives compared to the monocular approach. Next, with a many-core implementation of the detector, we train view-dependent face detectors based on tailored views which guarantee that the statistical variability is fully covered. These detectors are superior to the state of the art on a challenging dataset and can be trained in an automated procedure. Finally, we contribute a model describing the relation of facial appearance and makeup. The approach extracts makeup from before/after images of faces and allows to modify faces in images. Applications such as machine-suggested makeup can improve perceived attractiveness as shown in a perceptual study. In summary, the presented methods help improve the outcome of face detection algorithms, ease and automate their training procedures and the modification of faces in images. Moreover, their data-driven nature enables new and powerful applications arising from the use of prior knowledge and statistical analyses.In der vorliegenden Arbeit werden drei neue, datengetriebene Methoden vorgestellt, die Gesichter in Abbildungen detektieren, analysieren oder modifizieren. Alle Algorithmen extrahieren dabei Vorwissen über Gesichter und deren Erscheinungsformen aus zuvor erstellten Gesichts- Datenbanken, in 2-D oder 3-D. Zunächst wird ein weit verbreiteter monokularer Gesichtsdetektions- Algorithmus um einen zweiten Klassifikator erweitert. In Echtzeit wertet dieser stereoskopische Tiefenkarten aus und führt so zu nachweislich weniger falsch detektierten Gesichtern. Anschließend wird der Basis-Algorithmus durch Parallelisierung verbessert und mit synthetisch generierten Bilddaten trainiert. Diese garantieren die volle Nutzung des verfügbaren Varianzspektrums. So erzeugte Detektoren übertreffen bisher präsentierte Detektoren auf einem schwierigen Datensatz und können automatisch erzeugt werden. Abschließend wird ein Datenmodell für Gesichts-Make-up vorgestellt. Dieses extrahiert Make-up aus Vorher/Nachher-Fotos und kann Gesichter in Abbildungen modifizieren. In einer Studie wird gezeigt, dass vom Computer empfohlenes Make-up die wahrgenommene Attraktivität von Gesichtern steigert. Zusammengefasst verbessern die gezeigten Methoden die Ergebnisse von Gesichtsdetektoren, erleichtern und automatisieren ihre Trainingsprozedur sowie die automatische Veränderung von Gesichtern in Abbildungen. Durch Extraktion von Vorwissen und statistische Datenanalyse entstehen zudem neuartige Anwendungsfelder

    Biometric fusion methods for adaptive face recognition in computer vision

    Get PDF
    PhD ThesisFace recognition is a biometric method that uses different techniques to identify the individuals based on the facial information received from digital image data. The system of face recognition is widely used for security purposes, which has challenging problems. The solutions to some of the most important challenges are proposed in this study. The aim of this thesis is to investigate face recognition across pose problem based on the image parameters of camera calibration. In this thesis, three novel methods have been derived to address the challenges of face recognition and offer solutions to infer the camera parameters from images using a geomtric approach based on perspective projection. The following techniques were used: camera calibration CMT and Face Quadtree Decomposition (FQD), in order to develop the face camera measurement technique (FCMT) for human facial recognition. Facial information from a feature extraction and identity-matching algorithm has been created. The success and efficacy of the proposed algorithm are analysed in terms of robustness to noise, the accuracy of distance measurement, and face recognition. To overcome the intrinsic and extrinsic parameters of camera calibration parameters, a novel technique has been developed based on perspective projection, which uses different geometrical shapes to calibrate the camera. The parameters used in novel measurement technique CMT that enables the system to infer the real distance for regular and irregular objects from the 2-D images. The proposed system of CMT feeds into FQD to measure the distance between the facial points. Quadtree decomposition enhances the representation of edges and other singularities along curves of the face, and thus improves directional features from face detection across face pose. The proposed FCMT system is the new combination of CMT and FQD to recognise the faces in the various pose. The theoretical foundation of the proposed solutions has been thoroughly developed and discussed in detail. The results show that the proposed algorithms outperform existing algorithms in face recognition, with a 2.5% improvement in main error recognition rate compared with recent studies

    Robust surface modelling of visual hull from multiple silhouettes

    Get PDF
    Reconstructing depth information from images is one of the actively researched themes in computer vision and its application involves most vision research areas from object recognition to realistic visualisation. Amongst other useful vision-based reconstruction techniques, this thesis extensively investigates the visual hull (VH) concept for volume approximation and its robust surface modelling when various views of an object are available. Assuming that multiple images are captured from a circular motion, projection matrices are generally parameterised in terms of a rotation angle from a reference position in order to facilitate the multi-camera calibration. However, this assumption is often violated in practice, i.e., a pure rotation in a planar motion with accurate rotation angle is hardly realisable. To address this problem, at first, this thesis proposes a calibration method associated with the approximate circular motion. With these modified projection matrices, a resulting VH is represented by a hierarchical tree structure of voxels from which surfaces are extracted by the Marching cubes (MC) algorithm. However, the surfaces may have unexpected artefacts caused by a coarser volume reconstruction, the topological ambiguity of the MC algorithm, and imperfect image processing or calibration result. To avoid this sensitivity, this thesis proposes a robust surface construction algorithm which initially classifies local convex regions from imperfect MC vertices and then aggregates local surfaces constructed by the 3D convex hull algorithm. Furthermore, this thesis also explores the use of wide baseline images to refine a coarse VH using an affine invariant region descriptor. This improves the quality of VH when a small number of initial views is given. In conclusion, the proposed methods achieve a 3D model with enhanced accuracy. Also, robust surface modelling is retained when silhouette images are degraded by practical noise

    Shadow segmentation and tracking in real-world conditions

    Get PDF
    Visual information, in the form of images and video, comes from the interaction of light with objects. Illumination is a fundamental element of visual information. Detecting and interpreting illumination effects is part of our everyday life visual experience. Shading for instance allows us to perceive the three-dimensional nature of objects. Shadows are particularly salient cues for inferring depth information. However, we do not make any conscious or unconscious effort to avoid them as if they were an obstacle when we walk around. Moreover, when humans are asked to describe a picture, they generally omit the presence of illumination effects, such as shadows, shading, and highlights, to give a list of objects and their relative position in the scene. Processing visual information in a way that is close to what the human visual system does, thus being aware of illumination effects, represents a challenging task for computer vision systems. Illumination phenomena interfere in fact with fundamental tasks in image analysis and interpretation applications, such as object extraction and description. On the other hand, illumination conditions are an important element to be considered when creating new and richer visual content that combines objects from different sources, both natural and synthetic. When taken into account, illumination effects can play an important role in achieving realism. Among illumination effects, shadows are often integral part of natural scenes and one of the elements contributing to naturalness of synthetic scenes. In this thesis, the problem of extracting shadows from digital images is discussed. A new analysis method for the segmentation of cast shadows in still and moving images without the need of human supervision is proposed. The problem of separating moving cast shadows from moving objects in image sequences is particularly relevant for an always wider range of applications, ranging from video analysis to video coding, and from video manipulation to interactive environments. Therefore, particular attention has been dedicated to the segmentation of shadows in video. The validity of the proposed approach is however also demonstrated through its application to the detection of cast shadows in still color images. Shadows are a difficult phenomenon to model. Their appearance changes with changes in the appearance of the surface they are cast upon. It is therefore important to exploit multiple constraints derived from the analysis of the spectral, geometric and temporal properties of shadows to develop effective techniques for their extraction. The proposed method combines an analysis of color information and of photometric invariant features to a spatio-temporal verification process. With regards to the use of color information for shadow analysis, a complete picture of the existing solutions is provided, which points out the fundamental assumptions, the adopted color models and the link with research problems such as computational color constancy and color invariance. The proposed spatial verification does not make any assumption about scene geometry nor about object shape. The temporal analysis is based on a novel shadow tracking technique. On the basis of the tracking results, a temporal reliability estimation of shadows is proposed which allows to discard shadows which do not present time coherence. The proposed approach is general and can be applied to a wide class of applications and input data. The proposed cast shadow segmentation method has been evaluated on a number of different video data representing indoor and outdoor real-world environments. The obtained results have confirmed the validity of the approach, in particular its ability to deal with different types of content and its robustness to different physically important independent variables, and have demonstrated the improvement with respect to the state of the art. Examples of application of the proposed shadow segmentation tool to the enhancement of video object segmentation, tracking and description operations, and to video composition, have demonstrated the advantages of a shadow-aware video processing

    Robust surface modelling of visual hull from multiple silhouettes

    Get PDF
    Reconstructing depth information from images is one of the actively researched themes in computer vision and its application involves most vision research areas from object recognition to realistic visualisation. Amongst other useful vision-based reconstruction techniques, this thesis extensively investigates the visual hull (VH) concept for volume approximation and its robust surface modelling when various views of an object are available. Assuming that multiple images are captured from a circular motion, projection matrices are generally parameterised in terms of a rotation angle from a reference position in order to facilitate the multi-camera calibration. However, this assumption is often violated in practice, i.e., a pure rotation in a planar motion with accurate rotation angle is hardly realisable. To address this problem, at first, this thesis proposes a calibration method associated with the approximate circular motion. With these modified projection matrices, a resulting VH is represented by a hierarchical tree structure of voxels from which surfaces are extracted by the Marching cubes (MC) algorithm. However, the surfaces may have unexpected artefacts caused by a coarser volume reconstruction, the topological ambiguity of the MC algorithm, and imperfect image processing or calibration result. To avoid this sensitivity, this thesis proposes a robust surface construction algorithm which initially classifies local convex regions from imperfect MC vertices and then aggregates local surfaces constructed by the 3D convex hull algorithm. Furthermore, this thesis also explores the use of wide baseline images to refine a coarse VH using an affine invariant region descriptor. This improves the quality of VH when a small number of initial views is given. In conclusion, the proposed methods achieve a 3D model with enhanced accuracy. Also, robust surface modelling is retained when silhouette images are degraded by practical noise

    Autonomous Target Recognition and Localization for Manipulator Sampling Tasks

    Get PDF
    Future exploration missions will require autonomous robotic operations to minimize overhead on human operators. Autonomous manipulation in unknown environments requires target identification and tracking from initial discovery through grasp and stow sequences. Even with a supervisor in the loop, automating target identification and localization processes significantly lowers operator workload and data throughput requirements. This thesis introduces the Autonomous Vision Application for Target Acquisition and Ranging (AVATAR), a software system capable of recognizing appropriate targets and determining their locations for manipulator retrieval tasks. AVATAR utilizes an RGB color filter to segment possible sampling or tracking targets, applies geometric-based matching constraints, and performs stereo triangulation to determine absolute 3-D target position. Neutral buoyancy and 1-G tests verify AVATAR capabilities over a diverse matrix of targets and visual environments as well as camera and manipulator configurations. AVATAR repeatably and reliably recognizes targets and provides real-time position data sufficiently accurate for autonomous sampling

    Image Mosaicing and Super-resolution

    Full text link

    Automatic face recognition using stereo images

    Get PDF
    Face recognition is an important pattern recognition problem, in the study of both natural and artificial learning problems. Compaxed to other biometrics, it is non-intrusive, non- invasive and requires no paxticipation from the subjects. As a result, it has many applications varying from human-computer-interaction to access control and law-enforcement to crowd surveillance. In typical optical image based face recognition systems, the systematic vaxiability arising from representing the three-dimensional (3D) shape of a face by a two-dimensional (21)) illumination intensity matrix is treated as random vaxiability. Multiple examples of the face displaying vaxying pose and expressions axe captured in different imaging conditions. The imaging environment, pose and expressions are strictly controlled and the images undergo rigorous normalisation and pre-processing. This may be implemented in a paxtially or a fully automated system. Although these systems report high classification accuracies (>90%), they lack versatility and tend to fail when deployed outside laboratory conditions. Recently, more sophisticated 3D face recognition systems haxnessing the depth information have emerged. These systems usually employ specialist equipment such as laser scanners and structured light projectors. Although more accurate than 2D optical image based recognition, these systems are equally difficult to implement in a non-co-operative environment. Existing face recognition systems, both 2D and 3D, detract from the main advantages of face recognition and fail to fully exploit its non-intrusive capacity. This is either because they rely too much on subject co-operation, which is not always available, or because they cannot cope with noisy data. The main objective of this work was to investigate the role of depth information in face recognition in a noisy environment. A stereo-based system, inspired by the human binocular vision, was devised using a pair of manually calibrated digital off-the-shelf cameras in a stereo setup to compute depth information. Depth values extracted from 2D intensity images using stereoscopy are extremely noisy, and as a result this approach for face recognition is rare. This was cofirmed by the results of our experimental work. Noise in the set of correspondences, camera calibration and triangulation led to inaccurate depth reconstruction, which in turn led to poor classifier accuracy for both 3D surface matching and 211) 2 depth maps. Recognition experiments axe performed on the Sheffield Dataset, consisting 692 images of 22 individuals with varying pose, illumination and expressions
    corecore