109 research outputs found

    Human Face Recognition

    Get PDF
    Face recognition, as the main biometric used by human beings, has become more popular for the last twenty years. Automatic recognition of human faces has many commercial and security applications in identity validation and recognition and has become one of the hottest topics in the area of image processing and pattern recognition since 1990. Availability of feasible technologies as well as the increasing request for reliable security systems in today’s world has been a motivation for many researchers to develop new methods for face recognition. In automatic face recognition we desire to either identify or verify one or more persons in still or video images of a scene by means of a stored database of faces. One of the important features of face recognition is its non-intrusive and non-contact property that distinguishes it from other biometrics like iris or finger print recognition that require subjects’ participation. During the last two decades several face recognition algorithms and systems have been proposed and some major advances have been achieved. As a result, the performance of face recognition systems under controlled conditions has now reached a satisfactory level. These systems, however, face some challenges in environments with variations in illumination, pose, expression, etc. The objective of this research is designing a reliable automated face recognition system which is robust under varying conditions of noise level, illumination and occlusion. A new method for illumination invariant feature extraction based on the illumination-reflectance model is proposed which is computationally efficient and does not require any prior information about the face model or illumination. A weighted voting scheme is also proposed to enhance the performance under illumination variations and also cancel occlusions. The proposed method uses mutual information and entropy of the images to generate different weights for a group of ensemble classifiers based on the input image quality. The method yields outstanding results by reducing the effect of both illumination and occlusion variations in the input face images

    NIRFaceNet: A Convolutional Neural Network for Near-Infrared Face Identification

    Get PDF
    Near-infrared (NIR) face recognition has attracted increasing attention because of its advantage of illumination invariance. However, traditional face recognition methods based on NIR are designed for and tested in cooperative-user applications. In this paper, we present a convolutional neural network (CNN) for NIR face recognition (specifically face identification) in non-cooperative-user applications. The proposed NIRFaceNet is modified from GoogLeNet, but has a more compact structure designed specifically for the Chinese Academy of Sciences Institute of Automation (CASIA) NIR database and can achieve higher identification rates with less training time and less processing time. The experimental results demonstrate that NIRFaceNet has an overall advantage compared to other methods in the NIR face recognition domain when image blur and noise are present. The performance suggests that the proposed NIRFaceNet method may be more suitable for non-cooperative-user applications

    Near-infrared Image Based Face Recognition

    Get PDF
    Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2012Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2012İnsanların yüzleri hatırlama, tanıma ve ayrıştırma yetenekleri doğuştandır. Yüz tanıma alanındaki gelişmeler ve çeşitli ticari yüz tanıma uygulamaları birbirlerine paralel ilerlemiştir. Yine de, daha hatasız ve doğru sistemlere olan ihtiyaç devam etmektedir. Yüz tanımanın kullanıldığı bazı uygulama örnekleri aşağıdaki gibidir:  Yüz tabanlı video dizinleme ve arama motorları  Multimedya yönetimi  İnsan-bilgisayar etkileşimi  Biyometrik kimlik tanıma  Takip sistemleri Yüz tanımada işbirliği içinde ve işbirliği etmeyen olarak iki tip senaryo bulunmaktadır. Takip sistemleri, işbirliği etmeyen kullanıcı uygulamaları için iyi bir örnektir. İşbirliği içinde olan kullanıcı uygulamalarına da geçiş control makinelerinde okunabilen seyahat dökümanları, ATM, bilgisayarda oturum açma, e-ticaret ve e-devlet uygulamaları örnek verilebilir. Kullanıcının sistemle işbirliği içinde olduğu uygulamalarda, sistemin kabulu için yüzün kameraya uygun bir şekilde konumlandırıldıktan sonra yüz resminin elde edilmelidir. Aslında çoğu yüz tanıma sistemleri bu tip uygulamalar için geliştirilmiştir. Yüze ait iç ve dış faktörler yüz tanıma işleminin performansını etkilemektedir. Yüz tanıma yüz yüzeyinin 3D şekil yansıması gibi, yalnızca yüze ait iç faktörlere dayandırılmalıdır. Dış faktörler gözlük, saç modeli, yüz ifadesi, poz ve çevresel ışıklandırma gibi özellikleri içerir. Güvenilir bir yüz tanıma için etkileri en aza indirgenmelidir. Biyometrik bir sistem çevreye uyum sağlamalıdır, bu durumun tam tersi düşünülemez. Çeşitli dış faktörlerin arasından kontrolsüz çevresel ışıklandırma en önemli konudur. Işıklandırma koşulları, özellikle ışığın açısı, yüzün görünümünü öyle çok değiştirmektedir ki; farklı ışıklandırma altında aynı kişiye ait görüntüler ile aynı ışıklandırma altında iki ayrı kişiye ait görüntüler arasında hesaplanan farklılık daha fazladır. Üzerinde çalışılmakta olan bölgesel filtrelerin çoğu kendi başlarına ışıklandırma yönünün sebep olduğu değişimlerin üstesinden gelmekte yetersizdir. Bu sebeple, yakın kızılötesi görüntüleme önerilmiştir. Son zamanlarda, görünür spectrum ardı görüntüleme üzerine çalışmalar yürütülmektedir. Ancak, termal görüntülemenin üstünlükleri yanısıra birçok dezavantajı vardır. Çevresel sıcaklık, fiziksel ve duygusal durum, alkol alımı sistemin başarısını çok fazla etkilemektedir. Çalışmalar, thermal görüntüleme ile yapılan tanıma işlemlerinin, görünür ışık tabanlı görüntüleme işlemlerinden daha iyi bir performans sergilemediklerini göstermiştir. 3D görüntüleme de kullanılan yöntemler arasındadır; fakat işlem yükü, görüntüleme sırasında gözlük takılması veya ağzın açık olma durumu sistemi başarısız kılabilir. Yakın kızıl-ötesi için aktif ışıklandırmada dikkat edilmesi gereken iki önemli husus vardır. • Işıklar net bir önden aydınlatılmış yüz resmi sağlayacak şiddette olmalı; fakat göze rahatsızlık vermemelidir. • Elde edilen yüz resmi çevresel ışıklandırmadan minimum derecede etkilenmiş olmalıdır. Bu çalışmada, diğer metotlarla karşılaştırma amacıyla, yakın kızılötesi (YKÖ) imajları üzerinde öncelikle PCA, LDA ve LBP gibi geleneksel yüz tanıma metotları uygulanmıştır. Eigenfaces yaklaşımında, “öz yüzler” PCA yardımıyla yüz imajlarından oluşturulmuştur. PCA’in amacı yüksek boyutlu veri uzayını, daha az boyuta sahip içsel özellik uzayına dönüştürmektir. LDA’in PCA’den sonra uygulandığı Fisherfaces yaklaşımında, projeksiyon yönü bulunur böylece farklı id’li, farklı sınıflara ait imajlar azami ölçüde ayrıştırılacaktır. Diğer bir deyişle, sınıflar arası dağılım matrisi ve sınıf içi dağılım matrisi oranını maksimum yapan projeksiyon matrisi bulunur. Gabor ve LBP gibi yerel görüntü temsilleri ile ilgili çalışmalar da merak uyandırmaktadır. Başarılı bir yüz tanıma için yüzün dışsal özellikleri ile uğraşmak önemli bir konudur. LBP doku operatörü, ışıklandırma gibi özellikler nedeniyle oluşan değişimlerle başa çıkabilmektedir; bu yüzden çeşitli uygulamalarda popular bir yaklaşım haline gelmiştir. Kapalı mekan için yapılan yüz tanıma uygulamalarında, ışıklandırma bağımsız yüz temsilinde, gri tonlamadaki monotonik dönüşümün serbestlik derecesini telafi etmek amacıyla LBP gösterimi kullanılmaktadır. İmaja ait pixeller, komşu piksellerin eşik değeri olarak ilgili pikselle karşılaştırılması ile 0 veya 1 olarak etiketlenir. LBP operatörü, tamsayı olmayan piksel koordinatlarında çift doğrusal interpolasyon uygulayarak, farklı boyut ve çaplardaki komşuluklarda kullanılabilmesi için geliştirilmiştir. Başka bir değişik kullanımı ise tek biçim dokulardır. Yerel bir ikili değer dokusu, 0’dan 1’e veya tersi şeklinde en fazla iki bitsel geçiş içeriyorsa tek biçim olarak adlandırılır. Bu çalışmada, (8,1), (8,2) ve (16,2) komşu sayısı ve çap için tek biçim LBP’leri hesaplanmıştır. LBP+LDA metotu da bu çalışmada kullanılmıştır. İmajlara ait ek biçim (8,1)’lik LBP görüntü temsilleri elde edildikten sonra, bellek kısıtlarından ötürü alt örnekleme ile boyutu düşürülür. Tekil olmayan sınıf içi dağılım matrisi için PCA işleminden sonra, alt örneklenmiş özellik sınıfları üzerinde LDA uygulanır. Yüz tanıma performansını daha da arttırmak için Zernike momentleri kullanılmıştır. Global Zernike momentleri, LBP gibi bir yerel görüntü temsilleri eldesi için değiştirilmiştir. Komşuluklar ve her bir piksel etrafındaki mikro yapıyı yakalamak için bulunan moment bileşenleri dikkate alınarak, momentler her bir piksel için hesaplanmıştır. Asıl yüz imajı boyutlarına sahip kompleks moment imajı, her bir moment bileşeni için elde edilir. Daha sonra, her moment imajı, üst üste denk gelmeyecek şekilde alt bölgelere bölünür ve her bir alt bölgeden faz-büyüklük histogramları çıkartılır. Bu histogramlar peşi sıra birbirine eklenerek yüz temsili elde edilir. LBP ve LDA metotlarının birlikte kullanımı yüz tanıma başarısını olumlu bir şekilde etkilemektedir. Bu yüzden LZM ile LDA de birlikte kullanılarak, başarısı test edilmiştir. LDA’in LZM üzerine uygulanma şekli LBP+LDA işlemindekinin aynısıdır. Faz-büyüklük histogramlarının alt örnekleme ile boyutu düşürülmüştür. Daha sonar, LDA projeksiyonları hesaplanmış ve cosine benzerliği formülü ile eşleşme operasyonu gerçekleştirilmiştir. Sonuçlardan anlaşıldığı üzere, LZM+LDA’in LZM üzerinde belirgin bir üstünlüğü vardır. Bu çalışmada aşağıdaki metotlar kullanılmıştır: 1. Mahalanobis mesafesi ile PCA 2. Cosine benzerliği ile LDA 3. Ki-kare mesafesi ile tek biçim LBP (original (8,1), (8,2) ve (8,16)) 4. Cosine benzerliği ile LBP+LDA 5. Manhattan mesafesi ile LZM 6. Cosine benzerliği ile LBP+LZM Bu çalışma için oluşturulan yazılım hem kimlik tanımlama hem de kimlik doğrulama için test edilmiştir. Kimlik tanımlamada, sistem kullanıcının kim olduğunu bulmaya çalışır. Kimlik doğrulamada ise, kullanıcı belirli bir kimlik olduğunu iddia eder ve sistem bunun doğruluğunu kontrol eder. Testler için OTCBVS kalite testi veri kümesi koleksiyonundan CBSR NIR yüz veritabanı kullanılmıştır. Veritabanında 197 farklı kişiye ait toplam 3,940 YKÖ yüz imajı bulunmaktadır. Görüntüler, aktif yakın kızıl-ötesi ışıklandırma ile yakın kızıl-ötesi kamera kullanarak çekilmiştir. Kameranın üstüne konumlandırılmış 18 adet yakın kızıl-ötesi led bulunmaktadır. Bu çalışma için yapılan testler sonucunda, LZM’in başarısı, hem orijinal tek biçim LBP hem de farklı komşuluk sayısı ve çapta kullanım için geliştirilmiş olan tek biçim LBP metotlarından daha yüksek çıkmıştır. Metotların LDA ile birlikte kullanımı ise yüz tanıma işleminin başarısını daha üst seviyelere taşımaktadır. Kimlik doğrulama adımında, LBP operatörlerinin başarısı tek başına LDA’in başarısından daha fazladır; ancak kimlik tanımlama adımında LDA’in başarısı, LBP’nin üstünde çıkmıştır. PCA kullanımı ise hem tanımlama hem doğrulama için diğer metotların başarımlarını yakalayamamış; güvenilir bir yüz tanıma için yetersiz kalmıştır. Bir YKÖ yüz imajı, yüz tanıma sistemleri için sorunsuz bir girdi oluşturmaktadır; çünkü tanıma aşamasından önceki ağır ön işleme adımlarını azaltmaktadır. LZM işleminin de yardımlarıyla, YKÖ görüntüleme sisteminden elde edilmiş yüz imajları ile hızlı ve yüksek başarımlı yüz tanıma sistemleri gerçekleştirilebilir. Yalnız, YKÖ görüntüleme, işbirliği etmeyen kullanıcı uygulamaları için henüz uygun değildir. Ayrıca, dış mekan kullanımı da özellikle görünür ışığın, güneşli havalar gibi baskın olacağı yerlerde başarılı olamayabilir. Gelecekte, YKÖ görüntüleme sistemlerinde yapılacak çalışmalar ile bu tür kısıtların üzerinden gelinebilir.Humans have the ability to remember, recognize and distinguish faces and the scientists have been working on systems that can establish the same facility. The improvements in face recognition and numerous commercial face recognition systems has increased in a parallel way. Yet the need for more accurate systems still remains. Some examples of the applications in which face recognition is being used are:  Face-based video indexing and browsing engines  Multimedia management  Human-computer interaction  Biometric identity authentication  Surveillance systems There are two kinds of scenarios in face recognition, namely cooperative and uncooperative. Survellience systems can be a good example for uncooperative user applications. Cooperative user applications are such as access control machine readable traveling documents, ATM, computer login, e-commerce and e-government systems. In cooperative user scenarios, a user is required provide his/her face in a proper position for the camera to have the face image captured properly, in order to be granted for the access. In fact, many face recognition systems have been developed for such applications. The intrinsic and extrinsic factors of the face affect the performance of the face recognition. Face recognition should be performed based on intrinsic factors of the face only, like 3D shape reflectance of the facial surface. Extrinsic factors include eyeglasses, hairstyle, expression, posture, environmental lighting. They should be minimized for reliable face recognition. A biometric system should adapt to the environment, not vice versa. Among several extrinsic factors, problems with uncontrolled environmental lighting is the topmost issue. Lighting conditions, especially the light angle, change the appearance of a face so much that the changes calculated between the images of a person under different illumination conditions are larger than those between the images of two different people under the same illumination conditions. All of the local filters under study are insufficient by themselves to overcome variations due to changes in illumination direction. So, therefore, near infrared imaging is proposed. Studies on imaging beyond visible spectrum has been carried on recently. However, thermal imaging has many disadvantages as well as its advantages. Enviromental temperature, physical and emotional conditions, drinking alcohol can affect the system’s success drastically. Studies have shown they have not performed better than visible image based systems. 3D visible imaging had also been tried but the load created during its process and wearing sunglasses or an open mouth can fail the system’s success. There are two principles for the active lighting in near-infrared imaging: • The lights should be strong enough to produce clear frontal-lighted face image but not cause disturbance to human eyes • The resulting face image should be affected as little as possible after minimizing the environmental lighting. In this work, firstly, traditional face recognition methods such as PCA, LDA and LBP have been tried on NIR images for comparison with other methods. In Eigenfaces approach, “eigenfaces” are constructed from the face images, by means of PCA. The purpose of PCA is to reduce the large dimensionality of the data space to the smaller intrinsic dimensionality of feature space. In Fisherfaces approach, where LDA is applied after PCA, the projection direction is found so that the images belonging to different class, here the different ids, are separated maximally. In other words, the projection matrix that makes the ratio of the between-class scatter matrix and within-class scatter matrix of the images maximum, is found. Local image representations such as Gabor and LBP has arisen great interest. For robust face recognition, dealing with extrinsic properties of face is an important issue. LBP texture operator can handle the variations caused by these properties, such as illumination, so it has become a popular approach in various applications. LBP representation is used to compensate for the degree of freedom in a monotonic transform in the gray tone to achieve an illumination invariant representation of faces for indoor face recognition applications. The pixels of an image are labeled as 0 or 1, by thresholding the neighborhood of each pixel, considering the result as a binary number. The LBP operator was extended for neighborhood of different sizes and radius by bilinearly interpolating values at non-integer pixel coordinates. Another extension is the uniform patterns. A local binary pattern is called uniform if the binary pattern contains at most two bitwise transitions from 0 to 1 or vice versa. Uniform LBPs that have (8,1), (8,2) and (16,2) neighborhood and radius size are computed. LBP+LDA is also used in this work. After uniform LBP(8,1) representations of the images are obtained, they are downsampled because of the memory limitations. Then LDA is performed on the downsampled feature sets after PCA is applied to make the within-class scatter matrix nonsingular. Zernike moments are used to further improve the face recognition performance. Global Zernike moments are modified to obtain a local representation, such as LBP, called Local Zernike moments (LZM). The moments are computed at each pixel, considering their neighborhood and moment components obtained to capture the micro structure around each pixel. A complex moment image, which has the same size of the original face image, is obtained for each moment component. Later, each moment image is divided into non-overlapping subregions and phase-magnitude histograms are extracted from each subregion. Finally, the phase-magnitude histograms are concatenated and the face representation is built. Since the use of LDA on LBP has positive effects on the success of the recognition, LZM+LDA is implemented for this study. The process of applying LDA on LZM is the same as the process in LBP+LDA. The phase-magnitude moments are downsampled and PCA is applied before LDA operation. Afterwards, the LDA projections are calculated and cosine distance is used for the matching operation. It is found out that the success of LZM+LDA over LZM is significant. The tests in this study are performed with the following methods: 1. PCA with Mahalanobis distance 2. LDA with cosine distance 3. LBP with chi-square distance (original uniform (8,1), (8,2) and (16,2)) 4. LBP+LDA with cosine distance 5. LZM with Manhattan distance 6. LZM+LDA with cosine distance Both identification and verification have been tested for the methods. In face identification, a system tries to figure who the person is. In face verification, the system verifies whether the identity a person claims to be is true. CBSR NIR Face Dataset of OTCBVS Benchmark Dataset Collection is used. The database contains 3,940 NIR face images of 197 people. The images were taken by an NIR camera with active NIR lighting. 18 NIR LEDs are mounted on the camera. It is found that LZM performs better than both the original and extended uniform LBP methods in verification and identification tests. A method’s combination with LDA carries the success of face recognition to higher levels. In identification step, however, the extended LBP operators are more successful than LDA itself but in verification step, LDA is more successful than all the LBP operators. The success rate of PCA is not good enough to catch up with the other methods in face recognition. Using NIR face images for face recognition saves the system from the load of the preprocessing steps before the recognition. With the help of LZM on NIR images, robust and highly accurate systems can be built. Yet, NIR imaging is not improved enough to handle outdoor and uncooperative user applications. Future works on this context can help the system’s success carry to a higher level.Yüksek LisansM.Sc

    Advanced facial recognition for digital forensics

    Get PDF

    Integration of 2D Textural and 3D Geometric Features for Robust Facial Expression Recognition

    Get PDF
    Recognition of facial expressions is critical for successful social interactions and relationships. Facial expressions transmit emotional information, which is critical for human-machine interaction; therefore, significant research in computer vision has been conducted, with promising findings in using facial expression detection in both academia and industry. 3D pictures acquired enormous popularity owing to their ability to overcome some of the constraints inherent in 2D imagery, such as lighting and variation. We present a method for recognizing facial expressions in this article by combining features extracted from 2D textured pictures and 3D geometric data using the Local Binary Pattern (LBP) and the 3D Voxel Histogram of Oriented Gradients (3DVHOG), respectively. We performed various pre-processing operations using the MDPA-FACE3D and Bosphorus datasets, then we carried out classification process to classify images into seven universal emotions, namely anger, disgust, fear, happiness, sadness, neutral, and surprise. Using Support Vector Machine classifier, we achieved the accuracy of 88.5 % and 92.9 % on the MDPA-FACE3D and the Bosphorus datasets, respectively

    An Extensive Review on Spectral Imaging in Biometric Systems: Challenges and Advancements

    Full text link
    Spectral imaging has recently gained traction for face recognition in biometric systems. We investigate the merits of spectral imaging for face recognition and the current challenges that hamper the widespread deployment of spectral sensors for face recognition. The reliability of conventional face recognition systems operating in the visible range is compromised by illumination changes, pose variations and spoof attacks. Recent works have reaped the benefits of spectral imaging to counter these limitations in surveillance activities (defence, airport security checks, etc.). However, the implementation of this technology for biometrics, is still in its infancy due to multiple reasons. We present an overview of the existing work in the domain of spectral imaging for face recognition, different types of modalities and their assessment, availability of public databases for sake of reproducible research as well as evaluation of algorithms, and recent advancements in the field, such as, the use of deep learning-based methods for recognizing faces from spectral images

    New human action recognition scheme with geometrical feature representation and invariant discretization for video surveillance

    Get PDF
    Human action recognition is an active research area in computer vision because of its immense application in the field of video surveillance, video retrieval, security systems, video indexing and human computer interaction. Action recognition is classified as the time varying feature data generated by human under different viewpoint that aims to build mapping between dynamic image information and semantic understanding. Although a great deal of progress has been made in recognition of human actions during last two decades, few proposed approaches in literature are reported. This leads to a need for much research works to be conducted in addressing on going challenges leading to developing more efficient approaches to solve human action recognition. Feature extraction is the main tasks in action recognition that represents the core of any action recognition procedure. The process of feature extraction involves transforming the input data that describe the shape of a segmented silhouette of a moving person into the set of represented features of action poses. In video surveillance, global moment invariant based on Geometrical Moment Invariant (GMI) is widely used in human action recognition. However, there are many drawbacks of GMI such that it lack of granular interpretation of the invariants relative to the shape. Consequently, the representation of features has not been standardized. Hence, this study proposes a new scheme of human action recognition (HAR) with geometrical moment invariants for feature extraction and supervised invariant discretization in identifying actions uniqueness in video sequencing. The proposed scheme is tested using IXMAS dataset in video sequence that has non rigid nature of human poses that resulting from drastic illumination changes, changing in pose and erratic motion patterns. The invarianceness of the proposed scheme is validated based on the intra-class and inter-class analysis. The result of the proposed scheme yields better performance in action recognition compared to the conventional scheme with an average of more than 99% accuracy while preserving the shape of the human actions in video images

    Face Recognition: A Comparative Approach from Traditional to Recent Trends

    Get PDF
    Face recognition, an important biometric method used extensively by researchers, has become more popular recently due to development of mobile applications and frequent usages of facial images in social media. A major development is attained in facial recognition methods due to the emergence of deep learning methods. As a result, the performance of face recognition systems reached a matured state. The objectives of this research are to improve the accuracy rate of both traditional and modern methods of face recognition system under illumination variation by applying various preprocessing techniques. In the proposed face recognition approach, various preprocessing methods like SQI, HE, LTISN, GIC and DoG are applied to the Local Binary Pattern (LBP) feature extraction method and by using the Weighted Entropy based method to fuse the output of classifiers on FERET database, we have shown improvement in recognition accuracy of as high as 88.2 % can be obtained after applying DoG . In a recently used approach, deep CNN model is suggested. The Experiments are conducted in Extended Yale B and FERET Database. The suggested model provides good accuracy rates. To improve the accuracy rates further, preprocessing methods like SQI, HE, LTISN, GIC and DoG are applied to both the models. As a result, higher accuracy rates are achieved in deep CNN model both in Extended Yale B Database and FERET Database. Extended Yale B Database provides the highest accuracy rate of 99.8% after the application of SQI and an accuracy rate of 99.7% is achieved by applying HE

    Phenomenological modeling of image irradiance for non-Lambertian surfaces under natural illumination.

    Get PDF
    Various vision tasks are usually confronted by appearance variations due to changes of illumination. For instance, in a recognition system, it has been shown that the variability in human face appearance is owed to changes to lighting conditions rather than person\u27s identity. Theoretically, due to the arbitrariness of the lighting function, the space of all possible images of a fixed-pose object under all possible illumination conditions is infinite dimensional. Nonetheless, it has been proven that the set of images of a convex Lambertian surface under distant illumination lies near a low dimensional linear subspace. This result was also extended to include non-Lambertian objects with non-convex geometry. As such, vision applications, concerned with the recovery of illumination, reflectance or surface geometry from images, would benefit from a low-dimensional generative model which captures appearance variations w.r.t. illumination conditions and surface reflectance properties. This enables the formulation of such inverse problems as parameter estimation. Typically, subspace construction boils to performing a dimensionality reduction scheme, e.g. Principal Component Analysis (PCA), on a large set of (real/synthesized) images of object(s) of interest with fixed pose but different illumination conditions. However, this approach has two major problems. First, the acquired/rendered image ensemble should be statistically significant vis-a-vis capturing the full behavior of the sources of variations that is of interest, in particular illumination and reflectance. Second, the curse of dimensionality hinders numerical methods such as Singular Value Decomposition (SVD) which becomes intractable especially with large number of large-sized realizations in the image ensemble. One way to bypass the need of large image ensemble is to construct appearance subspaces using phenomenological models which capture appearance variations through mathematical abstraction of the reflection process. In particular, the harmonic expansion of the image irradiance equation can be used to derive an analytic subspace to represent images under fixed pose but different illumination conditions where the image irradiance equation has been formulated in a convolution framework. Due to their low-frequency nature, irradiance signals can be represented using low-order basis functions, where Spherical Harmonics (SH) has been extensively adopted. Typically, an ideal solution to the image irradiance (appearance) modeling problem should be able to incorporate complex illumination, cast shadows as well as realistic surface reflectance properties, while moving away from the simplifying assumptions of Lambertian reflectance and single-source distant illumination. By handling arbitrary complex illumination and non-Lambertian reflectance, the appearance model proposed in this dissertation moves the state of the art closer to the ideal solution. This work primarily addresses the geometrical compliance of the hemispherical basis for representing surface reflectance while presenting a compact, yet accurate representation for arbitrary materials. To maintain the plausibility of the resulting appearance, the proposed basis is constructed in a manner that satisfies the Helmholtz reciprocity property while avoiding high computational complexity. It is believed that having the illumination and surface reflectance represented in the spherical and hemispherical domains respectively, while complying with the physical properties of the surface reflectance would provide better approximation accuracy of image irradiance when compared to the representation in the spherical domain. Discounting subsurface scattering and surface emittance, this work proposes a surface reflectance basis, based on hemispherical harmonics (HSH), defined on the Cartesian product of the incoming and outgoing local hemispheres (i.e. w.r.t. surface points). This basis obeys physical properties of surface reflectance involving reciprocity and energy conservation. The basis functions are validated using analytical reflectance models as well as scattered reflectance measurements which might violate the Helmholtz reciprocity property (this can be filtered out through the process of projecting them on the subspace spanned by the proposed basis, where the reciprocity property is preserved in the least-squares sense). The image formation process of isotropic surfaces under arbitrary distant illumination is also formulated in the frequency space where the orthogonality relation between illumination and reflectance bases is encoded in what is termed as irradiance harmonics. Such harmonics decouple the effect of illumination and reflectance from the underlying pose and geometry. Further, a bilinear approach to analytically construct irradiance subspace is proposed in order to tackle the inherent problem of small-sample-size and curse of dimensionality. The process of finding the analytic subspace is posed as establishing a relation between its principal components and that of the irradiance harmonics basis functions. It is also shown how to incorporate prior information about natural illumination and real-world surface reflectance characteristics in order to capture the full behavior of complex illumination and non-Lambertian reflectance. The use of the presented theoretical framework to develop practical algorithms for shape recovery is further presented where the hitherto assumed Lambertian assumption is relaxed. With a single image of unknown general illumination, the underlying geometrical structure can be recovered while accounting explicitly for object reflectance characteristics (e.g. human skin types for facial images and teeth reflectance for human jaw reconstruction) as well as complex illumination conditions. Experiments on synthetic and real images illustrate the robustness of the proposed appearance model vis-a-vis illumination variation. Keywords: computer vision, computer graphics, shading, illumination modeling, reflectance representation, image irradiance, frequency space representations, {hemi)spherical harmonics, analytic bilinear PCA, model-based bilinear PCA, 3D shape reconstruction, statistical shape from shading

    Design and Development of Robotic Part Assembly System under Vision Guidance

    Get PDF
    Robots are widely used for part assembly across manufacturing industries to attain high productivity through automation. The automated mechanical part assembly system contributes a major share in production process. An appropriate vision guided robotic assembly system further minimizes the lead time and improve quality of the end product by suitable object detection methods and robot control strategies. An approach is made for the development of robotic part assembly system with the aid of industrial vision system. This approach is accomplished mainly in three phases. The first phase of research is mainly focused on feature extraction and object detection techniques. A hybrid edge detection method is developed by combining both fuzzy inference rule and wavelet transformation. The performance of this edge detector is quantitatively analysed and compared with widely used edge detectors like Canny, Sobel, Prewitt, mathematical morphology based, Robert, Laplacian of Gaussian and wavelet transformation based. A comparative study is performed for choosing a suitable corner detection method. The corner detection technique used in the study are curvature scale space, Wang-Brady and Harris method. The successful implementation of vision guided robotic system is dependent on the system configuration like eye-in-hand or eye-to-hand. In this configuration, there may be a case that the captured images of the parts is corrupted by geometric transformation such as scaling, rotation, translation and blurring due to camera or robot motion. Considering such issue, an image reconstruction method is proposed by using orthogonal Zernike moment invariants. The suggested method uses a selection process of moment order to reconstruct the affected image. This enables the object detection method efficient. In the second phase, the proposed system is developed by integrating the vision system and robot system. The proposed feature extraction and object detection methods are tested and found efficient for the purpose. In the third stage, robot navigation based on visual feedback are proposed. In the control scheme, general moment invariants, Legendre moment and Zernike moment invariants are used. The selection of best combination of visual features are performed by measuring the hamming distance between all possible combinations of visual features. This results in finding the best combination that makes the image based visual servoing control efficient. An indirect method is employed in determining the moment invariants for Legendre moment and Zernike moment. These moments are used as they are robust to noise. The control laws, based on these three global feature of image, perform efficiently to navigate the robot in the desire environment
    corecore