Search CORE

30 research outputs found

Sparse shape registration for occluded facial feature localization

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

From images to augmented 3D models: improved visual SLAM and augmented point cloud modeling

Author: Zhang Guangcong
Publication venue: Georgia Institute of Technology
Publication date: 11/01/2017
Field of study

This thesis investigates into the problem of using monocular image sequences to generate augmented models. The problem is decomposed to two subproblems: monocular visual simultaneously localization and mapping (VSLAM), and the point cloud data modeling. Accordingly, the thesis comprises two major parts. The First part, including Chapters 2, 3 and 4, aims to leverage the system observability theories to improve the VSLAM accuracy. In Chapter 2, a piece-wise linear system is developed to model VSLAM, and two necessary conditions are proved to make the VSLAM completely observable. Based on the First condition, an instantaneous condition for complete observability, the "Optimally Observable and Minimal Cardinality (OOMC) VSLAM" is presented in Chapter 3. The OOMC algorithm selects the feature subset of minimal required cardinality to form the strongest observable VSLAM subsystem. The select feature subset is further used to improve the data association in VSLAM. Based on the second condition, a temporal condition for complete observability, the "Good Features (GF) to Track for VSLAM" is presented in Chapter 4. The GF algorithm ranks the individual features according to their contributions to system observability. Benchmarking experiments of both OOMC and GF algorithms demonstrate improvements in VSLAM performance. The second part, including Chapters 5 and 6, aims to solve the PCD modeling problem in a geometry-driven manner. Chapter 5 presents an algorithm to model PCDs with planar patches via a sparsity-inducing optimization. Chapter 6 extends the PCD modeling to quadratic surface primitives based models. A method is further developed to retrieve the high-level semantic information of the model components. Evaluation on the PCDs generated from VSLAM demonstrates the effectiveness of these geometry-driven PCD modeling approaches.Ph.D

Scholarly Materials And Research @ Georgia Tech

Nearest Neighbor Discriminant Analysis Based Face Recognition Using Ensembled Gabor Features

Author: Dolu Onur
Publication venue: 'National Institute of Informatics (NII)'
Publication date: 25/10/2016
Field of study

Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Bilişim Enstitüsü, 2009Thesis (M.Sc.) -- İstanbul Technical University, Institute of Informatics, 2009Son yıllarda, ışık varyasyonlarına ve yüz ifade değişikliklerine karşı gürbüz olduğu üzere yüz tanıma alanında Gabor öznitelikleri tabanlı yüz temsil etme çok umut vaad edici sonuç vermiştir. Seçilen uzamsal frekans, uzamsal lokalizasyon ve yönelime göre yerel yapıyı hesaplaması, elle işaretlendirmeye ihtiyaç duymaması Gabor özniteliklerini efektif yapan özellikleridir. Bu tez çalışmasındaki katkı, Gabor süzgeçleri ve En Yakın Komşu Ayrışım Analizi'nin (EYKAA) güçlerini birleştirerek önemli ayrışım öznitelikleri ortaya çıkaran Gabor En Yakın Komşu Sınıflandırıcısı (GEYKS) genişletip Parçalı Gabor En Yakın Komşu Sınıflandırıcısı (PGEYKS) metodunu ortaya koymaktır. PGEYKS; alçaltılmış gabor öznitelikleri barındıran farklı segmanları kullanarak, her biri ayrı dizayn edilen birçok EYKAA tabanlı bileşen sınıflandırıcılarını bir araya getiren grup sınıflandırıcısıdır. Tüm gabor özniteliklerinin alçaltılmış boyutu tek bir EYKAA bileşeninden çıkarıldığı gibi, PGEYKS; ayrışım bilgi kaybını minimum yapıp 3S (yetersiz örnek miktarı) problemini önleyerek alçaltılmış gabor öznitelikleri içindeki ayrıştırabilirliği daha iyi kullanır. PGEYKS yönteminin tanıma başarımı karşılaştırmalı performans çalışması ile gösterilmiştir. Farklı ışıklandırma ve yüz ifadesi deişiklikleri barındıran 200 sınıflık FERET veritabanı alt kümesinde, 65 öznitelik için PGEYKS %100 başarım elde ederek atası olan GEYKS'nın aldığı %98 başarısını ve diğer GFS (Gabor Fisher Sınıflandırıcı) ve GTS (Gabor Temel Sınıflandırıcı) gibi standard methodlardan daha iyi sonuçlar vermiştir. Ayrıca YALE veritabanı üzerindeki testlerde PGEYKS her türlü (k, alpha) çiftleri için GEYKS'ten daha başarılıdır ve 14 öznitelik için step size = 5, k = 5, alpha = 3 parametlerinde %96 tanıma başarısına ulaşmıştır.In last decades, Gabor features based face representation performed very promising results in face recognition area as its robust to variations due to illumination and facial expression changes. The properties of Gabor are, which makes it effective, it computes the local structure corresponding to spatial frequency (scale), spatial localization, and orientation selectivity and no need for manual annotations. The contribution of this thesis, an Ensemble based Gabor Nearest Neighbor Classifier (EGNNC) method is proposed extending Gabor Nearest Neighbor Classifier (GNNC) where GNNC extracts important discriminant features both utilizing the power of Gabor filters and Nearest Neighbor Discriminant Analysis (NNDA). EGNNC is an ensemble classifier combining multiple NNDA based component classifiers designed respectively using different segments of the reduced Gabor feature. Since reduced dimension of the entire Gabor feature is extracted by one component NNDA classifier, EGNNC has better use of the discriminability implied in reduced Gabor features by the avoiding 3S (small sample size) problem as making minimum loss of discriminative information. The accuracy of the EGNNC is shown by comparative performance work. Using a 200 class subset of FERET database covering illumination and expression variations, EGNNC achieved 100% recognition rate, outperforming its ancestor GNNC perform 98 percent as well as standard methods such GFC and GPC for 65 features. Also for the YALE database, EGNNC outperformed GNNC on all (k, alpha) tuples and EGNNC reaches 96 percent accuracy in 14 feature dimension, along with parameters step size = 5, k = 5, alpha = 3.Yüksek LisansM.Sc

Ulusal Üniversitelerarası Açık Erişim Sistemi - İstanbul Teknik Üniversitesi

The production and recognition of emotions in speech: features and algorithms

Author: Banse
Breazeal
Cahn
Druin
Ekman
Fujita
Goldberg
Halliday
Kirby
Murray
Murray
Oudeyer Pierre-Yves
Picard
Samal
Steels
Williams
Witten
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Recommended from our members

DefED-Net: Deformable Encoder-Decoder Network for Liver and Liver Tumor Segmentation

Author: Lei T
Liu C
Nandi A
Wang R
Wang Y
Zhang Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/02/2021
Field of study

Deep convolutional neural networks have been widely used for medical image segmentation due to their superiority in feature learning. Although these networks are successful for simple object segmentation tasks, they suffer from two problems for liver and liver tumor segmentation in CT images. One is that convolutional kernels of fixed geometrical structure are unmatched with livers and liver tumors of irregular shapes. The other is that pooling and strided convolutional operations easily lead to the loss of spatial contextual information of images. To address these issues, we propose a deformable encoder-decoder network (DefED-Net) for liver and liver tumor segmentation. The proposed network makes two contributions. The first is that the deformable convolution is used to enhance the feature representation capability of DefED-Net, which can help the network to learn convolution kernels with adaptive spatial structuring information. The second is that we design a Ladder-atrous-spatial-pyramid-pooling module using multi-scale dilation rate (Ladder-ASPP) and apply the module to learn better context information than the atrous spatial pyramid pooling (ASPP) for CT image segmentation. The proposed DefED-Net is evaluated on two public benchmark datasets, the LiTS and the 3DIRCADb. Experiments demonstrate that the DefED-Net has better capability of feature representation as well as provides higher accuracy on liver and liver tumor segmentation than stateof-the art networks. The available code of DefED-Net we propose can be found from https://github.com/SUST-reynole/DefED-Net.Science and Technology Program of Shaanxi Province of China; National Natural Science Foundation of Chin

Brunel University Research Archive

Statistical Medial Model dor Cardiac Segmentation and Morphometry

Author: Sun Hui
Publication venue: ScholarlyCommons
Publication date: 01/01/2010
Field of study

In biomedical image analysis, shape information can be utilized for many purposes. For example, irregular shape features can help identify diseases; shape features can help match different instances of anatomical structures for statistical comparison; and prior knowledge of the mean and possible variation of an anatomical structure\u27s shape can help segment a new example of this structure in noisy, low-contrast images. A good shape representation helps to improve the performance of the above techniques. The overall goal of the proposed research is to develop and evaluate methods for representing shapes of anatomical structures. The medial model is a shape representation method that models a 3D object by explicitly defining its skeleton (medial axis) and deriving the object\u27s boundary via inverse-skeletonization . This model represents shape compactly, and naturally expresses descriptive global shape features like thickening , bending , and elongation . However, its application in biomedical image analysis has been limited, and it has not yet been applied to the heart, which has a complex shape. In this thesis, I focus on developing efficient methods to construct the medial model, and apply it to solve biomedical image analysis problems. I propose a new 3D medial model which can be efficiently applied to complex shapes. The proposed medial model closely approximates the medial geometry along medial edge curves and medial branching curves by soft-penalty optimization and local correction. I further develop a scheme to perform model-based segmentation using a statistical medial model which incorporates prior shape and appearance information. The proposed medial models are applied to a series of image analysis tasks. The 2D medial model is applied to the corpus callosum which results in an improved alignment of the patterns of commissural connectivity compared to a volumetric registration method. The 3D medial model is used to describe the myocardium of the left and right ventricles, which provides detailed thickness maps characterizing different disease states. The model-based myocardium segmentation scheme is tested in a heterogeneous adult MRI dataset. Our segmentation experiments demonstrate that the statistical medial model can accurately segment the ventricular myocardium and provide useful parameters to characterize heart function

ScholarlyCommons@Penn

Use of Coherent Point Drift in computer vision applications

Author: Sara Saravi (7168430)
Publication venue
Publication date: 01/01/2013
Field of study

This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration

Loughborough University Institutional Repository

A Comprehensive Survey on Enterprise Financial Risk Analysis: Problems, Methods, Spotlights and Applications

Author: Du Huaming
Zhao Yu
Publication venue
Publication date: 27/11/2022
Field of study

Enterprise financial risk analysis aims at predicting the enterprises' future financial risk.Due to the wide application, enterprise financial risk analysis has always been a core research issue in finance. Although there are already some valuable and impressive surveys on risk management, these surveys introduce approaches in a relatively isolated way and lack the recent advances in enterprise financial risk analysis. Due to the rapid expansion of the enterprise financial risk analysis, especially from the computer science and big data perspective, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing enterprise financial risk researches, as well as to summarize and interpret the mechanisms and the strategies of enterprise financial risk analysis in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. This paper provides a systematic literature review of over 300 articles published on enterprise risk analysis modelling over a 50-year period, 1968 to 2022. We first introduce the formal definition of enterprise risk as well as the related concepts. Then, we categorized the representative works in terms of risk type and summarized the three aspects of risk analysis. Finally, we compared the analysis methods used to model the enterprise financial risk. Our goal is to clarify current cutting-edge research and its possible future directions to model enterprise risk, aiming to fully understand the mechanisms of enterprise risk communication and influence and its application on corporate governance, financial institution and government regulation

arXiv.org e-Print Archive