    Tracking Skin-Colored Objects in Real-Time

    We present a methodology for tracking multiple skin-colored objects in a monocular image sequence. The proposed approach encompasses a collection of techniques that allow the modeling, detection and temporal association of skincolored objects across image sequences. A non-parametric model of skin color is employed. Skin-colored objects are detected with a Bayesian classifier that is bootstrapped with a small set of training data and refined through an off-line iterative training procedure. By using on-line adaptation of skin-color probabilities the classifier is able to cope with considerable illumination changes. Tracking over time is achieved by a novel technique that can handle multiple objects simultaneously. Tracked objects may move in complex trajectories, occlude each other in the field of view of a possibly moving camera and vary in number over time. A prototype implementation of the developed system operates on 320x240 live video in real time (28Hz), running on a conventional Pentium IV processor. Representative experimental results from the application of this prototype to image sequences are also presented. 1

    Review on Image Guided Surgery Systems

    Nowadays modern imaging techniques can grant an excellent quality 3D images that clearly show the anatomy, vascularity, pathology and active functions of the tissues. The ability to register these preoperative images to each other, to offer a comprehensive information, and later the ability to register the image space to the patient space intraoperatively is the core for the image guided surgery systems (IGS). Other main elements of the system include the process of tracking the surgical tools intraoperatively by reflecting their positions within the 3D image model. In some occasions an intraoperative image may be acquired and registered to the preoperative images to make sure the 3D model used to guide the operation describes the actual situation at surgery time. This survey overviews the history of IGS and discusses the modern system components for a reliable application and gives information about the different applications in medical specialties that benefited from the use of IGS

    Switch KVM Através de Visão Computacional

    Software known as KVM switch are responsible for control two or more computers with the same mouse and keyboard. The purpose of this study is to explore face detection and use this to detect where is the attention of the user and changing the use of keyboard and mouse to the computer of attention, proposing an alternative in the form of Human-Computer Interaction. The method used for face detection was the method Viola-Jones using the OpenCV library and the feature was incorporated into a KVM switch called Synergy

    Desenvolvimento de um tradutor de Língua Gestual Portuguesa

    O presente projeto que aqui se apresenta ganhou o concurso “Passaporte para o Empreendedorismo”A Língua Gestual é um meio de comunicação entre surdos e entre surdos e técnicos especialistas que não requer o uso da expressão/comunicação oral. Contudo, a interação de surdos com outras pessoas não surdas ou não especialistas é dificultada pela ausência de uma linguagem comum. Em particular, no contexto do dia-a-dia a utilização de serviços que envolvam uma interação mais complexa pode ser impossível sem um intérprete de Língua Gestual. Com o intuito de resolver esta barreira na comunicação, foi vista uma oportunidade de desenvolvimento de uma aplicação que, em conjunto com o Leap Motion, deteta, grava e reconhece a fluência do movimento dos gestos na base da Língua Gestual. Neste trabalho foram estudadas 4 diferentes metodologias com o intuito de iniciar o desenvolvimento do primeiro tradutor pessoal de Língua Gestual Portuguesa (LGP) usando a tecnologia recente e inovadora do Leap Motion. O trabalho foi desenvolvido sobre a framework Unity3D por motivos de desenvolvimento futuro multiplataforma, o desenvolvimento da interface de visualização gráfica em tempo real e a utilização de som para a tradução. Foram obtidos bons resultados em 3 das 4 metodologias, sendo elas o LeapTrainerUI, e as duas aplicações de Reconhecimento de Numeração cardinal em LGP até 5, conseguindo uma correta classificação da numeração. A última metodologia, Classificador de 1 Gesto, com um grau de complexidade maior, foi a que obteve resultados menos satisfatórios. Embora os resultados obtidos sejam preliminares, este trabalho abre as perspetivas ao desenvolvimento de um intérprete pessoal de LGP. Apesar desta língua ter uma complexidade superior à simples deteção de movimento das mãos e dos dedos, num futuro próximo, conseguir-se-á que pessoas surdas tenham consigo este dispositivo e consigam transmitir de forma verbalizada a sua intenção a pessoas que não entendam LGP.Fundação para a Ciência e a Tecnologia (FCT)e Ministério da Ciência e Educação (MCE) Portugal (PIDDAC) pois estava integrado no projeto PEst-OE/SAU/UI0645/201

    Towards an efficient, unsupervised and automatic face detection system for unconstrained environments

    Nowadays, there is growing interest in face detection applications for unconstrained environments. The increasing need for public security and national security motivated our research on the automatic face detection system. For public security surveillance applications, the face detection system must be able to cope with unconstrained environments, which includes cluttered background and complicated illuminations. Supervised approaches give very good results on constrained environments, but when it comes to unconstrained environments, even obtaining all the training samples needed is sometimes impractical. The limitation of supervised approaches impels us to turn to unsupervised approaches. In this thesis, we present an efficient and unsupervised face detection system, which is feature and configuration based. It combines geometric feature detection and local appearance feature extraction to increase stability and performance of the detection process. It also contains a novel adaptive lighting compensation approach to normalize the complicated illumination in real life environments. We aim to develop a system that has as few assumptions as possible from the very beginning, is robust and exploits accuracy/complexity trade-offs as much as possible. Although our attempt is ambitious for such an ill posed problem-we manage to tackle it in the end with very few assumptions.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Face recognition by means of advanced contributions in machine learning

    Face recognition (FR) has been extensively studied, due to both scientific fundamental challenges and current and potential applications where human identification is needed. FR systems have the benefits of their non intrusiveness, low cost of equipments and no useragreement requirements when doing acquisition, among the most important ones. Nevertheless, despite the progress made in last years and the different solutions proposed, FR performance is not yet satisfactory when more demanding conditions are required (different viewpoints, blocked effects, illumination changes, strong lighting states, etc). Particularly, the effect of such non-controlled lighting conditions on face images leads to one of the strongest distortions in facial appearance. This dissertation addresses the problem of FR when dealing with less constrained illumination situations. In order to approach the problem, a new multi-session and multi-spectral face database has been acquired in visible, Near-infrared (NIR) and Thermal infrared (TIR) spectra, under different lighting conditions. A theoretical analysis using information theory to demonstrate the complementarities between different spectral bands have been firstly carried out. The optimal exploitation of the information provided by the set of multispectral images has been subsequently addressed by using multimodal matching score fusion techniques that efficiently synthesize complementary meaningful information among different spectra. Due to peculiarities in thermal images, a specific face segmentation algorithm has been required and developed. In the final proposed system, the Discrete Cosine Transform as dimensionality reduction tool and a fractional distance for matching were used, so that the cost in processing time and memory was significantly reduced. Prior to this classification task, a selection of the relevant frequency bands is proposed in order to optimize the overall system, based on identifying and maximizing independence relations by means of discriminability criteria. The system has been extensively evaluated on the multispectral face database specifically performed for our purpose. On this regard, a new visualization procedure has been suggested in order to combine different bands for establishing valid comparisons and giving statistical information about the significance of the results. This experimental framework has more easily enabled the improvement of robustness against training and testing illumination mismatch. Additionally, focusing problem in thermal spectrum has been also addressed, firstly, for the more general case of the thermal images (or thermograms), and then for the case of facialthermograms from both theoretical and practical point of view. In order to analyze the quality of such facial thermograms degraded by blurring, an appropriate algorithm has been successfully developed. Experimental results strongly support the proposed multispectral facial image fusion, achieving very high performance in several conditions. These results represent a new advance in providing a robust matching across changes in illumination, further inspiring highly accurate FR approaches in practical scenarios.El reconeixement facial (FR) ha estat àmpliament estudiat, degut tant als reptes fonamentals científics que suposa com a les aplicacions actuals i futures on requereix la identificació de les persones. Els sistemes de reconeixement facial tenen els avantatges de ser no intrusius,presentar un baix cost dels equips d’adquisició i no la no necessitat d’autorització per part de l’individu a l’hora de realitzar l'adquisició, entre les més importants. De totes maneres i malgrat els avenços aconseguits en els darrers anys i les diferents solucions proposades, el rendiment del FR encara no resulta satisfactori quan es requereixen condicions més exigents (diferents punts de vista, efectes de bloqueig, canvis en la il·luminació, condicions de llum extremes, etc.). Concretament, l'efecte d'aquestes variacions no controlades en les condicions d'il·luminació sobre les imatges facials condueix a una de les distorsions més accentuades sobre l'aparença facial. Aquesta tesi aborda el problema del FR en condicions d'il·luminació menys restringides. Per tal d'abordar el problema, hem adquirit una nova base de dades de cara multisessió i multiespectral en l'espectre infraroig visible, infraroig proper (NIR) i tèrmic (TIR), sota diferents condicions d'il·luminació. En primer lloc s'ha dut a terme una anàlisi teòrica utilitzant la teoria de la informació per demostrar la complementarietat entre les diferents bandes espectrals objecte d’estudi. L'òptim aprofitament de la informació proporcionada pel conjunt d'imatges multiespectrals s'ha abordat posteriorment mitjançant l'ús de tècniques de fusió de puntuació multimodals, capaces de sintetitzar de manera eficient el conjunt d’informació significativa complementària entre els diferents espectres. A causa de les característiques particulars de les imatges tèrmiques, s’ha requerit del desenvolupament d’un algorisme específic per la segmentació de les mateixes. En el sistema proposat final, s’ha utilitzat com a eina de reducció de la dimensionalitat de les imatges, la Transformada del Cosinus Discreta i una distància fraccional per realitzar les tasques de classificació de manera que el cost en temps de processament i de memòria es va reduir de forma significa. Prèviament a aquesta tasca de classificació, es proposa una selecció de les bandes de freqüències més rellevants, basat en la identificació i la maximització de les relacions d'independència per mitjà de criteris discriminabilitat, per tal d'optimitzar el conjunt del sistema. El sistema ha estat àmpliament avaluat sobre la base de dades de cara multiespectral, desenvolupada pel nostre propòsit. En aquest sentit s'ha suggerit l’ús d’un nou procediment de visualització per combinar diferents bandes per poder establir comparacions vàlides i donar informació estadística sobre el significat dels resultats. Aquest marc experimental ha permès més fàcilment la millora de la robustesa quan les condicions d’il·luminació eren diferents entre els processos d’entrament i test. De forma complementària, s’ha tractat la problemàtica de l’enfocament de les imatges en l'espectre tèrmic, en primer lloc, pel cas general de les imatges tèrmiques (o termogrames) i posteriorment pel cas concret dels termogrames facials, des dels punt de vista tant teòric com pràctic. En aquest sentit i per tal d'analitzar la qualitat d’aquests termogrames facials degradats per efectes de desenfocament, s'ha desenvolupat un últim algorisme. Els resultats experimentals recolzen fermament que la fusió d'imatges facials multiespectrals proposada assoleix un rendiment molt alt en diverses condicions d’il·luminació. Aquests resultats representen un nou avenç en l’aportació de solucions robustes quan es contemplen canvis en la il·luminació, i esperen poder inspirar a futures implementacions de sistemes de reconeixement facial precisos en escenaris no controlats.Postprint (published version