43 research outputs found

    Intelligent system for interaction with virtual characters based on volumetric sensors

    Get PDF
    Dissertação de Mestrado, Engenharia Elétrica e Eletrónica, Instituto Superior de Engenharia, Universidade do Algarve, 2015A tecnologia vem sendo desenvolvida para ajudar-nos a completar ou aumentar a produtividade nas nossas tarefas diárias. Muitas das máquinas construídas têm sido progressivamente aperfeiçoadas para funcionar mais como um ser humano, usando para isso os mais variados sensores. Um dos problemas mais desafiantes que a tecnologia encontrou é como dar a uma máquina a capacidade que um "animal" tem de perceber o mundo através do seu sistema visual. Uma solução será usar na máquina sistemas inteligentes que usem visão computacional. Uma grande ajuda pode chegar da perceção de profundidade pela máquina, tornando menos complexa a deteção e a compreensão de objetos numa imagem por parte desta. Com o aparecimento de sensores volumétricos (tridimensional 3D) no mercado consumidor, aumentaram os desenvolvimentos feitos nesta área científica, permitindo assim a sua integração na maioria dos dispositivos, tais como computadores ou dispositivos móveis, a um preço muito competitivo. Os sensores volumétricos podem ser usados nas mais variadas áreas pois apesar de terem aparecido inicialmente na área dos videojogos, estendemse ainda à área de vídeo, modelação 3D, interfaces, jogos ou realidade virtual e aumentada. Esta dissertação foca essencialmente no desenvolvimento de sistemas (inteligentes) baseados em sensores volumétricos (neste caso a Microsoft Kinect) para a interação com avatares ou filmes. Quanto a aplicações na área de vídeo, foi desenvolvida uma solução onde um sensor 3D ajuda um utilizador a seguir uma narrativa que é iniciada assim que o utilizador é detetado, mudando os acontecimentos do vídeo consoante ações pré-determinadas do utilizador. O utilizador pode então mudar o rumo da história mudando de posição ou efetuando um gesto. Esta solução é ilustrada utilizando retroprojeção, existindo ainda a possibilidade de ser apresentada em modo holograma numa abordagem à escala. O descrito no anterior parágrafo pode também ser aplicada a uma solução de vertente mais comercial. Para isso, foi desenvolvido uma aplicação altamente configurável, podendo-se ajustar (em termos visuais) às necessidades de diferentes companhias. O ambiente gráfico é acompanhado por um avatar ou por um video (previamente gravado), que interage com um utilizador através de gestos, dando uma sensação mais realista devido à utilização de holografia. Ao interagir com a instalação, são registados todos os movimentos e interações efetuadas pelo utilizador para que estatísticas sejam construídas, de maneira a perceber os conteúdos com mais interesse bem como as áreas físicas com mais interação. Adicionalmente, o utilizador poderá ter a sua fotografia completa ou tipo BI extraída, podendo-lhe ser oferecidos em produtos promocionais da empresa. Devido à curta área de interação oferecida por um sensor deste tipo (Kinect), foi também desenvolvida a possibilidade de juntar vários sensores, 4 para cobrir 180º (graus) em frente da instalação ou ainda 8 para cobrir os 360º à volta da instalação, de maneira a que os utilizadores possam ser detetados por qualquer um deles e que não sejam perdidos quando atravessam para uma zona de outro sensor, ou mesmo quando saem do campo de visão dos sensores e retornam mais tarde. Apesar dos sensores referidos serem mais conhecidos na interação com um jogo virtual, jogos reais e físicos também podem beneficiar deste tipo de sensor. Neste último ponto, é apresentada uma ferramenta de realidade aumentada para snooker ou bilhar. Nesta aplicação, um sensor 3D colocado por cima da mesa, capta a área de jogo sendo depois processada para que sejam detetadas as bolas, o taco e as tabelas. Sempre que possível, esta deteção é feita usando a terceira dimensão (profundidade) oferecida por estes sensores, tornando-se por exemplo mais robusto a mudanças quanto a condições luminosas. Com estes dados é então previsto, utilizando álgebra vetorial, a trajetória da bola, sendo projetado o resultado na mesa

    Strong converses for group testing in the finite blocklength regime

    Get PDF

    State of the Art in Face Recognition

    Get PDF
    Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state

    Single-pixel, single-photon three-dimensional imaging

    Get PDF
    The 3D recovery of a scene is a crucial task with many real-life applications such as self-driving vehicles, X-ray tomography and virtual reality. The recent development of time-resolving detectors sensible to single photons allowed the recovery of the 3D information at high frame rate with unprecedented capabilities. Combined with a timing system, single-photon sensitive detectors allow the 3D image recovery by measuring the Time-of-Flight (ToF) of the photons scattered back by the scene with a millimetre depth resolution. Current ToF 3D imaging techniques rely on scanning detection systems or multi-pixel sensor. Here, we discuss an approach to simplify the hardware complexity of the current 3D imaging ToF techniques using a single-pixel, single-photon sensitive detector and computational imaging algorithms. The 3D imaging approaches discussed in this thesis do not require mechanical moving parts as in standard Lidar systems. The single-pixel detector allows to reduce the pixel complexity to a single unit and offers several advantages in terms of size, flexibility, wavelength range and cost. The experimental results demonstrate the 3D image recovery of hidden scenes with a subsecond acquisition time, allowing also non-line-of-sight scenes 3D recovery in real-time. We also introduce the concept of intelligent Lidar, a 3D imaging paradigm based uniquely on the temporal trace of the return photons and a data-driven 3D retrieval algorithm

    Robust and real-time hand detection and tracking in monocular video

    Get PDF
    In recent years, personal computing devices such as laptops, tablets and smartphones have become ubiquitous. Moreover, intelligent sensors are being integrated into many consumer devices such as eyeglasses, wristwatches and smart televisions. With the advent of touchscreen technology, a new human-computer interaction (HCI) paradigm arose that allows users to interface with their device in an intuitive manner. Using simple gestures, such as swipe or pinch movements, a touchscreen can be used to directly interact with a virtual environment. Nevertheless, touchscreens still form a physical barrier between the virtual interface and the real world. An increasingly popular field of research that tries to overcome this limitation, is video based gesture recognition, hand detection and hand tracking. Gesture based interaction allows the user to directly interact with the computer in a natural manner by exploring a virtual reality using nothing but his own body language. In this dissertation, we investigate how robust hand detection and tracking can be accomplished under real-time constraints. In the context of human-computer interaction, real-time is defined as both low latency and low complexity, such that a complete video frame can be processed before the next one becomes available. Furthermore, for practical applications, the algorithms should be robust to illumination changes, camera motion, and cluttered backgrounds in the scene. Finally, the system should be able to initialize automatically, and to detect and recover from tracking failure. We study a wide variety of existing algorithms, and propose significant improvements and novel methods to build a complete detection and tracking system that meets these requirements. Hand detection, hand tracking and hand segmentation are related yet technically different challenges. Whereas detection deals with finding an object in a static image, tracking considers temporal information and is used to track the position of an object over time, throughout a video sequence. Hand segmentation is the task of estimating the hand contour, thereby separating the object from its background. Detection of hands in individual video frames allows us to automatically initialize our tracking algorithm, and to detect and recover from tracking failure. Human hands are highly articulated objects, consisting of finger parts that are connected with joints. As a result, the appearance of a hand can vary greatly, depending on the assumed hand pose. Traditional detection algorithms often assume that the appearance of the object of interest can be described using a rigid model and therefore can not be used to robustly detect human hands. Therefore, we developed an algorithm that detects hands by exploiting their articulated nature. Instead of resorting to a template based approach, we probabilistically model the spatial relations between different hand parts, and the centroid of the hand. Detecting hand parts, such as fingertips, is much easier than detecting a complete hand. Based on our model of the spatial configuration of hand parts, the detected parts can be used to obtain an estimate of the complete hand's position. To comply with the real-time constraints, we developed techniques to speed-up the process by efficiently discarding unimportant information in the image. Experimental results show that our method is competitive with the state-of-the-art in object detection while providing a reduction in computational complexity with a factor 1 000. Furthermore, we showed that our algorithm can also be used to detect other articulated objects such as persons or animals and is therefore not restricted to the task of hand detection. Once a hand has been detected, a tracking algorithm can be used to continuously track its position in time. We developed a probabilistic tracking method that can cope with uncertainty caused by image noise, incorrect detections, changing illumination, and camera motion. Furthermore, our tracking system automatically determines the number of hands in the scene, and can cope with hands entering or leaving the video canvas. We introduced several novel techniques that greatly increase tracking robustness, and that can also be applied in other domains than hand tracking. To achieve real-time processing, we investigated several techniques to reduce the search space of the problem, and deliberately employ methods that are easily parallelized on modern hardware. Experimental results indicate that our methods outperform the state-of-the-art in hand tracking, while providing a much lower computational complexity. One of the methods used by our probabilistic tracking algorithm, is optical flow estimation. Optical flow is defined as a 2D vector field describing the apparent velocities of objects in a 3D scene, projected onto the image plane. Optical flow is known to be used by many insects and birds to visually track objects and to estimate their ego-motion. However, most optical flow estimation methods described in literature are either too slow to be used in real-time applications, or are not robust to illumination changes and fast motion. We therefore developed an optical flow algorithm that can cope with large displacements, and that is illumination independent. Furthermore, we introduce a regularization technique that ensures a smooth flow-field. This regularization scheme effectively reduces the number of noisy and incorrect flow-vector estimates, while maintaining the ability to handle motion discontinuities caused by object boundaries in the scene. The above methods are combined into a hand tracking framework which can be used for interactive applications in unconstrained environments. To demonstrate the possibilities of gesture based human-computer interaction, we developed a new type of computer display. This display is completely transparent, allowing multiple users to perform collaborative tasks while maintaining eye contact. Furthermore, our display produces an image that seems to float in thin air, such that users can touch the virtual image with their hands. This floating imaging display has been showcased on several national and international events and tradeshows. The research that is described in this dissertation has been evaluated thoroughly by comparing detection and tracking results with those obtained by state-of-the-art algorithms. These comparisons show that the proposed methods outperform most algorithms in terms of accuracy, while achieving a much lower computational complexity, resulting in a real-time implementation. Results are discussed in depth at the end of each chapter. This research further resulted in an international journal publication; a second journal paper that has been submitted and is under review at the time of writing this dissertation; nine international conference publications; a national conference publication; a commercial license agreement concerning the research results; two hardware prototypes of a new type of computer display; and a software demonstrator

    Optimized Biosignals Processing Algorithms for New Designs of Human Machine Interfaces on Parallel Ultra-Low Power Architectures

    Get PDF
    The aim of this dissertation is to explore Human Machine Interfaces (HMIs) in a variety of biomedical scenarios. The research addresses typical challenges in wearable and implantable devices for diagnostic, monitoring, and prosthetic purposes, suggesting a methodology for tailoring such applications to cutting edge embedded architectures. The main challenge is the enhancement of high-level applications, also introducing Machine Learning (ML) algorithms, using parallel programming and specialized hardware to improve the performance. The majority of these algorithms are computationally intensive, posing significant challenges for the deployment on embedded devices, which have several limitations in term of memory size, maximum operative frequency, and battery duration. The proposed solutions take advantage of a Parallel Ultra-Low Power (PULP) architecture, enhancing the elaboration on specific target architectures, heavily optimizing the execution, exploiting software and hardware resources. The thesis starts by describing a methodology that can be considered a guideline to efficiently implement algorithms on embedded architectures. This is followed by several case studies in the biomedical field, starting with the analysis of a Hand Gesture Recognition, based on the Hyperdimensional Computing algorithm, which allows performing a fast on-chip re-training, and a comparison with the state-of-the-art Support Vector Machine (SVM); then a Brain Machine Interface (BCI) to detect the respond of the brain to a visual stimulus follows in the manuscript. Furthermore, a seizure detection application is also presented, exploring different solutions for the dimensionality reduction of the input signals. The last part is dedicated to an exploration of typical modules for the development of optimized ECG-based applications

    A vector symbolic approach for cognitive services and decentralized workflows

    Get PDF
    The proliferation of smart devices and sensors known as the Internet of Things (IoT), along with the transformation of mobile phones into powerful handheld computers as well as the continuing advancement in high-speed communication technologies, introduces new possibilities for collaborative distributed computing and collaborative workflows along with a new set of problems to be solved. However, traditional service-based applications, in fixed networks, are typically constructed and managed centrally and assume stable service endpoints and adequate network connectivity. Constructing and maintaining such applications in dynamic heterogeneous wireless networked environments, where limited bandwidth and transient connectivity are commonplace, presents significant challenges and makes centralized application construction and management impossible. The key objective for this thesis can be summarised as follows: a means is required to discover and orchestrate sequences of micro-services, i.e., workflows, on-demand, using currently available distributed resources (compute devices, functional services, data and sensors) in spite of a poor quality (fragmented, low bandwidth) network infrastructure and without central control. It is desirable to be able to compose such workflows on-the-fly in order to fulfil an ‘intent’. The research undertaken investigates how service definition, service matching and decentralised service composition and orchestration can be achieved without centralised control using an approach based on a Binary Spatter Code Vector Symbolic Architec-ture and shows that the approach offers significant advantages in environments where communication networks are unreliable. The outcomes demonstrate a new cognitive workflow model that uses one-to-many communications to enable intelligent cooperation between self-describing service entities that can self-organise to complete a workflow task. Workflow orchestration overhead was minimised using two innovations, a local arbitration mechanism that uses a delayed response mechanism to suppress responses that are not an ideal match and the holographic nature of VSA descriptions enables messages to be truncated without loss of meaning. A new hierarchical VSA encoding scheme was created that is scaleable to any number of vector embeddings including workflow steps. The encoding can also facilitate learning since it provides unique contexts for each step in a workflow. The encoding also enables service pre-provisioning because individual workflow steps can be decoded easily by any service receiving a multicast workflow vector. This thesis brings the state-of-the-art closer to the ability to discover distributed services on-the-fly to fulfil an intent and without the need for centralised management or the imperative definition of all service steps, including locations. The use of a mathematically deterministic distributed vector representation in the form of BSC vectors for both service objects and workflows enables a common language for all elements required to discover and execute workflows in decentralised transient environments and opens up the possibilities of employing learning algorithms that can advance the state-of-the-art in distributed workflows towards a true cognitive distributed network architectur

    Fast fluorescence lifetime imaging and sensing via deep learning

    Get PDF
    Error on title page – year of award is 2023.Fluorescence lifetime imaging microscopy (FLIM) has become a valuable tool in diverse disciplines. This thesis presents deep learning (DL) approaches to addressing two major challenges in FLIM: slow and complex data analysis and the high photon budget for precisely quantifying the fluorescence lifetimes. DL's ability to extract high-dimensional features from data has revolutionized optical and biomedical imaging analysis. This thesis contributes several novel DL FLIM algorithms that significantly expand FLIM's scope. Firstly, a hardware-friendly pixel-wise DL algorithm is proposed for fast FLIM data analysis. The algorithm has a simple architecture yet can effectively resolve multi-exponential decay models. The calculation speed and accuracy outperform conventional methods significantly. Secondly, a DL algorithm is proposed to improve FLIM image spatial resolution, obtaining high-resolution (HR) fluorescence lifetime images from low-resolution (LR) images. A computational framework is developed to generate large-scale semi-synthetic FLIM datasets to address the challenge of the lack of sufficient high-quality FLIM datasets. This algorithm offers a practical approach to obtaining HR FLIM images quickly for FLIM systems. Thirdly, a DL algorithm is developed to analyze FLIM images with only a few photons per pixel, named Few-Photon Fluorescence Lifetime Imaging (FPFLI) algorithm. FPFLI uses spatial correlation and intensity information to robustly estimate the fluorescence lifetime images, pushing this photon budget to a record-low level of only a few photons per pixel. Finally, a time-resolved flow cytometry (TRFC) system is developed by integrating an advanced CMOS single-photon avalanche diode (SPAD) array and a DL processor. The SPAD array, using a parallel light detection scheme, shows an excellent photon-counting throughput. A quantized convolutional neural network (QCNN) algorithm is designed and implemented on a field-programmable gate array as an embedded processor. The processor resolves fluorescence lifetimes against disturbing noise, showing unparalleled high accuracy, fast analysis speed, and low power consumption.Fluorescence lifetime imaging microscopy (FLIM) has become a valuable tool in diverse disciplines. This thesis presents deep learning (DL) approaches to addressing two major challenges in FLIM: slow and complex data analysis and the high photon budget for precisely quantifying the fluorescence lifetimes. DL's ability to extract high-dimensional features from data has revolutionized optical and biomedical imaging analysis. This thesis contributes several novel DL FLIM algorithms that significantly expand FLIM's scope. Firstly, a hardware-friendly pixel-wise DL algorithm is proposed for fast FLIM data analysis. The algorithm has a simple architecture yet can effectively resolve multi-exponential decay models. The calculation speed and accuracy outperform conventional methods significantly. Secondly, a DL algorithm is proposed to improve FLIM image spatial resolution, obtaining high-resolution (HR) fluorescence lifetime images from low-resolution (LR) images. A computational framework is developed to generate large-scale semi-synthetic FLIM datasets to address the challenge of the lack of sufficient high-quality FLIM datasets. This algorithm offers a practical approach to obtaining HR FLIM images quickly for FLIM systems. Thirdly, a DL algorithm is developed to analyze FLIM images with only a few photons per pixel, named Few-Photon Fluorescence Lifetime Imaging (FPFLI) algorithm. FPFLI uses spatial correlation and intensity information to robustly estimate the fluorescence lifetime images, pushing this photon budget to a record-low level of only a few photons per pixel. Finally, a time-resolved flow cytometry (TRFC) system is developed by integrating an advanced CMOS single-photon avalanche diode (SPAD) array and a DL processor. The SPAD array, using a parallel light detection scheme, shows an excellent photon-counting throughput. A quantized convolutional neural network (QCNN) algorithm is designed and implemented on a field-programmable gate array as an embedded processor. The processor resolves fluorescence lifetimes against disturbing noise, showing unparalleled high accuracy, fast analysis speed, and low power consumption

    NASA Tech Briefs, July/August 1988

    Get PDF
    Topics: New Product Ideas; NASA TU Services; Electronic Components and Circuits; Electronic Systems; Physical Sciences; Materials; Computer Programs; Mechanics; Machinery; Fabrication Technology; Mathematics and Information Sciences; Life Sciences
    corecore