92 research outputs found

    High Speed Human Action Recognition using a Photonic Reservoir Computer

    Full text link
    The recognition of human actions in videos is one of the most active research fields in computer vision. The canonical approach consists in a more or less complex preprocessing stages of the raw video data, followed by a relatively simple classification algorithm. Here we address recognition of human actions using the reservoir computing algorithm, which allows us to focus on the classifier stage. We introduce a new training method for the reservoir computer, based on "Timesteps Of Interest", which combines in a simple way short and long time scales. We study the performance of this algorithm using both numerical simulations and a photonic implementation based on a single non-linear node and a delay line on the well known KTH dataset. We solve the task with high accuracy and speed, to the point of allowing for processing multiple video streams in real time. The present work is thus an important step towards developing efficient dedicated hardware for video processing

    Analysis of 3D human gait reconstructed with a depth camera and mirrors

    Full text link
    L'évaluation de la démarche humaine est l'une des composantes essentielles dans les soins de santé. Les systèmes à base de marqueurs avec plusieurs caméras sont largement utilisés pour faire cette analyse. Cependant, ces systèmes nécessitent généralement des équipements spécifiques à prix élevé et/ou des moyens de calcul intensif. Afin de réduire le coût de ces dispositifs, nous nous concentrons sur un système d'analyse de la marche qui utilise une seule caméra de profondeur. Le principe de notre travail est similaire aux systèmes multi-caméras, mais l'ensemble de caméras est remplacé par un seul capteur de profondeur et des miroirs. Chaque miroir dans notre configuration joue le rôle d'une caméra qui capture la scène sous un point de vue différent. Puisque nous n'utilisons qu'une seule caméra, il est ainsi possible d'éviter l'étape de synchronisation et également de réduire le coût de l'appareillage. Notre thèse peut être divisée en deux sections: reconstruction 3D et analyse de la marche. Le résultat de la première section est utilisé comme entrée de la seconde. Notre système pour la reconstruction 3D est constitué d'une caméra de profondeur et deux miroirs. Deux types de capteurs de profondeur, qui se distinguent sur la base du mécanisme d'estimation de profondeur, ont été utilisés dans nos travaux. Avec la technique de lumière structurée (SL) intégrée dans le capteur Kinect 1, nous effectuons la reconstruction 3D à partir des principes de l'optique géométrique. Pour augmenter le niveau des détails du modèle reconstruit en 3D, la Kinect 2 qui estime la profondeur par temps de vol (ToF), est ensuite utilisée pour l'acquisition d'images. Cependant, en raison de réflections multiples sur les miroirs, il se produit une distorsion de la profondeur dans notre système. Nous proposons donc une approche simple pour réduire cette distorsion avant d'appliquer les techniques d'optique géométrique pour reconstruire un nuage de points de l'objet 3D. Pour l'analyse de la démarche, nous proposons diverses alternatives centrées sur la normalité de la marche et la mesure de sa symétrie. Cela devrait être utile lors de traitements cliniques pour évaluer, par exemple, la récupération du patient après une intervention chirurgicale. Ces méthodes se composent d'approches avec ou sans modèle qui ont des inconvénients et avantages différents. Dans cette thèse, nous présentons 3 méthodes qui traitent directement les nuages de points reconstruits dans la section précédente. La première utilise la corrélation croisée des demi-corps gauche et droit pour évaluer la symétrie de la démarche, tandis que les deux autres methodes utilisent des autoencodeurs issus de l'apprentissage profond pour mesurer la normalité de la démarche.The problem of assessing human gaits has received a great attention in the literature since gait analysis is one of key components in healthcare. Marker-based and multi-camera systems are widely employed to deal with this problem. However, such systems usually require specific equipments with high price and/or high computational cost. In order to reduce the cost of devices, we focus on a system of gait analysis which employs only one depth sensor. The principle of our work is similar to multi-camera systems, but the collection of cameras is replaced by one depth sensor and mirrors. Each mirror in our setup plays the role of a camera which captures the scene at a different viewpoint. Since we use only one camera, the step of synchronization can thus be avoided and the cost of devices is also reduced. Our studies can be separated into two categories: 3D reconstruction and gait analysis. The result of the former category is used as the input of the latter one. Our system for 3D reconstruction is built with a depth camera and two mirrors. Two types of depth sensor, which are distinguished based on the scheme of depth estimation, have been employed in our works. With the structured light (SL) technique integrated into the Kinect 1, we perform the 3D reconstruction based on geometrical optics. In order to increase the level of details of the 3D reconstructed model, the Kinect 2 with time-of-flight (ToF) depth measurement is used for image acquisition instead of the previous generation. However, due to multiple reflections on the mirrors, depth distortion occurs in our setup. We thus propose a simple approach for reducing such distortion before applying geometrical optics to reconstruct a point cloud of the 3D object. For the task of gait analysis, we propose various alternative approaches focusing on the problem of gait normality/symmetry measurement. They are expected to be useful for clinical treatments such as monitoring patient's recovery after surgery. These methods consist of model-free and model-based approaches that have different cons and pros. In this dissertation, we present 3 methods that directly process point clouds reconstructed from the previous work. The first one uses cross-correlation of left and right half-bodies to assess gait symmetry while the other ones employ deep auto-encoders to measure gait normality

    Indoor positioning with deep learning for mobile IoT systems

    Get PDF
    2022 Summer.Includes bibliographical references.The development of human-centric services with mobile devices in the era of the Internet of Things (IoT) has opened the possibility of merging indoor positioning technologies with various mobile applications to deliver stable and responsive indoor navigation and localization functionalities that can enhance user experience within increasingly complex indoor environments. But as GPS signals cannot easily penetrate modern building structures, it is challenging to build reliable indoor positioning systems (IPS). Currently, Wi-Fi sensing based indoor localization techniques are gaining in popularity as a means to build accurate IPS, benefiting from the prevalence of 802.11 family. Wi-Fi fingerprinting based indoor localization has shown remarkable performance over geometric mapping in complex indoor environments by taking advantage of pattern matching techniques. Today, the two main information extracted from Wi-Fi signals to form fingerprints are Received Signal Strength Index (RSSI) and Channel State Information (CSI) with Orthogonal Frequency-Division Multiplexing (OFDM) modulation, where the former can provide the average localization error around or under 10 meters but has low hardware and software requirements, while the latter has a higher chance to estimate locations with ultra-low distance errors but demands more resources from chipsets, firmware/software environments, etc. This thesis makes two novel contributions towards realizing viable IPS on mobile devices using RSSI and CSI information, and deep machine learning based fingerprinting. Due to the larger quantity of data and more sophisticated signal patterns to create fingerprints in complex indoor environments, conventional machine learning algorithms that need carefully engineered features suffer from the challenges of identifying features from very high dimensional data. Hence, the abilities of approximation functions generated from conventional machine learning models to estimate locations are limited. Deep machine learning based approaches can overcome these challenges to realize scalable feature pattern matching approaches such as fingerprinting. However, deep machine learning models generally require considerable memory footprint, and this creates a significant issue on resource-constrained devices such as mobile IoT devices, wearables, smartphones, etc. Developing efficient deep learning models is a critical factor to lower energy consumption for resource intensive mobile IoT devices and accelerate inference time. To address this issue, our first contribution proposes the CHISEL framework, which is a Wi-Fi RSSI- based IPS that incorporates data augmentation and compression-aware two-dimensional convolutional neural networks (2D CAECNNs) with different pruning and quantization options. The proposed model compression techniques help reduce model deployment overheads in the IPS. Unlike RSSI, CSI takes advantages of multipath signals to potentially help indoor localization algorithms achieve a higher level of localization accuracy. The compensations for magnitude attenuation and phase shifting during wireless propagation generate different patterns that can be utilized to define the uniqueness of different locations of signal reception. However, all prior work in this domain constrains the experimental space to relatively small-sized and rectangular rooms where the complexity of building interiors and dynamic noise from human activities, etc., are seldom considered. As part of our second contribution, we propose an end-to-end deep learning based framework called CSILoc for Wi-Fi CSI-based IPS on mobile IoT devices. The framework includes CSI data collection, clustering, denoising, calibration and classification, and is the first study to verify the feasibility to use CSI for floor level indoor localization with minimal knowledge of Wi-Fi access points (APs), thus avoiding security concerns during the offline data collection process

    Doctor of Philosophy

    Get PDF
    dissertationWhile boundary representations, such as nonuniform rational B-spline (NURBS) surfaces, have traditionally well served the needs of the modeling community, they have not seen widespread adoption among the wider engineering discipline. There is a common perception that NURBS are slow to evaluate and complex to implement. Whereas computer-aided design commonly deals with surfaces, the engineering community must deal with materials that have thickness. Traditional visualization techniques have avoided NURBS, and there has been little cross-talk between the rich spline approximation community and the larger engineering field. Recently there has been a strong desire to marry the modeling and analysis phases of the iterative design cycle, be it in car design, turbulent flow simulation around an airfoil, or lighting design. Research has demonstrated that employing a single representation throughout the cycle has key advantages. Furthermore, novel manufacturing techniques employing heterogeneous materials require the introduction of volumetric modeling representations. There is little question that fields such as scientific visualization and mechanical engineering could benefit from the powerful approximation properties of splines. In this dissertation, we remove several hurdles to the application of NURBS to problems in engineering and demonstrate how their unique properties can be leveraged to solve problems of interest

    Head tracking two-image 3D television displays

    Get PDF
    The research covered in this thesis encompasses the design of novel 3D displays, a consideration of 3D television requirements and a survey of autostereoscopic methods is also presented. The principle of operation of simple 3D display prototypes is described, and design of the components of optical systems is considered. A description of an appropriate non-contact infrared head tracking method suitable for use with 3D television displays is also included. The thesis describes how the operating principle of the displays is based upon a twoimage system comprising a pair of images presented to the appropriate viewers' eyes. This is achieved by means of novel steering optics positioned behind a direct view liquid crystal display (LCD) that is controlled by a head position tracker. Within the work, two separate prototypes are described, both of which provide 3D to a single viewer who has limited movement. The thesis goes on to describe how these prototypes can be developed into a multiple-viewer display that is suitable for television use. A consideration of 3D television requirements is documented showing that glassesfree viewing (autostereoscopic), freedom of viewer movement and practical designs are important factors for 3D television displays. The displays are novel in design in several important aspects that comply with the requirements for 3D television. Firstly they do not require viewers to wear special glasses, secondly the displays allow viewers to move freely when viewing and finally the design of the displays is practical with a housing size similar to modem television sets and a cost that is not excessive. Surveys of other autostereoscopic methods included within the work suggest that no contemporary 3D display offers all of these important factors

    Revealing the Invisible: On the Extraction of Latent Information from Generalized Image Data

    Get PDF
    The desire to reveal the invisible in order to explain the world around us has been a source of impetus for technological and scientific progress throughout human history. Many of the phenomena that directly affect us cannot be sufficiently explained based on the observations using our primary senses alone. Often this is because their originating cause is either too small, too far away, or in other ways obstructed. To put it in other words: it is invisible to us. Without careful observation and experimentation, our models of the world remain inaccurate and research has to be conducted in order to improve our understanding of even the most basic effects. In this thesis, we1 are going to present our solutions to three challenging problems in visual computing, where a surprising amount of information is hidden in generalized image data and cannot easily be extracted by human observation or existing methods. We are able to extract the latent information using non-linear and discrete optimization methods based on physically motivated models and computer graphics methodology, such as ray tracing, real-time transient rendering, and image-based rendering

    Low power CMOS vision sensor for foreground segmentation

    Get PDF
    This thesis focuses on the design of a top-ranked algorithm for background subtraction, the Pixel Adaptive Based Segmenter (PBAS), for its mapping onto a CMOS vision sensor on the focal plane processing. The redesign of PBAS into its hardware oriented version, HO-PBAS, has led to a less number of memories per pixel, along with a simpler overall model, yet, resulting in an acceptable loss of accuracy with respect to its counterpart on CPU. This thesis features two CMOS vision sensors. The first one, HOPBAS1K, has laid out a 24 x 56 pixel array onto a miniasic chip in standard 180 nm CMOS technology. The second one, HOPBAS10K, features an array of 98 x 98 pixels in standard 180 nm CMOS technology too. The second chip fixes some issues found in the first chip, and provides good hardware and background performance metrics

    THERMOPLASTIC MICROFLUIDIC PCR TECHNOLOGIES FOR NEAR-PATIENT DIAGNOSTICS

    Get PDF
    Microfluidic technologies have great potential to help create portable, scalable, and cost-effective devices for rapid polymerase chain reaction (PCR) diagnostics in near patient settings. Unfortunately, current PCR diagnostics have not reached ubiquitous use in such settings because of instrumentation requirements, operational complexity, and high cost. This dissertation demonstrates a novel platform that can provide reduced assay time, simple workflow, scalability, and integration in order to better meet these challenges. First, a disposable microfluidic chip with integrated Au thin film heating and sensing elements is described herein. The system employs capillary pumping for automated loading of sample into the reaction chamber, combined with an integrated hydrophilic valve for precise self-metering of sample volumes into the device. With extensive multiphysics modeling and empirical testing we were able to optimize the system and achieve cycle times of 14 seconds and completed 35 PCR cycles plus HRMA in a total of 15 minutes, for successful identification of a mutation in the G6PC gene indicative of von Gierke’s disease. Next, a scalable sample digitization method that exploits the controlled pinning of fluid at geometric discontinuities within an array of staggered microfluidic traps is described. A simple geometric model is developed to predict the impact of device geometry on sample filling and discretization, and validated experimentally using fabricated cyclic olefin polymer devices. Finally, a 768-element staggered trap array is demonstrated, with highly reliable passive loading and discretization achieved within 5 min. Finally, a technique for reagent integration by pin spotting affords simplified workflow, and the ability to perform multiplexed PCR. Reagent printing formulations were optimized for stability and volume consistency during spotting. Paraffin wax was demonstrated as a protective layer to prevent rehydration and reagent cross contamination during sample loading. Deposition was accomplished by a custom pin spotting tool. A staggered trap array device with integrated reagents successfully amplified and validated a 2-plex assay, showing the potential of the platform for a multiplexed antibiotic resistance screening panel
    • …
    corecore