92 research outputs found
High Speed Human Action Recognition using a Photonic Reservoir Computer
The recognition of human actions in videos is one of the most active research
fields in computer vision. The canonical approach consists in a more or less
complex preprocessing stages of the raw video data, followed by a relatively
simple classification algorithm. Here we address recognition of human actions
using the reservoir computing algorithm, which allows us to focus on the
classifier stage. We introduce a new training method for the reservoir
computer, based on "Timesteps Of Interest", which combines in a simple way
short and long time scales. We study the performance of this algorithm using
both numerical simulations and a photonic implementation based on a single
non-linear node and a delay line on the well known KTH dataset. We solve the
task with high accuracy and speed, to the point of allowing for processing
multiple video streams in real time. The present work is thus an important step
towards developing efficient dedicated hardware for video processing
Analysis of 3D human gait reconstructed with a depth camera and mirrors
L'évaluation de la démarche humaine est l'une des composantes essentielles dans les soins de santé. Les systèmes à base de marqueurs avec plusieurs caméras sont largement utilisés pour faire cette analyse. Cependant, ces systèmes nécessitent généralement des équipements spécifiques à prix élevé et/ou des moyens de calcul intensif. Afin de réduire le coût de ces dispositifs, nous nous concentrons sur un système d'analyse de la marche qui utilise une seule caméra de profondeur. Le principe de notre travail est similaire aux systèmes multi-caméras, mais l'ensemble de caméras est remplacé par un seul capteur de profondeur et des miroirs. Chaque miroir dans notre configuration joue le rôle d'une caméra qui capture la scène sous un point de vue différent. Puisque nous n'utilisons qu'une seule caméra, il est ainsi possible d'éviter l'étape de synchronisation et également de réduire le coût de l'appareillage.
Notre thèse peut être divisée en deux sections: reconstruction 3D et analyse de la marche. Le résultat de la première section est utilisé comme entrée de la seconde. Notre système pour la reconstruction 3D est constitué d'une caméra de profondeur et deux miroirs. Deux types de capteurs de profondeur, qui se distinguent sur la base du mécanisme d'estimation de profondeur, ont été utilisés dans nos travaux. Avec la technique de lumière structurée (SL) intégrée dans le capteur Kinect 1, nous effectuons la reconstruction 3D à partir des principes de l'optique géométrique. Pour augmenter le niveau des détails du modèle reconstruit en 3D, la Kinect 2 qui estime la profondeur par temps de vol (ToF), est ensuite utilisée pour l'acquisition d'images. Cependant, en raison de réflections multiples sur les miroirs, il se produit une distorsion de la profondeur dans notre système. Nous proposons donc une approche simple pour réduire cette distorsion avant d'appliquer les techniques d'optique géométrique pour reconstruire un nuage de points de l'objet 3D.
Pour l'analyse de la démarche, nous proposons diverses alternatives centrées sur la normalité de la marche et la mesure de sa symétrie. Cela devrait être utile lors de traitements cliniques pour évaluer, par exemple, la récupération du patient après une intervention chirurgicale. Ces méthodes se composent d'approches avec ou sans modèle qui ont des inconvénients et avantages différents. Dans cette thèse, nous présentons 3 méthodes qui traitent directement les nuages de points reconstruits dans la section précédente. La première utilise la corrélation croisée des demi-corps gauche et droit pour évaluer la symétrie de la démarche, tandis que les deux autres methodes utilisent des autoencodeurs issus de l'apprentissage profond pour mesurer la normalité de la démarche.The problem of assessing human gaits has received a great attention in the literature since gait analysis is one of key components in healthcare. Marker-based and multi-camera systems are widely employed to deal with this problem. However, such systems usually require specific equipments with high price and/or high computational cost. In order to reduce the cost of devices, we focus on a system of gait analysis which employs only one depth sensor. The principle of our work is similar to multi-camera systems, but the collection of cameras is replaced by one depth sensor and mirrors. Each mirror in our setup plays the role of a camera which captures the scene at a different viewpoint. Since we use only one camera, the step of synchronization can thus be avoided and the cost of devices is also reduced.
Our studies can be separated into two categories: 3D reconstruction and gait analysis. The result of the former category is used as the input of the latter one. Our system for 3D reconstruction is built with a depth camera and two mirrors. Two types of depth sensor, which are distinguished based on the scheme of depth estimation, have been employed in our works. With the structured light (SL) technique integrated into the Kinect 1, we perform the 3D reconstruction based on geometrical optics. In order to increase the level of details of the 3D reconstructed model, the Kinect 2 with time-of-flight (ToF) depth measurement is used for image acquisition instead of the previous generation. However, due to multiple reflections on the mirrors, depth distortion occurs in our setup. We thus propose a simple approach for reducing such distortion before applying geometrical optics to reconstruct a point cloud of the 3D object.
For the task of gait analysis, we propose various alternative approaches focusing on the problem of gait normality/symmetry measurement. They are expected to be useful for clinical treatments such as monitoring patient's recovery after surgery. These methods consist of model-free and model-based approaches that have different cons and pros. In this dissertation, we present 3 methods that directly process point clouds reconstructed from the previous work. The first one uses cross-correlation of left and right half-bodies to assess gait symmetry while the other ones employ deep auto-encoders to measure gait normality
Indoor positioning with deep learning for mobile IoT systems
2022 Summer.Includes bibliographical references.The development of human-centric services with mobile devices in the era of the Internet of Things (IoT) has opened the possibility of merging indoor positioning technologies with various mobile applications to deliver stable and responsive indoor navigation and localization functionalities that can enhance user experience within increasingly complex indoor environments. But as GPS signals cannot easily penetrate modern building structures, it is challenging to build reliable indoor positioning systems (IPS). Currently, Wi-Fi sensing based indoor localization techniques are gaining in popularity as a means to build accurate IPS, benefiting from the prevalence of 802.11 family. Wi-Fi fingerprinting based indoor localization has shown remarkable performance over geometric mapping in complex indoor environments by taking advantage of pattern matching techniques. Today, the two main information extracted from Wi-Fi signals to form fingerprints are Received Signal Strength Index (RSSI) and Channel State Information (CSI) with Orthogonal Frequency-Division Multiplexing (OFDM) modulation, where the former can provide the average localization error around or under 10 meters but has low hardware and software requirements, while the latter has a higher chance to estimate locations with ultra-low distance errors but demands more resources from chipsets, firmware/software environments, etc. This thesis makes two novel contributions towards realizing viable IPS on mobile devices using RSSI and CSI information, and deep machine learning based fingerprinting. Due to the larger quantity of data and more sophisticated signal patterns to create fingerprints in complex indoor environments, conventional machine learning algorithms that need carefully engineered features suffer from the challenges of identifying features from very high dimensional data. Hence, the abilities of approximation functions generated from conventional machine learning models to estimate locations are limited. Deep machine learning based approaches can overcome these challenges to realize scalable feature pattern matching approaches such as fingerprinting. However, deep machine learning models generally require considerable memory footprint, and this creates a significant issue on resource-constrained devices such as mobile IoT devices, wearables, smartphones, etc. Developing efficient deep learning models is a critical factor to lower energy consumption for resource intensive mobile IoT devices and accelerate inference time. To address this issue, our first contribution proposes the CHISEL framework, which is a Wi-Fi RSSI- based IPS that incorporates data augmentation and compression-aware two-dimensional convolutional neural networks (2D CAECNNs) with different pruning and quantization options. The proposed model compression techniques help reduce model deployment overheads in the IPS. Unlike RSSI, CSI takes advantages of multipath signals to potentially help indoor localization algorithms achieve a higher level of localization accuracy. The compensations for magnitude attenuation and phase shifting during wireless propagation generate different patterns that can be utilized to define the uniqueness of different locations of signal reception. However, all prior work in this domain constrains the experimental space to relatively small-sized and rectangular rooms where the complexity of building interiors and dynamic noise from human activities, etc., are seldom considered. As part of our second contribution, we propose an end-to-end deep learning based framework called CSILoc for Wi-Fi CSI-based IPS on mobile IoT devices. The framework includes CSI data collection, clustering, denoising, calibration and classification, and is the first study to verify the feasibility to use CSI for floor level indoor localization with minimal knowledge of Wi-Fi access points (APs), thus avoiding security concerns during the offline data collection process
Doctor of Philosophy
dissertationWhile boundary representations, such as nonuniform rational B-spline (NURBS) surfaces, have traditionally well served the needs of the modeling community, they have not seen widespread adoption among the wider engineering discipline. There is a common perception that NURBS are slow to evaluate and complex to implement. Whereas computer-aided design commonly deals with surfaces, the engineering community must deal with materials that have thickness. Traditional visualization techniques have avoided NURBS, and there has been little cross-talk between the rich spline approximation community and the larger engineering field. Recently there has been a strong desire to marry the modeling and analysis phases of the iterative design cycle, be it in car design, turbulent flow simulation around an airfoil, or lighting design. Research has demonstrated that employing a single representation throughout the cycle has key advantages. Furthermore, novel manufacturing techniques employing heterogeneous materials require the introduction of volumetric modeling representations. There is little question that fields such as scientific visualization and mechanical engineering could benefit from the powerful approximation properties of splines. In this dissertation, we remove several hurdles to the application of NURBS to problems in engineering and demonstrate how their unique properties can be leveraged to solve problems of interest
Head tracking two-image 3D television displays
The research covered in this thesis encompasses the design of novel 3D displays, a
consideration of 3D television requirements and a survey of autostereoscopic methods
is also presented. The principle of operation of simple 3D display prototypes is
described, and design of the components of optical systems is considered. A
description of an appropriate non-contact infrared head tracking method suitable for
use with 3D television displays is also included.
The thesis describes how the operating principle of the displays is based upon a twoimage
system comprising a pair of images presented to the appropriate viewers' eyes.
This is achieved by means of novel steering optics positioned behind a direct view
liquid crystal display (LCD) that is controlled by a head position tracker. Within the
work, two separate prototypes are described, both of which provide 3D to a single
viewer who has limited movement. The thesis goes on to describe how these
prototypes can be developed into a multiple-viewer display that is suitable for
television use.
A consideration of 3D television requirements is documented showing that glassesfree
viewing (autostereoscopic), freedom of viewer movement and practical designs
are important factors for 3D television displays.
The displays are novel in design in several important aspects that comply with the
requirements for 3D television. Firstly they do not require viewers to wear special
glasses, secondly the displays allow viewers to move freely when viewing and finally
the design of the displays is practical with a housing size similar to modem television
sets and a cost that is not excessive. Surveys of other autostereoscopic methods
included within the work suggest that no contemporary 3D display offers all of these
important factors
Revealing the Invisible: On the Extraction of Latent Information from Generalized Image Data
The desire to reveal the invisible in order to explain the world around us has been a source of impetus for technological and scientific progress throughout human history. Many of the phenomena that directly affect us cannot be sufficiently explained based on the observations using our primary senses alone. Often this is because their originating cause is either too small, too far away, or in other ways obstructed. To put it in other words: it is invisible to us. Without careful observation and experimentation, our models of the world remain inaccurate and research has to be conducted in order to improve our understanding of even the most basic effects. In this thesis, we1 are going to present our solutions to three challenging problems in visual computing, where a surprising amount of information is hidden in generalized image data and cannot easily be extracted by human observation or existing methods. We are able to extract the latent information using non-linear and discrete optimization methods based on physically motivated models and computer graphics methodology, such as ray tracing, real-time transient rendering, and image-based rendering
Low power CMOS vision sensor for foreground segmentation
This thesis focuses on the design of a top-ranked algorithm for background
subtraction, the Pixel Adaptive Based Segmenter (PBAS), for its mapping onto a CMOS vision sensor on the focal
plane processing. The redesign of PBAS into its hardware oriented version, HO-PBAS, has led to a less number of
memories per pixel, along with a simpler overall model, yet, resulting in an acceptable loss of accuracy with respect
to its counterpart on CPU. This thesis features two CMOS vision sensors. The first one, HOPBAS1K, has laid out a
24 x 56 pixel array onto a miniasic chip in standard 180 nm CMOS technology. The second one, HOPBAS10K,
features an array of 98 x 98 pixels in standard 180 nm CMOS technology too. The second chip fixes some issues
found in the first chip, and provides good hardware and background performance metrics
THERMOPLASTIC MICROFLUIDIC PCR TECHNOLOGIES FOR NEAR-PATIENT DIAGNOSTICS
Microfluidic technologies have great potential to help create portable, scalable,
and cost-effective devices for rapid polymerase chain reaction (PCR) diagnostics in
near patient settings. Unfortunately, current PCR diagnostics have not reached
ubiquitous use in such settings because of instrumentation requirements, operational
complexity, and high cost. This dissertation demonstrates a novel platform that can
provide reduced assay time, simple workflow, scalability, and integration in order to
better meet these challenges.
First, a disposable microfluidic chip with integrated Au thin film heating and
sensing elements is described herein. The system employs capillary pumping for
automated loading of sample into the reaction chamber, combined with an integrated
hydrophilic valve for precise self-metering of sample volumes into the device. With
extensive multiphysics modeling and empirical testing we were able to optimize the
system and achieve cycle times of 14 seconds and completed 35 PCR cycles plus
HRMA in a total of 15 minutes, for successful identification of a mutation in the G6PC
gene indicative of von Gierke’s disease.
Next, a scalable sample digitization method that exploits the controlled pinning
of fluid at geometric discontinuities within an array of staggered microfluidic traps is
described. A simple geometric model is developed to predict the impact of device
geometry on sample filling and discretization, and validated experimentally using
fabricated cyclic olefin polymer devices. Finally, a 768-element staggered trap array is
demonstrated, with highly reliable passive loading and discretization achieved within
5 min.
Finally, a technique for reagent integration by pin spotting affords simplified
workflow, and the ability to perform multiplexed PCR. Reagent printing formulations
were optimized for stability and volume consistency during spotting. Paraffin wax was
demonstrated as a protective layer to prevent rehydration and reagent cross
contamination during sample loading. Deposition was accomplished by a custom pin
spotting tool. A staggered trap array device with integrated reagents successfully
amplified and validated a 2-plex assay, showing the potential of the platform for a
multiplexed antibiotic resistance screening panel
- …