    High-speed in vitro intensity diffraction tomography

    We demonstrate a label-free, scan-free intensity diffraction tomography technique utilizing annular illumination (aIDT) to rapidly characterize large-volume three-dimensional (3-D) refractive index distributions in vitro. By optimally matching the illumination geometry to the microscope pupil, our technique reduces the data requirement by 60 times to achieve high-speed 10-Hz volume rates. Using eight intensity images, we recover volumes of ∼350 μm  ×  100 μm  ×  20  μm, with near diffraction-limited lateral resolution of   ∼  487  nm and axial resolution of   ∼  3.4  μm. The attained large volume rate and high-resolution enable 3-D quantitative phase imaging of complex living biological samples across multiple length scales. We demonstrate aIDT’s capabilities on unicellular diatom microalgae, epithelial buccal cell clusters with native bacteria, and live Caenorhabditis elegans specimens. Within these samples, we recover macroscale cellular structures, subcellular organelles, and dynamic micro-organism tissues with minimal motion artifacts. Quantifying such features has significant utility in oncology, immunology, and cellular pathophysiology, where these morphological features are evaluated for changes in the presence of disease, parasites, and new drug treatments. Finally, we simulate the aIDT system to highlight the accuracy and sensitivity of the proposed technique. aIDT shows promise as a powerful high-speed, label-free computational microscopy approach for applications where natural imaging is required to evaluate environmental effects on a sample in real time.https://arxiv.org/abs/1904.06004Accepted manuscrip

    Computational Multimedia for Video Self Modeling

    Video self modeling (VSM) is a behavioral intervention technique in which a learner models a target behavior by watching a video of oneself. This is the idea behind the psychological theory of self-efficacy - you can learn or model to perform certain tasks because you see yourself doing it, which provides the most ideal form of behavior modeling. The effectiveness of VSM has been demonstrated for many different types of disabilities and behavioral problems ranging from stuttering, inappropriate social behaviors, autism, selective mutism to sports training. However, there is an inherent difficulty associated with the production of VSM material. Prolonged and persistent video recording is required to capture the rare, if not existed at all, snippets that can be used to string together in forming novel video sequences of the target skill. To solve this problem, in this dissertation, we use computational multimedia techniques to facilitate the creation of synthetic visual content for self-modeling that can be used by a learner and his/her therapist with a minimum amount of training data. There are three major technical contributions in my research. First, I developed an Adaptive Video Re-sampling algorithm to synthesize realistic lip-synchronized video with minimal motion jitter. Second, to denoise and complete the depth map captured by structure-light sensing systems, I introduced a layer based probabilistic model to account for various types of uncertainties in the depth measurement. Third, I developed a simple and robust bundle-adjustment based framework for calibrating a network of multiple wide baseline RGB and depth cameras

    The effectiveness of PROMPT therapy for children with cerebral palsy

    The purpose of this study is to evaluate the effectiveness of a motor speech treatment approach (PROMPT) in the management of motor-speech impairment in children with cerebral palsy. Two main objectives were addressed: (1) to evaluate changes in speech intelligibility and, (2) evaluate changes in kinematic movements of the jaw and lips using three dimensional (3D) motion analysis.A single subject multiple-baseline-across-participants research design, with four phases: Baseline (A1), two intervention phases (B and C) and maintenance (A2), was implemented.Six participants, aged 3-to-11-years (3 boys, 3 girls) with moderate to severe speech impairment were recruited through The Centre for Cerebral Palsy, Western Australia (TCCP). Inclusion criteria were: diagnosis of cerebral palsy, age 3 – 14 years, stable head control (supported or independent), spontaneous use of at least 15 words, speech impairment ≥1.5 standard deviations, hearing loss no greater than 25dB, developmental quotient ≥70 (Leiter-Brief International Performance Scale R) and no previous exposure to PROMPT. Thirteen typically-developing peers were recruited to compare the trend of kinematic changes in jaw and lip movements to those of the children with cerebral palsy.Upon achievement of a stable baseline, participants completed two intervention phases both of 10 weeks duration. Therapist fidelity to the PROMPT approach was determined by a blinded, independent PROMPT Instructor.Perceptual outcome measures included the administration of weekly speech probes, containing trained and untrained vocabulary at the two targeted levels of intervention plus an additional level. These were analysed for both perceptual accuracy (PA) and the motor speech movement parameter. End of phase measures included: 1. Changes in phonetic accuracy as measured using a measure of percentage phonemes correct; 2. Speech intelligibility measures, using a standardised assessment tool; and 3. Changes to activity/participation using the Canadian Occupational Performance Measure (COPM).Kinematic data were collected at the end of each study phase using 3D motion analysis (Vicon Motus 9.1). This involved the collection of jaw and lip measurements of distance, duration and velocity, during the production of 11 untrained stimulus words. The words contained vowels that spanned the articulatory space and represented motor-speech movement patterns at the level of mandibular and labial-facial control, as classified according to the PROMPT motor speech hierarchy.Analysis of the speech probe data showed all participants recorded a statistically significant improvement. Between phases A1-B and B-C 6/6 and 4/6 participants respectively, recorded a statistically significant increase in performance level on the motor speech movement patterns (MSMPs) targeted during the training of that intervention priority (IP). The data further show that five participants (one participant was lost to follow-up) achieved a statistically significant increase at 12- weeks post-intervention as compared to baseline (phase A1).Four participants achieved a statistically significant increase in performance level in the PA of the speech probes of both IP1 and IP2 between phases A1-B. Whilst only one participant recorded a statistically significant increase in PA between phases BC, five participants achieved a statistically significant increase in IP2 between phases A1-C. The data further show all participants achieved a statistically significant increase in PA on both intervention priorities at 12-weeks post-intervention. All participants recorded data that indicated improved perceptual accuracy across the study phases. This was indicated by a statistically significant increase in the percentage phonemes correct scores F(3,18) = 5.55, p<.05.All participants achieved improved speech intelligibility. Five participants recorded an increase in speech intelligibility greater than 14% at the end of the first intervention (phase B). Continued improvement was observed for 5 participants at the end of the second intervention (phase C)

    Histopathological image analysis : a review

    Over the past decade, dramatic increases in computational power and improvement in image analysis algorithms have allowed the development of powerful computer-assisted analytical approaches to radiological data. With the recent advent of whole slide digital scanners, tissue histopathology slides can now be digitized and stored in digital image form. Consequently, digitized tissue histopathology has now become amenable to the application of computerized image analysis and machine learning techniques. Analogous to the role of computer-assisted diagnosis (CAD) algorithms in medical imaging to complement the opinion of a radiologist, CAD algorithms have begun to be developed for disease detection, diagnosis, and prognosis prediction to complement the opinion of the pathologist. In this paper, we review the recent state of the art CAD technology for digitized histopathology. This paper also briefly describes the development and application of novel image analysis technology for a few specific histopathology related problems being pursued in the United States and Europe

    Técnicas de coste reducido para el posicionamiento del paciente en radioterapia percutánea utilizando un sistema de imágenes ópticas

    Patient positioning is an important part of radiation therapy which is one of the main solutions for the treatment of malignant tissue in the human body. Currently, the most common patient positioning methods expose healthy tissue of the patient's body to extra dangerous radiations. Other non-invasive positioning methods are either not very accurate or are very costly for an average hospital. In this thesis, we explore the possibility of developing a system comprised of affordable hardware and advanced computer vision algorithms that facilitates patient positioning. Our algorithms are based on the usage of affordable RGB-D sensors, image features, ArUco planar markers, and other geometry registration methods. Furthermore, we take advantage of consumer-level computing hardware to make our systems widely accessible. More specifically, we avoid the usage of approaches that need to take advantage of dedicated GPU hardware for general-purpose computing since they are more costly. In different publications, we explore the usage of the mentioned tools to increase the accuracy of reconstruction/localization of the patient in its pose. We also take into account the visualization of the patient's target position with respect to their current position in order to assist the person who performs patient positioning. Furthermore, we make usage of augmented reality in conjunction with a real-time 3D tracking algorithm for better interaction between the program and the operator. We also solve more fundamental problems about ArUco markers that could be used in the future to improve our systems. These include highquality multi-camera calibration and mapping using ArUco markers plus detection of these markers in event cameras which are very useful in the presence of fast camera movement. In the end, we conclude that it is possible to increase the accuracy of 3D reconstruction and localization by combining current computer vision algorithms with fiducial planar markers with RGB-D sensors. This is reflected in the low amount of error we have achieved in our experiments for patient positioning, pushing forward the state of the art for this application.En el tratamiento de tumores malignos en el cuerpo, el posicionamiento del paciente en las sesiones de radioterapia es una cuestión crucial. Actualmente, los métodos más comunes de posicionamiento del paciente exponen tejido sano del mismo a radiaciones peligrosas debido a que no es posible asegurar que la posición del paciente siempre sea la misma que la que tuvo cuando se planificó la zona a radiar. Los métodos que se usan actualmente, o no son precisos o tienen costes que los hacen inasequibles para ser usados en hospitales con financiación limitada. En esta Tesis hemos analizado la posibilidad de desarrollar un sistema compuesto por hardware de bajo coste y métodos avanzados de visión por ordenador que ayuden a que el posicionamiento del paciente sea el mismo en las diferentes sesiones de radioterapia, con respecto a su pose cuando fue se planificó la zona a radiar. La solución propuesta como resultado de la Tesis se basa en el uso de sensores RGB-D, características extraídas de la imagen, marcadores cuadrados denominados ArUco y métodos de registro de la geometría en la imagen. Además, en la solución propuesta, se aprovecha la existencia de hardware convencional de bajo coste para hacer nuestro sistema ampliamente accesible. Más específicamente, evitamos el uso de enfoques que necesitan aprovechar GPU, de mayores costes, para computación de propósito general. Se han obtenido diferentes publicaciones para conseguir el objetivo final. Las mismas describen métodos para aumentar la precisión de la reconstrucción y la localización del paciente en su pose, teniendo en cuenta la visualización de la posición ideal del paciente con respecto a su posición actual, para ayudar al profesional que realiza la colocación del paciente. También se han propuesto métodos de realidad aumentada junto con algoritmos para seguimiento 3D en tiempo real para conseguir una mejor interacción entre el sistema ideado y el profesional que debe realizar esa labor. De forma añadida, también se han propuesto soluciones para problemas fundamentales relacionados con el uso de marcadores cuadrados que han sido utilizados para conseguir el objetivo de la Tesis. Las soluciones propuestas pueden ser empleadas en el futuro para mejorar otros sistemas. Los problemas citados incluyen la calibración y el mapeo multicámara de alta calidad utilizando los marcadores y la detección de estos marcadores en cámaras de eventos, que son muy útiles en presencia de movimientos rápidos de la cámara. Al final, concluimos que es posible aumentar la precisión de la reconstrucción y localización en 3D combinando los actuales algoritmos de visión por ordenador, que usan marcadores cuadrados de referencia, con sensores RGB-D. Los resultados obtenidos con respecto al error que el sistema obtiene al reproducir el posicionamiento del paciente suponen un importante avance en el estado del arte de este tópico

    Deep Learning-Based Human Pose Estimation: A Survey

    Human pose estimation aims to locate the human body parts and build human body representation (e.g., body skeleton) from input data such as images and videos. It has drawn increasing attention during the past decade and has been utilized in a wide range of applications including human-computer interaction, motion analysis, augmented reality, and virtual reality. Although the recently developed deep learning-based solutions have achieved high performance in human pose estimation, there still remain challenges due to insufficient training data, depth ambiguities, and occlusion. The goal of this survey paper is to provide a comprehensive review of recent deep learning-based solutions for both 2D and 3D pose estimation via a systematic analysis and comparison of these solutions based on their input data and inference procedures. More than 240 research papers since 2014 are covered in this survey. Furthermore, 2D and 3D human pose estimation datasets and evaluation metrics are included. Quantitative performance comparisons of the reviewed methods on popular datasets are summarized and discussed. Finally, the challenges involved, applications, and future research directions are concluded. We also provide a regularly updated project page: \url{https://github.com/zczcwh/DL-HPE


    Quasi-articulated objects, such as human beings, are among the most commonly seen objects in our daily lives. Extensive research have been dedicated to 3D shape reconstruction and motion analysis for this type of objects for decades. A major motivation is their wide applications, such as in entertainment, surveillance and health care. Most of existing studies relied on one or more regular video cameras. In recent years, commodity depth sensors have become more and more widely available. The geometric measurements delivered by the depth sensors provide significantly valuable information for these tasks. In this dissertation, we propose three algorithms for monocular pose estimation and shape reconstruction of quasi-articulated objects using a single commodity depth sensor. These three algorithms achieve shape reconstruction with increasing levels of granularity and personalization. We then further develop a method for highly detailed shape reconstruction based on our pose estimation techniques. Our first algorithm takes advantage of a motion database acquired with an active marker-based motion capture system. This method combines pose detection through nearest neighbor search with pose refinement via non-rigid point cloud registration. It is capable of accommodating different body sizes and achieves more than twice higher accuracy compared to a previous state of the art on a publicly available dataset. The above algorithm performs frame by frame estimation and therefore is less prone to tracking failure. Nonetheless, it does not guarantee temporal consistent of the both the skeletal structure and the shape and could be problematic for some applications. To address this problem, we develop a real-time model-based approach for quasi-articulated pose and 3D shape estimation based on Iterative Closest Point (ICP) principal with several novel constraints that are critical for monocular scenario. In this algorithm, we further propose a novel method for automatic body size estimation that enables its capability to accommodate different subjects. Due to the local search nature, the ICP-based method could be trapped to local minima in the case of some complex and fast motions. To address this issue, we explore the potential of using statistical model for soft point correspondences association. Towards this end, we propose a unified framework based on Gaussian Mixture Model for joint pose and shape estimation of quasi-articulated objects. This method achieves state-of-the-art performance on various publicly available datasets. Based on our pose estimation techniques, we then develop a novel framework that achieves highly detailed shape reconstruction by only requiring the user to move naturally in front of a single depth sensor. Our experiments demonstrate reconstructed shapes with rich geometric details for various subjects with different apparels. Last but not the least, we explore the applicability of our method on two real-world applications. First of all, we combine our ICP-base method with cloth simulation techniques for Virtual Try-on. Our system delivers the first promising 3D-based virtual clothing system. Secondly, we explore the possibility to extend our pose estimation algorithms to assist physical therapist to identify their patients’ movement dysfunctions that are related to injuries. Our preliminary experiments have demonstrated promising results by comparison with the gold standard active marker-based commercial system. Throughout the dissertation, we develop various state-of-the-art algorithms for pose estimation and shape reconstruction of quasi-articulated objects by leveraging the geometric information from depth sensors. We also demonstrate their great potentials for different real-world applications

    Using Visual Feedback to Enhance Intonation Control within Electrolaryngeal Speech

    This study evaluated the effectiveness of visual feedback in facilitating pitch control using a pressure sensitive electrolarynx (EL). This proof-of-concept pilot study was a single-subject design that included two healthy adults (1 female aged 23;6 years old, and 1 male aged 67;0 years old). Both participants were provided with visual feedback over two consecutive weeks. Changes in pitch and force control accuracy over 4 hours were analyzed. The results demonstrated that both participants showed an improvement in force control accuracy from the first to the last training session. The results of this proof-of-concept study are a preliminary step towards the development of a clinical training protocol for the use of a pressure sensitive EL. Further, these results highlight the importance of developing a clinically relevant tool for the improvement of a laryngectomee’s quality of life postlaryngectomy
