1,847 research outputs found

    Computational Multimedia for Video Self Modeling

    Get PDF
    Video self modeling (VSM) is a behavioral intervention technique in which a learner models a target behavior by watching a video of oneself. This is the idea behind the psychological theory of self-efficacy - you can learn or model to perform certain tasks because you see yourself doing it, which provides the most ideal form of behavior modeling. The effectiveness of VSM has been demonstrated for many different types of disabilities and behavioral problems ranging from stuttering, inappropriate social behaviors, autism, selective mutism to sports training. However, there is an inherent difficulty associated with the production of VSM material. Prolonged and persistent video recording is required to capture the rare, if not existed at all, snippets that can be used to string together in forming novel video sequences of the target skill. To solve this problem, in this dissertation, we use computational multimedia techniques to facilitate the creation of synthetic visual content for self-modeling that can be used by a learner and his/her therapist with a minimum amount of training data. There are three major technical contributions in my research. First, I developed an Adaptive Video Re-sampling algorithm to synthesize realistic lip-synchronized video with minimal motion jitter. Second, to denoise and complete the depth map captured by structure-light sensing systems, I introduced a layer based probabilistic model to account for various types of uncertainties in the depth measurement. Third, I developed a simple and robust bundle-adjustment based framework for calibrating a network of multiple wide baseline RGB and depth cameras

    Challenges in 3D scanning: Focusing on Ears and Multiple View Stereopsis

    Get PDF

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Automatic visual recognition using parallel machines

    Get PDF
    Invariant features and quick matching algorithms are two major concerns in the area of automatic visual recognition. The former reduces the size of an established model database, and the latter shortens the computation time. This dissertation, will discussed both line invariants under perspective projection and parallel implementation of a dynamic programming technique for shape recognition. The feasibility of using parallel machines can be demonstrated through the dramatically reduced time complexity. In this dissertation, our algorithms are implemented on the AP1000 MIMD parallel machines. For processing an object with a features, the time complexity of the proposed parallel algorithm is O(n), while that of a uniprocessor is O(n2). The two applications, one for shape matching and the other for chain-code extraction, are used in order to demonstrate the usefulness of our methods. Invariants from four general lines under perspective projection are also discussed in here. In contrast to the approach which uses the epipolar geometry, we investigate the invariants under isotropy subgroups. Theoretically speaking, two independent invariants can be found for four general lines in 3D space. In practice, we show how to obtain these two invariants from the projective images of four general lines without the need of camera calibration. A projective invariant recognition system based on a hypothesis-generation-testing scheme is run on the hypercube parallel architecture. Object recognition is achieved by matching the scene projective invariants to the model projective invariants, called transfer. Then a hypothesis-generation-testing scheme is implemented on the hypercube parallel architecture

    Automatic Registration of Optical Aerial Imagery to a LiDAR Point Cloud for Generation of City Models

    Get PDF
    This paper presents a framework for automatic registration of both the optical and 3D structural information extracted from oblique aerial imagery to a Light Detection and Ranging (LiDAR) point cloud without prior knowledge of an initial alignment. The framework employs a coarse to fine strategy in the estimation of the registration parameters. First, a dense 3D point cloud and the associated relative camera parameters are extracted from the optical aerial imagery using a state-of-the-art 3D reconstruction algorithm. Next, a digital surface model (DSM) is generated from both the LiDAR and the optical imagery-derived point clouds. Coarse registration parameters are then computed from salient features extracted from the LiDAR and optical imagery-derived DSMs. The registration parameters are further refined using the iterative closest point (ICP) algorithm to minimize global error between the registered point clouds. The novelty of the proposed approach is in the computation of salient features from the DSMs, and the selection of matching salient features using geometric invariants coupled with Normalized Cross Correlation (NCC) match validation. The feature extraction and matching process enables the automatic estimation of the coarse registration parameters required for initializing the fine registration process. The registration framework is tested on a simulated scene and aerial datasets acquired in real urban environments. Results demonstrates the robustness of the framework for registering optical and 3D structural information extracted from aerial imagery to a LiDAR point cloud, when co-existing initial registration parameters are unavailable

    A Practical Reflectance Transformation Imaging Pipeline for Surface Characterization in Cultural Heritage

    Get PDF
    We present a practical acquisition and processing pipeline to characterize the surface structure of cultural heritage objects. Using a free-form Reflectance Transformation Imaging (RTI) approach, we acquire multiple digital photographs of the studied object shot from a stationary camera. In each photograph, a light is freely positioned around the object in order to cover a wide variety of illumination directions. Multiple reflective spheres and white Lambertian surfaces are added to the scene to automatically recover light positions and to compensate for non-uniform illumination. An estimation of geometry and reflectance parameters (e.g., albedo, normals, polynomial texture maps coefficients) is then performed to locally characterize surface properties. The resulting object description is stable and representative enough of surface features to reliably provide a characterization of measured surfaces. We validate our approach by comparing RTI-acquired data with data acquired with a high-resolution microprofilometer.Terms: "European Union (EU)" & "Horizon 2020" / Action: H2020-EU.3.6.3. - Reflective societies - cultural heritage and European identity / Acronym: Scan4Reco / Grant number: 66509
    corecore