120 research outputs found
Recommended from our members
High-quality dense stereo vision for whole body imaging and obesity assessment
textThe prevalence of obesity has necessitated developing safe and convenient tools for timely assessing and monitoring this condition for a broad range of population. Three-dimensional (3D) body imaging has become a new mean for obesity assessment. Moreover, it generates body shape information that is meaningful for fitness, ergonomics, and personalized clothing. In the previous work of our lab, we developed a prototype active stereo vision system that demonstrated a potential to fulfill this goal. But the prototype required four computer projectors to cast artificial textures on the body which facilitate the stereo-matching on texture-deficient images (e.g., skin). This decreases the mobility of the system when used to collect a large population data. In addition, the resolution of the generated 3D~images is limited by both cameras and projectors available during the project. The study reported in this dissertation highlights our continued effort in improving the capability of 3Dbody imaging through simplified hardware for passive stereo and advanced computation techniques.
The system utilizes high-resolution single-lens reflex (SLR) cameras, which became widely available lately, and is configured in a two-stance design to image the front and back surfaces of a person. A total of eight cameras are used to form four pairs of stereo units. Each unit covers a quarter of the body surface. The stereo units are individually calibrated with a specific pattern to determine cameras' intrinsic and extrinsic parameters for stereo matching. The global orientation and position of each stereo unit within a common world coordinate system is calculated through a 3Dregistration step. The stereo calibration and 3Dregistration procedures do not need to be repeated for a deployed system if the cameras' relative positions have not changed. This property contributes to the portability of the system, and tremendously alleviates the maintenance task. The image acquisition time is around two seconds for a whole-body capture. The system works in an indoor environment with a moderate ambient light.
Advanced stereo computation algorithms are developed by taking advantage of high-resolution images and by tackling the ambiguity problem in stereo matching. A multi-scale, coarse-to-fine matching framework is proposed to match large-scale textures at a low resolution and refine the matched results over higher resolutions. This matching strategy reduces the complexity of the computation and avoids ambiguous matching at the native resolution. The pixel-to-pixel stereo matching algorithm follows a classic, four-step strategy which consists of matching cost computation, cost aggregation, disparity computation and disparity refinement.
The system performance has been evaluated on mannequins and human subjects in comparison with other measurement methods. It was found that the geometrical measurements from reconstructed 3Dbody models, including body circumferences and whole volume, are highly repeatable and consistent with manual and other instrumental measurements (CV 0.99). The agreement of percent body fat (%BF) estimation on human subjects between stereo and dual-energy X-ray absorptiometry (DEXA) was found to be improved over the previous active stereo system, and the limits of agreement with 95% confidence were reduced by half. Our achieved %BF estimation agreement is among the lowest ones of other comparative studies with commercialized air displacement plethysmography (ADP) and DEXA. In practice, %BF estimation through a two-component model is sensitive to body volume measurement, and the estimation of lung volume could be a source of variation. Protocols for this type of measurement should still be created with an awareness of this factor.Biomedical Engineerin
Unfalsified visual servoing for simultaneous object recognition and pose tracking
In a complex environment, simultaneous object recognition and tracking has been one of the challenging topics in computer vision and robotics. Current approaches are usually fragile due to spurious feature matching and local convergence for pose determination. Once a failure happens, these approaches lack a mechanism to recover automatically. In this paper, data-driven unfalsified control is proposed for solving this problem in visual servoing. It recognizes a target through matching image features with a 3-D model and then tracks them through dynamic visual servoing. The features can be falsified or unfalsified by a supervisory mechanism according to their tracking performance. Supervisory visual servoing is repeated until a consensus between the model and the selected features is reached, so that model recognition and object tracking are accomplished. Experiments show the effectiveness and robustness of the proposed algorithm to deal with matching and tracking failures caused by various disturbances, such as fast motion, occlusions, and illumination variation
Multi-camera object segmentation in dynamically textured scenes using disparity contours
This thesis presents a stereo-based object segmentation system that combines the simplicity and efficiency of the background subtraction approach with the capacity of dealing with dynamic lighting and background texture and large textureless regions. The method proposed here does not rely on full stereo reconstruction or empirical parameter tuning, but employs disparity-based hypothesis verification to separate multiple objects at different depths.The proposed stereo-based segmentation system uses a pair of calibrated cameras with a small baseline and factors the segmentation problem into two stages: a well-understood offline stage and a novel online one. Based on the calibrated parameters, the offline stage models the 3D geometry of a background by constructing a complete disparity map. The online stage compares corresponding new frames synchronously captured by the two cameras according to the background disparity map in order to falsify the hypothesis that the scene contains only background. The resulting object boundary contours possess a number of useful features that can be exploited for object segmentation.Three different approaches to contour extraction and object segmentation were experimented with and their advantages and limitations analyzed. The system demonstrates its ability to extract multiple objects from a complex scene with near real-time performance. The algorithm also has the potential of providing precise object boundaries rather than just bounding boxes, and is extensible to perform 2D and 3D object tracking and online background update
Recommended from our members
Camera positioning for 3D panoramic image rendering
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.Virtual camera realisation and the proposition of trapezoidal camera architecture are the two broad contributions of this thesis. Firstly, multiple camera and their arrangement constitute a critical component which affect the integrity of visual content acquisition for multi-view video. Currently, linear, convergence, and divergence arrays are the prominent camera topologies adopted. However, the large number of cameras required and their synchronisation are two of prominent challenges usually encountered. The use of virtual cameras can significantly reduce the number of physical cameras used with respect to any of the known
camera structures, hence adequately reducing some of the other implementation issues. This thesis explores to use image-based rendering with and without geometry in the implementations leading to the realisation of virtual cameras. The virtual camera implementation was carried out from the perspective of depth map (geometry) and use of multiple image samples (no geometry). Prior to the virtual camera realisation, the generation of depth map was investigated using region match measures widely known for solving image point correspondence problem. The constructed depth maps have been compare with the ones generated
using the dynamic programming approach. In both the geometry and no geometry approaches, the virtual cameras lead to the rendering of views from a textured depth map, construction of 3D panoramic image of a scene by stitching multiple image samples and performing superposition on them, and computation
of virtual scene from a stereo pair of panoramic images. The quality of these rendered images were assessed through the use of either objective or subjective analysis in Imatest software. Further more, metric reconstruction of a scene was performed by re-projection of the pixel points from multiple image samples with
a single centre of projection. This was done using sparse bundle adjustment algorithm. The statistical summary obtained after the application of this algorithm provides a gauge for the efficiency of the optimisation step. The optimised data was then visualised in Meshlab software environment, hence providing the reconstructed scene. Secondly, with any of the well-established camera arrangements, all cameras are usually constrained to the same horizontal plane. Therefore, occlusion becomes an extremely challenging problem, and a robust camera set-up is required in order to resolve strongly the hidden part of any scene objects.
To adequately meet the visibility condition for scene objects and given that occlusion of the same scene objects can occur, a multi-plane camera structure is highly desirable. Therefore, this thesis also explore trapezoidal camera structure for image acquisition. The approach here is to assess the feasibility and potential
of several physical cameras of the same model being sparsely arranged on the edge of an efficient trapezoid graph. This is implemented both Matlab and Maya. The quality of the depth maps rendered in Matlab are better in Quality
Depth recovery and parameter analysis using single-lens prism based stereovision system
Ph.DDOCTOR OF PHILOSOPH
Model-based human upper body tracking using interest points in real-time video
Vision-based human motion analysis has received huge attention from researchers because of the number of applications, such as automated surveillance, video indexing, human machine interaction, traffic monitoring, and vehicle navigation. However, it contains several open problems. To date, despite very promising proposed approaches, no explicit solution has been found to solve these open problems efficiently. In this regard, this thesis presents a model-based human upper body pose estimation and tracking system using interest points
(IPs) in real-time video.
In the first stage, we propose a novel IP-based background-subtraction algorithm to segment the foreground IPs of each frame from the background ones. Afterwards, the foreground IPs of any two consecutive frames are matched to each other using a dynamic hybrid localspatial IP matching algorithm, proposed in this research.
The IP matching algorithm starts by using the local feature descriptors of the IPs to find an initial set of possible matches. Then two filtering steps are applied to the results to increase the precision by deleting the mismatched pairs. To improve the recall, a spatial matching process is applied to the remaining unmatched points.
Finally, a two-stage hierarchical-global model-based pose estimation and tracking algorithm based on Particle Swarm Optimiation (PSO) is proposed to track the human upper body through consecutive frames. Given the pose and the foreground IPs in the previous frame and the matched points in the current frame, the proposed PSO-based pose estimation and tracking algorithm estimates the current pose hierarchically by minimizing the discrepancy between the hypothesized pose and the real matched observed points in the first stage. Then a global PSO is applied to the pose estimated by the first stage to do a consistency check and pose refinement
- …