1,303 research outputs found

    Data Fusion of Objects Using Techniques Such as Laser Scanning, Structured Light and Photogrammetry for Cultural Heritage Applications

    Full text link
    In this paper we present a semi-automatic 2D-3D local registration pipeline capable of coloring 3D models obtained from 3D scanners by using uncalibrated images. The proposed pipeline exploits the Structure from Motion (SfM) technique in order to reconstruct a sparse representation of the 3D object and obtain the camera parameters from image feature matches. We then coarsely register the reconstructed 3D model to the scanned one through the Scale Iterative Closest Point (SICP) algorithm. SICP provides the global scale, rotation and translation parameters, using minimal manual user intervention. In the final processing stage, a local registration refinement algorithm optimizes the color projection of the aligned photos on the 3D object removing the blurring/ghosting artefacts introduced due to small inaccuracies during the registration. The proposed pipeline is capable of handling real world cases with a range of characteristics from objects with low level geometric features to complex ones

    Perceptual integration for qualitatively different 3-D cues in the human brain.

    Get PDF
    The visual system's flexibility in estimating depth is remarkable: We readily perceive 3-D structure under diverse conditions from the seemingly random dots of a "magic eye" stereogram to the aesthetically beautiful, but obviously flat, canvasses of the Old Masters. Yet, 3-D perception is often enhanced when different cues specify the same depth. This perceptual process is understood as Bayesian inference that improves sensory estimates. Despite considerable behavioral support for this theory, insights into the cortical circuits involved are limited. Moreover, extant work tested quantitatively similar cues, reducing some of the challenges associated with integrating computationally and qualitatively different signals. Here we address this challenge by measuring fMRI responses to depth structures defined by shading, binocular disparity, and their combination. We quantified information about depth configurations (convex "bumps" vs. concave "dimples") in different visual cortical areas using pattern classification analysis. We found that fMRI responses in dorsal visual area V3B/KO were more discriminable when disparity and shading concurrently signaled depth, in line with the predictions of cue integration. Importantly, by relating fMRI and psychophysical tests of integration, we observed a close association between depth judgments and activity in this area. Finally, using a cross-cue transfer test, we found that fMRI responses evoked by one cue afford classification of responses evoked by the other. This reveals a generalized depth representation in dorsal visual cortex that combines qualitatively different information in line with 3-D perception

    Stereoscopic motion analysis in densely packed clusters: 3D analysis of the shimmering behaviour in Giant honey bees

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The detailed interpretation of mass phenomena such as human escape panic or swarm behaviour in birds, fish and insects requires detailed analysis of the 3D movements of individual participants. Here, we describe the adaptation of a 3D stereoscopic imaging method to measure the positional coordinates of individual agents in densely packed clusters. The method was applied to study behavioural aspects of shimmering in Giant honeybees, a collective defence behaviour that deters predatory wasps by visual cues, whereby individual bees flip their abdomen upwards in a split second, producing Mexican wave-like patterns.</p> <p>Results</p> <p>Stereoscopic imaging provided non-invasive, automated, simultaneous, <it>in-situ </it>3D measurements of hundreds of bees on the nest surface regarding their thoracic position and orientation of the body length axis. <it>Segmentation </it>was the basis for the <it>stereo matching</it>, which defined correspondences of individual bees in pairs of stereo images. Stereo-matched "agent bees" were re-identified in subsequent frames by the <it>tracking </it>procedure and <it>triangulated </it>into real-world coordinates. These algorithms were required to calculate the three spatial motion components (dx: horizontal, dy: vertical and dz: towards and from the comb) of individual bees over time.</p> <p>Conclusions</p> <p>The method enables the assessment of the 3D positions of individual Giant honeybees, which is not possible with single-view cameras. The method can be applied to distinguish at the individual bee level active movements of the thoraces produced by abdominal flipping from passive motions generated by the moving bee curtain. The data provide evidence that the z-deflections of thoraces are potential cues for colony-intrinsic communication. The method helps to understand the phenomenon of collective decision-making through mechanoceptive synchronization and to associate shimmering with the principles of wave propagation. With further, minor modifications, the method could be used to study aspects of other mass phenomena that involve active and passive movements of individual agents in densely packed clusters.</p

    3D scanning of cultural heritage with consumer depth cameras

    Get PDF
    Three dimensional reconstruction of cultural heritage objects is an expensive and time-consuming process. Recent consumer real-time depth acquisition devices, like Microsoft Kinect, allow very fast and simple acquisition of 3D views. However 3D scanning with such devices is a challenging task due to the limited accuracy and reliability of the acquired data. This paper introduces a 3D reconstruction pipeline suited to use consumer depth cameras as hand-held scanners for cultural heritage objects. Several new contributions have been made to achieve this result. They include an ad-hoc filtering scheme that exploits the model of the error on the acquired data and a novel algorithm for the extraction of salient points exploiting both depth and color data. Then the salient points are used within a modified version of the ICP algorithm that exploits both geometry and color distances to precisely align the views even when geometry information is not sufficient to constrain the registration. The proposed method, although applicable to generic scenes, has been tuned to the acquisition of sculptures and in this connection its performance is rather interesting as the experimental results indicate

    Stereoscopic high dynamic range imaging

    Get PDF
    Two modern technologies show promise to dramatically increase immersion in virtual environments. Stereoscopic imaging captures two images representing the views of both eyes and allows for better depth perception. High dynamic range (HDR) imaging accurately represents real world lighting as opposed to traditional low dynamic range (LDR) imaging. HDR provides a better contrast and more natural looking scenes. The combination of the two technologies in order to gain advantages of both has been, until now, mostly unexplored due to the current limitations in the imaging pipeline. This thesis reviews both fields, proposes stereoscopic high dynamic range (SHDR) imaging pipeline outlining the challenges that need to be resolved to enable SHDR and focuses on capture and compression aspects of that pipeline. The problems of capturing SHDR images that would potentially require two HDR cameras and introduce ghosting, are mitigated by capturing an HDR and LDR pair and using it to generate SHDR images. A detailed user study compared four different methods of generating SHDR images. Results demonstrated that one of the methods may produce images perceptually indistinguishable from the ground truth. Insights obtained while developing static image operators guided the design of SHDR video techniques. Three methods for generating SHDR video from an HDR-LDR video pair are proposed and compared to the ground truth SHDR videos. Results showed little overall error and identified a method with the least error. Once captured, SHDR content needs to be efficiently compressed. Five SHDR compression methods that are backward compatible are presented. The proposed methods can encode SHDR content to little more than that of a traditional single LDR image (18% larger for one method) and the backward compatibility property encourages early adoption of the format. The work presented in this thesis has introduced and advanced capture and compression methods for the adoption of SHDR imaging. In general, this research paves the way for a novel field of SHDR imaging which should lead to improved and more realistic representation of captured scenes

    Combined Learned and Classical Methods for Real-Time Visual Perception in Autonomous Driving

    Full text link
    Autonomy, robotics, and Artificial Intelligence (AI) are among the main defining themes of next-generation societies. Of the most important applications of said technologies is driving automation which spans from different Advanced Driver Assistance Systems (ADAS) to full self-driving vehicles. Driving automation is promising to reduce accidents, increase safety, and increase access to mobility for more people such as the elderly and the handicapped. However, one of the main challenges facing autonomous vehicles is robust perception which can enable safe interaction and decision making. With so many sensors to perceive the environment, each with its own capabilities and limitations, vision is by far one of the main sensing modalities. Cameras are cheap and can provide rich information of the observed scene. Therefore, this dissertation develops a set of visual perception algorithms with a focus on autonomous driving as the target application area. This dissertation starts by addressing the problem of real-time motion estimation of an agent using only the visual input from a camera attached to it, a problem known as visual odometry. The visual odometry algorithm can achieve low drift rates over long-traveled distances. This is made possible through the innovative local mapping approach used. This visual odometry algorithm was then combined with my multi-object detection and tracking system. The tracking system operates in a tracking-by-detection paradigm where an object detector based on convolution neural networks (CNNs) is used. Therefore, the combined system can detect and track other traffic participants both in image domain and in 3D world frame while simultaneously estimating vehicle motion. This is a necessary requirement for obstacle avoidance and safe navigation. Finally, the operational range of traditional monocular cameras was expanded with the capability to infer depth and thus replace stereo and RGB-D cameras. This is accomplished through a single-stream convolution neural network which can output both depth prediction and semantic segmentation. Semantic segmentation is the process of classifying each pixel in an image and is an important step toward scene understanding. Literature survey, algorithms descriptions, and comprehensive evaluations on real-world datasets are presented.Ph.D.College of Engineering & Computer ScienceUniversity of Michiganhttps://deepblue.lib.umich.edu/bitstream/2027.42/153989/1/Mohamed Aladem Final Dissertation.pdfDescription of Mohamed Aladem Final Dissertation.pdf : Dissertatio

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Intraframe Scene Capturing and Speed Measurement Based on Superimposed Image: New Sensor Concept for Vehicle Speed Measurement

    Get PDF
    A vision based vehicle speed measurement method is presented in this paper. The proposed intraframe method calculates speed estimates based on a single frame of a single camera. With a special double exposure, a superimposed image can be obtained, where motion blur appears significantly only in the bright regions of the otherwise sharp image. This motion blur contains information of the movement of bright objects during the exposure. Most papers in the field of motion blur are aiming at the removal of this image degradation effect. In this work, we utilize it for a novel speed measurement approach. An applicable sensor structure and exposure-control system are also shown, as well as the applied image processing methods and experimental results. © 2016 Mate Nemeth and Akos Zarandy

    Novel haptic interface For viewing 3D images

    Get PDF
    In recent years there has been an explosion of devices and systems capable of displaying stereoscopic 3D images. While these systems provide an improved experience over traditional bidimensional displays they often fall short on user immersion. Usually these systems only improve depth perception by relying on the stereopsis phenomenon. We propose a system that improves the user experience and immersion by having a position dependent rendering of the scene and the ability to touch the scene. This system uses depth maps to represent the geometry of the scene. Depth maps can be easily obtained on the rendering process or can be derived from the binocular-stereo images by calculating their horizontal disparity. This geometry is then used as an input to be rendered in a 3D display, do the haptic rendering calculations and have a position depending render of the scene. The author presents two main contributions. First, since the haptic devices have a finite work space and limited resolution, we used what we call detail mapping algorithms. These algorithms compress geometry information contained in a depth map, by reducing the contrast among pixels, in such a way that it can be rendered into a limited resolution display medium without losing any detail. Second, the unique combination of a depth camera as a motion capturing system, a 3D display and haptic device to enhance user experience. While developing this system we put special attention on the cost and availability of the hardware. We decided to use only off-the-shelf, mass consumer oriented hardware so our experiments can be easily implemented and replicated. As an additional benefit the total cost of the hardware did not exceed the one thousand dollars mark making it affordable for many individuals and institutions
    corecore