2,584 research outputs found

    Using encoder-decoder architecture for material segmentation based on beam profile analysis

    Get PDF
    Abstract. Recognition and segmentation of materials has proven to be a challenging problem because of the wide divergence in appearance within and between categories. Many recent material segmentation approaches treat materials as yet another set of labels like objects. However, materials are basically different from objects as they have no basic shape or defined spatial extent. Our approach roughly ignores this and can primarily take advantage of limited implicit context (local appearance) as it seems during training, because our training images that almost do not have a global image context; such as (I) where the used materials have no inherent shape or defined spatial extent like apple, orange and potato approximately have the same spherical shape; (II) besides, images where taken under a black background, which roughly removes the spatial features of the materials. We introduce a new materials segmentation dataset, which was taken with a Beam Profile Analysis sensing device. The dataset contains 10 material categories, and it has image pair samples consisting of grayscale images with and without the laser spots (grayscale and laser images) in addition to annotated segmented images. To the best of our knowledge, this is the first material segmentation dataset for Beam Profile Analysis images. As a second step, we proposed a deep learning approach to perform material segmentation on our dataset; our proposed CNNs is an encoder-decoder model, which is based on the DeeplabV3+ model. Our main goal is to obtain segmented material maps and discover how the laser spots contribute to the segmentation results; therefore, we perform a comparative analysis across different types of architectures to observe how the laser spots contribute to the whole segmentation. We built our experiments on three main types of models that use a different type of input; for each model, we implemented various types of backbone architectures. Our experiments results show that the laser spots have an efficient contribution on the segmentation results. GrayLaser model achieves a significant accuracy improvement compared to other models, where the fine-tuned architecture of this model has reached an accuracy of 94% over MIoU metric, and one trained from the scratch has reached an accuracy of 62% over MIoU

    Micro Fourier Transform Profilometry (μ\muFTP): 3D shape measurement at 10,000 frames per second

    Full text link
    Recent advances in imaging sensors and digital light projection technology have facilitated a rapid progress in 3D optical sensing, enabling 3D surfaces of complex-shaped objects to be captured with improved resolution and accuracy. However, due to the large number of projection patterns required for phase recovery and disambiguation, the maximum fame rates of current 3D shape measurement techniques are still limited to the range of hundreds of frames per second (fps). Here, we demonstrate a new 3D dynamic imaging technique, Micro Fourier Transform Profilometry (μ\muFTP), which can capture 3D surfaces of transient events at up to 10,000 fps based on our newly developed high-speed fringe projection system. Compared with existing techniques, μ\muFTP has the prominent advantage of recovering an accurate, unambiguous, and dense 3D point cloud with only two projected patterns. Furthermore, the phase information is encoded within a single high-frequency fringe image, thereby allowing motion-artifact-free reconstruction of transient events with temporal resolution of 50 microseconds. To show μ\muFTP's broad utility, we use it to reconstruct 3D videos of 4 transient scenes: vibrating cantilevers, rotating fan blades, bullet fired from a toy gun, and balloon's explosion triggered by a flying dart, which were previously difficult or even unable to be captured with conventional approaches.Comment: This manuscript was originally submitted on 30th January 1

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Low-cost single-pixel 3D imaging by using an LED array

    Get PDF
    We propose a method to perform color imaging with a single photodiode by using light structured illumination generated with a low-cost color LED array. The LED array is used to generate a sequence of color Hadamard patterns which are projected onto the object by a simple optical system while the photodiode records the light intensity. A field programmable gate array (FPGA) controls the LED panel allowing us to obtain high refresh rates up to 10 kHz. The system is extended to 3D imaging by simply adding a low number of photodiodes at different locations. The 3D shape of the object is obtained by using a noncalibrated photometric stereo technique. Experimental results are provided for an LED array with 32 × 32 elements
    corecore