6,586 research outputs found

    Compact Bilinear Pooling

    Full text link
    Bilinear models has been shown to achieve impressive performance on a wide range of visual tasks, such as semantic segmentation, fine grained recognition and face recognition. However, bilinear features are high dimensional, typically on the order of hundreds of thousands to a few million, which makes them impractical for subsequent analysis. We propose two compact bilinear representations with the same discriminative power as the full bilinear representation but with only a few thousand dimensions. Our compact representations allow back-propagation of classification errors enabling an end-to-end optimization of the visual recognition system. The compact bilinear representations are derived through a novel kernelized analysis of bilinear pooling which provide insights into the discriminative power of bilinear pooling, and a platform for further research in compact pooling methods. Experimentation illustrate the utility of the proposed representations for image classification and few-shot learning across several datasets.Comment: Camera ready version for CVP

    3D Dynamic Scene Reconstruction from Multi-View Image Sequences

    Get PDF
    A confirmation report outlining my PhD research plan is presented. The PhD research topic is 3D dynamic scene reconstruction from multiple view image sequences. Chapter 1 describes the motivation and research aims. An overview of the progress in the past year is included. Chapter 2 is a review of volumetric scene reconstruction techniques and Chapter 3 is an in-depth description of my proposed reconstruction method. The theory behind the proposed volumetric scene reconstruction method is also presented, including topics in projective geometry, camera calibration and energy minimization. Chapter 4 presents the research plan and outlines the future work planned for the next two years

    HSM: A New Color Space used in the Processing of Color Images

    Get PDF
    Inspired on the techniques used by painters to overlap layers of various hues of paint to create oil paintings, and also on observations of the the arrangement of Short-(S), Middle-(M), and Long-(L) wavelength-sensitive cones of the human retina for the interpretation of the colors, this paper proposes the use of the new color space called HSM to the processing of color images. To demonstrate the applicability of the HSM color space in the processing of color images, this paper proposes the pixelbased segmentation of a digital image of “human skin” or “non-skin”, the sketch of the face image and the pixel-based segmentation of the trumpet flowers tree (ype). The performance of the HSM color space in the pixel-based segmentation is compared with the HSV, YCbCr and TSL color spaces while the sketch of the face image is also compared with HSV, YCbCr and TSL colors spaces and the edge detectors of the Sobel, Prewitt, Roberts, Canny and Laplacian of Gaussian methods. The results demonstrate the potential of the proposed color space

    3D Model Assisted Image Segmentation

    Get PDF
    The problem of segmenting a given image into coherent regions is important in Computer Vision and many industrial applications require segmenting a known object into its components. Examples include identifying individual parts of a component for proces

    Calipso: Physics-based Image and Video Editing through CAD Model Proxies

    Get PDF
    We present Calipso, an interactive method for editing images and videos in a physically-coherent manner. Our main idea is to realize physics-based manipulations by running a full physics simulation on proxy geometries given by non-rigidly aligned CAD models. Running these simulations allows us to apply new, unseen forces to move or deform selected objects, change physical parameters such as mass or elasticity, or even add entire new objects that interact with the rest of the underlying scene. In Calipso, the user makes edits directly in 3D; these edits are processed by the simulation and then transfered to the target 2D content using shape-to-image correspondences in a photo-realistic rendering process. To align the CAD models, we introduce an efficient CAD-to-image alignment procedure that jointly minimizes for rigid and non-rigid alignment while preserving the high-level structure of the input shape. Moreover, the user can choose to exploit image flow to estimate scene motion, producing coherent physical behavior with ambient dynamics. We demonstrate Calipso's physics-based editing on a wide range of examples producing myriad physical behavior while preserving geometric and visual consistency.Comment: 11 page

    Accelerated High-Resolution Photoacoustic Tomography via Compressed Sensing

    Get PDF
    Current 3D photoacoustic tomography (PAT) systems offer either high image quality or high frame rates but are not able to deliver high spatial and temporal resolution simultaneously, which limits their ability to image dynamic processes in living tissue. A particular example is the planar Fabry-Perot (FP) scanner, which yields high-resolution images but takes several minutes to sequentially map the photoacoustic field on the sensor plane, point-by-point. However, as the spatio-temporal complexity of many absorbing tissue structures is rather low, the data recorded in such a conventional, regularly sampled fashion is often highly redundant. We demonstrate that combining variational image reconstruction methods using spatial sparsity constraints with the development of novel PAT acquisition systems capable of sub-sampling the acoustic wave field can dramatically increase the acquisition speed while maintaining a good spatial resolution: First, we describe and model two general spatial sub-sampling schemes. Then, we discuss how to implement them using the FP scanner and demonstrate the potential of these novel compressed sensing PAT devices through simulated data from a realistic numerical phantom and through measured data from a dynamic experimental phantom as well as from in-vivo experiments. Our results show that images with good spatial resolution and contrast can be obtained from highly sub-sampled PAT data if variational image reconstruction methods that describe the tissues structures with suitable sparsity-constraints are used. In particular, we examine the use of total variation regularization enhanced by Bregman iterations. These novel reconstruction strategies offer new opportunities to dramatically increase the acquisition speed of PAT scanners that employ point-by-point sequential scanning as well as reducing the channel count of parallelized schemes that use detector arrays.Comment: submitted to "Physics in Medicine and Biology
    • …
    corecore