163 research outputs found

    Learning Generative ConvNets via Multi-grid Modeling and Sampling

    Full text link
    This paper proposes a multi-grid method for learning energy-based generative ConvNet models of images. For each grid, we learn an energy-based probabilistic model where the energy function is defined by a bottom-up convolutional neural network (ConvNet or CNN). Learning such a model requires generating synthesized examples from the model. Within each iteration of our learning algorithm, for each observed training image, we generate synthesized images at multiple grids by initializing the finite-step MCMC sampling from a minimal 1 x 1 version of the training image. The synthesized image at each subsequent grid is obtained by a finite-step MCMC initialized from the synthesized image generated at the previous coarser grid. After obtaining the synthesized examples, the parameters of the models at multiple grids are updated separately and simultaneously based on the differences between synthesized and observed examples. We show that this multi-grid method can learn realistic energy-based generative ConvNet models, and it outperforms the original contrastive divergence (CD) and persistent CD.Comment: CVPR 201

    An Integrated Full-bridge Class-DE Ultrasound Transducer Driver for HIFU Applications

    Get PDF
    This thesis present a CMOS integrated transducer driver for high intensity focused ultrasound (HIFU) applications. Because this driver will be used in a magnetic resonance imaging (MRI) environment, no magnetic components such as inductors and transformers have been used in this design. The transducer is directly connected to the driver without a matching network. The output stage of this driver is a full-bridge Class DE RF amplifer which is able to deliver more power than the previous design that has a half-bridge Class DE amplifer. The driver was also designed to be used in a transducer array. A digital control unit was integrated with the power amplifer that allows to program the drivers phase shift and duty ratio. A strategy to drive a ultrasound transducer array using the designed driver is also presented in this thesis. This design was implemented using the AMS H35B4 CMOS technology using the Cadence suite of design tools and occupies a die area of 2mm by 1.5mm with 20 input and output pads. Simulation and initial experimental results are presented in this work. The proposed integrated CMOS driver has an efficiency of 89.4% with 3.60 W of output power. Results are little bit different for each transducer

    Deformable Generator Network: Unsupervised Disentanglement of Appearance and Geometry

    Full text link
    We present a deformable generator model to disentangle the appearance and geometric information for both image and video data in a purely unsupervised manner. The appearance generator network models the information related to appearance, including color, illumination, identity or category, while the geometric generator performs geometric warping, such as rotation and stretching, through generating deformation field which is used to warp the generated appearance to obtain the final image or video sequences. Two generators take independent latent vectors as input to disentangle the appearance and geometric information from image or video sequences. For video data, a nonlinear transition model is introduced to both the appearance and geometric generators to capture the dynamics over time. The proposed scheme is general and can be easily integrated into different generative models. An extensive set of qualitative and quantitative experiments shows that the appearance and geometric information can be well disentangled, and the learned geometric generator can be conveniently transferred to other image datasets to facilitate knowledge transfer tasks.Comment: version

    Motion-Based Generator Model: Unsupervised Disentanglement of Appearance, Trackable and Intrackable Motions in Dynamic Patterns

    Full text link
    Dynamic patterns are characterized by complex spatial and motion patterns. Understanding dynamic patterns requires a disentangled representational model that separates the factorial components. A commonly used model for dynamic patterns is the state space model, where the state evolves over time according to a transition model and the state generates the observed image frames according to an emission model. To model the motions explicitly, it is natural for the model to be based on the motions or the displacement fields of the pixels. Thus in the emission model, we let the hidden state generate the displacement field, which warps the trackable component in the previous image frame to generate the next frame while adding a simultaneously emitted residual image to account for the change that cannot be explained by the deformation. The warping of the previous image is about the trackable part of the change of image frame, while the residual image is about the intrackable part of the image. We use a maximum likelihood algorithm to learn the model that iterates between inferring latent noise vectors that drive the transition model and updating the parameters given the inferred latent vectors. Meanwhile we adopt a regularization term to penalize the norms of the residual images to encourage the model to explain the change of image frames by trackable motion. Unlike existing methods on dynamic patterns, we learn our model in unsupervised setting without ground truth displacement fields. In addition, our model defines a notion of intrackability by the separation of warped component and residual component in each image frame. We show that our method can synthesize realistic dynamic pattern, and disentangling appearance, trackable and intrackable motions. The learned models are useful for motion transfer, and it is natural to adopt it to define and measure intrackability of a dynamic pattern
    corecore