549,441 research outputs found

    Linear Object Classes and Image Synthesis from a Single Example Image

    Get PDF
    The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, we have recently introduced techniques that are applicable under restricted conditions but simpler. The approach exploits image transformations that are specific to the relevant object class and learnable from example views of other "prototypical" objects of the same class. In this paper, we introduce such a new technique by extending the notion of linear class first proposed by Poggio and Vetter. For linear object classes it is shown that linear transformations can be learned exactly from a basis set of 2D prototypical views. We demonstrate the approach on artificial objects and then show preliminary evidence that the technique can effectively "rotate" high- resolution face images from a single 2D view

    Exemplar Learning for Medical Image Segmentation

    Full text link
    Medical image annotation typically requires expert knowledge and hence incurs time-consuming and expensive data annotation costs. To reduce this burden, we propose a novel learning scenario, Exemplar Learning (EL), to explore automated learning processes for medical image segmentation from a single annotated image example. This innovative learning task is particularly suitable for medical image segmentation, where all categories of organs can be presented in one single image for annotation all at once. To address this challenging EL task, we propose an Exemplar Learning-based Synthesis Net (ELSNet) framework for medical image segmentation that enables innovative exemplar-based data synthesis, pixel-prototype based contrastive embedding learning, and pseudo-label based exploitation of the unlabeled data. Specifically, ELSNet introduces two new modules for image segmentation: an exemplar-guided synthesis module, which enriches and diversifies the training set by synthesizing annotated samples from the given exemplar, and a pixel-prototype based contrastive embedding module, which enhances the discriminative capacity of the base segmentation model via contrastive self-supervised learning. Moreover, we deploy a two-stage process for segmentation model training, which exploits the unlabeled data with predicted pseudo segmentation labels. To evaluate this new learning framework, we conduct extensive experiments on several organ segmentation datasets and present an in-depth analysis. The empirical results show that the proposed exemplar learning framework produces effective segmentation results

    Learning to restore multiple image degradations simultaneously

    Get PDF
    Image corruptions are common in the real world, for example images in the wild may come with unknown blur, bias field, noise, or other kinds of non-linear distributional shifts, thus hampering encoding methods and rendering downstream task unreliable. Image upgradation requires a complicated balance between high-level contextualised information and spatial specific details. Existing approaches to solving the problems are designed to focus on single corruption, which unavoidably results in poor performance when the acquisitions suffer from multiple degradations. In this study, we investigate the possibility of handling multiple degradations and enhancing the quality of images via deblurring, bias field correction, and denoising. To tackle the problems with propagating errors caused by independent learning, we propose a unified and scalable framework, which consists of three special decoders. Two decoders learn artifact attention from provided images thereby generating realistic individual artifact and multiple artifacts on single image; the third decoder is trained towards removing artifact on the synthetic image with multiple corruptions thereby generating high quality image. We additionally provide improvements over previous image degradation synthesis approaches by modelling multiple image degradations directly from data observations. We first create a toy MNIST dataset and investigate the properties of the proposed algorithm. We then use brain MRI datasets to demonstrate our method’s robustness, including both simulated (where necessary) and real-world artifacts. In addition, our method can be used for single/or multiple degradation(s) synthesis by implementing the learned degradation operators in a new domain from a given dataset. The code will be released upon acceptance of the paper

    Cross domain Image Transformation and Generation by Deep Learning

    Get PDF
    Compared with single domain learning, cross-domain learning is more challenging due to the large domain variation. In addition, cross-domain image synthesis is more difficult than other cross learning problems, including, for example, correlation analysis, indexing, and retrieval, because it needs to learn complex function which contains image details for photo-realism. This work investigates cross-domain image synthesis in two common and challenging tasks, i.e., image-to-image and non-image-to-image transfer/synthesis.The image-to-image transfer is investigated in Chapter 2, where we develop a method for transformation between face images and sketch images while preserving the identity. Different from existing works that conduct domain transfer in a one-pass manner, we design a recurrent bidirectional transformation network (r-BTN), which allows bidirectional domain transfer in an integrated framework. More importantly, it could perceptually compose partial inputs from two domains to simultaneously synthesize face and sketch images with consistent identity. Most existing works could well synthesize images from patches that cover at least 70% of the original image. The proposed r-BTN could yield appealing results from patches that cover less than 10% because of the recursive estimation of the missing region in an incremental manner. Extensive experiments have been conducted to demonstrate the superior performance of r-BTN as compared to existing solutions.Chapter 3 targets at image transformation/synthesis from non-image sources, i.e., generating talking face based on the audio input. Existing works either do not consider temporal dependency thus yielding abrupt facial/lip movement or are limited to the generation for a specific person thus lacking generalization capacity. A novel conditional recurrent generation network which incorporates image and audio features in the recurrent unit for temporal dependency is proposed such that smooth transition can be achieved for lip and facial movements. To achieve image- and video-realism, we adopt a pair of spatial-temporal discriminators. Accurate lip synchronization is essential to the success of talking face video generation where we construct a lip-reading discriminator to boost the accuracy of lip synchronization. Extensive experiments demonstrate the superiority of our framework over the state-of-the-arts in terms of visual quality, lip sync accuracy, and smooth transition regarding lip and facial movement

    End-to-End Optimization of Scene Layout

    Full text link
    We propose an end-to-end variational generative model for scene layout synthesis conditioned on scene graphs. Unlike unconditional scene layout generation, we use scene graphs as an abstract but general representation to guide the synthesis of diverse scene layouts that satisfy relationships included in the scene graph. This gives rise to more flexible control over the synthesis process, allowing various forms of inputs such as scene layouts extracted from sentences or inferred from a single color image. Using our conditional layout synthesizer, we can generate various layouts that share the same structure of the input example. In addition to this conditional generation design, we also integrate a differentiable rendering module that enables layout refinement using only 2D projections of the scene. Given a depth and a semantics map, the differentiable rendering module enables optimizing over the synthesized layout to fit the given input in an analysis-by-synthesis fashion. Experiments suggest that our model achieves higher accuracy and diversity in conditional scene synthesis and allows exemplar-based scene generation from various input forms.Comment: CVPR 2020 (Oral). Project page: http://3dsln.csail.mit.edu

    Two-stage filtration algorithm with interframe causal processing for multichannel image with presence of uncorrelated noise

    Get PDF
    Π— використанням властивості ΡƒΠΌΠΎΠ²Π½ΠΎΡ— нСзалСТності ΠΎΡ‚Ρ€ΠΈΠΌΠ°Π½ΠΎ Π²ΠΈΡ€Π°Π· для апостСріорної Ρ‰Ρ–Π»ΡŒΠ½ΠΎΡΡ‚Ρ– ймовірності Π²Ρ–Π΄Π»Ρ–ΠΊΡ–Π² Π±Π°Π³Π°Ρ‚ΠΎΠΊΠ°Π½Π°Π»ΡŒΠ½ΠΈΡ… Π·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΡŒ ΠΏΡ€ΠΈ Π΄Π²ΠΎΠ΅Ρ‚Π°ΠΏΠ½Ρ–ΠΉ Ρ„Ρ–Π»ΡŒΡ‚Ρ€Π°Ρ†Ρ–Ρ— Π· Π²Π½ΡƒΡ‚Ρ€Ρ–ΡˆΠ½ΡŒΠΎΠΊΠ°Π΄Ρ€ΠΎΠ²ΠΎΡŽ ΠΊΠ°ΡƒΠ·Π°Π»ΡŒΠ½ΠΎΡŽ ΠΎΠ±Ρ€ΠΎΠ±ΠΊΠΎΡŽ ΠΏΡ€ΠΈ наявності Π½Π΅ΠΊΠΎΡ€Π΅Π»ΡŒΠΎΠ²Π°Π½ΠΎΡ— Π·Π°Π²Π°Π΄ΠΈ. ΠžΡ‚Ρ€ΠΈΠΌΠ°Π½ΠΎ Π²ΠΈΡ€Π°Π·ΠΈ для обчислСння Ρ—Ρ— ΠΏΠ΅Ρ€ΡˆΠΎΠ³ΠΎ Ρ– Π΄Ρ€ΡƒΠ³ΠΎΠ³ΠΎ ΠΌΠΎΠΌΠ΅Π½Ρ‚Ρ–Π² Ρƒ Π²ΠΈΠΏΠ°Π΄ΠΊΡƒ гауссівских Π±Π°Π³Π°Ρ‚ΠΎΠΊΠ°Π½Π°Π»ΡŒΠ½ΠΈΡ… Π·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΡŒ. Аналіз Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΡƒ ΠΏΡ€ΠΎΠ²Π΅Π΄Π΅Π½ΠΎ Π½Π° ΠΌΠΎΠ΄Π΅Π»ΡŒΠ½ΠΎΠΌΡƒ ΠΏΡ€ΠΈΠΊΠ»Π°Π΄Ρ– Π·Π° допомогою статистичного модСлювання Π½Π° Π•ΠžΠœ.Introduction. When solving a number of practical problems the usage of multichannel images is common practice. Multichannel of this data permits or increases the efficiency of solving the problem, or allows to obtain useful information, which in principle cannot be extracted from the single-channel images. One of the main types of noise occurring in a multichannel image is uncorrelated noise. The optimal image filtering algorithms require enormous computational cost. Therefore, the important practical value is the synthesis of multi-channel image filtering algorithms, providing the required performance indicators at moderate computational cost. Theoretical results. Using conditional independence properties, the expression for the a posteriori probability density of pixels at the two-stage multi-channel image filtration with causal frame processing in the presence of uncorrelated noise is obtained. Gaussian algorithm for determining the estimates of image pixels and error variance estimation with causal intra and inter-frame processing is obtained in the case of multichannel image. Experimental results. The developed algorithm for considered example allows increasing the filtration accuracy of the sequence of homogeneous Gaussian images on a 20% - 45% compared to inter-frame averaging algorithm. Conclusion. Optimal and quasi-two-stage multi-channel image filtration algorithms were synthesized. In algorithms the first stage is one-dimensional causal filtration along each of the coordinates, and the second is the union of the results. These algorithms allow reducing the computational cost in comparison with the optimal algorithm and thus ensuring acceptable accuracy characteristics.Π‘ использованиСм свойства условной нСзависимости ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½ΠΎ Π²Ρ‹Ρ€Π°ΠΆΠ΅Π½ΠΈΠ΅ для апостСриорной плотности вСроятности отсчСтов ΠΌΠ½ΠΎΠ³ΠΎΠΊΠ°Π½Π°Π»ΡŒΠ½Ρ‹Ρ… ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠΉ ΠΏΡ€ΠΈ двухэтапной Ρ„ΠΈΠ»ΡŒΡ‚Ρ€Π°Ρ†ΠΈΠΈ с Π²Π½ΡƒΡ‚Ρ€ΠΈΠΊΠ°Π΄Ρ€ΠΎΠ²ΠΎΠΉ ΠΊΠ°ΡƒΠ·Π°Π»ΡŒΠ½ΠΎΠΉ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚ΠΊΠΎΠΉ ΠΏΡ€ΠΈ Π½Π°Π»ΠΈΡ‡ΠΈΠΈ Π½Π΅ΠΊΠΎΡ€Ρ€Π΅Π»ΠΈΡ€ΠΎΠ²Π°Π½Π½ΠΎΠΉ ΠΏΠΎΠΌΠ΅Ρ…ΠΈ. ΠŸΠΎΠ»ΡƒΡ‡Π΅Π½Ρ‹ выраТСния для вычислСния Π΅Π΅ ΠΏΠ΅Ρ€Π²ΠΎΠ³ΠΎ ΠΈ Π²Ρ‚ΠΎΡ€ΠΎΠ³ΠΎ ΠΌΠΎΠΌΠ΅Π½Ρ‚ΠΎΠ² Π² случаС гауссовских ΠΌΠ½ΠΎΠ³ΠΎΠΊΠ°Π½Π°Π»ΡŒΠ½Ρ‹Ρ… ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠΉ. Анализ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠ° ΠΏΡ€ΠΎΠ²Π΅Π΄Π΅Π½ Π½Π° модСльном ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π΅ с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ статистичСского модСлирования Π½Π° Π­Π’Πœ

    Turn Fake into Real: Adversarial Head Turn Attacks Against Deepfake Detection

    Full text link
    Malicious use of deepfakes leads to serious public concerns and reduces people's trust in digital media. Although effective deepfake detectors have been proposed, they are substantially vulnerable to adversarial attacks. To evaluate the detector's robustness, recent studies have explored various attacks. However, all existing attacks are limited to 2D image perturbations, which are hard to translate into real-world facial changes. In this paper, we propose adversarial head turn (AdvHeat), the first attempt at 3D adversarial face views against deepfake detectors, based on face view synthesis from a single-view fake image. Extensive experiments validate the vulnerability of various detectors to AdvHeat in realistic, black-box scenarios. For example, AdvHeat based on a simple random search yields a high attack success rate of 96.8% with 360 searching steps. When additional query access is allowed, we can further reduce the step budget to 50. Additional analyses demonstrate that AdvHeat is better than conventional attacks on both the cross-detector transferability and robustness to defenses. The adversarial images generated by AdvHeat are also shown to have natural looks. Our code, including that for generating a multi-view dataset consisting of 360 synthetic views for each of 1000 IDs from FaceForensics++, is available at https://github.com/twowwj/AdvHeaT

    Learning from one example in machine vision by sharing probability densities

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 125-130).Human beings exhibit rapid learning when presented with a small number of images of a new object. A person can identify an object under a wide variety of visual conditions after having seen only a single example of that object. This ability can be partly explained by the application of previously learned statistical knowledge to a new setting. This thesis presents an approach to acquiring knowledge in one setting and using it in another. Specifically, we develop probability densities over common image changes. Given a single image of a new object and a model of change learned from a different object, we form a model of the new object that can be used for synthesis, classification, and other visual tasks. We start by modeling spatial changes. We develop a framework for learning statistical knowledge of spatial transformations in one task and using that knowledge in a new task. By sharing a probability density over spatial transformations learned from a sample of handwritten letters, we develop a handwritten digit classifier that achieves 88.6% accuracy using only a single hand-picked training example from each class. The classification scheme includes a new algorithm, congealing, for the joint alignment of a set of images using an entropy minimization criterion. We investigate properties of this algorithm and compare it to other methods of addressing spatial variability in images. We illustrate its application to binary images, gray-scale images, and a set of 3-D neonatal magnetic resonance brain volumes.Next, we extend the method of change modeling from spatial transformations to color transformations. By measuring statistically common joint color changes of a scene in an office environment, and then applying standard statistical techniques such as principal components analysis, we develop a probabilistic model of color change. We show that these color changes, which we call color flows, can be shared effectively between certain types of scenes. That is, a probability density over color change developed by observing one scene can provide useful information about the variability of another scene. We demonstrate a variety of applications including image synthesis, image matching, and shadow detection.by Erik G. Miller.Ph.D

    Bayesian inference for radio observations

    Get PDF
    New telescopes like the Square Kilometre Array (SKA) will push into a new sensitivity regime and expose systematics, such as direction-dependent effects, that could previously be ignored. Current methods for handling such systematics rely on alternating best estimates of instrumental calibration and models of the underlying sky, which can lead to inadequate uncertainty estimates and biased results because any correlations between parameters are ignored. These deconvolution algorithms produce a single image that is assumed to be a true representation of the sky, when in fact it is just one realization of an infinite ensemble of images compatible with the noise in the data. In contrast, here we report a Bayesian formalism that simultaneously infers both systematics and science. Our technique, Bayesian Inference for Radio Observations (BIRO), determines all parameters directly from the raw data, bypassing image-making entirely, by sampling from the joint posterior probability distribution. This enables it to derive both correlations and accurate uncertainties, making use of the flexible software meqtrees to model the sky and telescope simultaneously. We demonstrate BIRO with two simulated sets of Westerbork Synthesis Radio Telescope data sets. In the first, we perform joint estimates of 103 scientific (flux densities of sources) and instrumental (pointing errors, beamwidth and noise) parameters. In the second example, we perform source separation with BIRO. Using the Bayesian evidence, we can accurately select between a single point source, two point sources and an extended Gaussian source, allowing for β€˜super-resolution' on scales much smaller than the synthesized bea
    • …
    corecore