114 research outputs found

    Weakly- and Self-Supervised Learning for Content-Aware Deep Image Retargeting

    Full text link
    This paper proposes a weakly- and self-supervised deep convolutional neural network (WSSDCNN) for content-aware image retargeting. Our network takes a source image and a target aspect ratio, and then directly outputs a retargeted image. Retargeting is performed through a shift map, which is a pixel-wise mapping from the source to the target grid. Our method implicitly learns an attention map, which leads to a content-aware shift map for image retargeting. As a result, discriminative parts in an image are preserved, while background regions are adjusted seamlessly. In the training phase, pairs of an image and its image-level annotation are used to compute content and structure losses. We demonstrate the effectiveness of our proposed method for a retargeting application with insightful analyses.Comment: 10 pages, 11 figures. To appear in ICCV 2017, Spotlight Presentatio

    Structure-aware content creation : detection, retargeting and deformation

    Get PDF
    Nowadays, access to digital information has become ubiquitous, while three-dimensional visual representation is becoming indispensable to knowledge understanding and information retrieval. Three-dimensional digitization plays a natural role in bridging connections between the real and virtual world, which prompt the huge demand for massive three-dimensional digital content. But reducing the effort required for three-dimensional modeling has been a practical problem, and long standing challenge in compute graphics and related fields. In this thesis, we propose several techniques for lightening up the content creation process, which have the common theme of being structure-aware, ie maintaining global relations among the parts of shape. We are especially interested in formulating our algorithms such that they make use of symmetry structures, because of their concise yet highly abstract principles are universally applicable to most regular patterns. We introduce our work from three different aspects in this thesis. First, we characterized spaces of symmetry preserving deformations, and developed a method to explore this space in real-time, which significantly simplified the generation of symmetry preserving shape variants. Second, we empirically studied three-dimensional offset statistics, and developed a fully automatic retargeting application, which is based on verified sparsity. Finally, we made step forward in solving the approximate three-dimensional partial symmetry detection problem, using a novel co-occurrence analysis method, which could serve as the foundation to high-level applications.Jetzt hat die Zugang zu digitalen Informationen allgegenwärtig geworden. Dreidimensionale visuelle Darstellung wird immer zum Einsichtsverständnis und Informationswiedergewinnung unverzichtbar. Dreidimensionale Digitalisierung verbindet die reale und virtuelle Welt auf natürliche Weise, die prompt die große Nachfrage nach massiven dreidimensionale digitale Inhalte. Es ist immer noch ein praktisches Problem und langjährige Herausforderung in Computergrafik und verwandten Bereichen, die den Aufwand für die dreidimensionale Modellierung reduzieren. In dieser Dissertation schlagen wir verschiedene Techniken zur Aufhellung der Erstellung von Inhalten auf, im Rahmen der gemeinsamen Thema der struktur-bewusst zu sein, d.h. globalen Beziehungen zwischen den Teilen der Gestalt beibehalten wird. Besonders interessiert sind wir bei der Formulierung unserer Algorithmen, so dass sie den Einsatz von Symmetrische Strukturen machen, wegen ihrer knappen, aber sehr abstrakten Prinzipien für die meisten regelmäßigen Mustern universell einsetzbar sind. Wir stellen unsere Arbei aus drei verschiedenen Aspekte in dieser Dissertation. Erstens befinden wir Räume der Verformungen, die Symmetrien zu erhalten, und entwickelten wir eine Methode, diesen Raum in Echtzeit zu erkunden, die deutlich die Erzeugung von Gestalten vereinfacht, die Symmetrien zu bewahren. Zweitens haben wir empirisch untersucht dreidimensionale Offset Statistiken und entwickelten eine vollautomatische Applikation für Retargeting, die auf den verifizierte Seltenheit basiert. Schließlich treten wir uns auf die ungefähre dreidimensionalen Teilsymmetrie Erkennungsproblem zu lösen, auf der Grundlage unserer neuen Kookkurrenz Analyseverfahren, die viele hochrangige Anwendungen dienen verwendet werden könnten

    Perceptually Guided Photo Retargeting

    Get PDF
    We propose perceptually guided photo retargeting, which shrinks a photo by simulating a human's process of sequentially perceiving visually/semantically important regions in a photo. In particular, we first project the local features (graphlets in this paper) onto a semantic space, wherein visual cues such as global spatial layout and rough geometric context are exploited. Thereafter, a sparsity-constrained learning algorithm is derived to select semantically representative graphlets of a photo, and the selecting process can be interpreted by a path which simulates how a human actively perceives semantics in a photo. Furthermore, we learn the prior distribution of such active graphlet paths (AGPs) from training photos that are marked as esthetically pleasing by multiple users. The learned priors enforce the corresponding AGP of a retargeted photo to be maximally similar to those from the training photos. On top of the retargeting model, we further design an online learning scheme to incrementally update the model with new photos that are esthetically pleasing. The online update module makes the algorithm less dependent on the number and contents of the initial training data. Experimental results show that: 1) the proposed AGP is over 90% consistent with human gaze shifting path, as verified by the eye-tracking data, and 2) the retargeting algorithm outperforms its competitors significantly, as AGP is more indicative of photo esthetics than conventional saliency maps

    Deformation analysis and its application in image editing.

    Get PDF
    Jiang, Lei.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (p. 68-75).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 2 --- Background and Motivation --- p.5Chapter 2.1 --- Foreshortening --- p.5Chapter 2.1.1 --- Vanishing Point --- p.6Chapter 2.1.2 --- Metric Rectification --- p.8Chapter 2.2 --- Content Aware Image Resizing --- p.11Chapter 2.3 --- Texture Deformation --- p.15Chapter 2.3.1 --- Shape from texture --- p.16Chapter 2.3.2 --- Shape from lattice --- p.18Chapter 3 --- Resizing on Facade --- p.21Chapter 3.1 --- Introduction --- p.21Chapter 3.2 --- Related Work --- p.23Chapter 3.3 --- Algorithm --- p.24Chapter 3.3.1 --- Facade Detection --- p.25Chapter 3.3.2 --- Facade Resizing --- p.32Chapter 3.4 --- Results --- p.34Chapter 4 --- Cell Texture Editing --- p.42Chapter 4.1 --- Introduction --- p.42Chapter 4.2 --- Related Work --- p.44Chapter 4.3 --- Our Approach --- p.46Chapter 4.3.1 --- Cell Detection --- p.47Chapter 4.3.2 --- Local Affine Estimation --- p.49Chapter 4.3.3 --- Affine Transformation Field --- p.52Chapter 4.4 --- Photo Editing Applications --- p.55Chapter 4.5 --- Discussion --- p.58Chapter 5 --- Conclusion --- p.65Bibliography --- p.6

    FUZZY KERNEL REGRESSION FOR REGISTRATION AND OTHER IMAGE WARPING APPLICATIONS

    Get PDF
    In this dissertation a new approach for non-rigid medical im- age registration is presented. It relies onto a probabilistic framework based on the novel concept of Fuzzy Kernel Regression. The theoric framework, after a formal introduction is applied to develop several complete registration systems, two of them are interactive and one is fully automatic. They all use the composition of local deforma- tions to achieve the final alignment. Automatic one is based onto the maximization of mutual information to produce local affine aligments which are merged into the global transformation. Mutual Information maximization procedure uses gradient descent method. Due to the huge amount of data associated to medical images, a multi-resolution topology is embodied, reducing processing time. The distance based interpolation scheme injected facilitates the similairity measure op- timization by attenuating the presence of local maxima in the func- tional. System blocks are implemented on GPGPUs allowing efficient parallel computation of large 3d datasets using SIMT execution. Due to the flexibility of Mutual Information, it can be applied to multi- modality image scans (MRI, CT, PET, etc.). Both quantitative and qualitative experiments show promising results and great potential for future extension. Finally the framework flexibility is shown by means of its succesful application to the image retargeting issue, methods and results are presented

    Adaptation of Images and Videos for Different Screen Sizes

    Full text link
    With the increasing popularity of smartphones and similar mobile devices, the demand for media to consume on the go rises. As most images and videos today are captured with HD or even higher resolutions, there is a need to adapt them in a content-aware fashion before they can be watched comfortably on screens with small sizes and varying aspect ratios. This process is called retargeting. Most distortions during this process are caused by a change of the aspect ratio. Thus, retargeting mainly focuses on adapting the aspect ratio of a video while the rest can be scaled uniformly. The main objective of this dissertation is to contribute to the modern image and video retargeting, especially regarding the potential of the seam carving operator. There are still unsolved problems in this research field that should be addressed in order to improve the quality of the results or speed up the performance of the retargeting process. This dissertation presents novel algorithms that are able to retarget images, videos and stereoscopic videos while dealing with problems like the preservation of straight lines or the reduction of the required memory space and computation time. Additionally, a GPU implementation is used to achieve the retargeting of videos in real-time. Furthermore, an enhancement of face detection is presented which is able to distinguish between faces that are important for the retargeting and faces that are not. Results show that the developed techniques are suitable for the desired scenarios

    Transformation-aware Perceptual Image Metric

    Get PDF
    Predicting human visual perception has several applications such as compression, rendering, editing, and retargeting. Current approaches, however, ignore the fact that the human visual system compensates for geometric transformations, e.g., we see that an image and a rotated copy are identical. Instead, they will report a large, false-positive difference. At the same time, if the transformations become too strong or too spatially incoherent, comparing two images gets increasingly difficult. Between these two extrema, we propose a system to quantify the effect of transformations, not only on the perception of image differences but also on saliency and motion parallax. To this end, we first fit local homographies to a given optical flow field, and then convert this field into a field of elementary transformations, such as translation, rotation, scaling, and perspective. We conduct a perceptual experiment quantifying the increase of difficulty when compensating for elementary transformations. Transformation entropy is proposed as a measure of complexity in a flow field. This representation is then used for applications, such as comparison of nonaligned images, where transformations cause threshold elevation, detection of salient transformations, and a model of perceived motion parallax. Applications of our approach are a perceptual level-of-detail for real-time rendering and viewpoint selection based on perceived motion parallax
    corecore