12,120 research outputs found

    Aesthetic-Driven Image Enhancement by Adversarial Learning

    Full text link
    We introduce EnhanceGAN, an adversarial learning based model that performs automatic image enhancement. Traditional image enhancement frameworks typically involve training models in a fully-supervised manner, which require expensive annotations in the form of aligned image pairs. In contrast to these approaches, our proposed EnhanceGAN only requires weak supervision (binary labels on image aesthetic quality) and is able to learn enhancement operators for the task of aesthetic-based image enhancement. In particular, we show the effectiveness of a piecewise color enhancement module trained with weak supervision, and extend the proposed EnhanceGAN framework to learning a deep filtering-based aesthetic enhancer. The full differentiability of our image enhancement operators enables the training of EnhanceGAN in an end-to-end manner. We further demonstrate the capability of EnhanceGAN in learning aesthetic-based image cropping without any groundtruth cropping pairs. Our weakly-supervised EnhanceGAN reports competitive quantitative results on aesthetic-based color enhancement as well as automatic image cropping, and a user study confirms that our image enhancement results are on par with or even preferred over professional enhancement

    6 Seconds of Sound and Vision: Creativity in Micro-Videos

    Full text link
    The notion of creativity, as opposed to related concepts such as beauty or interestingness, has not been studied from the perspective of automatic analysis of multimedia content. Meanwhile, short online videos shared on social media platforms, or micro-videos, have arisen as a new medium for creative expression. In this paper we study creative micro-videos in an effort to understand the features that make a video creative, and to address the problem of automatic detection of creative content. Defining creative videos as those that are novel and have aesthetic value, we conduct a crowdsourcing experiment to create a dataset of over 3,800 micro-videos labelled as creative and non-creative. We propose a set of computational features that we map to the components of our definition of creativity, and conduct an analysis to determine which of these features correlate most with creative video. Finally, we evaluate a supervised approach to automatically detect creative video, with promising results, showing that it is necessary to model both aesthetic value and novelty to achieve optimal classification accuracy.Comment: 8 pages, 1 figures, conference IEEE CVPR 201

    Maastikumeetrika ja ökosüsteemi kultuuriteenused – ressursipõhine integreeriv lähenemine maastikuharmoonia kaardistamisele

    Get PDF
    A Thesis for applying for the degree of Doctor of Philosophy in Environmental Protection.The overall idea of PhD thesis was to explain with objective evidence and using mapping techniques, why and how people value particular visual landscapes. Mainstream mapping research usually refers to uniqueness, diversity and naturalness of landscapes as the main factors for landscape values and preferences. These variables can be easily measured using satellite imagery and cartographic materials: for example, the diversity of landscape elements can be assessed with a function of Shannon information entropy, and naturalness – as the share of relatively natural land cover within the region of interest. However, psychological background suggests other important attributes of landscape experience – harmony, unity or coherence of the scene. Mentioned aspects are usually measured subjectively with questionnaires and surveys. Measuring landscape preferences is also quite a challenging task, requiring many people involved in assessment of photographs or even having a nature trip (with obvious drawbacks in spatial coverage and replicability with other evaluators). Therefore, the PhD research was designed to make all assessments as objective, as possible. Overall landscape coherence, for the first time, was measured as the extent to which total diversity of digital landscape model (composed of landforms and land cover) exceeds the added diversity of landforms and land cover alone. In this way, coherence was directly related to system properties of landscape, making it legible and understandable. Also, for the first time colour harmony of land cover was evaluated with remotely sensed data (satellite imagery). Retrieved map-based indices were examined with geo-located photographs of landscapes and outdoor recreation, uploaded to social media, such as Flickr, VK.com and former Panoramio. The study contributes to the operationalisation of landscape beauty and, therefore, more advanced landscape management, nature protection and sustainability of land use practises.Doktoritöö eesmärk on kaardistustehnoloogiad kasutades tõenduspõhiselt selgitada, miks ja kuidas inimesed väärtustavad teatud maastikke visuaalsest seisukohast. Peavoolu kaardistusuuringud tavaliselt keskenduvad maastiku väärtuste ja eelistuste hindamisel unikaalsusele, mitmekesisusele ja looduslikkusele. Neid muutujaid saab satelliitpiltide ja kartograafilise materjali põhjal lihtsalt mõõta, näiteks maastikuelementide mitmekesisust saab hinnata Shannoni entroopiavalemiga ning looduslikkust vastava iseloomuga maakatte osakaaluga uuritaval alal. Psühholoogilisest vaatepunktist lähtudes on maastikukogemusel veel teisi olulisi omadusi, nagu vaate harmoonia, ühtsus või kooskõla sidusus. Uuringute puhul mõõdetakse neid muutujaid tavaliselt subjektiivselt. Maastikueelistuste teaduslik hindamine on tõsine metoodiline väljakutse, mis nõuab paljude hindajate osalemist näiteks maastikufotode hindamisel või vahetult looduses, kus tuleb arvestada piirangutega ruumilisel esindatusel või hinnangute replikatiivsusel. Arvestades eelnimetatud asjaolusid, on dissertatsiooni eesmärgiks seatud leida võimalikult objektiivseid teid tavaliselt subjektiivsetena käsitletavate maastikumuutujate hindamisel. Uudne on üldise maastiku kooskõla mõõdetmine digitaalse pinnavorme ja maakatet hõlmava maastikumudeliga, võrreldes nende komponentide eraldi mõõtmisega. Selliselt menetledes on koherentsus otseselt seostatav maastiku struktuursete parameetritega ja seega muudab hinnangud loetavamaks ja arusaadavamaks. Esmakordselt on kaugseire andmete (satelliitpildid) alusel hinnatud ka maakatte värviharmooniat. Määratletud kaardipõhiseid indekseid kontrolliti kohtseotud fotodega maastikuvaadetest ning välirekreatsiooni tegevustest sotsiaalmeedias (nt Flickr, VK.com ja varasem Panoramio). Uuring aitab paremini mõista ja rakendada maastiku ilu hindamise käiku ja seeläbi kasutada esteetilist kvaliteeti maastiku planeerimisel ja korraldamisel, looduskaitses ja teistes säästva maakasutuse praktilistes valdkondades.Publication of this dissertation has been supported by the Estonian University of Life Science

    Media aesthetics based multimedia storytelling.

    Get PDF
    Since the earliest of times, humans have been interested in recording their life experiences, for future reference and for storytelling purposes. This task of recording experiences --i.e., both image and video capture-- has never before in history been as easy as it is today. This is creating a digital information overload that is becoming a great concern for the people that are trying to preserve their life experiences. As high-resolution digital still and video cameras become increasingly pervasive, unprecedented amounts of multimedia, are being downloaded to personal hard drives, and also uploaded to online social networks on a daily basis. The work presented in this dissertation is a contribution in the area of multimedia organization, as well as automatic selection of media for storytelling purposes, which eases the human task of summarizing a collection of images or videos in order to be shared with other people. As opposed to some prior art in this area, we have taken an approach in which neither user generated tags nor comments --that describe the photographs, either in their local or on-line repositories-- are taken into account, and also no user interaction with the algorithms is expected. We take an image analysis approach where both the context images --e.g. images from online social networks to which the image stories are going to be uploaded--, and the collection images --i.e., the collection of images or videos that needs to be summarized into a story--, are analyzed using image processing algorithms. This allows us to extract relevant metadata that can be used in the summarization process. Multimedia-storytellers usually follow three main steps when preparing their stories: first they choose the main story characters, the main events to describe, and finally from these media sub-groups, they choose the media based on their relevance to the story as well as based on their aesthetic value. Therefore, one of the main contributions of our work has been the design of computational models --both regression based, as well as classification based-- that correlate well with human perception of the aesthetic value of images and videos. These computational aesthetics models have been integrated into automatic selection algorithms for multimedia storytelling, which are another important contribution of our work. A human centric approach has been used in all experiments where it was feasible, and also in order to assess the final summarization results, i.e., humans are always the final judges of our algorithms, either by inspecting the aesthetic quality of the media, or by inspecting the final story generated by our algorithms. We are aware that a perfect automatically generated story summary is very hard to obtain, given the many subjective factors that play a role in such a creative process; rather, the presented approach should be seen as a first step in the storytelling creative process which removes some of the ground work that would be tedious and time consuming for the user. Overall, the main contributions of this work can be capitalized in three: (1) new media aesthetics models for both images and videos that correlate with human perception, (2) new scalable multimedia collection structures that ease the process of media summarization, and finally, (3) new media selection algorithms that are optimized for multimedia storytelling purposes.Postprint (published version

    Representations and representation learning for image aesthetics prediction and image enhancement

    Get PDF
    With the continual improvement in cell phone cameras and improvements in the connectivity of mobile devices, we have seen an exponential increase in the images that are captured, stored and shared on social media. For example, as of July 1st 2017 Instagram had over 715 million registered users which had posted just shy of 35 billion images. This represented approximately seven and nine-fold increase in the number of users and photos present on Instagram since 2012. Whether the images are stored on personal computers or reside on social networks (e.g. Instagram, Flickr), the sheer number of images calls for methods to determine various image properties, such as object presence or appeal, for the purpose of automatic image management and curation. One of the central problems in consumer photography centers around determining the aesthetic appeal of an image and motivates us to explore questions related to understanding aesthetic preferences, image enhancement and the possibility of using such models on devices with constrained resources. In this dissertation, we present our work on exploring representations and representation learning approaches for aesthetic inference, composition ranking and its application to image enhancement. Firstly, we discuss early representations that mainly consisted of expert features, and their possibility to enhance Convolutional Neural Networks (CNN). Secondly, we discuss the ability of resource-constrained CNNs, and the different architecture choices (inputs size and layer depth) in solving various aesthetic inference tasks: binary classification, regression, and image cropping. We show that if trained for solving fine-grained aesthetics inference, such models can rival the cropping performance of other aesthetics-based croppers, however they fall short in comparison to models trained for composition ranking. Lastly, we discuss our work on exploring and identifying the design choices in training composition ranking functions, with the goal of using them for image composition enhancement
    • …
    corecore