12,120 research outputs found
Aesthetic-Driven Image Enhancement by Adversarial Learning
We introduce EnhanceGAN, an adversarial learning based model that performs
automatic image enhancement. Traditional image enhancement frameworks typically
involve training models in a fully-supervised manner, which require expensive
annotations in the form of aligned image pairs. In contrast to these
approaches, our proposed EnhanceGAN only requires weak supervision (binary
labels on image aesthetic quality) and is able to learn enhancement operators
for the task of aesthetic-based image enhancement. In particular, we show the
effectiveness of a piecewise color enhancement module trained with weak
supervision, and extend the proposed EnhanceGAN framework to learning a deep
filtering-based aesthetic enhancer. The full differentiability of our image
enhancement operators enables the training of EnhanceGAN in an end-to-end
manner. We further demonstrate the capability of EnhanceGAN in learning
aesthetic-based image cropping without any groundtruth cropping pairs. Our
weakly-supervised EnhanceGAN reports competitive quantitative results on
aesthetic-based color enhancement as well as automatic image cropping, and a
user study confirms that our image enhancement results are on par with or even
preferred over professional enhancement
6 Seconds of Sound and Vision: Creativity in Micro-Videos
The notion of creativity, as opposed to related concepts such as beauty or
interestingness, has not been studied from the perspective of automatic
analysis of multimedia content. Meanwhile, short online videos shared on social
media platforms, or micro-videos, have arisen as a new medium for creative
expression. In this paper we study creative micro-videos in an effort to
understand the features that make a video creative, and to address the problem
of automatic detection of creative content. Defining creative videos as those
that are novel and have aesthetic value, we conduct a crowdsourcing experiment
to create a dataset of over 3,800 micro-videos labelled as creative and
non-creative. We propose a set of computational features that we map to the
components of our definition of creativity, and conduct an analysis to
determine which of these features correlate most with creative video. Finally,
we evaluate a supervised approach to automatically detect creative video, with
promising results, showing that it is necessary to model both aesthetic value
and novelty to achieve optimal classification accuracy.Comment: 8 pages, 1 figures, conference IEEE CVPR 201
Maastikumeetrika ja ökosüsteemi kultuuriteenused – ressursipõhine integreeriv lähenemine maastikuharmoonia kaardistamisele
A Thesis
for applying for the degree of Doctor of Philosophy
in Environmental Protection.The overall idea of PhD thesis was to explain with objective evidence and using mapping techniques, why and how people value particular visual landscapes. Mainstream mapping research usually refers to uniqueness, diversity and naturalness of landscapes as the main factors for landscape values and preferences. These variables can be easily measured using satellite imagery and cartographic materials: for example, the diversity of landscape elements can be assessed with a function of Shannon information entropy, and naturalness – as the share of relatively natural land cover within the region of interest. However, psychological background suggests other important attributes of landscape experience – harmony, unity or coherence of the scene. Mentioned aspects are usually measured subjectively with questionnaires and surveys. Measuring landscape preferences is also quite a challenging task, requiring many people involved in assessment of photographs or even having a nature trip (with obvious drawbacks in spatial coverage and replicability with other evaluators).
Therefore, the PhD research was designed to make all assessments as objective, as possible. Overall landscape coherence, for the first time, was measured as the extent to which total diversity of digital landscape model (composed of landforms and land cover) exceeds the added diversity of landforms and land cover alone. In this way, coherence was directly related to system properties of landscape, making it legible and understandable. Also, for the first time colour harmony of land cover was evaluated with remotely sensed data (satellite imagery). Retrieved map-based indices were examined with geo-located photographs of landscapes and outdoor recreation, uploaded to social media, such as Flickr, VK.com and former Panoramio. The study contributes to the operationalisation of landscape beauty and, therefore, more advanced landscape management, nature protection and sustainability of land use practises.Doktoritöö eesmärk on kaardistustehnoloogiad kasutades tõenduspõhiselt selgitada, miks ja kuidas inimesed väärtustavad teatud maastikke visuaalsest seisukohast. Peavoolu kaardistusuuringud tavaliselt keskenduvad maastiku väärtuste ja eelistuste hindamisel unikaalsusele, mitmekesisusele ja looduslikkusele. Neid muutujaid saab satelliitpiltide ja kartograafilise materjali põhjal lihtsalt mõõta, näiteks maastikuelementide mitmekesisust saab hinnata Shannoni entroopiavalemiga ning looduslikkust vastava iseloomuga maakatte osakaaluga uuritaval alal.
Psühholoogilisest vaatepunktist lähtudes on maastikukogemusel veel teisi olulisi omadusi, nagu vaate harmoonia, ühtsus või kooskõla sidusus. Uuringute puhul mõõdetakse neid muutujaid tavaliselt subjektiivselt. Maastikueelistuste teaduslik hindamine on tõsine metoodiline väljakutse, mis nõuab paljude hindajate osalemist näiteks maastikufotode hindamisel või vahetult looduses, kus tuleb arvestada piirangutega ruumilisel esindatusel või hinnangute replikatiivsusel.
Arvestades eelnimetatud asjaolusid, on dissertatsiooni eesmärgiks seatud leida võimalikult objektiivseid teid tavaliselt subjektiivsetena käsitletavate maastikumuutujate hindamisel. Uudne on üldise maastiku kooskõla mõõdetmine digitaalse pinnavorme ja maakatet hõlmava maastikumudeliga, võrreldes nende komponentide eraldi mõõtmisega. Selliselt menetledes on koherentsus otseselt seostatav maastiku struktuursete parameetritega ja seega muudab hinnangud loetavamaks ja arusaadavamaks.
Esmakordselt on kaugseire andmete (satelliitpildid) alusel hinnatud ka maakatte värviharmooniat. Määratletud kaardipõhiseid indekseid kontrolliti kohtseotud fotodega maastikuvaadetest ning välirekreatsiooni tegevustest sotsiaalmeedias (nt Flickr, VK.com ja varasem Panoramio). Uuring aitab paremini mõista ja rakendada maastiku ilu hindamise käiku ja seeläbi kasutada esteetilist kvaliteeti maastiku planeerimisel ja korraldamisel, looduskaitses ja teistes säästva maakasutuse praktilistes valdkondades.Publication of this dissertation has been supported by the Estonian
University of Life Science
Media aesthetics based multimedia storytelling.
Since the earliest of times, humans have been interested in recording their life experiences, for future reference and for storytelling purposes. This task of recording experiences --i.e., both image and video capture-- has never before in history been as easy as it is today. This is creating a digital information overload that is becoming a great concern for the people that are trying to preserve their life experiences.
As high-resolution digital still and video cameras become increasingly pervasive, unprecedented amounts of multimedia, are being downloaded to personal hard drives, and also uploaded to online social networks on a daily basis. The work presented in this dissertation is a contribution in the area of multimedia organization, as well as automatic selection of media for storytelling purposes, which eases the human task of summarizing a collection of images or videos in order to be shared with other people.
As opposed to some prior art in this area, we have taken an approach in which neither user generated tags nor comments --that describe the photographs, either in their local or on-line repositories-- are taken into account, and also no user interaction with the algorithms is expected. We take an image analysis approach where both the context images --e.g. images from online social networks to which the image stories are going to be uploaded--, and the collection images --i.e., the collection of images or videos that needs to be summarized into a story--, are analyzed using image processing algorithms. This allows us to extract relevant metadata that can be used in the summarization process.
Multimedia-storytellers usually follow three main steps when preparing their stories: first they choose the main story characters, the main events to describe, and finally from these media sub-groups, they choose the media based on their relevance to the story as well as based on their aesthetic value. Therefore, one of the main contributions of our work has been the design of computational models --both regression based, as well as classification based-- that correlate well with human perception of the aesthetic value of images and videos. These computational aesthetics models have been integrated into automatic selection algorithms for multimedia storytelling, which are another important contribution of our work. A human centric approach has been used in all experiments where it was feasible, and also in order to assess the final summarization results, i.e., humans are always the final judges of our algorithms, either by inspecting the aesthetic quality of the media, or by inspecting the final story generated by our algorithms.
We are aware that a perfect automatically generated story summary is very hard to obtain, given the many subjective factors that play a role in such a creative process; rather, the presented approach should be seen as a first step in the storytelling creative process which removes some of the ground work that would be tedious and time consuming for the user. Overall, the main contributions of this work can be capitalized in three:
(1) new media aesthetics models for both images and videos that correlate with human perception,
(2) new scalable multimedia collection structures that ease the process of media summarization, and finally,
(3) new media selection algorithms that are optimized for multimedia storytelling purposes.Postprint (published version
Representations and representation learning for image aesthetics prediction and image enhancement
With the continual improvement in cell phone cameras and improvements in the connectivity of mobile devices, we have seen an exponential increase in the images that are captured, stored and shared on social media. For example, as of July 1st 2017 Instagram had over 715 million registered users which had posted just shy of 35 billion images. This represented approximately seven and nine-fold increase in the number of users and photos present on Instagram since 2012. Whether the images are stored on personal computers or reside on social networks (e.g. Instagram, Flickr), the sheer number of images calls for methods to determine various image properties, such as object presence or appeal, for the purpose of automatic image management and curation. One of the central problems in consumer photography centers around determining the aesthetic appeal of an image and motivates us to explore questions related to understanding aesthetic preferences, image enhancement and the possibility of using such models on devices with constrained resources.
In this dissertation, we present our work on exploring representations and representation learning approaches for aesthetic inference, composition ranking and its application to image enhancement. Firstly, we discuss early representations that mainly consisted of expert features, and their possibility to enhance Convolutional Neural Networks (CNN). Secondly, we discuss the ability of resource-constrained CNNs, and the different architecture choices (inputs size and layer depth) in solving various aesthetic inference tasks: binary classification, regression, and image cropping. We show that if trained for solving fine-grained aesthetics inference, such models can rival the cropping performance of other aesthetics-based croppers, however they fall short in comparison to models trained for composition ranking. Lastly, we discuss our work on exploring and identifying the design choices in training composition ranking functions, with the goal of using them for image composition enhancement
- …