743 research outputs found

    A digitális fényképezés társadalmi gyakorlata Magyarországon

    Get PDF

    Media aesthetics based multimedia storytelling.

    Get PDF
    Since the earliest of times, humans have been interested in recording their life experiences, for future reference and for storytelling purposes. This task of recording experiences --i.e., both image and video capture-- has never before in history been as easy as it is today. This is creating a digital information overload that is becoming a great concern for the people that are trying to preserve their life experiences. As high-resolution digital still and video cameras become increasingly pervasive, unprecedented amounts of multimedia, are being downloaded to personal hard drives, and also uploaded to online social networks on a daily basis. The work presented in this dissertation is a contribution in the area of multimedia organization, as well as automatic selection of media for storytelling purposes, which eases the human task of summarizing a collection of images or videos in order to be shared with other people. As opposed to some prior art in this area, we have taken an approach in which neither user generated tags nor comments --that describe the photographs, either in their local or on-line repositories-- are taken into account, and also no user interaction with the algorithms is expected. We take an image analysis approach where both the context images --e.g. images from online social networks to which the image stories are going to be uploaded--, and the collection images --i.e., the collection of images or videos that needs to be summarized into a story--, are analyzed using image processing algorithms. This allows us to extract relevant metadata that can be used in the summarization process. Multimedia-storytellers usually follow three main steps when preparing their stories: first they choose the main story characters, the main events to describe, and finally from these media sub-groups, they choose the media based on their relevance to the story as well as based on their aesthetic value. Therefore, one of the main contributions of our work has been the design of computational models --both regression based, as well as classification based-- that correlate well with human perception of the aesthetic value of images and videos. These computational aesthetics models have been integrated into automatic selection algorithms for multimedia storytelling, which are another important contribution of our work. A human centric approach has been used in all experiments where it was feasible, and also in order to assess the final summarization results, i.e., humans are always the final judges of our algorithms, either by inspecting the aesthetic quality of the media, or by inspecting the final story generated by our algorithms. We are aware that a perfect automatically generated story summary is very hard to obtain, given the many subjective factors that play a role in such a creative process; rather, the presented approach should be seen as a first step in the storytelling creative process which removes some of the ground work that would be tedious and time consuming for the user. Overall, the main contributions of this work can be capitalized in three: (1) new media aesthetics models for both images and videos that correlate with human perception, (2) new scalable multimedia collection structures that ease the process of media summarization, and finally, (3) new media selection algorithms that are optimized for multimedia storytelling purposes.Postprint (published version

    Fast and Accurate Home Photo Categorization for Handheld Devices using MPEG-7 Descriptors

    Get PDF
    Home photo categorization has become an issue for practical use of photos taken with various devices. But it is a difficult task because of the semantic gap between physical images and human perception. Moreover, the object-based learning for overcoming this gap is hard to apply to handheld devices due to its computational overhead. We present an efficient image feature extraction method based on MPEG-7 descriptors and a learning structure constructed with multiple layers of Support Vector Machines for fast and accurate categorization of home photos. Experiments on diverse home photos demonstrate outstanding performance of our approach in terms of the categorization accuracy and the computational overhead

    How to predict the global instantaneous feeling induced by a facial picture?

    No full text
    International audiencePicture selection is a time-consuming task for humans and a real challenge for machines, which have to retrieve complex and subjective information from image pixels. An automated system that infers human feelings from digital portraits would be of great help for profile picture selection, photo album creation or photo editing. In this work, two models of facial pictures evaluation are defined. The first one predicts the overall aesthetic quality of a facial image, and the second one answers the question " Among a set of facial pictures of a given person, on which picture does the person look like the most friendly? ". Aesthetic quality is evaluated by the computation of 15 features that encode low-level statistics in different image regions (face, eyes, mouth). Relevant features are automatically selected by a feature ranking technique, and the outputs of 4 learning algorithms are fused in order to make a robust and accurate prediction of the image quality. Results are compared with recent works and the proposed algorithm obtains the best performance. The same pipeline is considered to evaluate the likability of a facial picture, with the difference that the estimation is based on high-level attributes such as gender, age, smile. Performance of these attributes is compared with previous techniques that mostly rely on facial keypoints positions, and it is shown that it is possible to obtain likability predictions that are close to human perception. Finally, a combination of both models that selects a likable facial image of good quality for a given person is described

    Multimedia Annotation Interoperability Framework

    Get PDF
    Multimedia systems typically contain digital documents of mixed media types, which are indexed on the basis of strongly divergent metadata standards. This severely hamplers the inter-operation of such systems. Therefore, machine understanding of metadata comming from different applications is a basic requirement for the inter-operation of distributed Multimedia systems. In this document, we present how interoperability among metadata, vocabularies/ontologies and services is enhanced using Semantic Web technologies. In addition, it provides guidelines for semantic interoperability, illustrated by use cases. Finally, it presents an overview of the most commonly used metadata standards and tools, and provides the general research direction for semantic interoperability using Semantic Web technologies

    Quantifying aesthetics of visual design applied to automatic design

    Get PDF
    In today\u27s Instagram world, with advances in ubiquitous computing and access to social networks, digital media is adopted by art and culture. In this dissertation, we study what makes a good design by investigating mechanisms to bring aesthetics of design from realm of subjection to objection. These mechanisms are a combination of three main approaches: learning theories and principles of design by collaborating with professional designers, mathematically and statistically modeling good designs from large scale datasets, and crowdscourcing to model perceived aesthetics of designs from general public responses. We then apply the knowledge gained in automatic design creation tools to help non-designers in self-publishing, and designers in inspiration and creativity. Arguably, unlike visual arts where the main goals may be abstract, visual design is conceptualized and created to convey a message and communicate with audiences. Therefore, we develop a semantic design mining framework to automatically link the design elements, layout, color, typography, and photos to linguistic concepts. The inferred semantics are applied to a design expert system to leverage user interactions in order to create personalized designs via recommendation algorithms based on the user\u27s preferences

    Building Energy Model Generation Using a Digital Photogrammetry-Based 3D Model

    Get PDF
    Buildings consume a large amount of energy and environmental resources. At the same time, current practices for whole-building energy simulation are costly and require skilled labor. As Building Energy Modeling (BEM) and simulations are becoming increasingly important, there is a growing need to make environmental assessments of buildings more efficient and accessible. A building energy model is based on collecting input data from the real, physical world and representing them as a digital energy model. Real-world data is also collected in the field of 3D reconstruction and image analysis, where major developments have been happening in recent years. Current digital photogrammetry software can automatically match photographs taken with a simple smartphone camera and generate a 3D model. This thesis presents methods and techniques that can be used to generate a building energy model from a digital photogrammetry-based 3D model. To accomplish this, a prototype program was developed that uses 3D reconstructed data as geometric modeling inputs for BEM. To validate the prototype, an experiment was conducted where a case-study building was selected. Photographs of the building were taken using a small remotelycontrolled Unmanned Aerial Vehicle (UAV) drone. Then, using photogrammetry software, the photographs were used to automatically generate a textured 3D model. The texture map, which is an image that represents the color information in the 3D model, was semantically annotated to extract building elements. The window annotations were iii used as inputs for the BEM process. In addition, a number of algorithms were applied to automatically convert both the 3D model and the annotated texture map into geometry that is compatible for a building energy model. Through the prototype, pre-defined templates were used with the geometric inputs to generate an EnergyPlus model (as an example building energy model). The feasibility of this experiment was verified by running a successful energy simulation. The results of this thesis contribute towards creating an automated and user-friendly photo-to-BEM method

    A Fuzzy-Based Multimedia Content Retrieval Method Using Mood Tags and Their Synonyms in Social Networks

    Get PDF
    The preferences of Web information purchasers are rapidly evolving. Cost-effectiveness is now becoming less regarded than cost-satisfaction, which emphasizes the purchaser’s psychological satisfaction. One method to improve a user’s cost-satisfaction in multimedia content retrieval is to utilize the mood inherent in multimedia items. An example of applications using this method is SNS (Social Network Services), which is based on folksonomy, but its applications encounter problems due to synonyms. In order to solve the problem of synonyms in our previous study, the mood of multimedia content is represented with arousal and valence (AV) in Thayer’s two-dimensional model as its internal tag. Although some problems of synonyms could now be solved, the retrieval performance of the previous study was less than that of a keyword-based method. In this paper, a new method that can solve the synonym problem is proposed, while simultaneously maintaining the same performance as the keyword-based approach. In the proposed method, a mood of multimedia content is represented with a fuzzy set of 12 moods of the Thayer model. For the analysis, the proposed method is compared with two methods, one based on AV value and the other based on keyword. The analysis results demonstrate that the proposed method is superior to the two methods

    Data Mining and Machine Learning in Astronomy

    Full text link
    We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra figures, some minor additions to the tex

    Applying psychology to forensic facial identification: perception and identification of facial composite images and facial image comparison

    Get PDF
    Eyewitness recognition is acknowledged to be prone to error but there is less understanding of difficulty in discriminating unfamiliar faces. This thesis examined the effects of face perception on identification of facial composites, and on unfamiliar face image comparison. Facial composites depict face memories by reconstructing features and configurations to form a likeness. They are generally reconstructed from an unfamiliar face memory, and will be unavoidably flawed. Identification will require perception of any accurate features, by someone who is familiar with the suspect and performance is typically poor. In typical face perception, face images are processed efficiently as complete units of information. Chapter 2 explored the possibility that holistic processing of inaccurate composite configurations will impair identification of individual features. Composites were split below the eyes and misaligned to impair holistic analysis (cf. Young, Hellawell, & Jay, 1987); identification was significantly enhanced, indicating that perceptual expertise with inaccurate configurations exerts powerful effects that can be reduced by enabling featural analysis. Facial composite recognition is difficult, which means that perception and judgement will be influence by an affective recognition bias: smiles enhance perceived familiarity, while negative expressions produce the opposite effect. In applied use, facial composites are generally produced from unpleasant memories and will convey negative expression; affective bias will, therefore, be important for facial composite recognition. Chapter 3 explored the effect of positive expression on composite identification: composite expressions were enhanced, and positive affect significantly increased identification. Affective quality rather than expression strength mediated the effect, with subtle manipulations being very effective. Facial image comparison (FIC) involves discrimination of two or more face images. Accuracy in unfamiliar face matching is typically in the region of 70%, and as discrimination is difficult, may be influenced by affective bias. Chapter 4 explored the smiling face effect in unfamiliar face matching. When multiple items were compared, positive affect did not enhance performance and false positive identification increased. With a delayed matching procedure, identification was not enhanced but in contrast to face recognition and simultaneous matching, positive affect improved rejection of foil images. Distinctive faces are easier to discriminate. Chapter 5 evaluated a systematic caricature transformation as a means to increase distinctiveness and enhance discrimination of unfamiliar faces. Identification of matching face images did not improve, but successful rejection of non-matching items was significantly enhanced. Chapter 6 used face matching to explore the basis of own race bias in face perception. Other race faces were manipulated to show own race facial variation, and own race faces to show African American facial variation. When multiple face images were matched simultaneously, the transformation impaired performance for all of the images; but when images were individually matched, the transformation improved perception of other race faces and discrimination of own race faces declined. Transformation of Japanese faces to show own race dimensions produced the same pattern of effects but failed to reach significance. The results provide support for both perceptual expertise and featural processing theories of own race bias. Results are interpreted with reference to face perception theories; implications for application and future study are discussed
    • …
    corecore