18 research outputs found
RetouchingFFHQ: A Large-scale Dataset for Fine-grained Face Retouching Detection
The widespread use of face retouching filters on short-video platforms has
raised concerns about the authenticity of digital appearances and the impact of
deceptive advertising. To address these issues, there is a pressing need to
develop advanced face retouching techniques. However, the lack of large-scale
and fine-grained face retouching datasets has been a major obstacle to progress
in this field. In this paper, we introduce RetouchingFFHQ, a large-scale and
fine-grained face retouching dataset that contains over half a million
conditionally-retouched images. RetouchingFFHQ stands out from previous
datasets due to its large scale, high quality, fine-grainedness, and
customization. By including four typical types of face retouching operations
and different retouching levels, we extend the binary face retouching detection
into a fine-grained, multi-retouching type, and multi-retouching level
estimation problem. Additionally, we propose a Multi-granularity Attention
Module (MAM) as a plugin for CNN backbones for enhanced cross-scale
representation learning. Extensive experiments using different baselines as
well as our proposed method on RetouchingFFHQ show decent performance on face
retouching detection. With the proposed new dataset, we believe there is great
potential for future work to tackle the challenging problem of real-world
fine-grained face retouching detection.Comment: Under revie
WiFi-Based Human Activity Recognition Using Attention-Based BiLSTM
Recently, significant efforts have been made to explore human activity recognition (HAR) techniques that use information gathered by existing indoor wireless infrastructures through WiFi signals without demanding the monitored subject to carry a dedicated device. The key intuition is that different activities introduce different multi-paths in WiFi signals and generate different patterns in the time series of channel state information (CSI). In this paper, we propose and evaluate a full pipeline for a CSI-based human activity recognition framework for 12 activities in three different spatial environments using two deep learning models: ABiLSTM and CNN-ABiLSTM. Evaluation experiments have demonstrated that the proposed models outperform state-of-the-art models. Also, the experiments show that the proposed models can be applied to other environments with different configurations, albeit with some caveats. The proposed ABiLSTM model achieves an overall accuracy of 94.03%, 91.96%, and 92.59% across the 3 target environments. While the proposed CNN-ABiLSTM model reaches an accuracy of 98.54%, 94.25% and 95.09% across those same environments
Handbook of Digital Face Manipulation and Detection
This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area
KEER2022
AvanttĂtol: KEER2022. DiversitiesDescripciĂł del recurs: 25 juliol 202
Recommended from our members
EVA London 2022: Electronic Visualisation and the Arts
The Electronic Visualisation and the Arts London 2022 Conference (EVA London 2022) is co-sponsored by the Computer Arts Society (CAS) and BCS, the Chartered Institute for IT, of which the CAS is a Specialist Group. Of course, this has been a difficult time for all conferences, with the Covid-19 pandemic. For the first time since 2019, the EVA London 2022 Conference is a physical conference. It is also an online conference, as it was in the previous two years. We continue with publishing the proceedings, both online, with open access via ScienceOpen, and also in our traditional printed form, for the second year in full colour. Over recent decades, the EVA London Conference on Electronic Visualisation and the Arts has established itself as one of the United Kingdom’s most innovative and interdisciplinary conferences. It brings together a wide range of research domains to celebrate a diverse set of interests, with a specialised focus on visualisation. The long and short papers in this volume cover varied topics concerning the arts, visualisations, and IT, including 3D graphics, animation, artificial intelligence, creativity, culture, design, digital art, ethics, heritage, literature, museums, music, philosophy, politics, publishing, social media, and virtual reality, as well as other related interdisciplinary areas.
The EVA London 2022 proceedings presents a wide spectrum of papers, demonstrations, Research Workshop contributions, other workshops, and for the seventh year, the EVA London Symposium, in the form of an opening morning session, with three invited contributors. The conference includes a number of other associated evening events including ones organised by the Computer Arts Society, Art in Flux, and EVA International. As in previous years, there are Research Workshop contributions in this volume, aimed at encouraging participation by postgraduate students and early-career artists, accepted either through the peer-review process or directly by the Research Workshop chair. The Research Workshop contributors are offered bursaries to aid participation. In particular, EVA London liaises with Art in Flux, a London-based group of digital artists. The EVA London 2022 proceedings includes long papers and short “poster” papers from international researchers inside and outside academia, from graduate artists, PhD students, industry professionals, established scholars, and senior researchers, who value EVA London for its interdisciplinary community. The conference also features keynote talks. A special feature this year is support for Ukrainian culture after its invasion earlier in the year. This publication has resulted from a selective peer review process, fitting as many excellent submissions as possible into the proceedings.
This year, submission numbers were lower than previous years, mostly likely due to the pandemic and a new requirement to submit drafts of long papers for review as well as abstracts. It is still pleasing to have so many good proposals from which to select the papers that have been included. EVA London is part of a larger network of EVA international conferences. EVA events have been held in Athens, Beijing, Berlin, Brussels, California, Cambridge (both UK and USA), Canberra, Copenhagen, Dallas, Delhi, Edinburgh, Florence, Gifu (Japan), Glasgow, Harvard, Jerusalem, Kiev, Laval, London, Madrid, Montreal, Moscow, New York, Paris, Prague, St Petersburg, Thessaloniki, and Warsaw. Further venues for EVA conferences are very much encouraged by the EVA community. As noted earlier, this volume is a record of accepted submissions to EVA London 2022. Associated online presentations are in general recorded and made available online after the conference
The Problem of Expression in Generative and Human Drawing Agents
Drawing is considered one of the most accessible forms of expression. Children as young as four years of age as well as artists with more advanced technical skills can use simple drawing lines to effectively communicate abstract concepts and express emotional states. According to recent theories of embodiment rooted in the field of empirical aesthetics, an audience’s aesthetic experience is conditioned by the artists’ emotional states and intentions expressed through their works. This is evident especially when the art work involves brush strokes and drawing lines. Within this paradigm, I argue that investigating the relation between the artist and the audience becomes particularly relevant when the drawing agent is artificial. In that case, what emotional state and what agency is the audience relating to? This thesis examines drawings as a means to express emotions by evaluating how a naive audience experienced drawings made by human and artificial drawing agents. In particular, it asks if there are substantial differences between the way we “perceive” and “feel” about human made strokes as opposed to artificially generated ones. For this purpose, four studies were conducted combining methods from empirical aesthetics and Machine Learning field. The studies conducted with human participants showed that naive sketchers unintentionally conveyed information about their individual differences in emotional states (i.e., anxiety level) to a naive crowd by using figurative drawings. Similar effects were observed using abstract drawing lines made by naive sketchers whose mood changed while listening to music. Further analysis conducted at the stroke level provided insight into the interdependencies between the sketchers’ emotional states, the music, and the drawing process. Finally, the thesis investigated people’s ability to distinguish between generative and human-made strokes within theories of dual-processing reasoning. The thesis contributes to the art made by machines debate by evaluating the extent to which a generative model could learn from the human drawing process and generate expressive drawing strokes. The datasets and the models produced during this work add further to the field of the computational arts and Human Computer Interaction, particularly focusing on non-verbal expression
Facial Beauty Prediction and Analysis based on Deep Convolutional Neural Network: A Review
Abstract: Facial attractiveness or facial beauty prediction (FBP) is a current study that has several potential usages. It is a key difficulty area in the computer vision domain because of the few public databases related to FBP and its experimental trials on the minor-scale database. Moreover, the evaluation of facial beauty is personalized in nature, with people having personalized favor of beauty. Deep learning techniques have displayed a significant ability in terms of analysis and feature representation. The previous studies focussed on scattered portions of facial beauty with fewer comparisons between diverse techniques. Thus, this article reviewed the recent research on computer prediction and analysis of face beauty based on deep convolution neural network DCNN. Furthermore, the provided possible lines of research and challenges in this article can help researchers in advancing the state – of- art in future work
Modeling Visual Rhetoric and Semantics in Multimedia
Recent advances in machine learning have enabled computer vision algorithms to model complicated visual phenomena with accuracies unthinkable a mere decade ago. Their high-performance on a plethora of vision-related tasks has enabled computer vision researchers to begin to move beyond traditional visual recognition problems to tasks requiring higher-level image understanding. However, most computer vision research still focuses on describing what images, text, or other media literally portrays. In contrast, in this dissertation we focus on learning how and why such content is portrayed. Rather than viewing media for its content, we recast the problem as understanding visual communication and visual rhetoric. For example, the same content may be portrayed in different ways in order to present the story the author wishes to convey. We thus seek to model not only the content of the media, but its authorial intent and latent messaging. Understanding how and why visual content is portrayed a certain way requires understanding higher level abstract semantic concepts which are themselves latent within visual media. By latent, we mean the concept is not readily visually accessible within a single image (e.g. right vs left political bias), in contrast to explicit visual semantic concepts such as objects.
Specifically, we study the problems of modeling photographic style (how professional photographers portray their subjects), understanding visual persuasion in image advertisements, modeling political bias in multimedia (image and text) news articles, and learning cross-modal semantic representations. While most past research in vision and natural language processing studies the case where visual content and paired text are highly aligned (as in the case of image captions), we target the case where each modality conveys complementary information to tell a larger story. We particularly focus on the problem of learning cross-modal representations from multimedia exhibiting weak alignment between the image and text modalities. A variety of techniques are presented which improve modeling of multimedia rhetoric in real-world data and enable more robust artificially intelligent systems