112 research outputs found
Design and Evaluation of Product Aesthetics: A Human-Machine Hybrid Approach
Aesthetics are critically important to market acceptance in many product
categories. In the automotive industry in particular, an improved aesthetic
design can boost sales by 30% or more. Firms invest heavily in designing and
testing new product aesthetics. A single automotive "theme clinic" costs
between \$100,000 and \$1,000,000, and hundreds are conducted annually. We use
machine learning to augment human judgment when designing and testing new
product aesthetics. The model combines a probabilistic variational autoencoder
(VAE) and adversarial components from generative adversarial networks (GAN),
along with modeling assumptions that address managerial requirements for firm
adoption. We train our model with data from an automotive partner-7,000 images
evaluated by targeted consumers and 180,000 high-quality unrated images. Our
model predicts well the appeal of new aesthetic designs-38% improvement
relative to a baseline and substantial improvement over both conventional
machine learning models and pretrained deep learning models. New automotive
designs are generated in a controllable manner for the design team to consider,
which we also empirically verify are appealing to consumers. These results,
combining human and machine inputs for practical managerial usage, suggest that
machine learning offers significant opportunity to augment aesthetic design
Deep Learning for Audio Signal Processing
Given the recent surge in developments of deep learning, this article
provides a review of the state-of-the-art deep learning techniques for audio
signal processing. Speech, music, and environmental sound processing are
considered side-by-side, in order to point out similarities and differences
between the domains, highlighting general methods, problems, key references,
and potential for cross-fertilization between areas. The dominant feature
representations (in particular, log-mel spectra and raw waveform) and deep
learning models are reviewed, including convolutional neural networks, variants
of the long short-term memory architecture, as well as more audio-specific
neural network models. Subsequently, prominent deep learning application areas
are covered, i.e. audio recognition (automatic speech recognition, music
information retrieval, environmental sound detection, localization and
tracking) and synthesis and transformation (source separation, audio
enhancement, generative models for speech, sound, and music synthesis).
Finally, key issues and future questions regarding deep learning applied to
audio signal processing are identified.Comment: 15 pages, 2 pdf figure
16th Sound and Music Computing Conference SMC 2019 (28–31 May 2019, Malaga, Spain)
The 16th Sound and Music Computing Conference (SMC 2019) took place in Malaga, Spain, 28-31 May 2019 and it was organized by the Application of Information and Communication Technologies Research group (ATIC) of the University of Malaga (UMA). The SMC 2019 associated Summer School took place 25-28 May 2019. The First International Day of Women in Inclusive Engineering, Sound and Music Computing Research (WiSMC 2019) took place on 28 May 2019. The SMC 2019 TOPICS OF INTEREST included a wide selection of topics related to acoustics, psychoacoustics, music, technology for music, audio analysis, musicology, sonification, music games, machine learning, serious games, immersive audio, sound synthesis, etc
Handbook of Digital Face Manipulation and Detection
This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area
Bias in Deep Learning and Applications to Face Analysis
Deep learning has fostered the progress in the field of face analysis, resulting in the integration of these models in multiple aspects of society. Even though the majority of research has focused on optimizing standard evaluation metrics, recent work has exposed the bias of such algorithms as well as the dangers of their unaccountable utilization.n this thesis, we explore the bias of deep learning models in the discriminative and the generative setting. We begin by investigating the bias of face analysis models with regards to different demographics. To this end, we collect KANFace, a large-scale video and image dataset of faces captured ``in-the-wild’'. The rich set of annotations allows us to expose the demographic bias of deep learning models, which we mitigate by utilizing adversarial learning to debias the deep representations. Furthermore, we explore neural augmentation as a strategy towards training fair classifiers. We propose a style-based multi-attribute transfer framework that is able to synthesize photo-realistic faces of the underrepresented demographics. This is achieved by introducing a multi-attribute extension to Adaptive Instance Normalisation that captures the multiplicative interactions between the representations of different attributes. Focusing on bias in gender recognition, we showcase the efficacy of the framework in training classifiers that are more fair compared to generative and fairness-aware methods.In the second part, we focus on bias in deep generative models. In particular, we start by studying the generalization of generative models on images of unseen attribute combinations. To this end, we extend the conditional Variational Autoencoder by introducing a multilinear conditioning framework. The proposed method is able to synthesize unseen attribute combinations by modeling the multiplicative interactions between the attributes. Lastly, in order to control protected attributes, we investigate controlled image generation without training on a labelled dataset. We leverage pre-trained Generative Adversarial Networks that are trained in an unsupervised fashion and exploit the clustering that occurs in the representation space of intermediate layers of the generator. We show that these clusters capture semantic attribute information and condition image synthesis on the cluster assignment using Implicit Maximum Likelihood Estimation.Open Acces
Conditional Image Synthesis by Generative Adversarial Modeling
Recent years, image synthesis has attracted more interests. This work explores the recovery of details (low-level information) from high-level features. The generative adversarial nets (GAN) has led to the explosion of image synthesis. Moving away from those application-oriented alternatives, this work investigates its intrinsic drawbacks and derives corresponding improvements in a theoretical manner.Based on GAN, this work further investigates the conditional image synthesis by incorporating an autoencoder (AE) to GAN. The GAN+AE structure has been demonstrated to be an effective framework for image manipulation. This work emphasizes the effectiveness of GAN+AE structure by proposing the conditional adversarial autoencoder (CAAE) for human facial age progression and regression. Instead of editing on the image level, i.e., explicitly changing the shape of face, adding wrinkle, etc., this work edits the high-level features which implicitly guide the recovery of images towards expected appearance.While GAN+AE being prevalent in image manipulation, its drawbacks lack exploration. For example, GAN+AE requires a weight to balance the effects of GAN and AE. An inappropriate weight would generate unstable results. This work provides an insight to such instability, which is due to the interaction between GAN and AE. Therefore, this work proposes the decoupled learning (GAN//AE) to avoid the interaction between them and achieve a robust and effective framework for image synthesis. Most existing works used GAN+AE structure could be easily adapted to the proposed GAN//AE structure to boost their robustness. Experimental results demonstrate the correctness and effectiveness of the provided derivation and proposed methods, respectively.In addition, this work extends the conditional image synthesis to the traditional area of image super-resolution, which recovers the high-resolution image according the low-resolution counterpart. Diverting from such traditional routine, this work explores a new research direction | reference-conditioned super-resolution, in which a reference image containing desired high-resolution texture details is used besides the low-resolution image. We focus on transferring the high-resolution texture from reference images to the super-resolution process without the constraint of content similarity between reference and target images, which is a key difference from previous example-based methods
Handbook of Digital Face Manipulation and Detection
This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area
- …