6 research outputs found

    Design and Evaluation of Product Aesthetics: A Human-Machine Hybrid Approach

    Full text link
    Aesthetics are critically important to market acceptance in many product categories. In the automotive industry in particular, an improved aesthetic design can boost sales by 30% or more. Firms invest heavily in designing and testing new product aesthetics. A single automotive "theme clinic" costs between \$100,000 and \$1,000,000, and hundreds are conducted annually. We use machine learning to augment human judgment when designing and testing new product aesthetics. The model combines a probabilistic variational autoencoder (VAE) and adversarial components from generative adversarial networks (GAN), along with modeling assumptions that address managerial requirements for firm adoption. We train our model with data from an automotive partner-7,000 images evaluated by targeted consumers and 180,000 high-quality unrated images. Our model predicts well the appeal of new aesthetic designs-38% improvement relative to a baseline and substantial improvement over both conventional machine learning models and pretrained deep learning models. New automotive designs are generated in a controllable manner for the design team to consider, which we also empirically verify are appealing to consumers. These results, combining human and machine inputs for practical managerial usage, suggest that machine learning offers significant opportunity to augment aesthetic design

    Active vision for robot manipulators using the free energy principle

    Get PDF
    Occlusions, restricted field of view and limited resolution all constrain a robot's ability to sense its environment from a single observation. In these cases, the robot first needs to actively query multiple observations and accumulate information before it can complete a task. In this paper, we cast this problem of active vision as active inference, which states that an intelligent agent maintains a generative model of its environment and acts in order to minimize its surprise, or expected free energy according to this model. We apply this to an object-reaching task for a 7-DOF robotic manipulator with an in-hand camera to scan the workspace. A novel generative model using deep neural networks is proposed that is able to fuse multiple views into an abstract representation and is trained from data by minimizing variational free energy. We validate our approach experimentally for a reaching task in simulation in which a robotic agent starts without any knowledge about its workspace. Each step, the next view pose is chosen by evaluating the expected free energy. We find that by minimizing the expected free energy, exploratory behavior emerges when the target object to reach is not in view, and the end effector is moved to the correct reach position once the target is located. Similar to an owl scavenging for prey, the robot naturally prefers higher ground for exploring, approaching its target once located

    Representation Learning with Adversarial Latent Autoencoders

    Get PDF
    A large number of deep learning methods applied to computer vision problems require encoder-decoder maps. These methods include, but are not limited to, self-representation learning, generalization, few-shot learning, and novelty detection. Encoder-decoder maps are also useful for photo manipulation, photo editing, superresolution, etc. Encoder-decoder maps are typically learned using autoencoder networks.Traditionally, autoencoder reciprocity is achieved in the image-space using pixel-wisesimilarity loss, which has a widely known flaw of producing non-realistic reconstructions. This flaw is typical for the Variational Autoencoder (VAE) family and is not only limited to pixel-wise similarity losses, but is common to all methods relying upon the explicit maximum likelihood training paradigm, as opposed to an implicit one. Likelihood maximization, coupled with poor decoder distribution leads to poor or blurry reconstructions at best. Generative Adversarial Networks (GANs) on the other hand, perform an implicit maximization of the likelihood by solving a minimax game, thus bypassing the issues derived from the explicit maximization. This provides GAN architectures with remarkable generative power, enabling the generation of high-resolution images of humans, which are indistinguishable from real photos to the naked eye. However, GAN architectures lack inference capabilities, which makes them unsuitable for training encoder-decoder maps, effectively limiting their application space.We introduce an autoencoder architecture that (a) is free from the consequences ofmaximizing the likelihood directly, (b) produces reconstructions competitive in quality with state-of-the-art GAN architectures, and (c) allows learning disentangled representations, which makes it useful in a variety of problems. We show that the proposed architecture and training paradigm significantly improves the state-of-the-art in novelty and anomaly detection methods, it enables novel kinds of image manipulations, and has significant potential for other applications
    corecore