380 research outputs found

    Foreword ACII 2013

    Get PDF

    Edge Guided GANs with Semantic Preserving for Semantic Image Synthesis

    Full text link
    We propose a novel Edge guided Generative Adversarial Network (EdgeGAN) for photo-realistic image synthesis from semantic layouts. Although considerable improvement has been achieved, the quality of synthesized images is far from satisfactory due to two largely unresolved challenges. First, the semantic labels do not provide detailed structural information, making it difficult to synthesize local details and structures. Second, the widely adopted CNN operations such as convolution, down-sampling and normalization usually cause spatial resolution loss and thus are unable to fully preserve the original semantic information, leading to semantically inconsistent results (e.g., missing small objects). To tackle the first challenge, we propose to use the edge as an intermediate representation which is further adopted to guide image generation via a proposed attention guided edge transfer module. Edge information is produced by a convolutional generator and introduces detailed structure information. Further, to preserve the semantic information, we design an effective module to selectively highlight class-dependent feature maps according to the original semantic layout. Extensive experiments on two challenging datasets show that the proposed EdgeGAN can generate significantly better results than state-of-the-art methods. The source code and trained models are available at https://github.com/Ha0Tang/EdgeGAN.Comment: 40 pages, 29 figure

    Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

    Full text link
    In this paper, we address the task of semantic-guided scene generation. One open challenge in scene generation is the difficulty of the generation of small objects and detailed local texture, which has been widely observed in global image-level generation methods. To tackle this issue, in this work we consider learning the scene generation in a local context, and correspondingly design a local class-specific generative network with semantic maps as a guidance, which separately constructs and learns sub-generators concentrating on the generation of different classes, and is able to provide more scene details. To learn more discriminative class-specific feature representations for the local generation, a novel classification module is also proposed. To combine the advantage of both the global image-level and the local class-specific generation, a joint generation network is designed with an attention fusion module and a dual-discriminator structure embedded. Extensive experiments on two scene image generation tasks show superior generation performance of the proposed model. The state-of-the-art results are established by large margins on both tasks and on challenging public benchmarks. The source code and trained models are available at https://github.com/Ha0Tang/LGGAN.Comment: Accepted to CVPR 2020, camera ready (10 pages) + supplementary (18 pages

    ELVIS: Entertainment-led video summaries

    Get PDF
    © ACM, 2010. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Multimedia Computing, Communications, and Applications, 6(3): Article no. 17 (2010) http://doi.acm.org/10.1145/1823746.1823751Video summaries present the user with a condensed and succinct representation of the content of a video stream. Usually this is achieved by attaching degrees of importance to low-level image, audio and text features. However, video content elicits strong and measurable physiological responses in the user, which are potentially rich indicators of what video content is memorable to or emotionally engaging for an individual user. This article proposes a technique that exploits such physiological responses to a given video stream by a given user to produce Entertainment-Led VIdeo Summaries (ELVIS). ELVIS is made up of five analysis phases which correspond to the analyses of five physiological response measures: electro-dermal response (EDR), heart rate (HR), blood volume pulse (BVP), respiration rate (RR), and respiration amplitude (RA). Through these analyses, the temporal locations of the most entertaining video subsegments, as they occur within the video stream as a whole, are automatically identified. The effectiveness of the ELVIS technique is verified through a statistical analysis of data collected during a set of user trials. Our results show that ELVIS is more consistent than RANDOM, EDR, HR, BVP, RR and RA selections in identifying the most entertaining video subsegments for content in the comedy, horror/comedy, and horror genres. Subjective user reports also reveal that ELVIS video summaries are comparatively easy to understand, enjoyable, and informative

    Microscopic shell-model description of the exotic nucleus ^{16}C

    Get PDF
    The structure of the neutron-rich carbon nucleus ^{16}C is described by introducing a new microscopic shell model of no-core type. The model space is composed of the 0s, 0p, 1s0d, and 1p0f shells. The effective interaction is microscopically derived from the CD-Bonn potential and the Coulomb force through a unitary transformation theory. Calculated low-lying energy levels of ^{16}C agree well with the experiment. The B(E2;2_{1}^{+} \to 0_{1}^{+}) value is calculated with the bare charges. The anomalously hindered B(E2) value for ^{16}C, measured recently, is well reproduced.Comment: 14 pages, 4 figures, considerable results and discussion are added, but the main conclusion is unchanged, accepted for publication in Phys. Lett.

    Human-centered Computing: Toward a Human Revolution

    Get PDF
    Human-centered computing studies the design, development, and deployment of mixed-initiative human-computer systems. HCC is emerging from the convergence of multiple disciplines that are concerned both with understanding human beings and with the design of computational artifacts

    What does touch tell us about emotions in touchscreen-based gameplay?

    Get PDF
    This is the post-print version of the Article. The official published version can be accessed from the link below - Copyright @ 2012 ACM. It is posted here by permission of ACM for your personal use. Not for redistribution.Nowadays, more and more people play games on touch-screen mobile phones. This phenomenon raises a very interesting question: does touch behaviour reflect the player’s emotional state? If possible, this would not only be a valuable evaluation indicator for game designers, but also for real-time personalization of the game experience. Psychology studies on acted touch behaviour show the existence of discriminative affective profiles. In this paper, finger-stroke features during gameplay on an iPod were extracted and their discriminative power analysed. Based on touch-behaviour, machine learning algorithms were used to build systems for automatically discriminating between four emotional states (Excited, Relaxed, Frustrated, Bored), two levels of arousal and two levels of valence. The results were very interesting reaching between 69% and 77% of correct discrimination between the four emotional states. Higher results (~89%) were obtained for discriminating between two levels of arousal and two levels of valence

    AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Networks

    Full text link
    State-of-the-art methods in the unpaired image-to-image translation are capable of learning a mapping from a source domain to a target domain with unpaired image data. Though the existing methods have achieved promising results, they still produce unsatisfied artifacts, being able to convert low-level information while limited in transforming high-level semantics of input images. One possible reason is that generators do not have the ability to perceive the most discriminative semantic parts between the source and target domains, thus making the generated images low quality. In this paper, we propose a new Attention-Guided Generative Adversarial Networks (AttentionGAN) for the unpaired image-to-image translation task. AttentionGAN can identify the most discriminative semantic objects and minimize changes of unwanted parts for semantic manipulation problems without using extra data and models. The attention-guided generators in AttentionGAN are able to produce attention masks via a built-in attention mechanism, and then fuse the generation output with the attention masks to obtain high-quality target images. Accordingly, we also design a novel attention-guided discriminator which only considers attended regions. Extensive experiments are conducted on several generative tasks, demonstrating that the proposed model is effective to generate sharper and more realistic images compared with existing competitive models. The source code for the proposed AttentionGAN is available at https://github.com/Ha0Tang/AttentionGAN.Comment: An extended version of a paper published in IJCNN2019. arXiv admin note: substantial text overlap with arXiv:1903.1229
    corecore