213 research outputs found

    Self corrective Perturbations for Semantic Segmentation and Classification

    Full text link
    Convolutional Neural Networks have been a subject of great importance over the past decade and great strides have been made in their utility for producing state of the art performance in many computer vision problems. However, the behavior of deep networks is yet to be fully understood and is still an active area of research. In this work, we present an intriguing behavior: pre-trained CNNs can be made to improve their predictions by structurally perturbing the input. We observe that these perturbations - referred as Guided Perturbations - enable a trained network to improve its prediction performance without any learning or change in network weights. We perform various ablative experiments to understand how these perturbations affect the local context and feature representations. Furthermore, we demonstrate that this idea can improve performance of several existing approaches on semantic segmentation and scene labeling tasks on the PASCAL VOC dataset and supervised classification tasks on MNIST and CIFAR10 datasets.Comment: Accepted to ICCV 201

    Regularizing deep networks using efficient layerwise adversarial training

    Full text link
    Adversarial training has been shown to regularize deep neural networks in addition to increasing their robustness to adversarial examples. However, its impact on very deep state of the art networks has not been fully investigated. In this paper, we present an efficient approach to perform adversarial training by perturbing intermediate layer activations and study the use of such perturbations as a regularizer during training. We use these perturbations to train very deep models such as ResNets and show improvement in performance both on adversarial and original test data. Our experiments highlight the benefits of perturbing intermediate layer activations compared to perturbing only the inputs. The results on CIFAR-10 and CIFAR-100 datasets show the merits of the proposed adversarial training approach. Additional results on WideResNets show that our approach provides significant improvement in classification accuracy for a given base model, outperforming dropout and other base models of larger size.Comment: Published at the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18). Official link: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/1663

    Detecting Deep-Fake Videos from Appearance and Behavior

    Full text link
    Synthetically-generated audios and videos -- so-called deep fakes -- continue to capture the imagination of the computer-graphics and computer-vision communities. At the same time, the democratization of access to technology that can create sophisticated manipulated video of anybody saying anything continues to be of concern because of its power to disrupt democratic elections, commit small to large-scale fraud, fuel dis-information campaigns, and create non-consensual pornography. We describe a biometric-based forensic technique for detecting face-swap deep fakes. This technique combines a static biometric based on facial recognition with a temporal, behavioral biometric based on facial expressions and head movements, where the behavioral embedding is learned using a CNN with a metric-learning objective function. We show the efficacy of this approach across several large-scale video datasets, as well as in-the-wild deep fakes

    Unconstrained Facial Expression Transfer using Style-based Generator

    Full text link
    Facial expression transfer and reenactment has been an important research problem given its applications in face editing, image manipulation, and fabricated videos generation. We present a novel method for image-based facial expression transfer, leveraging the recent style-based GAN shown to be very effective for creating realistic looking images. Given two face images, our method can create plausible results that combine the appearance of one image and the expression of the other. To achieve this, we first propose an optimization procedure based on StyleGAN to infer hierarchical style vector from an image that disentangle different attributes of the face. We further introduce a linear combination scheme that fuses the style vectors of the two given images and generate a new face that combines the expression and appearance of the inputs. Our method can create high-quality synthesis with accurate facial reenactment. Unlike many existing methods, we do not rely on geometry annotations, and can be applied to unconstrained facial images of any identities without the need for retraining, making it feasible to generate large-scale expression-transferred results

    Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation

    Full text link
    Visual Domain Adaptation is a problem of immense importance in computer vision. Previous approaches showcase the inability of even deep neural networks to learn informative representations across domain shift. This problem is more severe for tasks where acquiring hand labeled data is extremely hard and tedious. In this work, we focus on adapting the representations learned by segmentation networks across synthetic and real domains. Contrary to previous approaches that use a simple adversarial objective or superpixel information to aid the process, we propose an approach based on Generative Adversarial Networks (GANs) that brings the embeddings closer in the learned feature space. To showcase the generality and scalability of our approach, we show that we can achieve state of the art results on two challenging scenarios of synthetic to real domain adaptation. Additional exploratory experiments show that our approach: (1) generalizes to unseen domains and (2) results in improved alignment of source and target distributions.Comment: Accepted as spotlight talk at CVPR 2018. Code available here: https://github.com/swamiviv/LSD-se

    CULTURAL MANAGEMENT EDUCATION IN SOUTHEAST ASIA

    Get PDF
    Cultural Management could be defined as the process of planning, organizing, leading, and controlling material and nonmaterial culture to meet predetermined goals and objectives. Areas under the umbrella of Cultural Management include Arts Management, Museum Management, Cultural Heritage and Tourism, Cultural and Creative Industries, and Design Management. Over the last two decades, there has also been a growing number of academic programmes befitting the umbrella of Cultural Management, offered at undergraduate and postgraduate levels by institutions of higher education in Southeast Asia. This paper explores the current state of Cultural Management education in Southeast Asia, and thereafter, highlights possible synergies to align with ASEAN’s agenda. Several qualitative research methods were adopted, including content analysis, followed by thematic analysis, participant observations, and semi-structured interviews. This paper formally documents and discusses the Cultural Management curriculum of 10 Southeast Asian nations, using three key themes - top-down, bottom-up, and a combination of both top-down and bottom-up focus. Thereafter, the paper proposes two ways in which institutions of higher education in Southeast Asia could better synergize to meet the six strategies listed in the ASEAN Strategic Plan for Culture and Arts 2016-2025

    One-Shot Domain Adaptation For Face Generation

    Full text link
    In this paper, we propose a framework capable of generating face images that fall into the same distribution as that of a given one-shot example. We leverage a pre-trained StyleGAN model that already learned the generic face distribution. Given the one-shot target, we develop an iterative optimization scheme that rapidly adapts the weights of the model to shift the output's high-level distribution to the target's. To generate images of the same distribution, we introduce a style-mixing technique that transfers the low-level statistics from the target to faces randomly generated with the model. With that, we are able to generate an unlimited number of faces that inherit from the distribution of both generic human faces and the one-shot example. The newly generated faces can serve as augmented training data for other downstream tasks. Such setting is appealing as it requires labeling very few, or even one example, in the target domain, which is often the case of real-world face manipulations that result from a variety of unknown and unique distributions, each with extremely low prevalence. We show the effectiveness of our one-shot approach for detecting face manipulations and compare it with other few-shot domain adaptation methods qualitatively and quantitatively.Comment: Accepted to CVPR 202

    Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)

    Full text link
    Massive Online Open Courses (MOOCs) which were introduced in 2008 has since drawn attention around the world for both its advantages as well as criticism on its drawbacks. One of the issues in MOOCs which is the lack of interactivity with the instructor has brought conversational bot into the picture to fill in this gap. In this study, a prototype of MOOCs conversational bot, MOOC-bot is being developed and integrated into MOOCs website to respond to the learner inquiries using text or speech input. MOOC-bot is using the popular Artificial Intelligence Markup Language (AIML) to develop its knowledge base, leverage from AIML capability to deliver appropriate responses and can be quickly adapted to new knowledge domains. The system architecture of MOOC-bot consists of knowledge base along with AIML interpreter, chat interface, MOOCs website and Web Speech API to provide speech recognition and speech synthesis capability. The initial MOOC-bot prototype has the general knowledge from the past Loebner Prize winner - ALICE, frequent asked questions, and a content offered by Universiti Teknikal Malaysia Melaka (UTeM). The evaluation of MOOC-bot based on the past competition questions from Chatterbox Challenge (CBC) and Loebner Prize has shown that it was able to provide correct answers most of the time during the test and demonstrated the capability to prolong the conversation. The advantages of MOOC-bot such as able to provide 24-hour service that can serve different time zones, able to have knowledge in multiple domains, and can be shared by multiple sites simultaneously have outweighed its existing limitations

    New Benchmarks for Learning on Non-Homophilous Graphs

    Full text link
    Much data with graph structures satisfy the principle of homophily, meaning that connected nodes tend to be similar with respect to a specific attribute. As such, ubiquitous datasets for graph machine learning tasks have generally been highly homophilous, rewarding methods that leverage homophily as an inductive bias. Recent work has pointed out this particular focus, as new non-homophilous datasets have been introduced and graph representation learning models better suited for low-homophily settings have been developed. However, these datasets are small and poorly suited to truly testing the effectiveness of new methods in non-homophilous settings. We present a series of improved graph datasets with node label relationships that do not satisfy the homophily principle. Along with this, we introduce a new measure of the presence or absence of homophily that is better suited than existing measures in different regimes. We benchmark a range of simple methods and graph neural networks across our proposed datasets, drawing new insights for further research. Data and codes can be found at https://github.com/CUAI/Non-Homophily-Benchmarks.Comment: In Workshop on Graph Learning Benchmarks (GLB 2021) at WWW 2021. 10 page

    Adversarial Example Decomposition

    Full text link
    Research has shown that widely used deep neural networks are vulnerable to carefully crafted adversarial perturbations. Moreover, these adversarial perturbations often transfer across models. We hypothesize that adversarial weakness is composed of three sources of bias: architecture, dataset, and random initialization. We show that one can decompose adversarial examples into an architecture-dependent component, data-dependent component, and noise-dependent component and that these components behave intuitively. For example, noise-dependent components transfer poorly to all other models, while architecture-dependent components transfer better to retrained models with the same architecture. In addition, we demonstrate that these components can be recombined to improve transferability without sacrificing efficacy on the original model.Comment: ICML 2019 Workshop, Security and Privacy of Machine Learning, camera-ready versio
    • …
    corecore