108 research outputs found
Multi-Content GAN for Few-Shot Font Style Transfer
In this work, we focus on the challenge of taking partial observations of
highly-stylized text and generalizing the observations to generate unobserved
glyphs in the ornamented typeface. To generate a set of multi-content images
following a consistent style from very few examples, we propose an end-to-end
stacked conditional GAN model considering content along channels and style
along network layers. Our proposed network transfers the style of given glyphs
to the contents of unseen ones, capturing highly stylized fonts found in the
real-world such as those on movie posters or infographics. We seek to transfer
both the typographic stylization (ex. serifs and ears) as well as the textual
stylization (ex. color gradients and effects.) We base our experiments on our
collected data set including 10,000 fonts with different styles and demonstrate
effective generalization from a very small number of observed glyphs
Neural Font Rendering
Recent advances in deep learning techniques and applications have
revolutionized artistic creation and manipulation in many domains (text,
images, music); however, fonts have not yet been integrated with deep learning
architectures in a manner that supports their multi-scale nature. In this work
we aim to bridge this gap, proposing a network architecture capable of
rasterizing glyphs in multiple sizes, potentially paving the way for easy and
accessible creation and manipulation of fonts
Parameter-Saving Adversarial Training: Reinforcing Multi-Perturbation Robustness via Hypernetworks
Adversarial training serves as one of the most popular and effective methods
to defend against adversarial perturbations. However, most defense mechanisms
only consider a single type of perturbation while various attack methods might
be adopted to perform stronger adversarial attacks against the deployed model
in real-world scenarios, e.g., or . Defending against
various attacks can be a challenging problem since multi-perturbation
adversarial training and its variants only achieve suboptimal robustness
trade-offs, due to the theoretical limit to multi-perturbation robustness for a
single model. Besides, it is impractical to deploy large models in some
storage-efficient scenarios. To settle down these drawbacks, in this paper we
propose a novel multi-perturbation adversarial training framework,
parameter-saving adversarial training (PSAT), to reinforce multi-perturbation
robustness with an advantageous side effect of saving parameters, which
leverages hypernetworks to train specialized models against a single
perturbation and aggregate these specialized models to defend against multiple
perturbations. Eventually, we extensively evaluate and compare our proposed
method with state-of-the-art single/multi-perturbation robust methods against
various latest attack methods on different datasets, showing the robustness
superiority and parameter efficiency of our proposed method, e.g., for the
CIFAR-10 dataset with ResNet-50 as the backbone, PSAT saves approximately 80\%
of parameters with achieving the state-of-the-art robustness trade-off
accuracy.Comment: 9 pages, 2 figure
Experimental Thinking / Design Practices
The Interaction Research Studio was invited to show the Energy Babble, as part of the Experimental Thinking/Design Practices exhibition at Griffith University Art Gallery, Queensland College or Art, Australia. The Energy Babble was developed during the Energy and Co-Designing Communities project, a joint project between Goldsmiths’ departments of Design and Sociology
Improved Deep Neural Networks for Generative Robotic Grasping
This thesis provides a thorough evaluation of current state-of-the-art robotic grasping methods and contributes to a subset of data-driven grasp estimation approaches, termed generative models. These models aim to directly generate grasp region proposals from a given image without the need for a separate analysis and ranking step, which can be computationally expensive. This approach allows for fully end-to-end training of a model and quick closed-loop operation of a robot arm.
A number of limitations are identified within these generative models, which are identified and addressed. Contributions are proposed that directly target each stage of the training pipeline that help to form accurate grasp proposals and generalise better to unseen objects. Firstly, inspired by theories of object manipulation within the mammalian visual system, the use of multi-task learning in existing generative architectures is evaluated. This aims to improve the performance of grasping algorithms when presented with impoverished colour (RGB) data by training models to perform simultaneous tasks such as object categorisation, saliency detection, and depth reconstruction. Secondly, a novel loss function is introduced which improves overall performance by rewarding the network to focus only on learning grasps at suitable positions. This reduces overall training times and results in better performance on fewer training examples. The last contribution analyses the problems with the most common metric used for evaluating and comparing offline performance between different grasping models and algorithms. To this end, a Gaussian method of representing ground-truth labelled grasps is put forward, which optimal grasp
locations tested in a simulated grasping environment.
The combination of these novel additions to generative models results in improved grasp success, accuracy, and performance on common benchmark datasets compared to previous approaches. Furthermore, the efficacy of these contributions is also tested when transferred to a physical robotic arm, demonstrating the ability to effectively grasp previously unseen 3D printed objects of varying complexity and difficulty without the need for domain adaptation. Finally, the future directions are discussed for generative convolutional models within the overall field of robotic grasping
Navigating the Latent: Exploring the Potentials of Islamic Calligraphy with Generative Adversarial Networks
Islamic calligraphy is a substantial cultural element in the countries that share Arabic script. Despite its prevalence and importance, Islamic calligraphy has not significantly benefited from modern technologies, and there are considerable gaps between the affordances of digital tools and the subtle requirements of this domain. This project explores the use of Generative Adversarial Networks (GANs) as an option that can fill the gap between digital tools and fundamental aspects of Islamic calligraphy, and as a new tool that can expand the creative space of artists and designers who use Islamic calligraphy in their works. This study also promotes an informed approach toward using GANs with a focus on domain-specific requirements of Islamic calligraphy. Some of the potentials of using GANs in Islamic calligraphy are depicted through analyzing the results of a GAN trained on a custom-made and regularized dataset of nasta’līq script. The results are also used to make calligraphy pieces with a new mode of expression
Towards Robust Deep Neural Networks
Deep neural networks (DNNs) enable state-of-the-art performance for most machine
learning tasks. Unfortunately, they are vulnerable to attacks, such as Trojans during
training and Adversarial Examples at test time. Adversarial Examples are inputs
with carefully crafted perturbations added to benign samples. In the Computer
Vision domain, while the perturbations being imperceptible to humans, Adversarial
Examples can successfully misguide or fool DNNs. Meanwhile, Trojan or backdoor
attacks involve attackers tampering with the training process, for example, to inject
poisoned training data to embed a backdoor into the network that can be activated
during model deployment when the Trojan triggers (known only to the attackers)
appear in the model’s inputs. This dissertation investigates methods of building robust
DNNs against these training-time and test-time threats.
Recognising the threat of Adversarial Examples in the malware domain, this research
considers the problem of realising a robust DNN-based malware detector against Adversarial
Example attacks by developing a Bayesian adversarial learning algorithm. In contrast
to vision tasks, adversarial learning in a domain without a differentiable or invertible
mapping function from the problemspace (such as software code inputs) to the feature
space is hard. The study proposes an alternative; performing adversarial learning in
the feature space and proving the projection of perturbed yet, valid malware, in the
problem space into the feature space will be a subset of feature-space adversarial
attacks. The Bayesian approach improves benign performance, provably bounds
the difference between adversarial risk and empirical risk and improves robustness
against increasingly large attack budgets not employed during training.
To investigate the problem of improving the robustness of DNNs against Adversarial
Examples–carefully crafted perturbation added to inputs—in the Computer Vision
domain, the research considers the problem of developing a Bayesian learning algorithm to
realise a robust DNN against Adversarial Examples in the CV domain. Accordingly, a novel
Bayesian learning method is designed that conceptualises an information gain objective
to measure and force the information learned from both benign and Adversarial
Examples to be similar. This method proves that minimising this information gain
objective further tightens the bound of the difference between adversarial risk and empirical risk to move towards a basis for a principled method of adversarially training
BNNs.
Recognising the threat from backdoor or Trojan attacks against DNNs, the research
considers the problem of finding a robust defence method that is effective against Trojan
attacks. The research explores a new idea in the domain; sanitisation of inputs and
proposes Februus to neutralise highly potent and insidious Trojan attacks on DNN
systems at run-time. In Trojan attacks, an adversary activates a backdoor crafted in
a deep neural network model using a secret trigger, a Trojan, applied to any input
to alter the model’s decision to a target prediction—a target determined by and only
known to the attacker. Februus sanitises the incoming input by surgically removing the
potential trigger artifacts and restoring the input for the classification task. Februus
enables effective Trojan mitigation by sanitising inputs with no loss of performance
for sanitised inputs, trojaned or benign. This method is highly effective at defending
against advanced Trojan attack variants as well as challenging, adaptive attacks where
attackers have full knowledge of the defence method.
Investigating the connections between Trojan attacks and spatially constrained
Adversarial Examples or so-called Adversarial Patches in the input space, the research
exposes an emerging threat; an attack exploiting the vulnerability of a DNN to generate
naturalistic adversarial patches as universal triggers. For the first time, a method based
on Generative Adversarial Networks is developed to exploit a GAN’s latent space to
search for universal naturalistic adversarial patches. The proposed attack’s advantage
is its ability to exert a high level of control, enabling attackers to craft naturalistic
adversarial patches that are highly effective, robust against state-of-the-art DNNs, and
deployable in the physical world without needing to interfere with the model building
process or risking discovery. Until now, this has only been demonstrably possible
using Trojan attack methods.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202
Investigating human-perceptual properties of "shapes" using 3D shapes and 2D fonts
Shapes are generally used to convey meaning. They are used in video games, films and other multimedia, in diverse ways. 3D shapes may be destined for virtual scenes or represent objects to be constructed in the real-world. Fonts add character to an otherwise plain block of text, allowing the writer to make important points more visually prominent or distinct from other text. They can indicate the structure of a document, at a glance. Rather than studying shapes through traditional geometric shape descriptors, we provide alternative methods to describe and analyse shapes, from a lens of human perception. This is done via the concepts of Schelling Points and Image Specificity. Schelling Points are choices people make when they aim to match with what they expect others to choose but cannot communicate with others to determine an answer. We study whole mesh selections in this setting, where Schelling Meshes are the most frequently selected shapes. The key idea behind image Specificity is that different images evoke different descriptions; but ‘Specific’ images yield more consistent descriptions than others. We apply Specificity to 2D fonts. We show that each concept can be learned and predict them for fonts and 3D shapes, respectively, using a depth image-based convolutional neural network. Results are shown for a range of fonts and 3D shapes and we demonstrate that font Specificity and the Schelling meshes concept are useful for visualisation, clustering, and search applications. Overall, we find that each concept represents similarities between their respective type of shape, even when there are discontinuities between the shape geometries themselves. The ‘context’ of these similarities is in some kind of abstract or subjective meaning which is consistent among different people
- …