Search CORE

21,543 research outputs found

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting

Author: Bhunia Ayan Kumar
Chowdhury Pinaki Nath
Hospedales Timothy M
Song Yi-Zhe
Xiang Tao
Yang Yongxin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/03/2021
Field of study

Self-supervised learning has gained prominence due to its efficacy at learning powerful representations from unlabelled data that achieve excellent performance on many challenging downstream tasks. However supervision-free pre-text tasks are challenging to design and usually modality specific. Although there is a rich literature of self-supervised methods for either spatial (such as images) or temporal data (sound or text) modalities, a common pre-text task that benefits both modalities is largely missing. In this paper, we are interested in defining a self-supervised pre-text task for sketches and handwriting data. This data is uniquely characterised by its existence in dual modalities of rasterized images and vector coordinate sequences. We address and exploit this dual representation by proposing two novel cross-modal translation pre-text tasks for self-supervised feature learning: Vectorization and Rasterization. Vectorization learns to map image space to vector coordinates and rasterization maps vector coordinates to image space. We show that the our learned encoder modules benefit both raster-based and vector-based downstream approaches to analysing hand-drawn data. Empirical evidence shows that our novel pre-text tasks surpass existing single and multi-modal self-supervision methods.Comment: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021 Code : https://github.com/AyanKumarBhunia/Self-Supervised-Learning-for-Sketc

arXiv.org e-Print Archive

University of Surrey

Edinburgh Research Explorer

Towards Practicality of Sketch-Based Visual Understanding

Author: Bhunia Ayan Kumar
Publication venue
Publication date: 26/10/2022
Field of study

Sketches have been used to conceptualise and depict visual objects from pre-historic times. Sketch research has flourished in the past decade, particularly with the proliferation of touchscreen devices. Much of the utilisation of sketch has been anchored around the fact that it can be used to delineate visual concepts universally irrespective of age, race, language, or demography. The fine-grained interactive nature of sketches facilitates the application of sketches to various visual understanding tasks, like image retrieval, image-generation or editing, segmentation, 3D-shape modelling etc. However, sketches are highly abstract and subjective based on the perception of individuals. Although most agree that sketches provide fine-grained control to the user to depict a visual object, many consider sketching a tedious process due to their limited sketching skills compared to other query/support modalities like text/tags. Furthermore, collecting fine-grained sketch-photo association is a significant bottleneck to commercialising sketch applications. Therefore, this thesis aims to progress sketch-based visual understanding towards more practicality.Comment: PhD thesis successfully defended by Ayan Kumar Bhunia, Supervisor: Prof. Yi-Zhe Song, Thesis Examiners: Prof Stella Yu and Prof Adrian Hilto

arXiv.org e-Print Archive

Deep Learning for Free-Hand Sketch: A Survey

Author: Hospedales Timothy M.
Song Yi-Zhe
Wang Liang
Xiang Tao
Xu Peng
Yin Qiyue
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present. The recent prevalence of touchscreen devices has made sketch creation a much easier task than ever and consequently made sketch-oriented applications increasingly popular. The progress of deep learning has immensely benefited free-hand sketch research and applications. This paper presents a comprehensive survey of the deep learning techniques oriented at free-hand sketch data, and the applications that they enable. The main contents of this survey include: (i) A discussion of the intrinsic traits and unique challenges of free-hand sketch, to highlight the essential differences between sketch data and other data modalities, e.g., natural photos. (ii) A review of the developments of free-hand sketch research in the deep learning era, by surveying existing datasets, research topics, and the state-of-the-art methods through a detailed taxonomy and experimental evaluation. (iii) Promotion of future work via a discussion of bottlenecks, open problems, and potential research directions for the community.Comment: This paper is accepted by IEEE TPAM

arXiv.org e-Print Archive

Edinburgh Research Explorer

DR-NTU (Digital Repository of NTU)

Playing Atari with Deep Reinforcement Learning

Author: Antonoglou Ioannis
Graves Alex
Kavukcuoglu Koray
Mnih Volodymyr
Riedmiller Martin
Silver David
Wierstra Daan
Publication venue
Publication date: 01/01/2013
Field of study

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no adjustment of the architecture or learning algorithm. We find that it outperforms all previous approaches on six of the games and surpasses a human expert on three of them.Comment: NIPS Deep Learning Workshop 201

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery