Search CORE

108 research outputs found

Semantically tied paired cycle consistency for any-shot sketch-based image retrieval

Author: Akata Z
Dutta A
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

This is the final version. Available from the publisher via the DOI in this record. Low-shot sketch-based image retrieval is an emerging task in computer vision, allowing to retrieve natural images relevant to hand-drawn sketch queries that are rarely seen during the training phase. Related prior works either require aligned sketchimage pairs that are costly to obtain or inefficient memory fusion layer for mapping the visual information to a semantic space. In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks, where we introduce the few-shot setting for SBIR. For solving these tasks, we propose a semantically aligned paired cycle-consistent generative adversarial network (SEM-PCYC) for any-shot SBIR, where each branch of the generative adversarial network maps the visual information from sketch and image to a common semantic space via adversarial training. Each of these branches maintains cycle consistency that only requires supervision at the category level, and avoids the need of aligned sketch-image pairs. A classification criteria on the generators’ outputs ensures the visual to semantic space mapping to be class-specific. Furthermore, we propose to combine textual and hierarchical side information via an auto-encoder that selects discriminating side information within a same end-to-end model. Our results demonstrate a significant boost in any-shot SBIR performance over the state-of-the-art on the extended version of the challenging Sketchy, TU-Berlin and QuickDraw datasets.European Union: Marie Skłodowska-Curie GrantEuropean Research Council (ERC

arXiv.org e-Print Archive

Open Research Exeter

MPG.PuRe

Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval

Author: Akata Z
Dutta A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

This is the author accepted manuscript. The final version is available from IEEE via the DOI in this recordZero-shot sketch-based image retrieval (SBIR) is an emerging task in computer vision, allowing to retrieve natural images relevant to sketch queries that might not been seen in the training phase. Existing works either require aligned sketch-image pairs or inefficient memory fusion layer for mapping the visual information to a semantic space. In this work, we propose a semantically aligned paired cycle-consistent generative (SEM-PCYC) model for zero-shot SBIR, where each branch maps the visual information to a common semantic space via an adversarial training. Each of these branches maintains a cycle consistency that only requires supervision at category levels, and avoids the need of highly-priced aligned sketch-image pairs. A classification criteria on the generators' outputs ensures the visual to semantic space mapping to be discriminating. Furthermore, we propose to combine textual and hierarchical side information via a feature selection auto-encoder that selects discriminating side information within a same end-to-end model. Our results demonstrate a significant boost in zero-shot SBIR performance over the state-of-the-art on the challenging Sketchy and TU-Berlin datasets.European Union Horizon 202

arXiv.org e-Print Archive

Crossref

Open Research Exeter

MPG.PuRe

International Migration, Integration and Social Cohesion online publications

Zero-Shot Learning - The Good, the Bad and the Ugly

Author: Akata Z.
Schiele B.
Xian Y.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Visual rationalizations in deep reinforcement learning for Atari games

Author: Akata Z.
van der Pol E.
Weitkamp L.
Publication venue: Jheronimus Academy of Data Science
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Grounding Visual Explanations

Author: Akata Z.
Darrell T.
Hendricks L.A.
Hu R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Large Loss Matters in Weakly Supervised Multi-Label Classification

Author: Akata Z.
Kim J.
Kim Y.
Lee J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

MPG.PuRe

Grounding Visual Explanations

Author: Akata Z.
Darrell T.
Hendricks L.A.
Hu R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language

Author: Akata Z.
Koepke A.
Mercea O.
Riesch L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

MPG.PuRe

Gaze Embeddings for Zero-Shot Image Classification

Author: Akata Z.
Bulling A.
Karessli N.
Schiele B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

International Migration, Integration and Social Cohesion online publications

Semi-Supervised and Unsupervised Deep Visual Learning: A Survey

Author: Akata Z.
Chen Y.
Mancini M.
Zhu X.
Publication venue
Publication date: 01/01/2022
Field of study

State-of-the-art deep learning models are often trained with a large amountof costly labeled training data. However, requiring exhaustive manualannotations may degrade the model's generalizability in the limited-labelregime. Semi-supervised learning and unsupervised learning offer promisingparadigms to learn from an abundance of unlabeled visual data. Recent progressin these paradigms has indicated the strong benefits of leveraging unlabeleddata to improve model generalization and provide better model initialization.In this survey, we review the recent advanced deep learning algorithms onsemi-supervised learning (SSL) and unsupervised learning (UL) for visualrecognition from a unified perspective. To offer a holistic understanding ofthe state-of-the-art in these areas, we propose a unified taxonomy. Wecategorize existing representative SSL and UL with comprehensive and insightfulanalysis to highlight their design rationales in different learning scenariosand applications in different computer vision tasks. Lastly, we discuss theemerging trends and open challenges in SSL and UL to shed light on futurecritical research directions.<br

MPG.PuRe