Search CORE

415 research outputs found

Part-based Face Recognition with Vision Transformers

Author: British Machine Vision Conference
Sun Z
Tzimiropoulos G
Publication venue
Publication date: 21/11/2022
Field of study

Queen Mary Research Online

Finding Directions in GAN’s Latent Space for Neural Face Reenactment

Author: Argyriou V
Bounareli S
British Machine Vision Conference
Tzimiropoulos G
Publication venue
Publication date: 21/11/2022
Field of study

Queen Mary Research Online

WarpedGANSpace: Finding non-linear RBF paths in GAN latent space

Author: International Conference on Computer Vision
Patras I
Tzelepis C
Tzimiropoulos G
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

This work addresses the problem of discovering, in an unsupervised manner, interpretable paths in the latent space of pretrained GANs, so as to provide an intuitive and easy way of controlling the underlying generative factors. In doing so, it addresses some of the limitations of the state-of-the-art works, namely, a) that they discover directions that are independent of the latent code, i.e., paths that are linear, and b) that their evaluation relies either on visual inspection or on laborious human labeling. More specifically, we propose to learn non-linear warpings on the latent space, each one parametrized by a set of RBF-based latent space warping functions, and where each warping gives rise to a family of non-linear paths via the gradient of the function. Building on the work of Voynov and Babenko that discovers linear paths, we optimize the trainable parameters of the set of RBFs, so as that images that are generated by codes along different paths, are easily distinguishable by a discriminator network. This leads to easily distinguishable image transformations, such as pose and facial expressions in facial images. We show that linear paths can be derived as a special case of our method, and show experimentally that non-linear paths in the latent space lead to steeper, more disentangled and interpretable changes in the image space than in state-of-the art methods, both qualitatively and quantitatively. We make the code and the pretrained models publicly available at: https://github.com/chi0tzp/WarpedGANSpace

Queen Mary Research Online

Phenex: Ontological Annotation of Phenotypic Diversity

Author: Cartik R. Kothari
Hilmar Lapp
James P. Balhoff
John G. Lundberg
Monte Westerfield
Paula Mabee
Peter E. Midford
Todd J. Vision
Wasila M. Dahdul
Publication venue
Publication date: 01/01/2010
Field of study

Phenex is a platform-independent desktop application designed to facilitate efficient and consistent annotation of phenotypic variation using Entity-Quality syntax, drawing on terms from community ontologies for anatomical entities, phenotypic qualities, and taxonomic names. Despite the centrality of the phenotype to so much of biology, traditions for communicating information about phenotypes are idiosyncratic to different disciplines. Phenotypes seem to elude standardized descriptions due to the variety of traits that compose them and the difficulty of capturing the complex forms and subtle differences among organisms that we can readily observe. Consequently, phenotypes are refractory to attempts at data integration that would allow computational analyses across studies and study systems. Phenex addresses this problem by allowing scientists to employ standard ontologies and syntax to link computable phenotype annotations to evolutionary character matrices, as well as to link taxa and specimens to ontological identifiers. Ontologies have become a foundational technology for establishing shared semantics, and, more generally, for capturing and computing with biological knowledge

Crossref

Directory of Open Access Journals

PubMed Central

Carolina Digital Repository

Nature Precedings

BATS: Binary ArchitecTure Search

Author: Bulat A
European Conference on Computer Vision (ECCV)
Martinez B
Tzimiropoulos G
Publication venue
Publication date: 24/08/2020
Field of study

Queen Mary Research Online

Black Box Few-Shot Adaptation for Vision-Language models

Author: Bulat A
International Conference on Computer Vision
Martinez B
Ouali Y
Tzimiropoulos G
Publication venue
Publication date: 02/10/2023
Field of study

Queen Mary Research Online

Fs-detr: Few-shot detection transformer with prompting and without re-training

Author: Bulat A
Guerrero R
International Conference on Computer Vision
Martinez B
Tzimiropoulos G
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/10/2023
Field of study

This paper is on Few-Shot Object Detection (FSOD), where given a few templates (examples) depicting a novel class (not seen during training), the goal is to detect all of its occurrences within a set of images. From a practical perspective, an FSOD system must fulfil the following desiderata: (a) it must be used as is, without requiring any fine-tuning at test time, (b) it must be able to process an arbitrary number of novel objects concurrently while supporting an arbitrary number of examples from each class and (c) it must achieve accuracy comparable to a closed system. Towards satisfying (a)-(c), in this work, we make the following contributions: We introduce, for the first time, a simple, yet powerful, few-shot detection transformer (FS-DETR) based on visual prompting that can address both desiderata (a) and (b). Our system builds upon the DETR framework, extending it based on two key ideas: (1) feed the provided visual templates of the novel classes as visual prompts during test time, and (2) “stamp” these prompts with pseudo-class embeddings (akin to soft prompting), which are then predicted at the output of the decoder. Importantly, we show that our system is not only more flexible than existing methods, but also, it makes a step towards satisfying desideratum (c). Specifically, it is significantly more accurate than all methods that do not require fine-tuning and even matches and outperforms the current state-of-the-art fine-tuning based methods on the most well-established benchmarks (PASCAL VOC & MSCOCO)

Queen Mary Research Online

ReGen: A good Generative zero-shot video classifier should be Rewarded

Author: Bulat A
International Conference on Computer Vision
Martinez B
Sanchez E
Tzimiropoulos G
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/10/2023
Field of study

This paper sets out to solve the following problem: How can we turn a generative video captioning model into an open-world video/action classification model? Video captioning models can naturally produce open-ended free-form descriptions of a given video which, however, might not be discriminative enough for video/action recognition. Unfortunately, when fine-tuned to auto-regress the class names directly, video captioning models overfit the base classes losing their open-world zero-shot capabilities. To alleviate base class overfitting, in this work, we propose to use reinforcement learning to enforce the output of the video captioning model to be more class-level discriminative. Specifically, we propose ReGen, a novel reinforcement learning based framework with a three-fold objective and reward functions: (1) a class-level discrimination reward that enforces the generated caption to be correctly classified into the corresponding action class, (2) a CLIP reward that encourages the generated caption to continue to be descriptive of the input video (i.e. video-specific), and (3) a grammar reward that preserves the grammatical correctness of the caption. We show that ReGen can train a model to produce captions that are: discriminative, video-specific and grammatically correct. Importantly, when evaluated on standard benchmarks for zero- and few-shot action classification, ReGen significantly outperforms the previous state-of-the-art

Queen Mary Research Online

Striking a Balance between Stability and Plasticity for Class-Incremental Learning

Author: 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
Gong S
Queen PL
Wu G
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/10/2021
Field of study

Queen Mary Research Online

HyperReenact: one-shot reenactment via jointly learning to refine and retarget faces

Author: Argyriou V
Bounareli S
International Conference on Computer Vision
Patras I
TZELEPIS C
Tzimiropoulos G
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/10/2023
Field of study

In this paper, we present our method for neural face reenactment, called HyperReenact, that aims to generate realistic talking head images of a source identity, driven by a target facial pose. Existing state-of-the-art face reenactment methods train controllable generative models that learn to synthesize realistic facial images, yet producing reenacted faces that are prone to significant visual artifacts, especially under the challenging condition of extreme head pose changes, or requiring expensive few-shot fine-tuning to better preserve the source identity characteristics. We propose to address these limitations by leveraging the photorealistic generation ability and the disentangled properties of a pretrained StyleGAN2 generator, by first inverting the real images into its latent space and then using a hypernetwork to perform: (i) refinement of the source identity characteristics and (ii) facial pose re-targeting, eliminating this way the dependence on external editing methods that typically produce artifacts. Our method operates under the one-shot setting (i.e., using a single source frame) and allows for cross-subject reenactment, without requiring any subject-specific fine-tuning. We compare our method both quantitatively and qualitatively against several state-of-the-art techniques on the standard benchmarks of VoxCeleb1 and VoxCeleb2, demonstrating the superiority of our approach in producing artifact-free images, exhibiting remarkable robustness even under extreme head pose changes. We make the code and the pretrained models publicly available at: https://github.com/ StelaBou/HyperReenact

Queen Mary Research Online