Search CORE

88 research outputs found

MIMIC: Mask Image Pre-training with Mix Contrastive Fine-tuning for Facial Expression Recognition

Author: Guo Xiaobao
Kot Alex
Peng Xiaojiang
Zhang Fan
Publication venue
Publication date: 14/01/2024
Field of study

Cutting-edge research in facial expression recognition (FER) currently favors the utilization of convolutional neural networks (CNNs) backbone which is supervisedly pre-trained on face recognition datasets for feature extraction. However, due to the vast scale of face recognition datasets and the high cost associated with collecting facial labels, this pre-training paradigm incurs significant expenses. Towards this end, we propose to pre-train vision Transformers (ViTs) through a self-supervised approach on a mid-scale general image dataset. In addition, when compared with the domain disparity existing between face datasets and FER datasets, the divergence between general datasets and FER datasets is more pronounced. Therefore, we propose a contrastive fine-tuning approach to effectively mitigate this domain disparity. Specifically, we introduce a novel FER training paradigm named Mask Image pre-training with MIx Contrastive fine-tuning (MIMIC). In the initial phase, we pre-train the ViT via masked image reconstruction on general images. Subsequently, in the fine-tuning stage, we introduce a mix-supervised contrastive learning process, which enhances the model with a more extensive range of positive samples by the mixing strategy. Through extensive experiments conducted on three benchmark datasets, we demonstrate that our MIMIC outperforms the previous training paradigm, showing its capability to learn better representations. Remarkably, the results indicate that the vanilla ViT can achieve impressive performance without the need for intricate, auxiliary-designed modules. Moreover, when scaling up the model size, MIMIC exhibits no performance saturation and is superior to the current state-of-the-art methods

arXiv.org e-Print Archive

Fusion hand gesture segmentation and extraction based on CMOS sensor and 3D sensor

Author: Chen Disi
Jiang Guozhang
Kong Jianyi
Li Gongfa
Li Jiahan
Liu Honghai
Sun Ying
Publication venue: 'Inderscience Publishers'
Publication date: 01/12/2017
Field of study

Portsmouth University Research Portal (Pure)

Expressivity in Natural and Artificial Systems

Author: LaViers Amy
Publication venue: 'MDPI AG'
Publication date: 05/07/2018
Field of study

Roboticists are trying to replicate animal behavior in artificial systems. Yet, quantitative bounds on capacity of a moving platform (natural or artificial) to express information in the environment are not known. This paper presents a measure for the capacity of motion complexity -- the expressivity -- of articulated platforms (both natural and artificial) and shows that this measure is stagnant and unexpectedly limited in extant robotic systems. This analysis indicates trends in increasing capacity in both internal and external complexity for natural systems while artificial, robotic systems have increased significantly in the capacity of computational (internal) states but remained more or less constant in mechanical (external) state capacity. This work presents a way to analyze trends in animal behavior and shows that robots are not capable of the same multi-faceted behavior in rich, dynamic environments as natural systems.Comment: Rejected from Nature, after review and appeal, July 4, 2018 (submitted May 11, 2018

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Author: Rektor der TU Ilmenau
Publication venue
Publication date: 20/10/2010
Field of study

Table of Contents with links to the conference papers

Digitale Bibliothek Thüringen

Action for perception : active object recognition and pose estimation in cluttered environments

Author: Wu Kanzhi
Publication venue
Publication date: 01/01/2017
Field of study

University of Technology Sydney. Faculty of Engineering and Information Technology.Object recognition and localisation are indispensable competency for service robots in everyday environments like offices and kitchens. Presence of similar objects that can only be differentiated from a small part of the surface together with clutter that leads to occlusions make it impossible to detect target objects accurately and reliably from a single observation. When the sensor observing the environment is mounted on a mobile platform, object detection and pose estimation can be facilitated by observing the environment from a series of different viewpoints. Computing Active perception strategies, with the aim of finding optimal actions to enhance object recognition and pose estimation performance is the focus of this thesis. This thesis consists of two main parts: In the first part, it focuses on object detection and pose estimation from a single frame of observation. Using an RGB-D sensor, we propose a modular 3D textured object detection and pose estimation framework which can recognise object under cluttered environment by taking advantage of the geometric information provided from the sensor. To handle less-textured objects and objects under severe illumination conditions, we propose a novel RGB-D feature which is robust to illumination, scale, rotation and viewpoint variations, and provides reliable feature matching results under challenging conditions. The proposed feature is validated for multiple applications including object detection and point cloud alignment. Parts of the above approaches are integrated with existing work to produce a practical and effective perception module for a warehouse automation task. The designed perception system can detect objects of different types and estimate their poses robustly thus guaranteeing a reliable object grasping and manipulation performances. In the second part of the thesis, we investigate the problem of active object detection and pose estimation from two perspectives: with and without considering the uncertainties in the motion model and the observation model. First, we propose a model-driven active object recognition and pose estimation system via exploiting the feature association probability under scale and viewpoint variations. By explicitly modelling the feature association, the proposed system can predict future information more accurately thus laying the foundation of a successful active Next-Best-View planning system even with a naive greedy search technique. We also present a probabilistic framework which handles motion and observation uncertainties in the active object detection and pose estimation problem. We present an optimisation framework which computes the optimal control at each step, using an objective function which incorporates uncertainties in state estimation, feature coverage for better recognition confidence and control consumption. The proposed framework can handle various issues such as object initialisation, collision avoidance, occlusion and changing the object hypothesis. Validations based on a simulation environment are also presented

OPUS - University of Technology Sydney

Modeling and Simulation in Engineering

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This book provides an open platform to establish and share knowledge developed by scholars, scientists, and engineers from all over the world, about various applications of the modeling and simulation in the design process of products, in various engineering fields. The book consists of 12 chapters arranged in two sections (3D Modeling and Virtual Prototyping), reflecting the multidimensionality of applications related to modeling and simulation. Some of the most recent modeling and simulation techniques, as well as some of the most accurate and sophisticated software in treating complex systems, are applied. All the original contributions in this book are jointed by the basic principle of a successful modeling and simulation process: as complex as necessary, and as simple as possible. The idea is to manipulate the simplifying assumptions in a way that reduces the complexity of the model (in order to make a real-time simulation), but without altering the precision of the results

Directory of Open Access Books (DOAB)

Systematic modelling of embedded systems:Managing non-formal aspects

Author: Marincic Jelena
Publication venue: University of Twente
Publication date: 01/09/2023
Field of study

University of Twente Research Information

Robot Calibration: Modeling Measurement and Applications

Author: Jose Mauricio S. T. Motta
Publication venue: 'IntechOpen'
Publication date: 01/12/2006
Field of study

IntechOpen

A Benchmark and Evaluation of Non-Rigid Structure from Motion

Author: Aanæs Henrik
Del Bue Alessio
Doest Mads Emil Brix
Jensen Sebastian Hoppe Nesgaard
Publication venue
Publication date: 26/04/2018
Field of study

Non-Rigid structure from motion (NRSfM), is a long standing and central problem in computer vision, allowing us to obtain 3D information from multiple images when the scene is dynamic. A main issue regarding the further development of this important computer vision topic, is the lack of high quality data sets. We here address this issue by presenting of data set compiled for this purpose, which is made publicly available, and considerably larger than previous state of the art. To validate the applicability of this data set, and provide and investigation into the state of the art of NRSfM, including potential directions forward, we here present a benchmark and a scrupulous evaluation using this data set. This benchmark evaluates 16 different methods with available code, which we argue reasonably spans the state of the art in NRSfM. We also hope, that the presented and public data set and evaluation, will provide benchmark tools for further development in this field

arXiv.org e-Print Archive

Online Research Database In Technology