Search CORE

328 research outputs found

Image-based family verification in the wild

Author: Serradilla Casado Oscar
Publication venue
Publication date: 15/09/2017
Field of study

Facial image analysis has been an important subject of study in the communities of pat- tern recognition and computer vision. Facial images contain much information about the person they belong to: identity, age, gender, ethnicity, expression and many more. For that reason, the analysis of facial images has many applications in real world problems such as face recognition, age estimation, gender classification or facial expression recognition. Visual kinship recognition is a new research topic in the scope of facial image analysis. It is essential for many real-world applications. However, nowadays there exist only a few practical vision systems capable to handle such tasks. Hence, vision technology for kinship-based problems has not matured enough to be applied to real- world problems. This leads to a concern of unsatisfactory performance when attempted on real-world datasets. Kinship verification is to determine pairwise kin relations for a pair of given images. It can be viewed as a typical binary classification problem, i.e., a face pair is either related by kinship or it is not. Prior research works have addressed kinship types for which pre-existing datasets have provided images, annotations and a verification task protocol. Namely, father-son, father-daughter, mother-son and mother-daughter. The main objective of this Master work is the study and development of feature selection and fusion for the problem of family verification from facial images. To achieve this objective, there is a main tasks that can be addressed: perform a compara- tive study on face descriptors that include classic descriptors as well as deep descriptors. The main contributions of this Thesis work are: 1. Studying the state of the art of the problem of family verification in images. 2. Implementing and comparing several criteria that correspond to different face rep- resentations (Local Binary Patterns (LBP), Histogram Oriented Gradients (HOG), deep descriptors)

Archivo Digital para la Docencia y la Investigación

Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier

Author: Chen Chen
Han Jungong
Shao Ling
Yang Linlin
Yang Yun
Zhang Baochang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/06/2017
Field of study

Human action recognition is an important yet challenging task. This paper presents a low-cost descriptor called 3D histograms of texture (3DHoTs) to extract discriminant features from a sequence of depth maps. 3DHoTs are derived from projecting depth frames onto three orthogonal Cartesian planes, i.e., the frontal, side, and top planes, and thus compactly characterize the salient information of a specific action, on which texture features are calculated to represent the action. Besides this fast feature descriptor, a new multi-class boosting classifier (MBC) is also proposed to efficiently exploit different kinds of features in a unified framework for action classification. Compared with the existing boosting frameworks, we add a new multi-class constraint into the objective function, which helps to maintain a better margin distribution by maximizing the mean of margin, whereas still minimizing the variance of margin. Experiments on the MSRAction3D, MSRGesture3D, MSRActivity3D, and UTD-MHAD data sets demonstrate that the proposed system combining 3DHoTs and MBC is superior to the state of the art

Crossref

Lancaster E-Prints

University of East Anglia digital repository

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach

Author: Sola Joan
Publication venue: Institut National Polytechnique de Toulouse
Publication date: 02/02/2007
Field of study

Dans cette thèse, nous résolvons le problème de reconstruire simultanément une représentation de la géométrie du monde, de la trajectoire de l'observateur, et de la trajectoire des objets mobiles, à l'aide de la vision. Nous divisons le problème en trois étapes : D'abord, nous donnons une solution au problème de la cartographie et localisation simultanées pour la vision monoculaire qui fonctionne dans les situations les moins bien conditionnées géométriquement. Ensuite, nous incorporons l'observabilité 3D instantanée en dupliquant le matériel de vision avec traitement monoculaire. Ceci élimine les inconvénients inhérents aux systèmes stéréo classiques. Nous ajoutons enfin la détection et suivi des objets mobiles proches en nous servant de cette observabilité 3D. Nous choisissons une représentation éparse et ponctuelle du monde et ses objets. La charge calculatoire des algorithmes de perception est allégée en focalisant activement l'attention aux régions de l'image avec plus d'intérêt. ABSTRACT : In this thesis we give new means for a machine to understand complex and dynamic visual scenes in real time. In particular, we solve the problem of simultaneously reconstructing a certain representation of the world's geometry, the observer's trajectory, and the moving objects' structures and trajectories, with the aid of vision exteroceptive sensors. We proceeded by dividing the problem into three main steps: First, we give a solution to the Simultaneous Localization And Mapping problem (SLAM) for monocular vision that is able to adequately perform in the most ill-conditioned situations: those where the observer approaches the scene in straight line. Second, we incorporate full 3D instantaneous observability by duplicating vision hardware with monocular algorithms. This permits us to avoid some of the inherent drawbacks of classic stereo systems, notably their limited range of 3D observability and the necessity of frequent mechanical calibration. Third, we add detection and tracking of moving objects by making use of this full 3D observability, whose necessity we judge almost inevitable. We choose a sparse, punctual representation of both the world and the moving objects in order to alleviate the computational payload of the image processing algorithms, which are required to extract the necessary geometrical information out of the images. This alleviation is additionally supported by active feature detection and search mechanisms which focus the attention to those image regions with the highest interest. This focusing is achieved by an extensive exploitation of the current knowledge available on the system (all the mapped information), something that we finally highlight to be the ultimate key to success

Thèses en Ligne

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Institut National Polytechnique de Toulouse (Theses)

HAL-INSA Toulouse

Lidar-based scale recovery dense SLAM for UAV navigation

Author
Publication venue
Publication date
Field of study

Imagine of having an autonomous agent (drone, robot, car, ..) that wants to navigate inside an unknown environment. The first question that it needs to answer for accomplish such task is: where Am I? Where are the objects that are surrounding me? The SLAM algorithm can answer to both questions simultaneously, in an on-line manner. This thesis focus on the implementation of a monocular SLAM algorithm on the UAV framework, where the classical obtained sparsity map is densified by means of a Convolutional Neural Network, properly scaled through 2D lidar measurements.Imagine of having an autonomous agent (drone, robot, car, ..) that wants to navigate inside an unknown environment. The first question that it needs to answer for accomplish such task is: where Am I? Where are the objects that are surrounding me? The SLAM algorithm can answer to both questions simultaneously, in an on-line manner. This thesis focus on the implementation of a monocular SLAM algorithm on the UAV framework, where the classical obtained sparsity map is densified by means of a Convolutional Neural Network, properly scaled through 2D lidar measurements

Padua Thesis and Dissertation Archive

Two and three dimensional segmentation of multimodal imagery

Author: Vantaram Sreenath Rao
Publication venue: RIT Scholar Works
Publication date: 01/10/2012
Field of study

The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

RIT Scholar Works

Texture and Colour in Image Analysis

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

Research in colour and texture has experienced major changes in the last few years. This book presents some recent advances in the field, specifically in the theory and applications of colour texture analysis. This volume also features benchmarks, comparative evaluations and reviews

Directory of Open Access Books (DOAB)