9 research outputs found

    Quasi Spin Images

    Get PDF
    The increasing adoption of 3D capturing equipment, now also found in mobile devices, means that 3D content is increasingly prevalent. Common operations on such data, including 3D object recognition and retrieval, are based on the measurement of similarity between 3D objects. A common way to measure object similarity is through local shape descriptors, which aim to do part-to-part matching by describing portions of an object's shape. The Spin Image is one of the local descriptors most suitable for use in scenes with high degrees of clutter and occlusion but its practical use has been hampered by high computational demands. The rise in processing power of the GPU represents an opportunity to significantly improve the generation and comparison performance of descriptors, such as the Spin Image, thereby increasing the practical applicability of methods making use of it. In this paper we introduce a GPU-based Quasi Spin Image (QSI) algorithm, a variation of the original Spin Image, and show that a speedup of an order of magnitude relative to a reference CPU implementation can be achieved in terms of the image generation rate. In addition, the QSI is noise free, can be computed consistently, and a preliminary evaluation shows it correlates well relative to the original Spin Image

    3D Facial landmark detection under large yaw and expression variations

    Get PDF
    A 3D landmark detection method for 3D facial scans is presented and thoroughly evaluated. The main contribution of the presented method is the automatic and pose-invariant detection of landmarks on 3D facial scans under large yaw variations (that often result in missing facial data), and its robustness against large facial expressions. Three-dimensional information is exploited by using 3D local shape descriptors to extract candidate landmark points. The shape descriptors include the shape index, a continuous map of principal curvature values of a 3D object’s surface, and spin images, local descriptors of the object’s 3D point distribution. The candidate landmarks are identified and labeled by matching them with a Facial Landmark Model (FLM) of facial anatomical landmarks. The presented method is extensively evaluated against a variety of 3D facial databases and achieves state-of-the-art accuracy (4.5-6.3 mm mean landmark localization error), considerably outperforming previous methods, even when tested with the most challenging data

    A proposal to improve the authentication process in m-health environments

    Get PDF
    Special Section: Mission Critical Public-Safety Communications: Architectures, Enabling Technologies, and Future Applications One of the challenges of mobile health is to provide a way of maintaining privacy in the access to the data. Especially, when using ICT for providing access to health services and information. In these scenarios, it is essential to determine and verify the identity of users to ensure the security of the network. A way of authenticating the identity of each patient, doctor or any stakeholder involved in the process is to use a software application that analyzes the face of them through the cams integrated in their devices. The selection of an appropriate facial authentication software application requires a fair comparison between alternatives through a common database of face images. Users usually carry out authentication with variations in their aspects while accessing to health services. This paper presents both 1) a database of facial images that combines the most common variations that can happen in the participants and 2) an algorithm that establishes different levels of access to the data based on data sensitivity levels and the accuracy of the authentication

    Multi-scale techniques for multi-dimensional data analysis

    Get PDF
    Large datasets of geometric data of various nature are becoming more and more available as sensors become cheaper and more widely used. Due to both their size and their noisy nature, special techniques must be employed to deal with them correctly. In order to efficiently handle this amount of data and to tackle the technical challenges they pose, we propose techniques that analyze a scalar signal by means of its critical points (i.e. maxima and minima), ranking them on a scale of importance, by which we can extrapolate important information of the input signal separating it from noise, thus dramatically reducing the complexity of the problem. In order to obtain a ranking of critical points we employ multi-scale techniques. The standard scale-space approach, however, is not sufficient when trying to track critical points across various scales. We start from an implementation of the scale-space which computes a linear interpolation between scales in order to make tracking of critical points easier. The linear interpolation of a process which is not itself linear, though, does not fulfill some theoretical properties of scale-space, thus making the tracking of critical points much harder. We propose an extension of this piecewiselinear scale-space implementation, which recovers the theoretical properties (e.g., to avoid the generation of new critical points as the scale increases) and keeps the tracking consistent. Next we combine the scale-space with another technique that comes from the topology theory: the classification of critical points based on their persistence value. While the scale-space applies a filtering in the frequency domain, by progressively smoothing the input signal with low-pass filters of increasing size, the computation of the persistence can be seen as a filtering applied in the amplitude domain, which progressively removes pairs of critical points based on their difference in amplitude. The two techniques, while being both relevant to the concept of scale, express different qualities of the critical points of the input signal; depending on the application domain we can use either of them, or, since they both have non-zero values only at critical points, they can be used together with a linear combination. The thesis will be structured as follows: In Chapter 1 we will present an overview on the problem of analyzing huge geometric datasets, focusing on the problem of dealing with their size and noise, and of reducing the problem to a subset of relevant samples. The Chapter 2 will contain a study of the state of the art in scale-space algorithms, followed by a more in-depth analysis of the virtually continuous framework used as base technique will be presented. In its last part, we will propose methods to extend these techniques in order to satisfy the axioms present in the continuous version of the scale-space and to have a stronger and more reliable tracking of critical points across scales, and the extraction of the persistence of critical points of a signal as a variant to the standard scale-space approach; we will show the differences between the two and discuss how to combine them. The Chapter 3 will introduce an ever growing source of data, the motion capture systems; we will motivate its importance by discussing the many applications in which it has been used for the past two decades. We will briefly summarize the different systems existing and then we will focus on a particular one, discussing its peculiarities and its output data. In Chapter 4, we will discuss the problem of studying intra-personal synchronization computed on data coming from such motion-capture systems. We will show how multi-scale approaches can be used to identify relevant instants in the motion and how these instants can be used to precisely study synchronization between the different parts of the body from which they are extracted. We will apply these techniques to the problem of generating a classifier to discriminate between martial artists of different skills who have been recorded doing karate\u2019s movements. In Chapter 5 will present a work on the automatic detection of relevant points of the human face from 3D data. We will show that the Gaussian curvature of the 3D surface is a good feature to distinguish the so-called fiducial points, but also that multi-scale techniques must be used to extract only relevant points and get rid of the noise. In closing, Chapter 6 will discuss an ongoing work about motion segmentation; after an introduction about the meaning and different possibilities of motion segmentation we will present the data we work with, the approach used to identify segments and some preliminary tools and results

    Models of Visual Attention in Deep Residual CNNs

    Get PDF
    Feature reuse from earlier layers in neural network hierarchies has been shown to improve the quality of features at a later stage - a concept known as residual learning. In this thesis, we learn effective residual learning methodologies infused with attention mechanisms to observe their effect on different tasks. To this end, we propose 3 architectures across medical image segmentation and 3D point cloud analysis. In FocusNet, we propose an attention based dual branch encoder decoder structure that learns an extremely efficient attention mechanism which achieves state of the art results on the ISIC 2017 skin cancer segmentation dataset. We propose a novel loss enhancement that improves the convergence of FocusNet, performing better than state-of-the-art loss functions such as tversky and focal loss. Evaluations of the architecture proposes two drawbacks which we fix in FocusNetAlpha. Our novel residual group attention block based network forms the backbone of this architecture, learning distinct features with sparse correlations, which is the key reason for its effectiveness. At the time of writing this thesis, FocusNetAlpha outperforms all state-of-the-art convolutional autoencoders with the least parameters and FLOPs compared to them, based on our experiments on the ISIC 2018, DRIVE retinal vessel segmentation and the cell nuclei segmentation dataset. We then shift our attention to 3D point cloud processing where we propose SAWNet, which combines global and local point embeddings infused with attention, to create a spatially aware embedding that outperforms both. We propose a novel method to learn a global feature aggregation for point clouds via a fully differential block that does not need a lot of trainable parameters and gives obvious performance boosts. SAWNet beats state-of-the-art results on ModelNet40 and ShapeNet part segmentation datasets

    3D Shape Descriptor-Based Facial Landmark Detection: A Machine Learning Approach

    Get PDF
    Facial landmark detection on 3D human faces has had numerous applications in the literature such as establishing point-to-point correspondence between 3D face models which is itself a key step for a wide range of applications like 3D face detection and authentication, matching, reconstruction, and retrieval, to name a few. Two groups of approaches, namely knowledge-driven and data-driven approaches, have been employed for facial landmarking in the literature. Knowledge-driven techniques are the traditional approaches that have been widely used to locate landmarks on human faces. In these approaches, a user with sucient knowledge and experience usually denes features to be extracted as the landmarks. Data-driven techniques, on the other hand, take advantage of machine learning algorithms to detect prominent features on 3D face models. Besides the key advantages, each category of these techniques has limitations that prevent it from generating the most reliable results. In this work we propose to combine the strengths of the two approaches to detect facial landmarks in a more ecient and precise way. The suggested approach consists of two phases. First, some salient features of the faces are extracted using expert systems. Afterwards, these points are used as the initial control points in the well-known Thin Plate Spline (TPS) technique to deform the input face towards a reference face model. Second, by exploring and utilizing multiple machine learning algorithms another group of landmarks are extracted. The data-driven landmark detection step is performed in a supervised manner providing an information-rich set of training data in which a set of local descriptors are computed and used to train the algorithm. We then, use the detected landmarks for establishing point-to-point correspondence between the 3D human faces mainly using an improved version of Iterative Closest Point (ICP) algorithms. Furthermore, we propose to use the detected landmarks for 3D face matching applications
    corecore