314,886 research outputs found

    Abstract art by shape classification

    Get PDF
    his paper shows that classifying shapes is a tool useful in nonphotorealistic rendering (NPR) from photographs. Our classifier inputs regions from an image segmentation hierarchy and outputs the "best” fitting simple shape such as a circle, square, or triangle. Other approaches to NPR have recognized the benefits of segmentation, but none have classified the shape of segments. By doing so, we can create artwork of a more abstract nature, emulating the style of modern artists such as Matisse and other artists who favored shape simplification in their artwork. The classifier chooses the shape that "best” represents the region. Since the classifier is trained by a user, the "best shape” has a subjective quality that can over-ride measurements such as minimum error and more importantly captures user preferences. Once trained, the system is fully automatic, although simple user interaction is also possible to allow for differences in individual tastes. A gallery of results shows how this classifier contributes to NPR from images by producing abstract artwork. INDEX TERM

    Abstract Art by Shape Classification

    Full text link

    Action Recognition in Video by Covariance Matching of Silhouette Tunnels

    Full text link
    Abstract—Action recognition is a challenging problem in video analytics due to event complexity, variations in imaging conditions, and intra- and inter-individual action-variability. Central to these challenges is the way one models actions in video, i.e., action representation. In this paper, an action is viewed as a temporal sequence of local shape-deformations of centroid-centered object silhouettes, i.e., the shape of the centroid-centered object silhouette tunnel. Each action is rep-resented by the empirical covariance matrix of a set of 13-dimensional normalized geometric feature vectors that capture the shape of the silhouette tunnel. The similarity of two actions is measured in terms of a Riemannian metric between their covariance matrices. The silhouette tunnel of a test video is broken into short overlapping segments and each segment is classified using a dictionary of labeled action covariance matrices and the nearest neighbor rule. On a database of 90 short video sequences this attains a correct classification rate of 97%, which is very close to the state-of-the-art, at almost 5-fold reduced computational cost. Majority-vote fusion of segment decisions achieves 100 % classification rate. Keywords-video analysis; action recognition; silhouette tun-nel; covariance matching; generalized eigenvalues; I

    3D shape estimation in video sequences provides high precision evaluation of facial expressions

    Get PDF
    Abstract Person independent and pose invariant estimation of facial expressions and action unit (AU) intensity estimation is important for situation analysis and for automated video annotation. We evaluated raw 2D shape data of the CK+ database, used Procrustes transformation and the multi-class SVM leave-one-out method for classification. We found close to 100% performance demonstrating the relevance and the strength of details of the shape. Precise 3D shape information was computed by means of Constrained Local Models (CLM) on video sequences. Such sequences offer the opportunity to compute a time-averaged '3D Personal Mean Shape' (PMS) from the estimated CLM shapes, which -upon subtraction -gives rise to person independent emotion estimation. On CK+ data PMS showed significant improvements over AU0 normalization; performance reached and sometimes surpassed state-ofthe-art results on emotion classification and on AU intensity estimation. 3D PMS from 3D CLM offers pose invariant emotion estimation that we studied by rendering a 3D emotional database for different poses and different subjects from the BU 4DFE database. Frontal shapes derived from CLM fits of the 3D shape were evaluated. Results demonstrate that shape estimation alone can be used for robust, high quality pose invariant emotion classification and AU intensity estimation

    Image-to-image translation with Generative Adversarial Networks via retinal masks for realistic Optical Coherence Tomography imaging of Diabetic Macular Edema disorders

    Get PDF
    [Abstract]: One of the main issues with deep learning is the need of a significant number of samples. We intend to address this problem in the field of Optical Coherence Tomography (OCT), specifically in the context of Diabetic Macular Edema (DME). This pathology represents one of the main causes of blindness in developed countries and, due to the capturing difficulties and saturation of health services, the task of creating computer-aided diagnosis (CAD) systems is an arduous task. For this reason, we propose a solution to generate samples. Our strategy employs image-to-image Generative Adversarial Networks (GAN) to translate a binary mask into a realistic OCT image. Moreover, thanks to the clinical relationship between the retinal shape and the presence of DME fluid, we can generate both pathological and non-pathological samples by altering the binary mask morphology. To demonstrate the capabilities of our proposal, we test it against two classification strategies of the state-of-the-art. In the first one, we evaluate a system fully trained with generated images, obtaining 94.83% accuracy with respect to the state-of-the-art. In the second case, we tested it against a state-of-the-art expert model based on deep features, in which it also achieved successful results with a 98.23% of the accuracy of the original work. This way, our methodology proved to be useful in scenarios where data is scarce, and could be easily adapted to other imaging modalities and pathologies where key shape constraints in the image provide enough information to recreate realistic samples

    Underwater Live Fish Recognition Using a Balance-Guaranteed Optimized Tree

    Get PDF
    Abstract. Live fish recognition in the open sea is a challenging multiclass classification task. We propose a novel method to recognize fish in an unrestricted natural environment recorded by underwater cameras. This method extracts 66 types of features, which are a combination of color, shape and texture properties from different parts of the fish and reduce the feature dimensions with forward sequential feature selection (FSFS) procedure. The selected features of the FSFS are used by an SVM. We present a Balance-Guaranteed Optimized Tree (BGOT) to control the error accumulation in hierarchical classification and, therefore, achieve better performance. A BGOT of 10 fish species is automatically constructed using the inter-class similarities and a heuristic method. The proposed BGOT-based hierarchical classification method achieves about 4 % better accuracy compared to state-of-the-art techniques on a live fish image dataset.

    Learning a Hierarchical Latent-Variable Model of 3D Shapes

    Full text link
    We propose the Variational Shape Learner (VSL), a generative model that learns the underlying structure of voxelized 3D shapes in an unsupervised fashion. Through the use of skip-connections, our model can successfully learn and infer a latent, hierarchical representation of objects. Furthermore, realistic 3D objects can be easily generated by sampling the VSL's latent probabilistic manifold. We show that our generative model can be trained end-to-end from 2D images to perform single image 3D model retrieval. Experiments show, both quantitatively and qualitatively, the improved generalization of our proposed model over a range of tasks, performing better or comparable to various state-of-the-art alternatives.Comment: Accepted as oral presentation at International Conference on 3D Vision (3DV), 201
    corecore