7 research outputs found

    Nonlinear spectral unmixing of hyperspectral images using Gaussian processes

    Get PDF
    This paper presents an unsupervised algorithm for nonlinear unmixing of hyperspectral images. The proposed model assumes that the pixel reflectances result from a nonlinear function of the abundance vectors associated with the pure spectral components. We assume that the spectral signatures of the pure components and the nonlinear function are unknown. The first step of the proposed method consists of the Bayesian estimation of the abundance vectors for all the image pixels and the nonlinear function relating the abundance vectors to the observations. The endmembers are subsequently estimated using Gaussian process regression. The performance of the unmixing strategy is evaluated with simulations conducted on synthetic and real data

    Human Pose Estimation from Monocular Images : a Comprehensive Survey

    Get PDF
    Human pose estimation refers to the estimation of the location of body parts and how they are connected in an image. Human pose estimation from monocular images has wide applications (e.g., image indexing). Several surveys on human pose estimation can be found in the literature, but they focus on a certain category; for example, model-based approaches or human motion analysis, etc. As far as we know, an overall review of this problem domain has yet to be provided. Furthermore, recent advancements based on deep learning have brought novel algorithms for this problem. In this paper, a comprehensive survey of human pose estimation from monocular images is carried out including milestone works and recent advancements. Based on one standard pipeline for the solution of computer vision problems, this survey splits the problema into several modules: feature extraction and description, human body models, and modelin methods. Problem modeling methods are approached based on two means of categorization in this survey. One way to categorize includes top-down and bottom-up methods, and another way includes generative and discriminative methods. Considering the fact that one direct application of human pose estimation is to provide initialization for automatic video surveillance, there are additional sections for motion-related methods in all modules: motion features, motion models, and motion-based methods. Finally, the paper also collects 26 publicly available data sets for validation and provides error measurement methods that are frequently used

    Probabilistic Models of Motor Production

    Get PDF
    N. Bernstein defined the ability of the central neural system (CNS) to control many degrees of freedom of a physical body with all its redundancy and flexibility as the main problem in motor control. He pointed at that man-made mechanisms usually have one, sometimes two degrees of freedom (DOF); when the number of DOF increases further, it becomes prohibitively hard to control them. The brain, however, seems to perform such control effortlessly. He suggested the way the brain might deal with it: when a motor skill is being acquired, the brain artificially limits the degrees of freedoms, leaving only one or two. As the skill level increases, the brain gradually "frees" the previously fixed DOF, applying control when needed and in directions which have to be corrected, eventually arriving to the control scheme where all the DOF are "free". This approach of reducing the dimensionality of motor control remains relevant even today. One the possibles solutions of the Bernstetin's problem is the hypothesis of motor primitives (MPs) - small building blocks that constitute complex movements and facilitite motor learnirng and task completion. Just like in the visual system, having a homogenious hierarchical architecture built of similar computational elements may be beneficial. Studying such a complicated object as brain, it is important to define at which level of details one works and which questions one aims to answer. David Marr suggested three levels of analysis: 1. computational, analysing which problem the system solves; 2. algorithmic, questioning which representation the system uses and which computations it performs; 3. implementational, finding how such computations are performed by neurons in the brain. In this thesis we stay at the first two levels, seeking for the basic representation of motor output. In this work we present a new model of motor primitives that comprises multiple interacting latent dynamical systems, and give it a full Bayesian treatment. Modelling within the Bayesian framework, in my opinion, must become the new standard in hypothesis testing in neuroscience. Only the Bayesian framework gives us guarantees when dealing with the inevitable plethora of hidden variables and uncertainty. The special type of coupling of dynamical systems we proposed, based on the Product of Experts, has many natural interpretations in the Bayesian framework. If the dynamical systems run in parallel, it yields Bayesian cue integration. If they are organized hierarchically due to serial coupling, we get hierarchical priors over the dynamics. If one of the dynamical systems represents sensory state, we arrive to the sensory-motor primitives. The compact representation that follows from the variational treatment allows learning of a motor primitives library. Learned separately, combined motion can be represented as a matrix of coupling values. We performed a set of experiments to compare different models of motor primitives. In a series of 2-alternative forced choice (2AFC) experiments participants were discriminating natural and synthesised movements, thus running a graphics Turing test. When available, Bayesian model score predicted the naturalness of the perceived movements. For simple movements, like walking, Bayesian model comparison and psychophysics tests indicate that one dynamical system is sufficient to describe the data. For more complex movements, like walking and waving, motion can be better represented as a set of coupled dynamical systems. We also experimentally confirmed that Bayesian treatment of model learning on motion data is superior to the simple point estimate of latent parameters. Experiments with non-periodic movements show that they do not benefit from more complex latent dynamics, despite having high kinematic complexity. By having a fully Bayesian models, we could quantitatively disentangle the influence of motion dynamics and pose on the perception of naturalness. We confirmed that rich and correct dynamics is more important than the kinematic representation. There are numerous further directions of research. In the models we devised, for multiple parts, even though the latent dynamics was factorized on a set of interacting systems, the kinematic parts were completely independent. Thus, interaction between the kinematic parts could be mediated only by the latent dynamics interactions. A more flexible model would allow a dense interaction on the kinematic level too. Another important problem relates to the representation of time in Markov chains. Discrete time Markov chains form an approximation to continuous dynamics. As time step is assumed to be fixed, we face with the problem of time step selection. Time is also not a explicit parameter in Markov chains. This also prohibits explicit optimization of time as parameter and reasoning (inference) about it. For example, in optimal control boundary conditions are usually set at exact time points, which is not an ecological scenario, where time is usually a parameter of optimization. Making time an explicit parameter in dynamics may alleviate this

    Understanding human-centric images : from geometry to fashion

    Get PDF
    Understanding humans from photographs has always been a fundamental goal of computer vision. Early works focused on simple tasks such as detecting the location of individuals by means of bounding boxes. As the field progressed, harder and more higher level tasks have been undertaken. For example, from human detection came the 2D and 3D human pose estimation in which the task consisted of identifying the location in the image or space of all different body parts, e.g., head, torso, knees, arms, etc. Human attributes also became a great source of interest as they allow recognizing individuals and other properties such as gender or age. Later, the attention turned to the recognition of the action being performed. This, in general, relies on the previous works on pose estimation and attribute classification. Currently, even higher level tasks are being conducted such as predicting the motivations of human behavior or identifying the fashionability of an individual from a photograph. In this thesis we have developed a hierarchy of tools that cover all these range of problems, from low level feature point descriptors to high level fashion-aware conditional random fields models, all with the objective of understanding humans from monocular, RGB images. In order to build these high level models it is paramount to have a battery of robust and reliable low and mid level cues. Along these lines, we have proposed two low-level keypoint descriptors: one based on the theory of the heat diffusion on images, and the other that uses a convolutional neural network to learn discriminative image patch representations. We also introduce distinct low-level generative models for representing human pose: in particular we present a discrete model based on a directed acyclic graph and a continuous model that consists of poses clustered on a Riemannian manifold. As mid level cues we propose two 3D human pose estimation algorithms: one that estimates the 3D pose given a noisy 2D estimation, and an approach that simultaneously estimates both the 2D and 3D pose. Finally, we formulate higher level models built upon low and mid level cues for human understanding. Concretely, we focus on two different tasks in the context of fashion: semantic segmentation of clothing, and predicting the fashionability from images with metadata to ultimately provide fashion advice to the user. In summary, to robustly extract knowledge from images with the presence of humans it is necessary to build high level models that integrate low and mid level cues. In general, using and understanding strong features is critical for obtaining reliable performance. The main contribution of this thesis is in proposing a variety of low, mid and high level algorithms for human-centric images that can be integrated into higher level models for comprehending humans from photographs, as well as tackling novel fashion-oriented problems.Siempre ha sido una meta fundamental de la visión por computador la comprensión de los seres humanos. Los primeros trabajos se fijaron en objetivos sencillos tales como la detección en imágenes de la posición de los individuos. A medida que la investigación progresó se emprendieron tareas mucho más complejas. Por ejemplo, a partir de la detección de los humanos se pasó a la estimación en dos y tres dimensiones de su postura por lo que la tarea consistía en identificar la localización en la imagen o el espacio de las diferentes partes del cuerpo, por ejemplo cabeza, torso, rodillas, brazos, etc...También los atributos humanos se convirtieron en una gran fuente de interés ya que permiten el reconocimiento de los individuos y de sus propiedades como el género o la edad. Más tarde, la atención se centró en el reconocimiento de la acción realizada. Todos estos trabajos reposan en las investigaciones previas sobre la estimación de las posturas y la clasificación de los atributos. En la actualidad, se llevan a cabo investigaciones de un nivel aún superior sobre cuestiones tales como la predicción de las motivaciones del comportamiento humano o la identificación del tallaje de un individuo a partir de una fotografía. En esta tesis desarrollamos una jerarquía de herramientas que cubre toda esta gama de problemas, desde descriptores de rasgos de bajo nivel a modelos probabilísticos de campos condicionales de alto nivel reconocedores de la moda, todos ellos con el objetivo de mejorar la comprensión de los humanos a partir de imágenes RGB monoculares. Para construir estos modelos de alto nivel es decisivo disponer de una batería de datos robustos y fiables de nivel bajo y medio. En este sentido, proponemos dos descriptores novedosos de bajo nivel: uno se basa en la teoría de la difusión de calor en las imágenes y otro utiliza una red neural convolucional para aprender representaciones discriminativas de trozos de imagen. También introducimos diferentes modelos de bajo nivel generativos para representar la postura humana: en particular presentamos un modelo discreto basado en un gráfico acíclico dirigido y un modelo continuo que consiste en agrupaciones de posturas en una variedad de Riemann. Como señales de nivel medio proponemos dos algoritmos estimadores de la postura humana: uno que estima la postura en tres dimensiones a partir de una estimación imprecisa en el plano de la imagen y otro que estima simultáneamente la postura en dos y tres dimensiones. Finalmente construimos modelos de alto nivel a partir de señales de nivel bajo y medio para la comprensión de la persona a partir de imágenes. En concreto, nos centramos en dos diferentes tareas en el ámbito de la moda: la segmentación semántica del vestido y la predicción del buen ajuste de la prenda a partir de imágenes con meta-datos con la finalidad de aconsejar al usuario sobre moda. En resumen, para extraer conocimiento a partir de imágenes con presencia de seres humanos es preciso construir modelos de alto nivel que integren señales de nivel medio y bajo. En general, el punto crítico para obtener resultados fiables es el empleo y la comprensión de rasgos fuertes. La aportación fundamental de esta tesis es la propuesta de una variedad de algoritmos de nivel bajo, medio y alto para el tratamiento de imágenes centradas en seres humanos que pueden integrarse en modelos de alto nivel, para mejor comprensión de los seres humanos a partir de fotografías, así como abordar problemas planteados por el buen ajuste de las prendas

    Nonlinear unmixing of Hyperspectral images

    Get PDF
    Le démélange spectral est un des sujets majeurs de l’analyse d’images hyperspectrales. Ce problème consiste à identifier les composants macroscopiques présents dans une image hyperspectrale et à quantifier les proportions (ou abondances) de ces matériaux dans tous les pixels de l’image. La plupart des algorithmes de démélange suppose un modèle de mélange linéaire qui est souvent considéré comme une approximation au premier ordre du mélange réel. Cependant, le modèle linéaire peut ne pas être adapté pour certaines images associées par exemple à des scènes engendrant des trajets multiples (forêts, zones urbaines) et des modèles non-linéaires plus complexes doivent alors être utilisés pour analyser de telles images. Le but de cette thèse est d’étudier de nouveaux modèles de mélange non-linéaires et de proposer des algorithmes associés pour l’analyse d’images hyperspectrales. Dans un premier temps, un modèle paramétrique post-non-linéaire est étudié et des algorithmes d’estimation basés sur ce modèle sont proposés. Les connaissances a priori disponibles sur les signatures spectrales des composants purs, sur les abondances et les paramètres de la non-linéarité sont exploitées à l’aide d’une approche bayesienne. Le second modèle étudié dans cette thèse est basé sur l’approximation de la variété non-linéaire contenant les données observées à l’aide de processus gaussiens. L’algorithme de démélange associé permet d’estimer la relation non-linéaire entre les abondances des matériaux et les pixels observés sans introduire explicitement les signatures spectrales des composants dans le modèle de mélange. Ces signatures spectrales sont estimées dans un second temps par prédiction à base de processus gaussiens. La prise en compte d’effets non-linéaires dans les images hyperspectrales nécessite souvent des stratégies de démélange plus complexes que celles basées sur un modèle linéaire. Comme le modèle linéaire est souvent suffisant pour approcher la plupart des mélanges réels, il est intéressant de pouvoir détecter les pixels ou les régions de l’image où ce modèle linéaire est approprié. On pourra alors, après cette détection, appliquer les algorithmes de démélange non-linéaires aux pixels nécessitant réellement l’utilisation de modèles de mélange non-linéaires. La dernière partie de ce manuscrit se concentre sur l’étude de détecteurs de non-linéarités basés sur des modèles linéaires et non-linéaires pour l’analyse d’images hyperspectrales. Les méthodes de démélange non-linéaires proposées permettent d’améliorer la caractérisation des images hyperspectrales par rapport au méthodes basées sur un modèle linéaire. Cette amélioration se traduit en particulier par une meilleure erreur de reconstruction des données. De plus, ces méthodes permettent de meilleures estimations des signatures spectrales et des abondances quand les pixels résultent de mélanges non-linéaires. Les résultats de simulations effectuées sur des données synthétiques et réelles montrent l’intérêt d’utiliser des méthodes de détection de non-linéarités pour l’analyse d’images hyperspectrales. En particulier, ces détecteurs peuvent permettre d’identifier des composants très peu représentés et de localiser des régions où les effets non-linéaires sont non-négligeables (ombres, reliefs,...). Enfin, la considération de corrélations spatiales dans les images hyperspectrales peut améliorer les performances des algorithmes de démélange non-linéaires et des détecteurs de non-linéarités. ABSTRACT : Spectral unmixing is one the major issues arising when analyzing hyperspectral images. It consists of identifying the macroscopic materials present in a hyperspectral image and quantifying the proportions of these materials in the image pixels. Most unmixing techniques rely on a linear mixing model which is often considered as a first approximation of the actual mixtures. However, the linear model can be inaccurate for some specific images (for instance images of scenes involving multiple reflections) and more complex nonlinear models must then be considered to analyze such images. The aim of this thesis is to study new nonlinear mixing models and to propose associated algorithms to analyze hyperspectral images. First, a ost-nonlinear model is investigated and efficient unmixing algorithms based on this model are proposed. The prior knowledge about the components present in the observed image, their proportions and the nonlinearity parameters is considered using Bayesian inference. The second model considered in this work is based on the approximation of the nonlinear manifold which contains the observed pixels using Gaussian processes. The proposed algorithm estimates the relation between the observations and the unknown material proportions without explicit dependency on the material spectral signatures, which are estimated subsequentially. Considering nonlinear effects in hyperspectral images usually requires more complex unmixing strategies than those assuming linear mixtures. Since the linear mixing model is often sufficient to approximate accurately most actual mixtures, it is interesting to detect pixels or regions where the linear model is accurate. This nonlinearity detection can be applied as a pre-processing step and nonlinear unmixing strategies can then be applied only to pixels requiring the use of nonlinear models. The last part of this thesis focuses on new nonlinearity detectors based on linear and nonlinear models to identify pixels or regions where nonlinear effects occur in hyperspectral images. The proposed nonlinear unmixing algorithms improve the characterization of hyperspectral images compared to methods based on a linear model. These methods allow the reconstruction errors to be reduced. Moreover, these methods provide better spectral signature and abundance estimates when the observed pixels result from nonlinear mixtures. The simulation results conducted on synthetic and real images illustrate the advantage of using nonlinearity detectors for hyperspectral image analysis. In particular, the proposed detectors can identify components which are present in few pixels (and hardly distinguishable) and locate areas where significant nonlinear effects occur (shadow, relief, ...). Moreover, it is shown that considering spatial correlation in hyperspectral images can improve the performance of nonlinear unmixing and nonlinearity detection algorithms
    corecore