1,119 research outputs found

    Tensor networks for quantum machine learning

    Get PDF
    Once developed for quantum theory, tensor networks have been established as a successful machine learning paradigm. Now, they have been ported back to the quantum realm in the emerging field of quantum machine learning to assess problems that classical computers are unable to solve efficiently. Their nature at the interface between physics and machine learning makes tensor networks easily deployable on quantum computers. In this review article, we shed light on one of the major architectures considered to be predestined for variational quantum machine learning. In particular, we discuss how layouts like MPS, PEPS, TTNs and MERA can be mapped to a quantum computer, how they can be used for machine learning and data encoding and which implementation techniques improve their performance

    Probabilistic Models of Motor Production

    Get PDF
    N. Bernstein defined the ability of the central neural system (CNS) to control many degrees of freedom of a physical body with all its redundancy and flexibility as the main problem in motor control. He pointed at that man-made mechanisms usually have one, sometimes two degrees of freedom (DOF); when the number of DOF increases further, it becomes prohibitively hard to control them. The brain, however, seems to perform such control effortlessly. He suggested the way the brain might deal with it: when a motor skill is being acquired, the brain artificially limits the degrees of freedoms, leaving only one or two. As the skill level increases, the brain gradually "frees" the previously fixed DOF, applying control when needed and in directions which have to be corrected, eventually arriving to the control scheme where all the DOF are "free". This approach of reducing the dimensionality of motor control remains relevant even today. One the possibles solutions of the Bernstetin's problem is the hypothesis of motor primitives (MPs) - small building blocks that constitute complex movements and facilitite motor learnirng and task completion. Just like in the visual system, having a homogenious hierarchical architecture built of similar computational elements may be beneficial. Studying such a complicated object as brain, it is important to define at which level of details one works and which questions one aims to answer. David Marr suggested three levels of analysis: 1. computational, analysing which problem the system solves; 2. algorithmic, questioning which representation the system uses and which computations it performs; 3. implementational, finding how such computations are performed by neurons in the brain. In this thesis we stay at the first two levels, seeking for the basic representation of motor output. In this work we present a new model of motor primitives that comprises multiple interacting latent dynamical systems, and give it a full Bayesian treatment. Modelling within the Bayesian framework, in my opinion, must become the new standard in hypothesis testing in neuroscience. Only the Bayesian framework gives us guarantees when dealing with the inevitable plethora of hidden variables and uncertainty. The special type of coupling of dynamical systems we proposed, based on the Product of Experts, has many natural interpretations in the Bayesian framework. If the dynamical systems run in parallel, it yields Bayesian cue integration. If they are organized hierarchically due to serial coupling, we get hierarchical priors over the dynamics. If one of the dynamical systems represents sensory state, we arrive to the sensory-motor primitives. The compact representation that follows from the variational treatment allows learning of a motor primitives library. Learned separately, combined motion can be represented as a matrix of coupling values. We performed a set of experiments to compare different models of motor primitives. In a series of 2-alternative forced choice (2AFC) experiments participants were discriminating natural and synthesised movements, thus running a graphics Turing test. When available, Bayesian model score predicted the naturalness of the perceived movements. For simple movements, like walking, Bayesian model comparison and psychophysics tests indicate that one dynamical system is sufficient to describe the data. For more complex movements, like walking and waving, motion can be better represented as a set of coupled dynamical systems. We also experimentally confirmed that Bayesian treatment of model learning on motion data is superior to the simple point estimate of latent parameters. Experiments with non-periodic movements show that they do not benefit from more complex latent dynamics, despite having high kinematic complexity. By having a fully Bayesian models, we could quantitatively disentangle the influence of motion dynamics and pose on the perception of naturalness. We confirmed that rich and correct dynamics is more important than the kinematic representation. There are numerous further directions of research. In the models we devised, for multiple parts, even though the latent dynamics was factorized on a set of interacting systems, the kinematic parts were completely independent. Thus, interaction between the kinematic parts could be mediated only by the latent dynamics interactions. A more flexible model would allow a dense interaction on the kinematic level too. Another important problem relates to the representation of time in Markov chains. Discrete time Markov chains form an approximation to continuous dynamics. As time step is assumed to be fixed, we face with the problem of time step selection. Time is also not a explicit parameter in Markov chains. This also prohibits explicit optimization of time as parameter and reasoning (inference) about it. For example, in optimal control boundary conditions are usually set at exact time points, which is not an ecological scenario, where time is usually a parameter of optimization. Making time an explicit parameter in dynamics may alleviate this

    Chapter Machine Learning in Volcanology: A Review

    Get PDF
    A volcano is a complex system, and the characterization of its state at any given time is not an easy task. Monitoring data can be used to estimate the probability of an unrest and/or an eruption episode. These can include seismic, magnetic, electromagnetic, deformation, infrasonic, thermal, geochemical data or, in an ideal situation, a combination of them. Merging data of different origins is a non-trivial task, and often even extracting few relevant and information-rich parameters from a homogeneous time series is already challenging. The key to the characterization of volcanic regimes is in fact a process of data reduction that should produce a relatively small vector of features. The next step is the interpretation of the resulting features, through the recognition of similar vectors and for example, their association to a given state of the volcano. This can lead in turn to highlight possible precursors of unrests and eruptions. This final step can benefit from the application of machine learning techniques, that are able to process big data in an efficient way. Other applications of machine learning in volcanology include the analysis and classification of geological, geochemical and petrological “static” data to infer for example, the possible source and mechanism of observed deposits, the analysis of satellite imagery to quickly classify vast regions difficult to investigate on the ground or, again, to detect changes that could indicate an unrest. Moreover, the use of machine learning is gaining importance in other areas of volcanology, not only for monitoring purposes but for differentiating particular geochemical patterns, stratigraphic issues, differentiating morphological patterns of volcanic edifices, or to assess spatial distribution of volcanoes. Machine learning is helpful in the discrimination of magmatic complexes, in distinguishing tectonic settings of volcanic rocks, in the evaluation of correlations of volcanic units, being particularly helpful in tephrochronology, etc. In this chapter we will review the relevant methods and results published in the last decades using machine learning in volcanology, both with respect to the choice of the optimal feature vectors and to their subsequent classification, taking into account both the unsupervised and the supervised approaches

    Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Equivariant Projected Kernels

    Get PDF
    Gaussian processes are machine learning models capable of learning unknown functions in a way that represents uncertainty, thereby facilitating construction of optimal decision-making systems. Motivated by a desire to deploy Gaussian processes in novel areas of science, a rapidly-growing line of research has focused on constructively extending these models to handle non-Euclidean domains, including Riemannian manifolds, such as spheres and tori. We propose techniques that generalize this class to model vector fields on Riemannian manifolds, which are important in a number of application areas in the physical sciences. To do so, we present a general recipe for constructing gauge equivariant kernels, which induce Gaussian vector fields, i.e. vector-valued Gaussian processes coherent with geometry, from scalar-valued Riemannian kernels. We extend standard Gaussian process training methods, such as variational inference, to this setting. This enables vector-valued Gaussian processes on Riemannian manifolds to be trained using standard methods and makes them accessible to machine learning practitioners

    Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels

    Get PDF
    Gaussian processes are machine learning models capable of learning unknown functions in a way that represents uncertainty, thereby facilitating construction of optimal decision-making systems. Motivated by a desire to deploy Gaussian processes in novel areas of science, a rapidly-growing line of research has focused on constructively extending these models to handle non-Euclidean domains, including Riemannian manifolds, such as spheres and tori. We propose techniques that generalize this class to model vector fields on Riemannian manifolds, which are important in a number of application areas in the physical sciences. To do so, we present a general recipe for constructing gauge independent kernels, which induce Gaussian vector fields, i.e. vector-valued Gaussian processes coherent withgeometry, from scalar-valued Riemannian kernels. We extend standard Gaussian process training methods, such as variational inference, to this setting. This enables vector-valued Gaussian processes on Riemannian manifolds to be trained using standard methods and makes them accessible to machine learning practitioners

    A Survey of Geometric Optimization for Deep Learning: From Euclidean Space to Riemannian Manifold

    Full text link
    Although Deep Learning (DL) has achieved success in complex Artificial Intelligence (AI) tasks, it suffers from various notorious problems (e.g., feature redundancy, and vanishing or exploding gradients), since updating parameters in Euclidean space cannot fully exploit the geometric structure of the solution space. As a promising alternative solution, Riemannian-based DL uses geometric optimization to update parameters on Riemannian manifolds and can leverage the underlying geometric information. Accordingly, this article presents a comprehensive survey of applying geometric optimization in DL. At first, this article introduces the basic procedure of the geometric optimization, including various geometric optimizers and some concepts of Riemannian manifold. Subsequently, this article investigates the application of geometric optimization in different DL networks in various AI tasks, e.g., convolution neural network, recurrent neural network, transfer learning, and optimal transport. Additionally, typical public toolboxes that implement optimization on manifold are also discussed. Finally, this article makes a performance comparison between different deep geometric optimization methods under image recognition scenarios.Comment: 41 page

    Memory Structure and Cognitive Maps

    Get PDF
    A common way to understand memory structures in the cognitive sciences is as a cognitive map​. Cognitive maps are representational systems organized by dimensions shared with physical space. The appeal to these maps begins literally: as an account of how spatial information is represented and used to inform spatial navigation. Invocations of cognitive maps, however, are often more ambitious; cognitive maps are meant to scale up and provide the basis for our more sophisticated memory capacities. The extension is not meant to be metaphorical, but the way in which these richer mental structures are supposed to remain map-like is rarely made explicit. Here we investigate this missing link, asking: how do cognitive maps represent non-spatial information?​ We begin with a survey of foundational work on spatial cognitive maps and then provide a comparative review of alternative, non-spatial representational structures. We then turn to several cutting-edge projects that are engaged in the task of scaling up cognitive maps so as to accommodate non-spatial information: first, on the spatial-isometric approach​ , encoding content that is non-spatial but in some sense isomorphic to spatial content; second, on the ​ abstraction approach​ , encoding content that is an abstraction over first-order spatial information; and third, on the ​ embedding approach​ , embedding non-spatial information within a spatial context, a prominent example being the Method-of-Loci. Putting these cases alongside one another reveals the variety of options available for building cognitive maps, and the distinctive limitations of each. We conclude by reflecting on where these results take us in terms of understanding the place of cognitive maps in memory

    Neural Latent Geometry Search: Product Manifold Inference via Gromov-Hausdorff-Informed Bayesian Optimization

    Full text link
    Recent research indicates that the performance of machine learning models can be improved by aligning the geometry of the latent space with the underlying data structure. Rather than relying solely on Euclidean space, researchers have proposed using hyperbolic and spherical spaces with constant curvature, or combinations thereof, to better model the latent space and enhance model performance. However, little attention has been given to the problem of automatically identifying the optimal latent geometry for the downstream task. We mathematically define this novel formulation and coin it as neural latent geometry search (NLGS). More specifically, we introduce a principled method that searches for a latent geometry composed of a product of constant curvature model spaces with minimal query evaluations. To accomplish this, we propose a novel notion of distance between candidate latent geometries based on the Gromov-Hausdorff distance from metric geometry. In order to compute the Gromov-Hausdorff distance, we introduce a mapping function that enables the comparison of different manifolds by embedding them in a common high-dimensional ambient space. Finally, we design a graph search space based on the calculated distances between candidate manifolds and use Bayesian optimization to search for the optimal latent geometry in a query-efficient manner. This is a general method which can be applied to search for the optimal latent geometry for a variety of models and downstream tasks. Extensive experiments on synthetic and real-world datasets confirm the efficacy of our method in identifying the optimal latent geometry for multiple machine learning problems

    Redes neuronales que expresan múltiples estrategias en el videojuego StarCraft 2.

    Get PDF
    ilustracionesUsing neural networks and supervised learning, we have created models capable of solving problems at a superhuman level. Nevertheless, this training process results in models that learn policies that average the plethora of behaviors usually found in datasets. In this thesis we present and study the Behavioral Repetoires Imitation Learning (BRIL) technique. In BRIL, the user designs a behavior space, the user then projects this behavior space into low coordinates and uses these coordinates as input to the model. Upon deployment, the user can adjust the model to express a behavior by specifying fixed coordinates for these inputs. The main research question ponders on the relationship between the Dimension Reduction algorithm and how much the trained models are able to replicate behaviors. We study three different Dimensionality Reduction algorithms: Principal Component Analysis (PCA), Isometric Feature Mapping (Isomap) and Uniform Manifold Approximation and Projection (UMAP); we design and embed a behavior space in the video game StarCraft 2, we train different models for each embedding and we test the ability of each model to express multiple strategies. Results show that with BRIL we are able to train models that are able to express the multiple behaviors present in the dataset. The geometric structure these methods preserve induce different separations of behaviors, and these separations are reflected in the models' conducts. (Tomado de la fuente)Usando redes neuronales y aprendizaje supervisado, hemos creado modelos capaces de solucionar problemas a nivel súperhumano. Sin embargo, el proceso de entrenamiento de estos modelos es tal que el resultado es una política que promedia todos los diferentes comportamientos presentes en el conjunto de datos. En esta tesis presentamos y estudiamos la técnica Aprendizaje por Imitación de Repertorios de Comportamiento (BRIL), la cual permite entrenar modelos que expresan múltiples comportamientos de forma ajustable. En BRIL, el usuario diseña un espacio de comportamientos, lo proyecta a bajas dimensiones y usa las coordenadas resultantes como entradas del modelo. Para poder expresar cierto comportamiento a la hora de desplegar la red, basta con fijar estas entradas a las coordenadas del respectivo comportamiento. La pregunta principal que investigamos es la relación entre el algoritmo de reducción de dimensionalidad y la capacidad de los modelos entrenados para replicar y expresar las estrategias representadas. Estudiamos tres algoritmos diferentes de reducción de dimensionalidad: Análisis de Componentes Principales (PCA), Mapeo de Características Isométrico (Isomap) y Aproximación y Proyección de Manifolds Uniformes (UMAP); diseñamos y proyectamos un espacio de comportamientos en el videojuego StarCraft 2, entrenamos diferentes modelos para cada embebimiento y probamos la capacidad de cada modelo de expresar múltiples estrategias. Los resultados muestran que, usando BRIL, logramos entrenar modelos que pueden expresar los múltiples comportamientos presentes en el conjunto de datos. La estructura geométrica preservada por cada método de reducción induce diferentes separaciones de los comportamientos, y estas separaciones se ven reflejadas en las conductas de los modelos. (Tomado de la fuente)Maestrí

    Neural latent geometry search: product manifold inference via Gromov-Hausdorff-informed Bayesian optimization

    Get PDF
    Recent research indicates that the performance of machine learning models can be improved by aligning the geometry of the latent space with the underlying data structure. Rather than relying solely on Euclidean space, researchers have proposed using hyperbolic and spherical spaces with constant curvature, or combinations thereof, to better model the latent space and enhance model performance. However, little attention has been given to the problem of automatically identifying the optimal latent geometry for the downstream task. We mathematically define this novel formulation and coin it as neural latent geometry search (NLGS). More specifically, we introduce an initial attempt to search for a latent geometry composed of a product of constant curvature model spaces with a small number of query evaluations, under some simplifying assumptions. To accomplish this, we propose a novel notion of distance between candidate latent geometries based on the Gromov-Hausdorff distance from metric geometry. In order to compute the Gromov-Hausdorff distance, we introduce a mapping function that enables the comparison of different manifolds by embedding them in a common high-dimensional ambient space. We then design a graph search space based on the notion of smoothness between latent geometries and employ the calculated distances as an additional inductive bias. Finally, we use Bayesian optimization to search for the optimal latent geometry in a query-efficient manner. This is a general method which can be applied to search for the optimal latent geometry for a variety of models and downstream tasks. We perform experiments on synthetic and real-world datasets to identify the optimal latent geometry for multiple machine learning problems
    corecore