781 research outputs found

    An exploration of methodologies to improve semi-supervised hierarchical clustering with knowledge-based constraints

    Get PDF
    Clustering algorithms with constraints (also known as semi-supervised clustering algorithms) have been introduced to the field of machine learning as a significant variant to the conventional unsupervised clustering learning algorithms. They have been demonstrated to achieve better performance due to integrating prior knowledge during the clustering process, that enables uncovering relevant useful information from the data being clustered. However, the research conducted within the context of developing semi-supervised hierarchical clustering techniques are still an open and active investigation area. Majority of current semi-supervised clustering algorithms are developed as partitional clustering (PC) methods and only few research efforts have been made on developing semi-supervised hierarchical clustering methods. The aim of this research is to enhance hierarchical clustering (HC) algorithms based on prior knowledge, by adopting novel methodologies. [Continues.

    Graduate Programs Course Catalog 2014-2015

    Get PDF

    Graduate Programs Course Catalog 2015-2016

    Get PDF

    State of the Art in Face Recognition

    Get PDF
    Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state

    Distance-based analysis of dynamical systems and time series by optimal transport

    Get PDF
    The concept of distance is a fundamental notion that forms a basis for the orientation in space. It is related to the scientific measurement process: quantitative measurements result in numerical values, and these can be immediately translated into distances. Vice versa, a set of mutual distances defines an abstract Euclidean space. Each system is thereby represented as a point, whose Euclidean distances approximate the original distances as close as possible. If the original distance measures interesting properties, these can be found back as interesting patterns in this space. This idea is applied to complex systems: The act of breathing, the structure and activity of the brain, and dynamical systems and time series in general. In all these situations, optimal transportation distances are used; these measure how much work is needed to transform one probability distribution into another. The reconstructed Euclidean space then permits to apply multivariate statistical methods. In particular, canonical discriminant analysis makes it possible to distinguish between distinct classes of systems, e.g., between healthy and diseased lungs. This offers new diagnostic perspectives in the assessment of lung and brain diseases, and also offers a new approach to numerical bifurcation analysis and to quantify synchronization in dynamical systems.LEI Universiteit LeidenNWO Computational Life Sciences, grant no. 635.100.006Analyse en stochastie

    Stochastic Optimization For Multi-Agent Statistical Learning And Control

    Get PDF
    The goal of this thesis is to develop a mathematical framework for optimal, accurate, and affordable complexity statistical learning among networks of autonomous agents. We begin by noting the connection between statistical inference and stochastic programming, and consider extensions of this setup to settings in which a network of agents each observes a local data stream and would like to make decisions that are good with respect to information aggregated across the entire network. There is an open-ended degree of freedom in this problem formulation, however: the selection of the estimator function class which defines the feasible set of the stochastic program. Our central contribution is the design of stochastic optimization tools in reproducing kernel Hilbert spaces that yield optimal, accurate, and affordable complexity statistical learning for a multi-agent network. To obtain this result, we first explore the relative merits and drawbacks of different function class selections. In Part I, we consider multi-agent expected risk minimization this problem setting for the case that each agent seems to learn a common globally optimal generalized linear models (GLMs) by developing a stochastic variant of Arrow-Hurwicz primal-dual method. We establish convergence to the primal-dual optimal pair when either consensus or ``proximity constraints encode the fact that we want all agents\u27 to agree, or nearby agents to make decisions that are close to one another. Empirically, we observe that these convergence results are substantiated but that convergence may not translate into statistical accuracy. More broadly, optimality within a given estimator function class is not the same as one that makes minimal inference errors. The optimality-accuracy tradeoff of GLMs motivates subsequent efforts to learn more sophisticated estimators based upon learned feature encodings of the data that is fed into the statistical model. The specific tool we turn to in Part II is dictionary learning, where we optimize both over regression weights and an encoding of the data, which yields a non-convex problem. We investigate the use of stochastic methods for online task-driven dictionary learning, and obtain promising performance for the task of a ground robot learning to anticipate control uncertainty based on its past experience. Heartened by this implementation, we then consider extensions of this framework for a multi-agent network to each learn globally optimal task-driven dictionaries based on stochastic primal-dual methods. However, it is here the non-convexity of the optimization problem causes problems: stringent conditions on stochastic errors and the duality gap limit the applicability of the convergence guarantees, and impractically small learning rates are required for convergence in practice. Thus, we seek to learn nonlinear statistical models while preserving convexity, which is possible through kernel methods ( Part III). However, the increased descriptive power of nonparametric estimation comes at the cost of infinite complexity. Thus, we develop a stochastic approximation algorithm in reproducing kernel Hilbert spaces (RKHS) that ameliorates this complexity issue while preserving optimality: we combine the functional generalization of stochastic gradient method (FSGD) with greedily constructed low-dimensional subspace projections based on matching pursuit. We establish that the proposed method yields a controllable trade-off between optimality and memory, and yields highly accurate parsimonious statistical models in practice. % Then, we develop a multi-agent extension of this method by proposing a new node-separable penalty function and applying FSGD together with low-dimensional subspace projections. This extension allows a network of autonomous agents to learn a memory-efficient approximation to the globally optimal regression function based only on their local data stream and message passing with neighbors. In practice, we observe agents are able to stably learn highly accurate and memory-efficient nonlinear statistical models from streaming data. From here, we shift focus to a more challenging class of problems, motivated by the fact that true learning is not just revising predictions based upon data but augmenting behavior over time based on temporal incentives. This goal may be described by Markov Decision Processes (MDPs): at each point, an agent is in some state of the world, takes an action and then receives a reward while randomly transitioning to a new state. The goal of the agent is to select the action sequence to maximize its long-term sum of rewards, but determining how to select this action sequence when both the state and action spaces are infinite has eluded researchers for decades. As a precursor to this feat, we consider the problem of policy evaluation in infinite MDPs, in which we seek to determine the long-term sum of rewards when starting in a given state when actions are chosen according to a fixed distribution called a policy. We reformulate this problem as a RKHS-valued compositional stochastic program and we develop a functional extension of stochastic quasi-gradient algorithm operating in tandem with the greedy subspace projections mentioned above. We prove convergence with probability 1 to the Bellman fixed point restricted to this function class, and we observe a state of the art trade off in memory versus Bellman error for the proposed method on the Mountain Car driving task, which bodes well for incorporating policy evaluation into more sophisticated, provably stable reinforcement learning techniques, and in time, developing optimal collaborative multi-agent learning-based control systems

    Some statistical models for high-dimensional data

    Get PDF

    Politische Maschinen: Maschinelles Lernen für das Verständnis von sozialen Maschinen

    Get PDF
    This thesis investigates human-algorithm interactions in sociotechnological ecosystems. Specifically, it applies machine learning and statistical methods to uncover political dimensions of algorithmic influence in social media platforms and automated decision making systems. Based on the results, the study discusses the legal, political and ethical consequences of algorithmic implementations.Diese Arbeit untersucht Mensch-Algorithmen-Interaktionen in sozio-technologischen Ă–kosystemen. Sie wendet maschinelles Lernen und statistische Methoden an, um politische Dimensionen des algorithmischen Einflusses auf Socialen Medien und automatisierten Entscheidungssystemen aufzudecken. Aufgrund der Ergebnisse diskutiert die Studie die rechtlichen, politischen und ethischen Konsequenzen von algorithmischen Anwendungen

    Quantum Entanglement in Time

    Full text link
    In this doctoral thesis we provide one of the first theoretical expositions on a quantum effect known as entanglement in time. It can be viewed as an interdependence of quantum systems across time, which is stronger than could ever exist between classical systems. We explore this temporal effect within the study of quantum information and its foundations as well as through relativistic quantum information. An original contribution of this thesis is the design of one of the first applications of entanglement in time.Comment: 271 pages, PhD Thesis (Victoria University of Wellington
    • …
    corecore