1,028 research outputs found

    Métodos de kernel escalables utilizando álgebra lineal numérica aleatorizada

    Get PDF
    Documento de tesis de maestriailustraciones, tablasLos métodos de kernel corresponden a un grupo de algoritmos de aprendizaje maquinal que hacen uso de una función de kernel para representar implicitamente datos en un espacio de alta dimensionalidad, donde sistemas de optimización lineal guíen a relaciones no lineales en el espacio original de los datos y por lo tanto encontrando patrones complejos dento de los datos. La mayor desventaja que tienen estos métodos es su pobre capacidad de escalamiento, pues muchos algoritmos basados en kernel requiren calcular una matriz de orden cuadrática respecto al numero de ejemplos en los datos, esta limitación ha provocado que los metodos de kernel sean evitados en configuraciones de datos a gran escala y utilicen en su lugar tecnicas como el aprendizaje profundo. Sin embargo, los metodos de kernel todavía son relevantes para entender mejor los métodos de aprendizaje profundo y ademas pueden mejorarlos haciendo uso de estrategias híbridas que combinen lo mejor de ambos mundos. El principal objetivo de esta tesis es explorar maneras eficientes de utilizar métodos de kernel sin una gran pérdida en precisión. Para realizar esto, diferentes enfoque son presentados y formulados, dentro de los cuales, nosotros proponemos la estrategía de aprendizaje utilizando budget, la cual es presentada en detalle desde una perspectiva teórica, incluyendo un procedimiento novedoso para la selección del budget, esta estrategia muestra en la evaluación experimental un rendimiento competitivo y mejoras respecto al método estandar de aprendizaje utilizando budget, especialmente cuando se seleccionan aproximaciones mas pequeñas, las cuales son las mas útiles en ambientes de gran escala. (Texto tomado de la fuente)Kernel methods are a set of machine learning algorithms that make use of a kernel function in order to represent data in an implicit high dimensional space, where linear optimization systems lead to non-linear relationships in the data original space and therefore finding complex patterns in the data. The main disadvantage of these methods is their poor scalability, as most kernel based algorithms need to calculate a matrix of quadratic order regarding the number of data samples. This limitation has caused kernel methods to be avoided for large scale datasets and use approaches such as deep learning instead. However, kernel methods are still relevant to better understand deep learning methods and can improve them through hybrid settings that combine the best of both worlds. The main goal of this thesis is to explore efficient ways to use kernel methods without a big loss in accuracy performance. In order to do this, different approaches are presented and formulated, from which, we propose the learning-on-a-budget strategy, which is presented in detail from a theoretical perspective, including a novel procedure of budget selection. This strategy shows, in the experimental evaluation competitive performance and improvements to the standard learning-on-a-budget method, especially when selecting smaller approximations, which are the most useful in large scale environments.MaestríaMagíster en Ingeniería - Ingeniería de Sistemas y ComputaciónCiencias de la computació

    Extending Structural Learning Paradigms for High-Dimensional Machine Learning and Analysis

    Get PDF
    Structure-based machine-learning techniques are frequently used in extensions of supervised learning, such as active, semi-supervised, multi-modal, and multi-task learning. A common step in many successful methods is a structure-discovery process that is made possible through the addition of new information, which can be user feedback, unlabeled data, data from similar tasks, alternate views of the problem, etc. Learning paradigms developed in the above-mentioned fields have led to some extremely flexible, scalable, and successful multivariate analysis approaches. This success and flexibility offer opportunities to expand the use of machine learning paradigms to more complex analyses. In particular, while information is often readily available concerning complex problems, the relationships among the information rarely follow the simple labeled-example-based setup that supervised learning is based upon. Even when it is possible to incorporate additional data in such forms, the result is often an explosion in the dimensionality of the input space, such that both sample complexity and computational complexity can limit real-world success. In this work, we review many of the latest structural learning approaches for dealing with sample complexity. We expand their use to generate new paradigms for combining some of these learning strategies to address more complex problem spaces. We overview extreme-scale data analysis problems where sample complexity is a much more limiting factor than computational complexity, and outline new structural-learning approaches for dealing jointly with both. We develop and demonstrate a method for dealing with sample complexity in complex systems that leads to a more scalable algorithm than other approaches to large-scale multi-variate analysis. This new approach reflects the underlying problem structure more accurately by using interdependence to address sample complexity, rather than ignoring it for the sake of tractability

    Efficient Asymmetric Co-Tracking using Uncertainty Sampling

    Full text link
    Adaptive tracking-by-detection approaches are popular for tracking arbitrary objects. They treat the tracking problem as a classification task and use online learning techniques to update the object model. However, these approaches are heavily invested in the efficiency and effectiveness of their detectors. Evaluating a massive number of samples for each frame (e.g., obtained by a sliding window) forces the detector to trade the accuracy in favor of speed. Furthermore, misclassification of borderline samples in the detector introduce accumulating errors in tracking. In this study, we propose a co-tracking based on the efficient cooperation of two detectors: a rapid adaptive exemplar-based detector and another more sophisticated but slower detector with a long-term memory. The sampling labeling and co-learning of the detectors are conducted by an uncertainty sampling unit, which improves the speed and accuracy of the system. We also introduce a budgeting mechanism which prevents the unbounded growth in the number of examples in the first detector to maintain its rapid response. Experiments demonstrate the efficiency and effectiveness of the proposed tracker against its baselines and its superior performance against state-of-the-art trackers on various benchmark videos.Comment: Submitted to IEEE ICSIPA'201

    Sharp analysis of low-rank kernel matrix approximations

    Get PDF
    We consider supervised learning problems within the positive-definite kernel framework, such as kernel ridge regression, kernel logistic regression or the support vector machine. With kernels leading to infinite-dimensional feature spaces, a common practical limiting difficulty is the necessity of computing the kernel matrix, which most frequently leads to algorithms with running time at least quadratic in the number of observations n, i.e., O(n^2). Low-rank approximations of the kernel matrix are often considered as they allow the reduction of running time complexities to O(p^2 n), where p is the rank of the approximation. The practicality of such methods thus depends on the required rank p. In this paper, we show that in the context of kernel ridge regression, for approximations based on a random subset of columns of the original kernel matrix, the rank p may be chosen to be linear in the degrees of freedom associated with the problem, a quantity which is classically used in the statistical analysis of such methods, and is often seen as the implicit number of parameters of non-parametric estimators. This result enables simple algorithms that have sub-quadratic running time complexity, but provably exhibit the same predictive performance than existing algorithms, for any given problem instance, and not only for worst-case situations
    corecore