Search CORE

22 research outputs found

Healing the Relevance Vector Machine through Augmentation

Author: Candela J.
Rasmussen C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2005
Field of study

The Relevance Vector Machine (RVM) is a sparse approximate Bayesian kernel method. It provides full predictive distributions for test cases. However, the predictive uncertainties have the unintuitive property, that emphthey get smaller the further you move away from the training cases. We give a thorough analysis. Inspired by the analogy to non-degenerate Gaussian Processes, we suggest augmentation to solve the problem. The purpose of the resulting model, RVM*, is primarily to corroborate the theoretical and experimental analysis. Although RVM* could be used in practical applications, it is no longer a truly sparse model. Experiments show that sparsity comes at the expense of worse predictive distributions

MPG.PuRe

Understanding and Comparing Scalable Gaussian Process Regression for Big Data

Author: Cai Jianfei
Liu Haitao
Ong Yew-Soon
Wang Yi
Publication venue
Publication date: 01/01/2018
Field of study

As a non-parametric Bayesian model which produces informative predictive distribution, Gaussian process (GP) has been widely used in various fields, like regression, classification and optimization. The cubic complexity of standard GP however leads to poor scalability, which poses challenges in the era of big data. Hence, various scalable GPs have been developed in the literature in order to improve the scalability while retaining desirable prediction accuracy. This paper devotes to investigating the methodological characteristics and performance of representative global and local scalable GPs including sparse approximations and local aggregations from four main perspectives: scalability, capability, controllability and robustness. The numerical experiments on two toy examples and five real-world datasets with up to 250K points offer the following findings. In terms of scalability, most of the scalable GPs own a time complexity that is linear to the training size. In terms of capability, the sparse approximations capture the long-term spatial correlations, the local aggregations capture the local patterns but suffer from over-fitting in some scenarios. In terms of controllability, we could improve the performance of sparse approximations by simply increasing the inducing size. But this is not the case for local aggregations. In terms of robustness, local aggregations are robust to various initializations of hyperparameters due to the local attention mechanism. Finally, we highlight that the proper hybrid of global and local scalable GPs may be a promising way to improve both the model capability and scalability for big data.Comment: 25 pages, 15 figures, preprint submitted to KB

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

New Fuzzy Support Vector Machine for the Class Imbalance Problem in Medical Datasets Classification

Author: Hongyuan Wang
Tongguang Ni
Xiaoqing Gu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

In medical datasets classification, support vector machine (SVM) is considered to be one of the most successful methods. However, most of the real-world medical datasets usually contain some outliers/noise and data often have class imbalance problems. In this paper, a fuzzy support machine (FSVM) for the class imbalance problem (called FSVM-CIP) is presented, which can be seen as a modified class of FSVM by extending manifold regularization and assigning two misclassification costs for two classes. The proposed FSVM-CIP can be used to handle the class imbalance problem in the presence of outliers/noise, and enhance the locality maximum margin. Five real-world medical datasets, breast, heart, hepatitis, BUPA liver, and pima diabetes, from the UCI medical database are employed to illustrate the method presented in this paper. Experimental results on these datasets show the outperformed or comparable effectiveness of FSVM-CIP

Crossref

Directory of Open Access Journals

PubMed Central

Toward robust and efficient physically-based rendering

Author: Pajot Anthony
Publication venue
Publication date: 26/04/2012
Field of study

Le rendu fondé sur la physique est utilisé pour le design, l'illustration ou l'animation par ordinateur. Ce type de rendu produit des images photo-réalistes en résolvant les équations qui décrivent le transport de la lumière dans une scène. Bien que ces équations soient connues depuis longtemps, et qu'un grand nombre d'algorithmes aient été développés pour les résoudre, il n'en existe pas qui puisse gérer de manière efficace toutes les scènes possibles. Plutôt qu'essayer de développer un nouvel algorithme de simulation d'éclairage, nous proposons d'améliorer la robustesse de la plupart des méthodes utilisées à ce jour et/ou qui sont amenées à être développées dans les années à venir. Nous faisons cela en commençant par identifier les sources de non-robustesse dans un moteur de rendu basé sur la physique, puis en développant des méthodes permettant de minimiser leur impact. Le résultat de ce travail est un ensemble de méthodes utilisant différents outils mathématiques et algorithmiques, chacune de ces méthodes visant à améliorer une partie spécifique d'un moteur de rendu. Nous examinons aussi comment les architectures matérielles actuelles peuvent être utilisées à leur maximum afin d'obtenir des algorithmes plus rapides, sans ajouter d'approximations. Bien que les contributions présentées dans cette thèse aient vocation à être combinées, chacune d'entre elles peut être utilisée seule : elles sont techniquement indépendantes les unes des autres.Physically-based rendering is used for design, illustration or computer animation. It consists in producing photorealistic images by solving the equations which describe how light travels in a scene. Although these equations have been known for a long time and many algorithms for light simulation have been developed, no algorithm exists to solve them efficiently for any scene. Instead of trying to develop a new algorithm devoted to light simulation, we propose to enhance the robustness of most methods used nowadays and/or which can be developed in the years to come. We do this by first identifying the sources of non-robustness in a physically-based rendering engine, and then addressing them by specific algorithms. The result is a set of methods based on different mathematical or algorithmic methods, each aiming at improving a different part of a rendering engine. We also investigate how the current hardware architectures can be used at their maximum to produce more efficient algorithms, without adding approximations. Although the contributions presented in this dissertation are meant to be combined, each of them can be used in a standalone way: they have been designed to be internally independent of each other

Thèses en ligne de l'Université Toulouse III - Paul Sabatier

Improved kernel methods for classification

Author: DUAN KAIBO
Publication venue
Publication date: 30/04/2004
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Automação de classificador SVM para aplicação em projetos de consultoria de gestão

Author: Almeida Filipe Guedes de Oliveira
Publication venue
Publication date: 22/07/2019
Field of study

Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2019.O trabalho propôs a criação de um protótipo de ferramenta para auxiliar os consultores de uma consultoria de gestão empresarial no melhor entendimento e aprofundamento do problema de seus clientes bem como na tomada de decisões e proposições de soluções. Isso é feito a partir da automatização do processo de mineração de dados, podendo ser realizado com pouca necessidade de interação com o usuário. O desenho da ferramenta tomou como base conceitos e estudos de ferramentas disponíveis para automated machine learning realizados por meio de ampla revisão bibliográfica. A partir dos estudos, foi possível estruturar a lógica da ferramenta e suas funcionalidades. Essa lógica tem como base algumas das etapas do CRISP-DM, passando pelo entendimento dos dados, preparação, modelagem e avaliação. A validação da aplicabilidade da ferramenta foi feita utilizando bases de dados públicas. Os resultados mostram que com a utilização da ferramenta, mesmo com pouco conhecimento de mineração de dados, é possível construir modelos consistentes.This work proposed the creation of a prototype tool to assist the consultants of a business management consultancy in the best understanding of the problem of its clients as well as in the decision making and propositions of solutions. This was done by automating the data mining process, which can be accomplished with little need for user interaction. The tool design was based on concepts and studies of available tools for automated machine learning supported by a wide bibliographic review. From the studies, it was possible to structure the logic of the tool and its functionalities. This logic is based on some of the steps of CRISP-DM, including data understanding, data preparation, modeling and evaluation. The tool applicability validation was done using public databases. The results show that with the use of the tool, even with little knowledge of data mining, it is possible to construct consistent models

Repositório Institucional da Universidade de Brasília

Kernel Methods for Machine Learning with Life Science Applications

Author: Abrahamsen Trine Julie
Publication venue: Technical University of Denmark
Publication date: 01/01/2013
Field of study

Online Research Database In Technology

A Practical and Conceptual Framework for Learning in Control

Author: Deisenroth MP
Rasmussen CE
Publication venue
Publication date: 01/01/2010
Field of study

We propose a fully Bayesian approach for efficient reinforcement learning (RL) in Markov decision processes with continuous-valued state and action spaces when no expert knowledge is available. Our framework is based on well-established ideas from statistics and machine learning and learns fast since it carefully models, quantifies, and incorporates available knowledge when making decisions. The key ingredient of our framework is a probabilistic model, which is implemented using a Gaussian process (GP), a distribution over functions. In the context of dynamic systems, the GP models the transition function. By considering all plausible transition functions simultaneously, we reduce model bias, a problem that frequently occurs when deterministic models are used. Due to its generality and efficiency, our RL framework can be considered a conceptual and practical approach to learning models and controllers whe

CiteSeerX

Spiral - Imperial College Digital Repository