28 research outputs found
Combining classification algorithms
Dissertação de Doutoramento em Ciência de Computadores apresentada à Faculdade de Ciências da Universidade do PortoA capacidade de um algoritmo de aprendizagem induzir, para um determinado problema, uma boa generalização depende da linguagem de representação usada para generalizar os exemplos. Como diferentes algoritmos usam diferentes linguagens de representação e estratégias de procura, são explorados espaços diferentes e são obtidos resultados diferentes. O problema de encontrar a representação mais adequada para o problema em causa, é uma área de investigação bastante activa. Nesta dissertação, em vez de procurar métodos que fazem o ajuste aos dados usando uma única linguagem de representação, apresentamos uma família de algoritmos, sob a designação genérica de Generalização em Cascata, onde o espaço de procura contem modelos que utilizam diferentes linguagens de representação. A ideia básica do método consiste em utilizar os algoritmos de aprendizagem em sequência. Em cada iteração ocorre um processo com dois passos. No primeiro passo, um classificador constrói um modelo. No segundo passo, o espaço definido pelos atributos é estendido pela inserção de novos atributos gerados utilizando este modelo. Este processo de construção de novos atributos constrói atributos na linguagem de representação do classificador usado para construir o modelo. Se posteriormente na sequência, um classificador utiliza um destes novos atributos para construir o seu modelo, a sua capacidade de representação foi estendida. Desta forma as restrições da linguagem de representação dosclassificadores utilizados a mais alto nível na sequência, são relaxadas pela incorporação de termos da linguagem derepresentação dos classificadores de base. Esta é a metodologia base subjacente ao sistema Ltree e à arquitecturada Generalização em Cascata.O método é apresentado segundo duas perspectivas. Numa primeira parte, é apresentado como uma estratégia paraconstruir árvores de decisão multivariadas. É apresentado o sistema Ltree que utiliza como operador para a construção de atributos um discriminante linear. ..
Recommended from our members
Evaluation of pesticide toxicity: a hierarchical QSAR approach to model the acute aquatic toxicity and avian oral toxicity of pesticides
The thesis aimed to extract information relevant to the hazard and risk assessment of pesticides. In particular, quantitative structure-activity relationship (QSAR) approaches have been used to build up a mathematical model able to predict the aquatic acute toxicity, LC50, and the avian oral toxicity, LD50, for pesticides. Ecotoxicological values were collected from several databases, and screened according to quality criteria.
A hierarchical QSAR approach was applied for the prediction of acute aquatic toxicity. Chemical structures were encoded into molecular descriptors by an automated, seamless procedure available within the OpenMolGRID system. Different linear and non-linear regression techniques were used to obtain reliable and thoroughly validated QSARs. The final model was developed by a counter-propagation neural network coupled with genetic algorithms for variable selection. The proposed QSAR is consistent with McFarland's principle for biological activity and makes use of seven molecular descriptors. The model was assessed thoroughly in test (R2 = 0.8) and validation sets (R2 = 0.72), the y-scrambling test and a sensitivity/stability test.
The second endpoint considered in this thesis was avian oral toxicity. As previously, the chemical description of chemicals was generated automatically by the OpenMolGRID system. The best classification model was chosen on the basis of the performances on a validation set of 19 data points, and was obtained from a support vector machine using 94 data points and nine variables selected by genetic algorithms (Error Ratetraining = 0.021, Error Ratevalidation = 0.158). The model allowed for a mechanistic estimation of the toxicological action. In fact, several descriptors selected for the final classification model encode for the interaction of the pesticides with other molecules. The presence of hetero-atoms, e.g. sulphur atoms, is correlated with the toxicity, and the pool of descriptor selected is generally dependent from the 3D conformation of the structures. These suggest that, in the case of avian oral toxicity, pesticides probably exert their toxic action through the interaction with some macromolecule and/or protein of the biological system
Non-Parametric Learning for Monocular Visual Odometry
This thesis addresses the problem of incremental localization from visual information, a scenario commonly known as visual odometry. Current visual odometry algorithms are heavily dependent on camera calibration, using a pre-established geometric model to provide the transformation between input (optical flow estimates) and output (vehicle motion estimates) information. A novel approach to visual odometry is proposed in this thesis where the need for camera calibration, or even for a geometric model, is circumvented by the use of machine learning principles and techniques. A non-parametric Bayesian regression technique, the Gaussian Process (GP), is used to elect the most probable transformation function hypothesis from input to output, based on training data collected prior and during navigation. Other than eliminating the need for a geometric model and traditional camera calibration, this approach also allows for scale recovery even in a monocular configuration, and provides a natural treatment of uncertainties due to the probabilistic nature of GPs. Several extensions to the traditional GP framework are introduced and discussed in depth, and they constitute the core of the contributions of this thesis to the machine learning and robotics community. The proposed framework is tested in a wide variety of scenarios, ranging from urban and off-road ground vehicles to unconstrained 3D unmanned aircrafts. The results show a significant improvement over traditional visual odometry algorithms, and also surpass results obtained using other sensors, such as laser scanners and IMUs. The incorporation of these results to a SLAM scenario, using a Exact Sparse Information Filter (ESIF), is shown to decrease global uncertainty by exploiting revisited areas of the environment. Finally, a technique for the automatic segmentation of dynamic objects is presented, as a way to increase the robustness of image information and further improve visual odometry results
Non-Parametric Learning for Monocular Visual Odometry
This thesis addresses the problem of incremental localization from visual information, a scenario commonly known as visual odometry. Current visual odometry algorithms are heavily dependent on camera calibration, using a pre-established geometric model to provide the transformation between input (optical flow estimates) and output (vehicle motion estimates) information. A novel approach to visual odometry is proposed in this thesis where the need for camera calibration, or even for a geometric model, is circumvented by the use of machine learning principles and techniques. A non-parametric Bayesian regression technique, the Gaussian Process (GP), is used to elect the most probable transformation function hypothesis from input to output, based on training data collected prior and during navigation. Other than eliminating the need for a geometric model and traditional camera calibration, this approach also allows for scale recovery even in a monocular configuration, and provides a natural treatment of uncertainties due to the probabilistic nature of GPs. Several extensions to the traditional GP framework are introduced and discussed in depth, and they constitute the core of the contributions of this thesis to the machine learning and robotics community. The proposed framework is tested in a wide variety of scenarios, ranging from urban and off-road ground vehicles to unconstrained 3D unmanned aircrafts. The results show a significant improvement over traditional visual odometry algorithms, and also surpass results obtained using other sensors, such as laser scanners and IMUs. The incorporation of these results to a SLAM scenario, using a Exact Sparse Information Filter (ESIF), is shown to decrease global uncertainty by exploiting revisited areas of the environment. Finally, a technique for the automatic segmentation of dynamic objects is presented, as a way to increase the robustness of image information and further improve visual odometry results
Recommended from our members
Bayesian methods in music modelling
This thesis presents several hierarchical generative Bayesian models of musical signals designed to improve the accuracy of existing multiple pitch detection systems and other musical signal processing applications whilst remaining feasible for real-time computation. At the lowest level the signal is modelled as a set of overlapping sinusoidal basis functions. The parameters of these basis functions are built into a prior framework based on principles known from musical theory and the physics of musical instruments. The model of a musical note optionally includes phenomena such as frequency and amplitude modulations, damping, volume, timbre and inharmonicity. The occurrence of note onsets in a performance of a piece of music is controlled by an underlying tempo process and the alignment of the timings to the underlying score of the music.
A variety of applications are presented for these models under differing inference constraints. Where full Bayesian inference is possible, reversible-jump Markov Chain Monte Carlo is employed to estimate the number of notes and partial frequency components in each frame of music. We also use approximate techniques such as model selection criteria and variational Bayes methods for inference in situations where computation time is limited or the amount of data to be processed is large. For the higher level score parameters, greedy search and conditional modes algorithms are found to be sufficiently accurate.
We emphasize the links between the models and inference algorithms developed in this thesis with that in existing and parallel work, and demonstrate the effects of making modifications to these models both theoretically and by means of experimental results
Bayesian Gaussian Process Models: PAC-Bayesian Generalisation Error Bounds and Sparse Approximations
Non-parametric models and techniques enjoy a growing popularity in the field of machine learning, and among these Bayesian inference for Gaussian process (GP) models has recently received significant attention. We feel that GP priors should be part of the standard toolbox for constructing models relevant to machine learning in the same way as parametric linear models are, and the results in this thesis help to remove some obstacles on the way towards this goal. In the first main chapter, we provide a distribution-free finite sample bound on the difference between generalisation and empirical (training) error for GP classification methods. While the general theorem (the PAC-Bayesian bound) is not new, we give a much simplified and somewhat generalised derivation and point out the underlying core technique (convex duality) explicitly. Furthermore, the application to GP models is novel (to our knowledge). A central feature of this bound is that its quality depends crucially on task knowledge being encoded faithfully in the model and prior distributions, so there is a mutual benefit between a sharp theoretical guarantee and empirically well-established statistical practices. Extensive simulations on real-world classification tasks indicate an impressive tightness of the bound, in spite of the fact that many previous bounds for related kernel machines fail to give non-trivial guarantees in this practically relevant regime. In the second main chapter, sparse approximations are developed to address the problem of the unfavourable scaling of most GP techniques with large training sets. Due to its high importance in practice, this problem has received a lot of attention recently. We demonstrate the tractability and usefulness of simple greedy forward selection with information-theoretic criteria previously used in active learning (or sequential design) and develop generic schemes for automatic model selection with many (hyper)parameters. We suggest two new generic schemes and evaluate some of their variants on large real-world classification and regression tasks. These schemes and their underlying principles (which are clearly stated and analysed) can be applied to obtain sparse approximations for a wide regime of GP models far beyond the special cases we studied here
Pygmalion's Long Shadow - Determinants and Outcomes of Teachers' Evaluations
This volume comprises two papers analyzing the predictors of teachers' evaluations, and another two with the latter's outcomes as the crucial objective. In the underlying data, the Cologne High School Panel (CHiSP), teachers had been asked whom of their 10th class students they consider to be suitable to start academic studies, and whom of them not.
The first paper models these evaluations as an outcome of students' cognitive ability in terms of intelligence scores, their average grades, their parents' social class, and their aspirations. Structural equation modeling is used to control for both measurement error and indirect effects of latent and observed variables
The second paper adds another level of analysis by investigating to what extent teachers' evaluations depend on reference-group effects in the classroom. Contextual effects of both class-room achievement and social composition as well as their interaction with student achievement and teachers' frame of reference (in terms of grading concepts) are analyzed by three-level cross-classified multilevel models.
The third paper uses Esser's (1999) subjective expected utility theory to develop a formal theoretical model of self-fulfilling prophecy effects on students' educational transitions. Teachers' expectations are supposed to affect students' subjective expected probability of educational success, and thereby their educational transition propensities. Analyses control for both sample selection bias and unobserved heterogeneity.
And finally, the fourth paper models decreasing self-fulfilling effects over a sequence of educational transitions as a result of actors' belief updating. Hypotheses are tested by means of sequential logit modeling amended by a variety of sensitivity analyses.
The four papers are preceded by an elaborate introduction that aims to approximate the underlying causes and effects of all research questions by unveiling the respective social mechanisms
Robot environment learning with a mixed-linear probabilistic state-space model
This thesis proposes the use of a probabilistic state-space model with mixed-linear dynamics for learning to predict a robot's experiences. It is motivated by a desire to bridge the gap between traditional models with predefined objective semantics on the one hand, and the biologically-inspired "black box" behavioural paradigm on the other. A novel EM-type algorithm for the model is presented, which is less compuationally
demanding than the Monte Carlo techniques developed for use in
(for example) visual applications. The algorithm's E-step is slightly approximative, but an extension is described which would in principle make it asymptotically correct. Investigation using synthetically sampled data shows that the uncorrected E-step can any
case make correct inferences about quite complicated systems.
Results collected from two simulated mobile robot environments support the claim that mixed-linear models can capture both discontinuous and continuous structure in world in an intuitively natural manner; while they proved to perform only slightly better than simpler
autoregressive hidden Markov models on these simple tasks, it is possible to claim tentatively that they might scale more effectively to environments in which trends over time played a larger role. Bayesian confidence regions—easily by mixed-linear model—
proved be an effective guard for preventing it from making over-confident predictions
outside its area of competence.
A section on future extensions discusses how the model's easy invertibility could be harnessed to the ultimate aim of choosing actions, from a continuous space of possibilities,
which maximise the robot's expected payoff over several steps into the futur