12 research outputs found

    Rule Extraction on Numeric Datasets Using Hyper-rectangles

    Get PDF
    When there is a need to understand the data stored in a database, one of the main requirements is being able to extract knowledge in the form of rules. Classification strategies allow extracting rules almost naturally. In this paper, a new classification strategy is presented that uses hyper-rectangles as data descriptors to achieve a model that allows extracting knowledge in the form of classification rules. The participation of an expert for training the model is discussed. Finally, the results obtained using the databases from the UCI repository are presented and compared with other existing classification models, showing that the algorithm presented requires less computational resources and achieves the same accuracy level and number of extracted rules.Fil: Hasperué, Waldo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; ArgentinaFil: Lanzarini, Laura Cristina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; ArgentinaFil: de Giusti, Armando Eduardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentin

    GTM: the generative topographic mapping

    Get PDF
    This thesis describes the Generative Topographic Mapping (GTM) --- a non-linear latent variable model, intended for modelling continuous, intrinsically low-dimensional probability distributions, embedded in high-dimensional spaces. It can be seen as a non-linear form of principal component analysis or factor analysis. It also provides a principled alternative to the self-organizing map --- a widely established neural network model for unsupervised learning --- resolving many of its associated theoretical problems. An important, potential application of the GTM is visualization of high-dimensional data. Since the GTM is non-linear, the relationship between data and its visual representation may be far from trivial, but a better understanding of this relationship can be gained by computing the so-called magnification factor. In essence, the magnification factor relates the distances between data points, as they appear when visualized, to the actual distances between those data points. There are two principal limitations of the basic GTM model. The computational effort required will grow exponentially with the intrinsic dimensionality of the density model. However, if the intended application is visualization, this will typically not be a problem. The other limitation is the inherent structure of the GTM, which makes it most suitable for modelling moderately curved probability distributions of approximately rectangular shape. When the target distribution is very different to that, theaim of maintaining an `interpretable' structure, suitable for visualizing data, may come in conflict with the aim of providing a good density model. The fact that the GTM is a probabilistic model means that results from probability theory and statistics can be used to address problems such as model complexity. Furthermore, this framework provides solid ground for extending the GTM to wider contexts than that of this thesis

    Design of Physical System Experiments Using Bayes Linear Emulation and History Matching Methodology with Application to Arabidopsis Thaliana

    Get PDF
    There are many physical processes within our world which scientists aim to understand. Computer models representing these processes are fundamental to achieving such understanding. Bayes linear emulation is a powerful tool for comprehensively exploring the behaviour of computationally intensive models. History matching is a method for finding the set of inputs to a computer model for which the corresponding model outputs give acceptable matches to observed data, given our state of uncertainty regarding the model itself, the measurements, and, if used, the emulators representing the model. This thesis provides three major developments to the current methodology in this area. We develop sequential history matching methodology by splitting the available data into groups and gaining insight about the information obtained from each group. Such insight is then realised through a wide array of novel visualisations. We develop emulation techniques for the case when there are hypersurfaces of input space across which we have essentially perfect knowledge about the model’s behaviour. Finally, we have developed the use of history matching methodology as criteria for the design of physical system experiments. We outline the general framework for design in a history matching setting, before discussing many extensions, including the performance of a comprehensive robustness analysis on our design choice. We outline our novel methodology on a model of hormonal crosstalk in the roots of an Arabidopsis plant

    Multidimensional Clustering for Spatio-Temporal Data and its Application in Climate Research

    Get PDF

    Theoretische Untersuchungen Kovalenter Mechanochemie

    Get PDF
    This thesis is concerned with computational-chemistry investigations of mechanoresponsive molecules which feature predetermined breaking points (PBPs). The mechanophoric systems have been approached at different levels of theory. Reactive molecular dynamics (rMD), density functional theory (DFT), second order Møller-Plesset perturbation theory (MP2) and multireference methods were employed to obtain a complete picture of the mechanochemical reactions.Die vorliegende Arbeit befasst sich mit der Untersuchung von mechanoresponsiven Molekülen, die molekulare Sollbruchstellen (PBPs) enhalten, mit Methoden der Computerchemie. In Abhängigkeit von den jeweiligen Fragestellungen kamen dabei unterschiedliche Methoden zum Einsatz. Reaktive molekulare Dynamik (rMD), Dichtefunktionaltheorie (DFT), Møller-Plesset Störungstheorie zweiter Ordnung (MP2) und Multireferenzverfahren wurden verwendet, um ein vollständiges Bild der mechanochemischen Reaktionen zu erhalten

    Acta Cybernetica : Volume 19. Number 1.

    Get PDF
    corecore