12 research outputs found
Rule Extraction on Numeric Datasets Using Hyper-rectangles
When there is a need to understand the data stored in a database, one of the main requirements is being able to extract knowledge in the form of rules. Classification strategies allow extracting rules almost naturally. In this paper, a new classification strategy is presented that uses hyper-rectangles as data descriptors to achieve a model that allows extracting knowledge in the form of classification rules. The participation of an expert for training the model is discussed. Finally, the results obtained using the databases from the UCI repository are presented and compared with other existing classification models, showing that the algorithm presented requires less computational resources and achieves the same accuracy level and number of extracted rules.Fil: Hasperué, Waldo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; ArgentinaFil: Lanzarini, Laura Cristina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; ArgentinaFil: de Giusti, Armando Eduardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentin
GTM: the generative topographic mapping
This thesis describes the Generative Topographic Mapping (GTM) --- a non-linear latent variable model, intended for modelling continuous, intrinsically low-dimensional probability distributions, embedded in high-dimensional spaces. It can be seen as a non-linear form of principal component analysis or factor analysis. It also provides a principled alternative to the self-organizing map --- a widely established neural network model for unsupervised learning --- resolving many of its associated theoretical problems. An important, potential application of the GTM is visualization of high-dimensional data. Since the GTM is non-linear, the relationship between data and its visual representation may be far from trivial, but a better understanding of this relationship can be gained by computing the so-called magnification factor. In essence, the magnification factor relates the distances between data points, as they appear when visualized, to the actual distances between those data points. There are two principal limitations of the basic GTM model. The computational effort required will grow exponentially with the intrinsic dimensionality of the density model. However, if the intended application is visualization, this will typically not be a problem. The other limitation is the inherent structure of the GTM, which makes it most suitable for modelling moderately curved probability distributions of approximately rectangular shape. When the target distribution is very different to that, theaim of maintaining an `interpretable' structure, suitable for visualizing data, may come in conflict with the aim of providing a good density model. The fact that the GTM is a probabilistic model means that results from probability theory and statistics can be used to address problems such as model complexity. Furthermore, this framework provides solid ground for extending the GTM to wider contexts than that of this thesis
Design of Physical System Experiments Using Bayes Linear Emulation and History Matching Methodology with Application to Arabidopsis Thaliana
There are many physical processes within our world which scientists aim to understand. Computer models representing these processes are fundamental to achieving such understanding. Bayes linear emulation is a powerful tool for comprehensively exploring the behaviour of computationally intensive models. History matching is a method for finding the set of inputs to a computer model for which the corresponding model outputs give acceptable matches to observed data, given our state of uncertainty regarding the model itself, the measurements, and, if used, the emulators representing the model. This thesis provides three major developments to the current methodology in this area. We develop sequential history matching methodology by splitting the available data into groups and gaining insight about the information obtained from each group. Such insight is then realised through a wide array of novel visualisations. We develop emulation techniques for the case when there are hypersurfaces of input space across which we have essentially perfect knowledge about the model’s behaviour. Finally, we have developed the use of history matching methodology as criteria for the design of physical system experiments. We outline the general framework for design in a history matching setting, before discussing many extensions, including the performance of a comprehensive robustness analysis on our design choice. We outline our novel methodology on a model of hormonal crosstalk in the roots of an Arabidopsis plant
Electricity Price Time Series Forecasting in Deregulated Markets Using Recurrent Neural Network Based Approaches
Ph.DDOCTOR OF PHILOSOPH
Theoretische Untersuchungen Kovalenter Mechanochemie
This thesis is concerned with computational-chemistry investigations of mechanoresponsive molecules
which feature predetermined breaking points (PBPs). The mechanophoric systems have been approached at different levels
of theory. Reactive molecular dynamics (rMD), density functional theory (DFT), second order
Møller-Plesset perturbation theory (MP2) and multireference methods were employed
to obtain a complete picture of the mechanochemical reactions.Die vorliegende Arbeit befasst sich mit der Untersuchung von mechanoresponsiven
Molekülen, die molekulare Sollbruchstellen (PBPs) enhalten, mit Methoden der
Computerchemie. In Abhängigkeit von den jeweiligen Fragestellungen kamen dabei
unterschiedliche Methoden zum Einsatz. Reaktive molekulare Dynamik (rMD),
Dichtefunktionaltheorie (DFT), Møller-Plesset Störungstheorie zweiter Ordnung (MP2)
und Multireferenzverfahren wurden verwendet, um ein vollständiges Bild der mechanochemischen
Reaktionen zu erhalten