19 research outputs found

    Modelos híbridos de aprendizaje basados en instancias y reglas para Clasificación Monotónica

    Get PDF
    En los problemas de clasificación supervisada, el atributo respuesta depende de determinados atributos de entrada explicativos. En muchos problemas reales el atributo respuesta está representado por valores ordinales que deberían incrementarse cuando algunos de los atributos explicativos de entrada también lo hacen. Estos son los llamados problemas de clasificación con restricciones monotónicas. En esta Tesis, hemos revisado los clasificadores monotónicos propuestos en la literatura y hemos formalizado la teoría del aprendizaje basado en ejemplos anidados generalizados para abordar la clasificación monotónica. Propusimos dos algoritmos, un primer algoritmos voraz, que require de datos monotónicos y otro basado en algoritmos evolutivos, que es capaz de abordar datos imperfectos que presentan violaciones monotónicas entre las instancias. Ambos mejoran el acierto, el índice de no-monotonicidad de las predicciones y la simplicidad de los modelos sobre el estado-del-arte.In supervised prediction problems, the response attribute depends on certain explanatory attributes. Some real problems require the response attribute to represent ordinal values that should increase with some of the explaining attributes. They are called classification problems with monotonicity constraints. In this thesis, we have reviewed the monotonic classifiers proposed in the literature and we have formalized the nested generalized exemplar learning theory to tackle monotonic classification. Two algorithms were proposed, a first greedy one, which require monotonic data and an evolutionary based algorithm, which is able to address imperfect data with monotonic violations present among the instances. Both improve the accuracy, the non-monotinic index of predictions and the simplicity of models over the state-of-the-art.Tesis Univ. Jaén. Departamento INFORMÁTIC

    Similarity-based and Iterative Label Noise Filters for Monotonic Classification

    Get PDF
    Monotonic ordinal classification has received an increasing interest in the latest years. Building monotone models from these problems usually requires datasets that verify monotonic relationships among the samples. When the monotonic relationships are not met, changing the labels may be a viable option, but the risk is high: wrong label changes would completely change the information contained in the data. In this work, we tackle the construction of monotone datasets by removing the wrong or noisy examples that violate monotonicity restrictions. We propose two monotonic noise filtering algorithms to preprocess the ordinal datasets and improve the monotonic relations between instances. The experiments are carried out over eleven ordinal datasets, showing that the application of the proposed filters improve the prediction capabilities over different levels of noise

    Techniques for data pattern selection and abstraction

    Get PDF
    This thesis concerns the problem of prototype reduction in instance-based learning. In order to deal with problems such as storage requirements, sensitivity to noise and computational complexity, various algorithms have been presented that condense the number of stored prototypes, while maintaining competent classification accuracy. Instance selection, which recovers a smaller subset of the original training set, is the most widely used technique for instance reduction. But, prototype abstraction that generates new prototypes to replace the initial ones has also gained a lot of interest recently. The major contribution of this work is the proposal of four novel frameworks for performing prototype reduction, the Class Boundary Preserving algorithm (CBP), a hybrid method that uses both selection and generation of prototypes, Instance Seriation for Prototype Abstraction (ISPA), which is an abstraction algorithm, and two selective techniques, Spectral Instance Reduction (SIR) and Direct Weight Optimization (DWO). CBP is a multi-stage method based on a simple heuristic that is very effective in identifying samples close to class borders. Using a noise filter harmful instances are removed, while the powerful heuristic determines the geometrical distribution of patterns around every instance. Together with the concepts of nearest enemy pairs and mean shift clustering this algorithm decides on the final set of retained prototypes. DWO is a selection model whose output set of prototypes is decided by a set of binary weights. These weights are computed according to an objective function composed of the ratio between the nearest friend and nearest enemy of every sample. In order to obtain good quality results DWO is optimized using a genetic algorithm. ISPA is an abstraction technique that employs the concept of data seriation to organize instances in an arrangement that favours merging between them. As a result, a new set of prototypes is created. Results show that CBP, SIR and DWO, the three major algorithms presented in this thesis, are competent and efficient in terms of at least one of the two basic objectives, classification accuracy and condensation ratio. The comparison against other successful condensation algorithms illustrates the competitiveness of the proposed models. The SIR algorithm presents a set of border discriminating features (BDFs) that depicts the local distribution of friends and enemies of all samples. These are then used along with spectral graph theory to partition the training set in to border and internal instances

    Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling

    Get PDF
    This paper aims at providing an in-depth overview of designing interpretable fuzzy inference models from data within a unified framework. The objective of complex system modelling is to develop reliable and understandable models for human being to get insights into complex real-world systems whose first-principle models are unknown. Because system behaviour can be described naturally as a series of linguistic rules, data-driven fuzzy modelling becomes an attractive and widely used paradigm for this purpose. However, fuzzy models constructed from data by adaptive learning algorithms usually suffer from the loss of model interpretability. Model accuracy and interpretability are two conflicting objectives, so interpretation preservation during adaptation in data-driven fuzzy system modelling is a challenging task, which has received much attention in fuzzy system modelling community. In order to clearly discriminate the different roles of fuzzy sets, input variables, and other components in achieving an interpretable fuzzy model, a taxonomy of fuzzy model interpretability is first proposed in terms of low-level interpretability and high-level interpretability in this paper. The low-level interpretability of fuzzy models refers to fuzzy model interpretability achieved by optimizing the membership functions in terms of semantic criteria on fuzzy set level, while the high-level interpretability refers to fuzzy model interpretability obtained by dealing with the coverage, completeness, and consistency of the rules in terms of the criteria on fuzzy rule level. Some criteria for low-level interpretability and high-level interpretability are identified, respectively. Different data-driven fuzzy modelling techniques in the literature focusing on the interpretability issues are reviewed and discussed from the perspective of low-level interpretability and high-level interpretability. Furthermore, some open problems about interpretable fuzzy models are identified and some potential new research directions on fuzzy model interpretability are also suggested. Crown Copyright © 2008

    Bandits on graphs and structures

    Get PDF
    We investigate the structural properties of certain sequential decision-making problems with limited feedback (bandits) in order to bring the known algorithmic solutions closer to a practical use. In the first part, we put a special emphasis on structures that can be represented as graphs on actions, in the second part we study the large action spaces that can be of exponential size in the number of base actions or even infinite. We show how to take advantage of structures over the actions and (provably) learn faster

    LIPIcs, Volume 258, SoCG 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 258, SoCG 2023, Complete Volum

    Mathematical modelling of GPCR-mediated calcium signalling

    Get PDF
    Ca2+ is an important messenger which mediates several physiological functions, including muscle contraction, fertilisation, heart regulation and gene transcription. One major way its cytosolic level is raised is via a G-protein coupled receptor (GPCR)- mediated release from intracellular stores. GPCR’s are the target of approximately 50% of all drugs in clinical use. Hence, understanding the underlying mechanisms of signalling in this pathway could lead to improved therapy in disease conditions associated with abnornmal Ca2+ signalling, and to the identification of new drug targets. To gain such insight, this thesis builds and analyses a detailed mathematical model of key processes leading to Ca2+ mobilisation. Ca2+ signalling is considered in the particular context of the M3 muscarinic receptor system. Guided by available data, the Ca2+ mobilisation model is assembled, first by analysing a base G-protein activation model, and subsequently extending it with downstream details. Computationally efficient designs of a global parameter sensitivity analysis method are used to identify the key controlling parameters with respect to the main features of the Ca2+ data. The underlying mechanism behind the experimentally observed, rapid, amplified Ca2+ response is shown to be a rapid rate of inositol trisphosphate (IP3) formation from Phosphatidylinositol 4,5-bisphosphate (PIP2) hydrolysis. Using the same results, potential drug targets (apart fromthe GPCR) are identified, including the sarco/endoplasmic reticulum Ca2+-ATPase (SERCA) and PIP2. Moreover, possible explanations for therapeutic failures were found when some parameters exerted a biphasic effect on the relative Ca2+ increase. The sensitivity analysis results are used to simplify the process of parameter estimation by a significant reduction of the parameter space of interest. An evolutionary algorithm is used to successfully fit the model to a significant portion of the Ca2+ data. Subsequent sensitivity analyses of the best-fitting parameter sets suggest that mechanistic modelling of kinase-mediated GPCR desensitisation, and SERCA dynamics may be required for a comprehensive representation of the data

    Applying Machine Learning Methods to Suggest Network Involvement and Functionality of Genes in Saccharomyces cerevisiae

    Get PDF
    Elucidating genetic networks provides the foundation for the development of new treatments or cures for diseased pathways, and determining novel gene functionality is critical for bringing a better understanding on how an organism functions as a whole. In this dissertation, I developed a methodology that correctly locates genes that may be involved in genetic networks with a given gene based on its location over 50% of the time or based on its description over 43% of the time. I also developed a methodology that makes it easier to predict how a gene product behaves in a cellular context by suggesting the correct Gene Ontology term over 80% of the time. The designed software provides researchers with a way to focus their search for coregulated genes which will lead to better microarray chip design and limits the list of possible functions of a gene product. This ultimately saves the researcher time and money

    Hyperrectangles Selection for Monotonic Classification by Using Evolutionary Algorithms

    Get PDF
    In supervised learning, some real problems require the response attribute to represent ordinal values that should increase with some of the explaining attributes. They are called classification problems with monotonicity constraints. Hyperrectangles can be viewed as storing objects in Rn which can be used to learn concepts combining instance-based classification with the axis-parallel rectangle mainly used in rule induction systems. This hybrid paradigm is known as nested generalized exemplar learning. In this paper, we propose the selection of the most effective hyperrectangles by means of evolutionary algorithms to tackle monotonic classification. The model proposed is compared through an exhaustive experimental analysis involving a large number of data sets coming from real classification and regression problems. The results reported show that our evolutionary proposal outperforms other instance-based and rule learning models, such as OLM, OSDL, k-NN and MID; in accuracy and mean absolute error, requiring a fewer number of hyperrectangles.TIN2014-57251-
    corecore