402 research outputs found

    Monotonicity in Ant Colony Classification Algorithms

    Get PDF
    Classification algorithms generally do not use existing domain knowledge during model construction. The creation of models that conflict with existing knowledge can reduce model acceptance, as users have to trust the models they use. Domain knowledge can be integrated into algorithms using semantic constraints to guide model construction. This paper proposes an extension to an existing ACO-based classification rule learner to create lists of monotonic classification rules. The proposed algorithm was compared to a majority classifier and the Ordinal Learning Model (OLM) monotonic learner. Our results show that the proposed algorithm successfully outperformed OLM’s predictive accuracy while still producing monotonic models

    Using an Ant Colony Optimization Algorithm for Monotonic Regression Rule Discovery

    Get PDF
    Many data mining algorithms do not make use of existing domain knowledge when constructing their models. This can lead to model rejection as users may not trust models that behave contrary to their expectations. Semantic constraints provide a way to encapsulate this knowledge which can then be used to guide the construction of models. One of the most studied semantic constraints in the literature is monotonicity, however current monotonically-aware algorithms have focused on ordinal classification problems. This paper proposes an extension to an ACO-based regression algorithm in order to extract a list of monotonic regression rules. We compared the proposed algorithm against a greedy regression rule induction algorithm that preserves monotonic constraints and the well-known M5’ Rules. Our experiments using eight publicly available data sets show that the proposed algorithm successfully creates monotonic rules while maintaining predictive accuracy

    Post-Processing Methods to Enforce Monotonic Constraints in Ant Colony Classification Algorithms

    Get PDF
    Most classification algorithms ignore existing domain knowledge during model construction, which can decrease the model's comprehensibility and increase the likelihood of model rejection due to users losing trust in the models they use. One approach to encapsulate this domain knowledge is monotonic constraints. This paper proposes new monotonic pruners to enforce monotonic constraints on models created by an existing ACO algorithm in a post-processing stage. We compare the effectiveness of the new pruners against an existing post-processing approach that also enforce constraints. Additionally, we also compare the effectiveness of both these post-processing procedures in isolation and in conjunction with favouring constraints in the learning phase. Our results show that our proposed pruners outperform the existing post-processing approach and the combination of favouring and enforcing constraints at different stages of the model construction process is the most effective solution

    Image Quality Estimation: Soft-ware for Objective Evaluation

    Get PDF
    Digital images are widely used in our daily lives and the quality of images is important to the viewing experience. Low quality images may be blurry or contain noise or compression artifacts. Humans can easily estimate image quality, but it is not practical to use human subjects to measure image quality in real applications. Image Quality Estimators (QE) are algorithms that evaluate image qualities automatically. These QEs compute scores of any input images to represent their qualities. This thesis mainly focuses on evaluating the performance of QEs. Two approaches used in this work are objective software analysis and the subjective database design. For the first, we create a software consisting of functional modules to test QE performances. These modules can load images from subjective databases or generate distortion images from any input images. Their QE scores are computed and analyzed by the statistical method module so that they can be easily interpreted and reported. Some modules in this software are combined and formed into a published software package: Stress Testing Image Quality Estimators (STIQE). In addition to the QE analysis software, a new subjective database is designed and implemented using both online and in-lab subjective tests. The database is designed using the pairwise comparison method and the subjective quality scores are computed using the Bradley-Terry model and Maximum Likelihood Estimation (MLE). While four testing phases are designed for this databases, only phase 1 is reported in this work

    Monotonicity Detection and Enforcement in Longitudinal Classification

    Get PDF
    Longitudinal datasets contain repeated measurements of the same variables at different points in time, which can be used by researchers to discover useful knowledge based on the changes of the data over time. Monotonic relations often occur in real-world data and need to be preserved in data mining models in order for the models to be acceptable by users. We propose a new methodology for detecting monotonic relations in longitudinal datasets and applying them in longitudinal classification model construction. Two different approaches were used to detect monotonic relations and include them into the classification task. The proposed approaches are evaluated using data from the English Lon- gitudinal Study of Ageing (ELSA) with 10 different age-related diseases used as class variables to be predicted. A gradient boosting algorithm (XGBoost) is used for constructing classification models in two scenarios: enforcing and not enforcing the constraints. The results show that enforcement of monotonicity constraints can consistently improve the predictive accuracy of the constructed models. The produced models are fully monotonic according to the monotonicity constraints, which can have a positive impact on model acceptance in real world applications

    Learning nonlinear monotone classifiers using the Choquet Integral

    Get PDF
    In der jüngeren Vergangenheit hat das Lernen von Vorhersagemodellen, die eine monotone Beziehung zwischen Ein- und Ausgabevariablen garantieren, wachsende Aufmerksamkeit im Bereich des maschinellen Lernens erlangt. Besonders für flexible nichtlineare Modelle stellt die Gewährleistung der Monotonie eine große Herausforderung für die Umsetzung dar. Die vorgelegte Arbeit nutzt das Choquet Integral als mathematische Grundlage für die Entwicklung neuer Modelle für nichtlineare Klassifikationsaufgaben. Neben den bekannten Einsatzgebieten des Choquet-Integrals als flexible Aggregationsfunktion in multi-kriteriellen Entscheidungsverfahren, findet der Formalismus damit Eingang als wichtiges Werkzeug für Modelle des maschinellen Lernens. Neben dem Vorteil, Monotonie und Flexibilität auf elegante Weise mathematisch vereinbar zu machen, bietet das Choquet-Integral Möglichkeiten zur Quantifizierung von Wechselwirkungen zwischen Gruppen von Attributen der Eingabedaten, wodurch interpretierbare Modelle gewonnen werden können. In der Arbeit werden konkrete Methoden für das Lernen mit dem Choquet Integral entwickelt, welche zwei unterschiedliche Ansätze nutzen, die Maximum-Likelihood-Schätzung und die strukturelle Risikominimierung. Während der erste Ansatz zu einer Verallgemeinerung der logistischen Regression führt, wird der zweite mit Hilfe von Support-Vektor-Maschinen realisiert. In beiden Fällen wird das Lernproblem imWesentlichen auf die Parameter-Identifikation von Fuzzy-Maßen für das Choquet Integral zurückgeführt. Die exponentielle Anzahl von Freiheitsgraden zur Modellierung aller Attribut-Teilmengen stellt dabei besondere Herausforderungen im Hinblick auf Laufzeitkomplexität und Generalisierungsleistung. Vor deren Hintergrund werden die beiden Ansätze praktisch bewertet und auch theoretisch analysiert. Zudem werden auch geeignete Verfahren zur Komplexitätsreduktion und Modellregularisierung vorgeschlagen und untersucht. Die experimentellen Ergebnisse sind auch für anspruchsvolle Referenzprobleme im Vergleich mit aktuellen Verfahren sehr gut und heben die Nützlichkeit der Kombination aus Monotonie und Flexibilität des Choquet Integrals in verschiedenen Ansätzen des maschinellen Lernens hervor

    Modelos híbridos de aprendizaje basados en instancias y reglas para Clasificación Monotónica

    Get PDF
    En los problemas de clasificación supervisada, el atributo respuesta depende de determinados atributos de entrada explicativos. En muchos problemas reales el atributo respuesta está representado por valores ordinales que deberían incrementarse cuando algunos de los atributos explicativos de entrada también lo hacen. Estos son los llamados problemas de clasificación con restricciones monotónicas. En esta Tesis, hemos revisado los clasificadores monotónicos propuestos en la literatura y hemos formalizado la teoría del aprendizaje basado en ejemplos anidados generalizados para abordar la clasificación monotónica. Propusimos dos algoritmos, un primer algoritmos voraz, que require de datos monotónicos y otro basado en algoritmos evolutivos, que es capaz de abordar datos imperfectos que presentan violaciones monotónicas entre las instancias. Ambos mejoran el acierto, el índice de no-monotonicidad de las predicciones y la simplicidad de los modelos sobre el estado-del-arte.In supervised prediction problems, the response attribute depends on certain explanatory attributes. Some real problems require the response attribute to represent ordinal values that should increase with some of the explaining attributes. They are called classification problems with monotonicity constraints. In this thesis, we have reviewed the monotonic classifiers proposed in the literature and we have formalized the nested generalized exemplar learning theory to tackle monotonic classification. Two algorithms were proposed, a first greedy one, which require monotonic data and an evolutionary based algorithm, which is able to address imperfect data with monotonic violations present among the instances. Both improve the accuracy, the non-monotinic index of predictions and the simplicity of models over the state-of-the-art.Tesis Univ. Jaén. Departamento INFORMÁTIC

    Discovering Regression and Classification Rules with Monotonic Constraints Using Ant Colony Optimization

    Get PDF
    Data mining is a broad area that encompasses many different tasks from the supervised classification and regression tasks to unsupervised association rule mining and clustering. A first research thread in this thesis is the introduction of new Ant Colony Optimization (ACO)-based algorithms that tackle the regression task in data mining, exploring three different learning strategies: Iterative Rule Learning, Pittsburgh and Michigan strategies. The Iterative Rule Learning strategy constructs one rule at a time, where the best rule created by the ant colony is added to the rule list at each iteration, until a complete rule list is created. In the Michigan strategy, each ant constructs a single rule and from this collection of rules a niching algorithm combines the rules to create the final rule list. Finally, in the Pittsburgh strategy each ant constructs an entire rule list at each iteration, with the best list constructed by an ant in any iteration representing the final model. The most successful Pittsburgh-based Ant-Miner-Reg_PB algorithm, among the three variants, has been shown to be competitive against a well-known regression rule induction algorithm from the literature. The second research thread pursued involved incorporating existing domain knowledge to guide the construction of models as it is rare to find new domains that nothing is known about. One type of domain knowledge that occurs frequently in real world data-sets is monotonic constraints which capture increasing or decreasing trends within the data. In this thesis, monotonic constraints have been introduced into ACO-based rule induction algorithms for both classification and regression tasks. The enforcement of monotonic constraints has been implemented as a two step process. The first is a soft constraint preference in the model construction phase. This is followed by a hard constraint post-processing pruning suite to ensure the production of monotonic models. The new algorithms presented here have been shown to maintain and improve their predictive power when compared to non-monotonic rule induction algorithms
    corecore