143 research outputs found

    A neural-symbolic system for temporal reasoning with application to model verification and learning

    Get PDF
    The effective integration of knowledge representation, reasoning and learning into a robust computational model is one of the key challenges in Computer Science and Artificial Intelligence. In particular, temporal models have been fundamental in describing the behaviour of Computational and Neural-Symbolic Systems. Furthermore, knowledge acquisition of correct descriptions of the desired system’s behaviour is a complex task in several domains. Several efforts have been directed towards the development of tools that are capable of learning, describing and evolving software models. This thesis contributes to two major areas of Computer Science, namely Artificial Intelligence (AI) and Software Engineering. Under an AI perspective, we present a novel neural-symbolic computational model capable of representing and learning temporal knowledge in recurrent networks. The model works in integrated fashion. It enables the effective representation of temporal knowledge, the adaptation of temporal models to a set of desirable system properties and effective learning from examples, which in turn can lead to symbolic temporal knowledge extraction from the corresponding trained neural networks. The model is sound, from a theoretical standpoint, but is also tested in a number of case studies. An extension to the framework is shown to tackle aspects of verification and adaptation under the SE perspective. As regards verification, we make use of established techniques for model checking, which allow the verification of properties described as temporal models and return counter-examples whenever the properties are not satisfied. Our neural-symbolic framework is then extended to deal with different sources of information. This includes the translation of model descriptions into the neural structure, the evolution of such descriptions by the application of learning of counter examples, and also the learning of new models from simple observation of their behaviour. In summary, we believe the thesis describes a principled methodology for temporal knowledge representation, learning and extraction, shedding new light on predictive temporal models, not only from a theoretical standpoint, but also with respect to a potentially large number of applications in AI, Neural Computation and Software Engineering, where temporal knowledge plays a fundamental role.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Learning understandable classifier models.

    Get PDF
    The topic of this dissertation is the automation of the process of extracting understandable patterns and rules from data. An unprecedented amount of data is available to anyone with a computer connected to the Internet. The disciplines of Data Mining and Machine Learning have emerged over the last two decades to face this challenge. This has led to the development of many tools and methods. These tools often produce models that make very accurate predictions about previously unseen data. However, models built by the most accurate methods are usually hard to understand or interpret by humans. In consequence, they deliver only decisions, and are short of any explanations. Hence they do not directly lead to the acquisition of new knowledge. This dissertation contributes to bridging the gap between the accurate opaque models and those less accurate but more transparent for humans. This dissertation first defines the problem of learning from data. It surveys the state-of-the-art methods for supervised learning of both understandable and opaque models from data, as well as unsupervised methods that detect features present in the data. It describes popular methods of rule extraction from unintelligible models which rewrite them into an understandable form. Limitations of rule extraction are described. A novel definition of understandability which ties computational complexity and learning is provided to show that rule extraction is an NP-hard problem. Next, a discussion whether one can expect that even an accurate classifier has learned new knowledge. The survey ends with a presentation of two approaches to building of understandable classifiers. On the one hand, understandable models must be able to accurately describe relations in the data. On the other hand, often a description of the output of a system in terms of its input requires the introduction of intermediate concepts, called features. Therefore it is crucial to develop methods that describe the data with understandable features and are able to use those features to present the relation that describes the data. Novel contributions of this thesis follow the survey. Two families of rule extraction algorithms are considered. First, a method that can work with any opaque classifier is introduced. Artificial training patterns are generated in a mathematically sound way and used to train more accurate understandable models. Subsequently, two novel algorithms that require that the opaque model is a Neural Network are presented. They rely on access to the network\u27s weights and biases to induce rules encoded as Decision Diagrams. Finally, the topic of feature extraction is considered. The impact on imposing non-negativity constraints on the weights of a neural network is considered. It is proved that a three layer network with non-negative weights can shatter any given set of points and experiments are conducted to assess the accuracy and interpretability of such networks. Then, a novel path-following algorithm that finds robust sparse encodings of data is presented. In summary, this dissertation contributes to improved understandability of classifiers in several tangible and original ways. It introduces three distinct aspects of achieving this goal: infusion of additional patterns from the underlying pattern distribution into rule learners, the derivation of decision diagrams from neural networks, and achieving sparse coding with neural networks with non-negative weights

    Why and How to Extract Conditional Statements From Natural Language Requirements

    Get PDF
    Functional requirements often describe system behavior by relating events to each other, e.g. "If the system detects an error (e_1), an error message shall be shown (e_2)". Such conditionals consist of two parts: the antecedent (see e_1) and the consequent (e_2), which convey strong, semantic information about the intended behavior of a system. Automatically extracting conditionals from texts enables several analytical disciplines and is already used for information retrieval and question answering. We found that automated conditional extraction can also provide added value to Requirements Engineering (RE) by facilitating the automatic derivation of acceptance tests from requirements. However, the potential of extracting conditionals has not yet been leveraged for RE. We are convinced that this has two principal reasons: 1) The extent, form, and complexity of conditional statements in RE artifacts is not well understood. We do not know how conditionals are formulated and logically interpreted by RE practitioners. This hinders the development of suitable approaches for extracting conditionals from RE artifacts. 2) Existing methods fail to extract conditionals from Unrestricted Natural Language (NL) in fine-grained form. That is, they do not consider the combinatorics between antecedents and consequents. They also do not allow to split them into more fine-granular text fragments (e.g., variable and condition), rendering the extracted conditionals unsuitable for RE downstream tasks such as test case derivation. This thesis contributes to both areas. In Part I, we present empirical results on the prevalence and logical interpretation of conditionals in RE artifacts. Our case study corroborates that conditionals are widely used in both traditional and agile requirements such as acceptance criteria. We found that conditionals in requirements mainly occur in explicit, marked form and may include up to three antecedents and two consequents. Hence, the extraction approach needs to understand conjunctions, disjunctions, and negations to fully capture the relation between antecedents and consequents. We also found that conditionals are a source of ambiguity and there is not just one way to interpret them formally. This affects any automated analysis that builds upon formalized requirements (e.g., inconsistency checking) and may also influence guidelines for writing requirements. Part II presents our tool-supported approach CiRA capable of detecting conditionals in NL requirements and extracting them in fine-grained form. For the detection, CiRA uses syntactically enriched BERT embeddings combined with a softmax classifier and outperforms existing methods (macro-F_1: 82%). Our experiments show that a sigmoid classifier built on RoBERTa embeddings is best suited to extract conditionals in fine-grained form (macro-F_1: 86%). We disclose our code, data sets, and trained models to facilitate replication. CiRA is available at http://www.cira.bth.se/demo/. In Part III, we highlight how the extraction of conditionals from requirements can help to create acceptance tests automatically. First, we motivate this use case in an empirical study and demonstrate that the lack of adequate acceptance tests is one of the major problems in agile testing. Second, we show how extracted conditionals can be mapped to a Cause-Effect-Graph from which test cases can be derived automatically. We demonstrate the feasibility of our approach in a case study with three industry partners. In our study, out of 578 manually created test cases, 71.8% can be generated automatically. Furthermore, our approach discovered 80 relevant test cases that were missed in manual test case design. At the end of this thesis, the reader will have an understanding of (1) the notion of conditionals in RE artifacts, (2) how to extract them in fine-grained form, and (3) the added value that the extraction of conditionals can provide to RE

    Abductive hypotheses generation in neural-symbolic systems

    Get PDF
    Wydział Nauk SpołecznychCelem pracy było opracowanie procedury abdukcyjnej, która działa w oparciu o system neuronalno-symboliczny. Procedura abdukcyjna jest rozumiana jako rozszerzona interpretacja algorytmiczna rozumowania abdukcyjnego, tj. posiadając wiedzę Γ oraz zjawisko ᵠ, którego nie można wyprowadzić z Γ tworzymy nową bazę wiedzy Γ’, która jest zmodyfikowaną wersją Γ, a z której ᵠ jest osiągalne. Hipotezę abdukcyjną definiujemy jako różnicę symetryczną pomiędzy Γ i Γ’. Wymagamy aby powstałe w ten sposób hipotezy abdukcyjne posiadały pewne własności, np. niesprzeczność z bazą wiedzy. Procedura abdukcyjna zaimplementowana została w systemie neuronalno-symbolicznym, który umożliwia tłumaczenie programów logicznych (baza wiedzy Γ oraz cel abdukcyjny ᵠ) na sztuczne sieci neuronowe, uczenie sieci neuronowych przy pomocy algorytmu propagacji wstecznej (proces tworzenia hipotezy abdukcyjnej) oraz tłumaczenie sieci neuronowych na programy logiczne (baza wiedzy Γ’), co umożliwia otrzymanie hipotezy abdukcyjnej.The goal of this work was to create abductive procedure that is implemented in neural-symbolic system. The abductive procedure is understood as extended algorithmic perspective, i.e. having Γ as knowledge base and ᵠ that is not obtainable from Γ we create Γ’ (modified Γ) from which ᵠ is obtainable. The abductive hypothesis is a symmetric difference between Γ and Γ’. We also require abductive hypotheses to fulfill certain conditions. The procedure is implemented in a neural-symbolic system where Γ and ᵠ are represented as a logic program that is translated into a neural network, the neural network is then trained by means of the backpropagation algorithm in such a way that ᵠ becomes obtainable, and then the trained neural network is translated back into a logic program that represents Γ’. The symmetric difference between Γ and Γ’ is the abductive hypothesis

    Identification of chemical species using artificial intelligence to interpret optical emission spectra

    Get PDF
    The nonlinear modeling capabilities of artificial neural networks (ANN’s) are renowned in the field of artificial intelligence (Al) for capturing knowledge that can be very difficult to understand otherwise. Their ability to be trained on representative data within a particular problem domain and generalise over a set of data make them efficient predictive models. One problem domain that contains complex data that would benefit from the predictive capabilities of ANN’s is that of optical emission spectra (OES). OES is an important diagnostic for monitoring plasma species within plasma processing. Normally, OES spectral interpretation requires significant prior expertise from a spectroscopist. One way of alleviating this intensive demand in order to quickly interpret OES spectra is to interpret the data using an intelligent pattern recognition technique like ANN’s. This thesis investigates and presents MLP ANN models that can successfully classify chemical species within OES spectral patterns. The primary contribution of the thesis is the creation of deployable ANN species models that can predict OES spectral line sizes directly from six controllable input process parameters; and the implementation of a novel rule extraction procedure to relate the real multi-output values of the spectral line sizes to individual input process parameters. Not only are the trained species models excellent in their predictive capability, but they also provide the foundation for extracting comprehensible rules. A secondary contribution made by this thesis is to present an adapted fuzzy rule extraction system that attaches a quantitative measure of confidence to individual rules. The most significant contribution to the field of Al that is generated from the work presented in the thesis is the fact that the rule extraction procedure utilises predictive ANN species models that employ real continuously valued multi-output data. This is an improvement on rule extraction from trained networks that normally focus on discrete binary output

    Machine Learning Advances for Practical Problems in Computer Vision

    Get PDF
    Convolutional neural networks (CNN) have become the de facto standard for computer vision tasks, due to their unparalleled performance and versatility. Although deep learning removes the need for extensive hand engineered features for every task, real world applications of CNNs still often require considerable engineering effort to produce usable results. In this thesis, we explore solutions to problems that arise in practical applications of CNNs. We address a rarely acknowledged weakness of CNN object detectors: the tendency to emit many excess detection boxes per object, which must be pruned by non maximum suppression (NMS). This practice relies on the assumption that highly overlapping boxes are excess, which is problematic when objects are occluding overlapping detections are actually required. Therefore we propose a novel loss function that incentivises a CNN to emit exactly one detection per object, making NMS unnecessary. Another common problem when deploying a CNN in the real world is domain shift - CNNs can be surprisingly vulnerable to sometimes quite subtle differences between the images they encounter at deployment and those they are trained on. We investigate the role that texture plays in domain shift, and propose a novel data augmentation technique using style transfer to train CNNs that are more robust against shifts in texture. We demonstrate that this technique results in better domain transfer on several datasets, without requiring any domain specific knowledge. In collaboration with AstraZeneca, we develop an embedding space for cellular images collected in a high throughput imaging screen as part of a drug discovery project. This uses a combination of techniques to embed the images in 2D space such that similar images are nearby, for the purpose of visualization and data exploration. The images are also clustered automatically, splitting the large dataset into a smaller number of clusters that display a common phenotype. This allows biologists to quickly triage the high throughput screen, selecting a small subset of promising phenotypes for further investigation. Finally, we investigate an unusual form of domain bias that manifested in a real-world visual binary classification project for counterfeit detection. We confirm that CNNs are able to ``cheat'' the task by exploiting a strong correlation between class label and the specific camera that acquired the image, and show that this reliably occurs when the correlation is present. We also investigate the question of how exactly the CNN is able to infer camera type from image pixels, given that this is impossible to the human eye. The contributions in this thesis are of practical value to deep learning practitioners working on a variety of problems in the field of computer vision

    On-the-fly synthesizer programming with rule learning

    Get PDF
    This manuscript explores automatic programming of sound synthesis algorithms within the context of the performative artistic practice known as live coding. Writing source code in an improvised way to create music or visuals became an instrument the moment affordable computers were able to perform real-time sound synthesis with languages that keep their interpreter running. Ever since, live coding has dealt with real time programming of synthesis algorithms. For that purpose, one possibility is an algorithm that automatically creates variations out of a few presets selected by the user. However, the need for real-time feedback and the small size of the data sets (which can even be collected mid-performance) are constraints that make existing automatic sound synthesizer programmers and learning algorithms unfeasible. Also, the design of such algorithms is not oriented to create variations of a sound but rather to find the synthesizer parameters that match a given one. Other approaches create representations of the space of possible sounds, allowing the user to explore it by means of interactive evolution. Even though these systems are exploratory-oriented, they require longer run-times. This thesis investigates inductive rule learning for on-the-fly synthesizer programming. This approach is conceptually different from those found in both synthesizer programming and live coding literature. Rule models offer interpretability and allow working with the parameter values of the synthesis algorithms (even with symbolic data), making preprocessing unnecessary. RuLer, the proposed learning algorithm, receives a dataset containing user labeled combinations of parameter values of a synthesis algorithm. Among those combinations sharing the same label, it analyses the patterns based on dissimilarity. These patterns are described as an IF-THEN rule model. The algorithm parameters provide control to define what is considered a pattern. As patterns are the base for inducting new parameter settings, the algorithm parameters control the degree of consistency of the inducted settings respect to the original input data. An algorithm (named FuzzyRuLer) able to extend IF-THEN rules to hyperrectangles, which in turn are used as the cores of membership functions, is presented. The resulting fuzzy rule model creates a map of the entire input feature space. For such a pursuit, the algorithm generalizes the logical rules solving the contradictions by following a maximum volume heuristics. Across the manuscript it is discussed how, when machine learning algorithms are used as creative tools, glitches, errors or inaccuracies produced by the resulting models are sometimes desirable as they might offer novel, unpredictable results. The evaluation of the algorithms follows two paths. The first focuses on user tests. The second responds to the fact that this work was carried out within the computer science department and is intended to provide a broader, nonspecific domain evaluation of the algorithms performance using extrinsic benchmarks (i.e not belonging to a synthesizer's domain) for cross validation and minority oversampling. In oversampling tasks, using imbalanced datasets, the algorithm yields state-of-the-art results. Moreover, the synthetic points produced are significantly different from those created by the other algorithms and perform (controlled) exploration of more distant regions. Finally, accompanying the research, various performances, concerts and an album were produced with the algorithms and examples of this thesis. The reviews received and collections where the album has been featured show a positive reception within the community. Together, these evaluations suggest that rule learning is both an effective method and a promising path for further research.Aquest manuscrit explora la programació automàtica d’algorismes de síntesi de so dins del context de la pràctica artística performativa coneguda com a live coding. L'escriptura improvisada de codi font per crear música o visuals es va convertir en un instrument en el moment en què els ordinadors van poder realitzar síntesis de so en temps real amb llenguatges que mantenien el seu intèrpret en funcionament. D'aleshores ençà, el live coding comporta la programació en temps real d’algorismes de síntesi de so. Per a aquest propòsit, una possibilitat és tenir un algorisme que creï automàticament variacions a partir d'alguns presets seleccionats. No obstant, la necessitat de retroalimentació en temps real i la petita mida dels conjunts de dades són restriccions que fan que els programadors automàtics de sintetitzadors de so i els algorismes d’aprenentatge no siguin factibles d’utilitzar. A més, el seu disseny no està orientat a crear variacions d'un so, sinó a trobar els paràmetres del sintetitzador que aplicats a l'algorisme de síntesi produeixen un so determinat (target). Altres enfocaments creen representacions de l'espai de sons possibles, per permetre a l'usuari explorar-lo mitjançant l'evolució interactiva, però requereixen temps més llargs. Aquesta tesi investiga l'aprenentatge inductiu de regles per a la programació on-the-fly de sintetitzadors. Aquest enfocament és conceptualment diferent dels que es troben a la literatura. Els models de regles ofereixen interpretabilitat i permeten treballar amb els valors dels paràmetres dels algorismes de síntesi, sense processament previ. RuLer, l'algorisme d'aprenentatge proposat, rep dades amb combinacions etiquetades per l'usuari dels valors dels paràmetres d'un algorisme de síntesi. A continuació, analitza els patrons, basats en la dissimilitud, entre les combinacions de cada etiqueta. Aquests patrons es descriuen com un model de regles IF-THEN. Els paràmetres de l'algorisme proporcionen control per definir el que es considera un patró. Llavors, controlen el grau de consistència dels nous paràmetres de síntesi induïts respecte a les dades d'entrada originals. A continuació, es presenta un algorisme (FuzzyRuLer) capaç d’estendre les regles IF-THEN a hiperrectangles, que al seu torn s’utilitzen com a nuclis de funcions de pertinença. El model de regles difuses resultant crea un mapa complet de l'espai de la funció d'entrada. Per això, l'algorisme generalitza les regles lògiques seguint una heurística de volum màxim. Al llarg del manuscrit es discuteix com, quan s’utilitzen algorismes d’aprenentatge automàtic com a eines creatives, de vegades són desitjables glitches, errors o imprecisions produïdes pels models resultants, ja que poden oferir nous resultats imprevisibles. L'avaluació dels algorismes segueix dos camins. El primer es centra en proves d'usuari. El segon, que respon al fet que aquest treball es va dur a terme dins del departament de ciències de la computació, pretén proporcionar una avaluació més àmplia, no específica d'un domini, del rendiment dels algorismes mitjançant benchmarks extrínsecs utilitzats per cross-validation i minority oversampling. En tasques d'oversampling, mitjançant imbalanced data sets, l'algorisme proporciona resultats equiparables als de l'estat de l'art. A més, els punts sintètics produïts són significativament diferents als creats pels altres algorismes i realitzen exploracions (controlades) de regions més llunyanesEste manuscrito explora la programación automática de algoritmos de síntesis de sonido dentro del contexto de la práctica artística performativa conocida como live coding. La escritura de código fuente de forma improvisada para crear música o imágenes, se convirtió en un instrumento en el momento en que las computadoras asequibles pudieron realizar síntesis de sonido en tiempo real con lenguajes que mantuvieron su interprete en funcionamiento. Desde entonces, el live coding ha implicado la programación en tiempo real de algoritmos de síntesis. Para ese propósito, una posibilidad es tener un algoritmo que cree automáticamente variaciones a partir de unos pocos presets seleccionados. Sin embargo, la necesidad de retroalimentación en tiempo real y el pequeño tamaño de los conjuntos de datos (que incluso pueden recopilarse durante la misma actuación), limitan el uso de los algoritmos existentes, tanto de programación automática de sintetizadores como de aprendizaje de máquina. Además, el diseño de dichos algoritmos no está orientado a crear variaciones de un sonido, sino a encontrar los parámetros del sintetizador que coincidan con un sonido dado. Otros enfoques crean representaciones del espacio de posibles sonidos, para permitir al usuario explorarlo mediante evolución interactiva. Aunque estos sistemas están orientados a la exploración, requieren tiempos más largos. Esta tesis investiga el aprendizaje inductivo de reglas para la programación de sintetizadores on-the-fly. Este enfoque es conceptualmente diferente de los que se encuentran en la literatura, tanto de programación de sintetizadores como de live coding. Los modelos de reglas ofrecen interpretabilidad y permiten trabajar con los valores de los parámetros de los algoritmos de síntesis (incluso con datos simbólicos), haciendo innecesario el preprocesamiento. RuLer, el algoritmo de aprendizaje propuesto, recibe un conjunto de datos que contiene combinaciones, etiquetadas por el usuario, de valores de parámetros de un algoritmo de síntesis. Luego, analiza los patrones, en función de la disimilitud, entre las combinaciones de cada etiqueta. Estos patrones se describen como un modelo de reglas lógicas IF-THEN. Los parámetros del algoritmo proporcionan el control para definir qué se considera un patrón. Como los patrones son la base para inducir nuevas configuraciones de parámetros, los parámetros del algoritmo controlan también el grado de consistencia de las configuraciones inducidas con respecto a los datos de entrada originales. Luego, se presenta un algoritmo (llamado FuzzyRuLer) capaz de extender las reglas lógicas tipo IF-THEN a hiperrectángulos, que a su vez se utilizan como núcleos de funciones de pertenencia. El modelo de reglas difusas resultante crea un mapa completo del espacio de las clases de entrada. Para tal fin, el algoritmo generaliza las reglas lógicas resolviendo las contradicciones utilizando una heurística de máximo volumen. A lo largo del manuscrito se analiza cómo, cuando los algoritmos de aprendizaje automático se utilizan como herramientas creativas, los glitches, errores o inexactitudes producidas por los modelos resultantes son a veces deseables, ya que pueden ofrecer resultados novedosos e impredecibles. La evaluación de los algoritmos sigue dos caminos. El primero se centra en pruebas de usuario. El segundo, responde al hecho de que este trabajo se llevó a cabo dentro del departamento de ciencias de la computación y está destinado a proporcionar una evaluación más amplia, no de dominio específica, del rendimiento de los algoritmos utilizando beanchmarks extrínsecos para cross-validation y oversampling. En estas últimas pruebas, utilizando conjuntos de datos no balanceados, el algoritmo produce resultados equiparables a los del estado del arte. Además, los puntos sintéticos producidos son significativamente diferentes de los creados por los otros algoritmos y realizan una exploración (controlada) de regiones más distantes. Finalmente, acompañando la investigación, realicé diversas presentaciones, conciertos y un ´álbum utilizando los algoritmos y ejemplos de esta tesis. Las críticas recibidas y las listas donde se ha presentado el álbum muestran una recepción positiva de la comunidad. En conjunto, estas evaluaciones sugieren que el aprendizaje de reglas es al mismo tiempo un método eficaz y un camino prometedor para futuras investigaciones.Postprint (published version
    corecore