24 research outputs found

    Influence of Machine Learning vs. Ranking Algorithm on the Critical Dimension

    Get PDF
    Article originally published in International Journal of Future Computer and CommunicationThe critical dimension is the minimum number of features required for a learning machine to perform with “high” accuracy, which for a specific dataset is dependent upon the learning machine and the ranking algorithm. Discovering the critical dimension, if one exists for a dataset, can help to reduce the feature size while maintaining the learning machine’s performance. It is important to understand the influence of learning machines and ranking algorithms on critical dimension to reduce the feature size effectively. In this paper we experiment with three ranking algorithms and three learning machines on several datasets to study their combined effect on the critical dimension. Results show the ranking algorithm has greater influence on the critical dimension than the learning machine.ICASA (Institute for Complex Additive Systems Analysis) of New Mexico Tech and the National Institute of Justice, U.S. Department of Justice (Award No. 2010-DN-BX-K223

    Studying polymer physics by machine learning.

    Get PDF
    Recently, machine learning becomes a computational method that burst in popularity. Many disciplines, such as condensed matter physics, quantum chemistry, chemical engineering as well as polymer physics have incorporate machine learning into their studies. This thesis mainly focuses on applying machine learning methods into the study of polymer physics. More specifically, two computational methods are studied: 1. how to classify polymer states by supervised or unsupervised learning methods, 2. how to use FNNs to search for structures of diblock copolymer under self-consistent field theory scheme. In the first topic, polymer samples that consist of both vastly different structures, such as gas-like random coil, liquid-like globular and subtly different structures, such as crystalline anti-Mackay, Mackay, are generated by Monte Carlo method. We then explored the capability of a FNN on the classification of different polymer configurations systematically. Base on a series of numerical experiments, we find that a FNN, after appropriate training, is able to not only identify all these structures, but also accurately locate the transition points between multiple states. The location given by the FNN has a good agreement with that provided by specific-heat calculations from the traditional method, which shows that the FNN offers a new tool for further studies of the polymeric phase transitions. We also studied these states with principal component analysis (PCA). When polymer samples only contain coil and globular states, PCA can distinguish these states, and offer insights to understand the relation between features and order parameters of these states. However, PCA itself is not powerful enough to distinguish globular, anti-Mackay, Mackay states. Then, a hybrid scheme combining PCA and supervised learning is utilized to identify and precisely detect the critical point of phase transitions between these polymer configurations. Compared with traditional methods, our studies demonstrate machine learning based methods have some distinct advantages. Firstly, these methods directly and only use molecular coordinates, which indicates its high compatibility with multiple sampling methods. In addition, the trained FNN has high transferability. In terms of identify transition points, our approaches requires much fewer samples, which indicates they are computationally faster than the traditional methods. In the second topic, we start from using the universal approximation theorem of FNN to build a machine learning based PDE solver. Our work mainly focuses on diffusion equations. This algorithm utilizes the function generated by the FNN as a trial function and adjusts the weights and biases of the FNN to search for the solution of a given PDE. The trial function will have a good match with the solution, when the weights and biases are optimal. Our approach is important to high dimensional diffusion equations. We discovered that the growth of the computational time obeys a power law with respect to the dimensionality, which indicates that the machine learning based solver offers a candidate algorithm that may not suffer from the ``curse of dimensionality''. We then demonstrated that this machine learning PDE solver can be conveniently adopted to deal with multi-variable, coupled integrodifferential equations in the self-consistent field theory for predicting polymer self-assembly structures. We observed all known three-dimensional classical structures, and our solutions have an excellent agreement with traditional solutions

    Facing online challenges using learning classifier systems

    Get PDF
    Els grans avenços en el camp de l’aprenentatge automàtic han resultat en el disseny de màquines competents que són capaces d’aprendre i d’extreure informació útil i original de l’experiència. Recentment, algunes d’aquestes tècniques d’aprenentatge s’han aplicat amb èxit per resoldre problemes del món real en àmbits tecnològics, mèdics, científics i industrials, els quals no es podien tractar amb tècniques convencionals d’anàlisi ja sigui per la seva complexitat o pel gran volum de dades a processar. Donat aquest èxit inicial, actualment els sistemes d’aprenentatge s’enfronten a problemes de complexitat més elevada, el que ha resultat en un augment de l’activitat investigadora entorn sistemes capaços d’afrontar nous problemes del món real eficientment i de manera escalable. Una de les famílies d’algorismes més prometedores en l’aprenentatge automàtic són els sistemes classificadors basats en algorismes genetics (LCSs), el funcionament dels quals s’inspira en la natura. Els LCSs intenten representar les polítiques d’actuació d’experts humans amb un conjunt de regles que s’empren per escollir les millors accions a realitzar en tot moment. Així doncs, aquests sistemes aprenen polítiques d’actuació de manera incremental a mida que van adquirint experiència a través de la informació nova que se’ls va presentant durant el temps. Els LCSs s’han aplicat, amb èxit, a camps tan diversos com la predicció de càncer de pròstata o el suport a la inversió en borsa, entre altres. A més en alguns casos s’ha demostrat que els LCSs realitzen tasques superant la precisió dels éssers humans. El propòsit d’aquesta tesi és explorar la naturalesa de l’aprenentatge online dels LCSs d’estil Michigan per a la mineria de grans quantitats de dades en forma de fluxos d’informació continus a alta velocitat i canviants en el temps. Molt sovint, l’extracció de coneixement a partir d’aquestes fonts de dades és clau per tal d’obtenir una millor comprensió dels processos que les dades estan descrivint. Així, aprendre d’aquestes dades planteja nous reptes a les tècniques tradicionals d’aprenentatge automàtic, les quals no estan dissenyades per tractar fluxos de dades continus i on els conceptes i els nivells de soroll poden variar amb el temps de forma arbitrària. La contribució de la present tesi pren l’eXtended Classifier System (XCS), el LCS d’estil Michigan més estudiat i un dels algoritmes d’aprenentatge automàtic més competents, com el punt de partida. D’aquesta manera els reptes abordats en aquesta tesi són dos: el primer desafiament és la construcció d’un sistema supervisat competent sobre el framework dels LCSs d’estil Michigan que aprèn dels fluxos de dades amb una capacitat de reacció ràpida als canvis de concepte i entrades amb soroll. Com moltes aplicacions científiques i industrials generen grans quantitats de dades sense etiquetar, el segon repte és aplicar les lliçons apreses per continuar amb el disseny de LCSs d’estil Michigan capaços de solucionar problemes online sense assumir una estructura a priori en els dades d’entrada.Los grandes avances en el campo del aprendizaje automático han resultado en el diseño de máquinas capaces de aprender y de extraer información útil y original de la experiencia. Recientemente alguna de estas técnicas de aprendizaje se han aplicado con éxito para resolver problemas del mundo real en ámbitos tecnológicos, médicos, científicos e industriales, los cuales no se podían tratar con técnicas convencionales de análisis ya sea por su complejidad o por el gran volumen de datos a procesar. Dado este éxito inicial, los sistemas de aprendizaje automático se enfrentan actualmente a problemas de complejidad cada vez m ́as elevada, lo que ha resultado en un aumento de la actividad investigadora en sistemas capaces de afrontar nuevos problemas del mundo real de manera eficiente y escalable. Una de las familias más prometedoras dentro del aprendizaje automático son los sistemas clasificadores basados en algoritmos genéticos (LCSs), el funcionamiento de los cuales se inspira en la naturaleza. Los LCSs intentan representar las políticas de actuación de expertos humanos usando conjuntos de reglas que se emplean para escoger las mejores acciones a realizar en todo momento. Así pues estos sistemas aprenden políticas de actuación de manera incremental mientras van adquiriendo experiencia a través de la nueva información que se les va presentando. Los LCSs se han aplicado con éxito en campos tan diversos como en la predicción de cáncer de próstata o en sistemas de soporte de bolsa, entre otros. Además en algunos casos se ha demostrado que los LCSs realizan tareas superando la precisión de expertos humanos. El propósito de la presente tesis es explorar la naturaleza online del aprendizaje empleado por los LCSs de estilo Michigan para la minería de grandes cantidades de datos en forma de flujos continuos de información a alta velocidad y cambiantes en el tiempo. La extracción del conocimiento a partir de estas fuentes de datos es clave para obtener una mejor comprensión de los procesos que se describen. Así, aprender de estos datos plantea nuevos retos a las técnicas tradicionales, las cuales no están diseñadas para tratar flujos de datos continuos y donde los conceptos y los niveles de ruido pueden variar en el tiempo de forma arbitraria. La contribución del la presente tesis toma el eXtended Classifier System (XCS), el LCS de tipo Michigan más estudiado y uno de los sistemas de aprendizaje automático más competentes, como punto de partida. De esta forma los retos abordados en esta tesis son dos: el primer desafío es la construcción de un sistema supervisado competente sobre el framework de los LCSs de estilo Michigan que aprende de flujos de datos con una capacidad de reacción rápida a los cambios de concepto y al ruido. Como muchas aplicaciones científicas e industriales generan grandes volúmenes de datos sin etiquetar, el segundo reto es aplicar las lecciones aprendidas para continuar con el diseño de nuevos LCSs de tipo Michigan capaces de solucionar problemas online sin asumir una estructura a priori en los datos de entrada.Last advances in machine learning have fostered the design of competent algorithms that are able to learn and extract novel and useful information from data. Recently, some of these techniques have been successfully applied to solve real-­‐world problems in distinct technological, scientific and industrial areas; problems that were not possible to handle by the traditional engineering methodology of analysis either for their inherent complexity or by the huge volumes of data involved. Due to the initial success of these pioneers, current machine learning systems are facing problems with higher difficulties that hamper the learning process of such algorithms, promoting the interest of practitioners for designing systems that are able to scalably and efficiently tackle real-­‐world problems. One of the most appealing machine learning paradigms are Learning Classifier Systems (LCSs), and more specifically Michigan-­‐style LCSs, an open framework that combines an apportionment of credit mechanism with a knowledge discovery technique inspired by biological processes to evolve their internal knowledge. In this regard, LCSs mimic human experts by making use of rule lists to choose the best action to a given problem situation, acquiring their knowledge through the experience. LCSs have been applied with relative success to a wide set of real-­‐ world problems such as cancer prediction or business support systems, among many others. Furthermore, on some of these areas LCSs have demonstrated learning capacities that exceed those of human experts for that particular task. The purpose of this thesis is to explore the online learning nature of Michigan-­‐style LCSs for mining large amounts of data in the form of continuous, high speed and time-­‐changing streams of information. Most often, extracting knowledge from these data is key, in order to gain a better understanding of the processes that the data are describing. Learning from these data poses new challenges to traditional machine learning techniques, which are not typically designed to deal with data in which concepts and noise levels may vary over time. The contribution of this thesis takes the extended classifier system (XCS), the most studied Michigan-­‐style LCS and one of the most competent machine learning algorithms, as the starting point. Thus, the challenges addressed in this thesis are twofold: the first challenge is building a competent supervised system based on the guidance of Michigan-­‐style LCSs that learns from data streams with a fast reaction capacity to changes in concept and noisy inputs. As many scientific and industrial applications generate vast amounts of unlabelled data, the second challenge is to apply the lessons learned in the previous issue to continue with the design of unsupervised Michigan-­‐style LCSs that handle online problems without assuming any a priori structure in input data

    Intégration de l’analyse prédictive dans des systèmes auto-adaptatifs

    Get PDF
    In this thesis we proposed a proactive self-adaptation by integrating predictive analysis into two phases of the software process. At design time, we propose a predictive modeling process, which includes the activities: define goals, collect data, select model structure, prepare data, build candidate predictive models, training, testing and cross-validation of the candidate models and selection of the ''best'' models based on a measure of model goodness. At runtime, we consume the predictions from the selected predictive models using the running system actual data. Depending on the input data and the time allowed for learning algorithms, we argue that the software system can foresee future possible input variables of the system and adapt proactively in order to accomplish middle and long term goals and requirements.Au cours des dernières années, il y a un intérêt croissant pour les systèmes logiciels capables de faire face à la dynamique des environnements en constante évolution. Actuellement, les systèmes auto-adaptatifs sont nécessaires pour l’adaptation dynamique à des situations nouvelles en maximisant performances et disponibilité. Les systèmes ubiquitaires et pervasifs fonctionnent dans des environnements complexes et hétérogènes et utilisent des dispositifs à ressources limitées où des événements peuvent compromettre la qualité du système. En conséquence, il est souhaitable de s’appuyer sur des mécanismes d’adaptation du système en fonction des événements se produisant dans le contexte d’exécution. En particulier, la communauté du génie logiciel pour les systèmes auto-adaptatif (Software Engineering for Self-Adaptive Systems - SEAMS) s’efforce d’atteindre un ensemble de propriétés d’autogestion dans les systèmes informatiques. Ces propriétés d’autogestion comprennent les propriétés dites self-configuring, self-healing, self-optimizing et self-protecting. Afin de parvenir à l’autogestion, le système logiciel met en œuvre un mécanisme de boucle de commande autonome nommé boucle MAPE-K [78]. La boucle MAPE-K est le paradigme de référence pour concevoir un logiciel auto-adaptatif dans le contexte de l’informatique autonome. Cet modèle se compose de capteurs et d’effecteurs ainsi que quatre activités clés : Monitor, Analyze, Plan et Execute, complétées d’une base de connaissance appelée Knowledge, qui permet le passage des informations entre les autres activités [78]. L’étude de la littérature récente sur le sujet [109, 71] montre que l’adaptation dynamique est généralement effectuée de manière réactive, et que dans ce cas les systèmes logiciels ne sont pas en mesure d’anticiper des situations problématiques récurrentes. Dans certaines situations, cela pourrait conduire à des surcoûts inutiles ou des indisponibilités temporaires de ressources du système. En revanche, une approche proactive n’est pas simplement agir en réponse à des événements de l’environnement, mais a un comportement déterminé par un but en prenant par anticipation des initiatives pour améliorer la performance du système ou la qualité de service

    Complete Issue 7, 1992

    Get PDF

    The Stained Glass of Knowledge: On Understanding Novice Mental Models of Computing

    Get PDF
    Learning to program can be a novel experience. The rigidity of programming can be at odds with beginning programmer\u27s existing perceptions, and the concepts can feel entirely unfamiliar. These observations motivated this research, which explores two major questions: What factors influence how novices learn programming? and How can analogy by more appropriately leveraged in programming education? This dissertation investigates the factors influencing novice programming through multiple methods. The CS1 classroom is observed as a whole system , with consideration to the factors present in it that can influence the learning process. Learning\u27s cognitive processes are elaborated to ground exploration into specifically learning programming. This includes extensive literature review spanning multiple disciplines. This allows positioning to guide the investigation. The literature survey also contributes to greater understanding of learning cognition within computing education research through its disciplinary depth. The focus on analogy with the second question is motivated through the factors observed in the first question. Analogy\u27s role in cognition and in education is observed, and the analogical inclinations of technology as a field are showcased. Stigma surrounds the use of analogy in computer science education in spite of these indications. This motivated investigation on how the use of analogy could be better addressed in programming education in order to utilize its value. This research presents a tool for the design of well-formed analogy in programming to answer this question. It also investigates additional forms analogy can take in the classroom setting, proposing relevant cultural forms such as memes can be analogical vehicles that promote learner engagement. This research presents a strong case for the value of analogy use in the CS1 classroom, and provides a tool to facilitate the design of well-formed analogies. In identifying ways to better leverage analogy in the programming classroom, presenting this research will hopefully contribute to dispelling analogy\u27s bad reputation in computing education. By exploring factors that contribute to the learning process in CS1, this research frames education design as experience design. This motivates methods and considerations from user experience design, and investigates aspects of the whole system that can promote or deter a learner\u27s experience. This dissertation presents findings on understanding the learner\u27s experience in the programming classroom, and how analogy can be used to benefit their learning process

    Topology Reconstruction of Dynamical Networks via Constrained Lyapunov Equations

    Get PDF
    The network structure (or topology) of a dynamical network is often unavailable or uncertain. Hence, we consider the problem of network reconstruction. Network reconstruction aims at inferring the topology of a dynamical network using measurements obtained from the network. In this technical note we define the notion of solvability of the network reconstruction problem. Subsequently, we provide necessary and sufficient conditions under which the network reconstruction problem is solvable. Finally, using constrained Lyapunov equations, we establish novel network reconstruction algorithms, applicable to general dynamical networks. We also provide specialized algorithms for specific network dynamics, such as the well-known consensus and adjacency dynamics.Comment: 8 page
    corecore