1,912 research outputs found

    How functional programming mattered

    Get PDF
    In 1989 when functional programming was still considered a niche topic, Hughes wrote a visionary paper arguing convincingly ‘why functional programming matters’. More than two decades have passed. Has functional programming really mattered? Our answer is a resounding ‘Yes!’. Functional programming is now at the forefront of a new generation of programming technologies, and enjoying increasing popularity and influence. In this paper, we review the impact of functional programming, focusing on how it has changed the way we may construct programs, the way we may verify programs, and fundamentally the way we may think about programs

    Genetic Programming + Proof Search = Automatic Improvement

    Get PDF
    Search Based Software Engineering techniques are emerging as important tools for software maintenance. Foremost among these is Genetic Improvement, which has historically applied the stochastic techniques of Genetic Programming to optimize pre-existing program code. Previous work in this area has not generally preserved program semantics and this article describes an alternative to the traditional mutation operators used, employing deterministic proof search in the sequent calculus to yield semantics-preserving transformations on algebraic data types. Two case studies are described, both of which are applicable to the recently-introduced `grow and graft' technique of Genetic Improvement: the first extends the expressiveness of the `grafting' phase and the second transforms the representation of a list data type to yield an asymptotic efficiency improvement

    Towards Data Wrangling Automation through Dynamically-Selected Background Knowledge

    Full text link
    [ES] El proceso de ciencia de datos es esencial para extraer valor de los datos. Sin embargo, la parte más tediosa del proceso, la preparación de los datos, implica una serie de formateos, limpieza e identificación de problemas que principalmente son tareas manuales. La preparación de datos todavía se resiste a la automatización en parte porque el problema depende en gran medida de la información del dominio, que se convierte en un cuello de botella para los sistemas de última generación a medida que aumenta la diversidad de dominios, formatos y estructuras de los datos. En esta tesis nos enfocamos en generar algoritmos que aprovechen el conocimiento del dominio para la automatización de partes del proceso de preparación de datos. Mostramos la forma en que las técnicas generales de inducción de programas, en lugar de los lenguajes específicos del dominio, se pueden aplicar de manera flexible a problemas donde el conocimiento es importante, mediante el uso dinámico de conocimiento específico del dominio. De manera más general, sostenemos que una combinación de enfoques de aprendizaje dinámicos y basados en conocimiento puede conducir a buenas soluciones. Proponemos varias estrategias para seleccionar o construir automáticamente el conocimiento previo apropiado en varios escenarios de preparación de datos. La idea principal se basa en elegir las mejores primitivas especializadas de acuerdo con el contexto del problema particular a resolver. Abordamos dos escenarios. En el primero, manejamos datos personales (nombres, fechas, teléfonos, etc.) que se presentan en formatos de cadena de texto muy diferentes y deben ser transformados a un formato unificado. El problema es cómo construir una transformación compositiva a partir de un gran conjunto de primitivas en el dominio (por ejemplo, manejar meses, años, días de la semana, etc.). Desarrollamos un sistema (BK-ADAPT) que guía la búsqueda a través del conocimiento previo extrayendo varias meta-características de los ejemplos que caracterizan el dominio de la columna. En el segundo escenario, nos enfrentamos a la transformación de matrices de datos en lenguajes de programación genéricos como R, utilizando como ejemplos una matriz de entrada y algunas celdas de la matriz de salida. También desarrollamos un sistema guiado por una búsqueda basada en árboles (AUTOMAT[R]IX) que usa varias restricciones, probabilidades previas para las primitivas y sugerencias textuales, para aprender eficientemente las transformaciones. Con estos sistemas, mostramos que la combinación de programación inductiva, con la selección dinámica de las primitivas apropiadas a partir del conocimiento previo, es capaz de mejorar los resultados de otras herramientas actuales específicas para la preparación de datos.[CA] El procés de ciència de dades és essencial per extraure valor de les dades. No obstant això, la part més tediosa del procés, la preparació de les dades, implica una sèrie de transformacions, neteja i identificació de problemes que principalment són tasques manuals. La preparació de dades encara es resisteix a l'automatització en part perquè el problema depén en gran manera de la informació del domini, que es converteix en un coll de botella per als sistemes d'última generació a mesura que augmenta la diversitat de dominis, formats i estructures de les dades. En aquesta tesi ens enfoquem a generar algorismes que aprofiten el coneixement del domini per a l'automatització de parts del procés de preparació de dades. Mostrem la forma en què les tècniques generals d'inducció de programes, en lloc dels llenguatges específics del domini, es poden aplicar de manera flexible a problemes on el coneixement és important, mitjançant l'ús dinàmic de coneixement específic del domini. De manera més general, sostenim que una combinació d'enfocaments d'aprenentatge dinàmics i basats en coneixement pot conduir a les bones solucions. Proposem diverses estratègies per seleccionar o construir automàticament el coneixement previ apropiat en diversos escenaris de preparació de dades. La idea principal es basa a triar les millors primitives especialitzades d'acord amb el context del problema particular a resoldre. Abordem dos escenaris. En el primer, manegem dades personals (noms, dates, telèfons, etc.) que es presenten en formats de cadena de text molt diferents i han de ser transformats a un format unificat. El problema és com construir una transformació compositiva a partir d'un gran conjunt de primitives en el domini (per exemple, manejar mesos, anys, dies de la setmana, etc.). Desenvolupem un sistema (BK-ADAPT) que guia la cerca a través del coneixement previ extraient diverses meta-característiques dels exemples que caracteritzen el domini de la columna. En el segon escenari, ens enfrontem a la transformació de matrius de dades en llenguatges de programació genèrics com a R, utilitzant com a exemples una matriu d'entrada i algunes dades de la matriu d'eixida. També desenvolupem un sistema guiat per una cerca basada en arbres (AUTOMAT[R]IX) que usa diverses restriccions, probabilitats prèvies per a les primitives i suggeriments textuals, per aprendre eficientment les transformacions. Amb aquests sistemes, mostrem que la combinació de programació inductiva amb la selecció dinàmica de les primitives apropiades a partir del coneixement previ, és capaç de millorar els resultats d'altres enfocaments de preparació de dades d'última generació i més específics.[EN] Data science is essential for the extraction of value from data. However, the most tedious part of the process, data wrangling, implies a range of mostly manual formatting, identification and cleansing manipulations. Data wrangling still resists automation partly because the problem strongly depends on domain information, which becomes a bottleneck for state-of-the-art systems as the diversity of domains, formats and structures of the data increases. In this thesis we focus on generating algorithms that take advantage of the domain knowledge for the automation of parts of the data wrangling process. We illustrate the way in which general program induction techniques, instead of domain-specific languages, can be applied flexibly to problems where knowledge is important, through the dynamic use of domain-specific knowledge. More generally, we argue that a combination of knowledge-based and dynamic learning approaches leads to successful solutions. We propose several strategies to automatically select or construct the appropriate background knowledge for several data wrangling scenarios. The key idea is based on choosing the best specialised background primitives according to the context of the particular problem to solve. We address two scenarios. In the first one, we handle personal data (names, dates, telephone numbers, etc.) that are presented in very different string formats and have to be transformed into a unified format. The problem is how to build a compositional transformation from a large set of primitives in the domain (e.g., handling months, years, days of the week, etc.). We develop a system (BK-ADAPT) that guides the search through the background knowledge by extracting several meta-features from the examples characterising the column domain. In the second scenario, we face the transformation of data matrices in generic programming languages such as R, using an input matrix and some cells of the output matrix as examples. We also develop a system guided by a tree-based search (AUTOMAT[R]IX) that uses several constraints, prior primitive probabilities and textual hints to efficiently learn the transformations. With these systems, we show that the combination of inductive programming with the dynamic selection of the appropriate primitives from the background knowledge is able to improve the results of other state-of-the-art and more specific data wrangling approaches.This research was supported by the Spanish MECD Grant FPU15/03219;and partially by the Spanish MINECO TIN2015-69175-C4-1-R (Lobass) and RTI2018-094403-B-C32-AR (FreeTech) in Spain; and by the ERC Advanced Grant Synthesising Inductive Data Models (Synth) in Belgium.Contreras Ochando, L. (2020). Towards Data Wrangling Automation through Dynamically-Selected Background Knowledge [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/160724TESI

    Design Ltd.: Renovated Myths for the Development of Socially Embedded Technologies

    Full text link
    This paper argues that traditional and mainstream mythologies, which have been continually told within the Information Technology domain among designers and advocators of conceptual modelling since the 1960s in different fields of computing sciences, could now be renovated or substituted in the mould of more recent discourses about performativity, complexity and end-user creativity that have been constructed across different fields in the meanwhile. In the paper, it is submitted that these discourses could motivate IT professionals in undertaking alternative approaches toward the co-construction of socio-technical systems, i.e., social settings where humans cooperate to reach common goals by means of mediating computational tools. The authors advocate further discussion about and consolidation of some concepts in design research, design practice and more generally Information Technology (IT) development, like those of: task-artifact entanglement, universatility (sic) of End-User Development (EUD) environments, bricolant/bricoleur end-user, logic of bricolage, maieuta-designers (sic), and laissez-faire method to socio-technical construction. Points backing these and similar concepts are made to promote further discussion on the need to rethink the main assumptions underlying IT design and development some fifty years later the coming of age of software and modern IT in the organizational domain.Comment: This is the peer-unreviewed of a manuscript that is to appear in D. Randall, K. Schmidt, & V. Wulf (Eds.), Designing Socially Embedded Technologies: A European Challenge (2013, forthcoming) with the title "Building Socially Embedded Technologies: Implications on Design" within an EUSSET editorial initiative (www.eusset.eu/

    Specific-to-General Learning for Temporal Events with Application to Learning Event Definitions from Video

    Full text link
    We develop, analyze, and evaluate a novel, supervised, specific-to-general learner for a simple temporal logic and use the resulting algorithm to learn visual event definitions from video sequences. First, we introduce a simple, propositional, temporal, event-description language called AMA that is sufficiently expressive to represent many events yet sufficiently restrictive to support learning. We then give algorithms, along with lower and upper complexity bounds, for the subsumption and generalization problems for AMA formulas. We present a positive-examples--only specific-to-general learning method based on these algorithms. We also present a polynomial-time--computable ``syntactic'' subsumption test that implies semantic subsumption without being equivalent to it. A generalization algorithm based on syntactic subsumption can be used in place of semantic generalization to improve the asymptotic complexity of the resulting learning algorithm. Finally, we apply this algorithm to the task of learning relational event definitions from video and show that it yields definitions that are competitive with hand-coded ones

    TOWARDS A MULTI-PURPOSE FRAMEWORK FOR TAX-BENEFIT MICROSIMULATION.

    Get PDF
    This paper introduces a generalised model building platform (MMEANS) for implementing and using tax-benefit microsimulation models. It is designed to aid in the construction of single- and multi-country tax- benefit models by providing all essential components and a system by which these can be parameterised and combined into a full model. We explain the conceptual and computational issues arising in the design and development of MMEANS. One application of the software has been to construct EUROMOD, a 15 country European tax-benefit model (Immervoll et al., 1999; Sutherland, 2001). However, we argue that, apart from its direct usefulness for this model, MMEANS can be used as a general software tool for microsimulation model building.Microsimulation; Tax-Benefit Model; European Union

    Towards Safe Artificial General Intelligence

    Get PDF
    The field of artificial intelligence has recently experienced a number of breakthroughs thanks to progress in deep learning and reinforcement learning. Computer algorithms now outperform humans at Go, Jeopardy, image classification, and lip reading, and are becoming very competent at driving cars and interpreting natural language. The rapid development has led many to conjecture that artificial intelligence with greater-than-human ability on a wide range of tasks may not be far. This in turn raises concerns whether we know how to control such systems, in case we were to successfully build them. Indeed, if humanity would find itself in conflict with a system of much greater intelligence than itself, then human society would likely lose. One way to make sure we avoid such a conflict is to ensure that any future AI system with potentially greater-than-human-intelligence has goals that are aligned with the goals of the rest of humanity. For example, it should not wish to kill humans or steal their resources. The main focus of this thesis will therefore be goal alignment, i.e. how to design artificially intelligent agents with goals coinciding with the goals of their designers. Focus will mainly be directed towards variants of reinforcement learning, as reinforcement learning currently seems to be the most promising path towards powerful artificial intelligence. We identify and categorize goal misalignment problems in reinforcement learning agents as designed today, and give examples of how these agents may cause catastrophes in the future. We also suggest a number of reasonably modest modifications that can be used to avoid or mitigate each identified misalignment problem. Finally, we also study various choices of decision algorithms, and conditions for when a powerful reinforcement learning system will permit us to shut it down. The central conclusion is that while reinforcement learning systems as designed today are inherently unsafe to scale to human levels of intelligence, there are ways to potentially address many of these issues without straying too far from the currently so successful reinforcement learning paradigm. Much work remains in turning the high-level proposals suggested in this thesis into practical algorithms, however
    corecore