    An Introduction to Programming for Bioscientists: A Python-based Primer

    Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in the biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language's usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a 'variable', the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables, numerous exercises, and 19 pages of Supporting Information; currently in press at PLOS Computational Biolog

    Preemptive type checking in dynamically typed programs

    With the rise of languages such as JavaScript, dynamically typed languages have gained a strong foothold in the programming language landscape. These languages are very well suited for rapid prototyping and for use with agile programming methodologies. However, programmers would benefit from the ability to detect type errors in their code early, without imposing unnecessary restrictions on their programs.Here we describe a new type inference system that identifies potential type errors through a flow-sensitive static analysis. This analysis is invoked at a very late stage, after the compilation to bytecode and initialisation of the program. It computes for every expression the variable’s present (from the values that it has last been assigned) and future (with which it is used in the further program execution) types, respectively. Using this information, our mechanism inserts type checks at strategic points in the original program. We prove that these checks, inserted as early as possible, preempt type errors earlier than existing type systems. We further show that these checks do not change the semantics of programs that do not raise type errors.Preemptive type checking can be added to existing languages without the need to modify the existing runtime environment. We show this with an implementation for the Python language and demonstrate its effectiveness on a number of benchmarks

    Programming tools for intelligent systems

    Les outils de programmation sont des programmes informatiques qui aident les humains à programmer des ordinateurs. Les outils sont de toutes formes et tailles, par exemple les éditeurs, les compilateurs, les débogueurs et les profileurs. Chacun de ces outils facilite une tâche principale dans le flux de travail de programmation qui consomme des ressources cognitives lorsqu’il est effectué manuellement. Dans cette thèse, nous explorons plusieurs outils qui facilitent le processus de construction de systèmes intelligents et qui réduisent l’effort cognitif requis pour concevoir, développer, tester et déployer des systèmes logiciels intelligents. Tout d’abord, nous introduisons un environnement de développement intégré (EDI) pour la programmation d’applications Robot Operating System (ROS), appelé Hatchery (Chapter 2). Deuxièmement, nous décrivons Kotlin∇, un système de langage et de type pour la programmation différenciable, un paradigme émergent dans l’apprentissage automatique (Chapter 3). Troisièmement, nous proposons un nouvel algorithme pour tester automatiquement les programmes différenciables, en nous inspirant des techniques de tests contradictoires et métamorphiques (Chapter 4), et démontrons son efficacité empirique dans le cadre de la régression. Quatrièmement, nous explorons une infrastructure de conteneurs basée sur Docker, qui permet un déploiement reproductible des applications ROS sur la plateforme Duckietown (Chapter 5). Enfin, nous réfléchissons à l’état actuel des outils de programmation pour ces applications et spéculons à quoi pourrait ressembler la programmation de systèmes intelligents à l’avenir (Chapter 6).Programming tools are computer programs which help humans program computers. Tools come in all shapes and forms, from editors and compilers to debuggers and profilers. Each of these tools facilitates a core task in the programming workflow which consumes cognitive resources when performed manually. In this thesis, we explore several tools that facilitate the process of building intelligent systems, and which reduce the cognitive effort required to design, develop, test and deploy intelligent software systems. First, we introduce an integrated development environment (IDE) for programming Robot Operating System (ROS) applications, called Hatchery (Chapter 2). Second, we describe Kotlin∇, a language and type system for differentiable programming, an emerging paradigm in machine learning (Chapter 3). Third, we propose a new algorithm for automatically testing differentiable programs, drawing inspiration from techniques in adversarial and metamorphic testing (Chapter 4), and demonstrate its empirical efficiency in the regression setting. Fourth, we explore a container infrastructure based on Docker, which enables reproducible deployment of ROS applications on the Duckietown platform (Chapter 5). Finally, we reflect on the current state of programming tools for these applications and speculate what intelligent systems programming might look like in the future (Chapter 6)

    Fully reflective execution environments

    Las Máquinas Virtuales (MV) son artefactos de software complejos. Sus responsabilidades abarcan desde realizar la semántica de algún lenguaje de programación en particular hasta garantizar propiedades tales como la eficiencia, la portabilidad y la seguridad de los programas. Actualmente, las MV son construidas como “cajas negras”, lo cual reduce significativamente la posibilidad de observar o modificar su comportamiento mientras están siendo ejecutadas. En este trabajo pregonamos que la falta de interacción entre las aplicaciones y las MV impone un límite a las posibilidades de adaptación de los programas, mientras están siendo ejecutados, ante nuevos requerimientos. Para solucionar esta limitación presentamos la noción de plataformas de ejecución reflexivas: un tipo especial de MV que promueve su propia inspección y modificación en tiempo de ejecución permitiendo de este modo a las aplicaciones reconfigurar el comportamiento de la MV cuando sus requerimientos cambian. Proponemos una arquitectura de referencia para construir plataformas de ejecución reflexivas e introducimos una serie de optimizaciones específicamente diseñadas para este tipo de plataformas. En particular proponemos aplicar técnicas de optimización especulativa, técnicas estándar en el contexto de los lenguajes dinámicos, a nivel dela MV misma. Para evaluar nuestro enfoque construimos dos plataformas de ejecución reflexivas, una basada en un compilador de métodos y la otra en un optimizador de trazas. Luego, analizamos una serie de casos de estudio que nos permitieron evaluar sus propiedades distintivas para lidiar con escenarios adaptativos. Comparamos nuestras implementaciones con soluciones alternativas de nivel de lenguaje y argumentamos porqué una plataforma de ejecución reflexiva potencialmente las subsume a todas. Por otra parte, mostramos empíricamente que las MV reflexivas pueden ejecutarse con un desempeño asintótico similar al de las MV estándar (no reflexivas) cuando las capacidades reflexivas no se usan. También que la degradación deldesempeño es bajo (comparado con las soluciones alternativas) cuando estos mecanismos sí son utilizados. Aprovechando nuestras dos implementaciones, estudiamos cómo impactan las diferentes familias de compiladores (por método vs. por trazas) en los resultados finales. Por último, realizamos una serie de experimentos con el objetivo de estudiarlos efectos de exponer el comportamiento de los módulos de compilación a las aplicaciones. Los resultados preliminares muestran que este es un enfoque plausiblepara mejorar el desempeño de aplicaciones sobre las cuales las heurísticas de los compiladores dinámicos producen resultados subóptimos.Many programming languages run on top of a Virtual Machine (VM). VMs are complex pieces of software because they realize the language semantics and provide efficiency, portability, and security. Unfortunately, mainstream VMs are engineered as “black boxes” and provide only minimal means to expose their state and behavior at run time. In this thesis we argue that the lack of interaction between applications and VMs put a limit on the adaptation capabilities of (running) applications. To overcome this situation we introduce the notion of fully reflective VM: a new kind of VM providing reflection not only at the application but also at the VM level. In other words, a fully reflective VM provides means to support its own observability and modifiability at run time and enables programming languages to interact with and adapt the underlying VM to changing requirements. We propose a reference architecture for such VMs and discuss some challenges in terms of performance degradation that these systems may induce. We then introduce a series of optimizations targeted specially to this kind of platforms. They are based on a key assumption: that the variability of the VM behavior tend to be low at run time. Accordingly, we apply standard dynamic compilation techniques such as specialization, speculation, and deoptimization on the VM code itself as a means tomitigate the overheads. To validate our claims we built two reflective VMs, one featuring a methodbased just in time (JIT) compiler and the other running on top of a trace-based optimizer. We start our evaluation by analyzing a series of case studies to understand how a reflective VM could deal with unanticipated adaptation scenarios on the fly. Furthermore, we compare our approach with the existing language-level alternatives and provide an elaborated discussion on why a reflective VM would subsume all of them. Then, we empirically show that our implementations can feature similar peak performance of that of standard VMs when their reflective mechanisms are not activated. Moreover, they present low overheads (in comparison to existing alternatives) when VM’s reflective capabilities are used. We also analyzed how the different compilation strategies (per-method vs. tracing) impact on the overallresults. Finally, we conduct a series of experiments in order to study the effects of opening up the compilation module of a reflective VM to the applications. We conclude that it is a plausible approach that brings new opportunities for optimizing algorithms in which the compiler heuristics fail to give optimal results.Fil: Chari, Guido Martín. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina

    Programming Language Evolution and Source Code Rejuvenation

    Programmers rely on programming idioms, design patterns, and workaround techniques to express fundamental design not directly supported by the language. Evolving languages often address frequently encountered problems by adding language and library support to subsequent releases. By using new features, programmers can express their intent more directly. As new concerns, such as parallelism or security, arise, early idioms and language facilities can become serious liabilities. Modern code sometimes bene fits from optimization techniques not feasible for code that uses less expressive constructs. Manual source code migration is expensive, time-consuming, and prone to errors. This dissertation discusses the introduction of new language features and libraries, exemplifi ed by open-methods and a non-blocking growable array library. We describe the relationship of open-methods to various alternative implementation techniques. The benefi ts of open-methods materialize in simpler code, better performance, and similar memory footprint when compared to using alternative implementation techniques. Based on these findings, we develop the notion of source code rejuvenation, the automated migration of legacy code. Source code rejuvenation leverages enhanced program language and library facilities by finding and replacing coding patterns that can be expressed through higher-level software abstractions. Raising the level of abstraction improves code quality by lowering software entropy. In conjunction with extensions to programming languages, source code rejuvenation o ers an evolutionary trajectory towards more reliable, more secure, and better performing code. We describe the tools that allow us efficient implementations of code rejuvenations. The Pivot source-to-source translation infrastructure and its traversal mechanism forms the core of our machinery. In order to free programmers from representation details, we use a light-weight pattern matching generator that turns a C like input language into pattern matching code. The generated code integrates seamlessly with the rest of the analysis framework. We utilize the framework to build analysis systems that find common workaround techniques for designated language extensions of C 0x (e.g., initializer lists). Moreover, we describe a novel system (TACE | template analysis and concept extraction) for the analysis of uninstantiated template code. Our tool automatically extracts requirements from the body of template functions. TACE helps programmers understand the requirements that their code de facto imposes on arguments and compare those de facto requirements to formal and informal specifications

    Simplifying the Analysis of C++ Programs

    Based on our experience of working with different C++ front ends, this thesis identifies numerous problems that complicate the analysis of C++ programs along the entire spectrum of analysis applications. We utilize library, language, and tool extensions to address these problems and offer solutions to many of them. In particular, we present efficient, expressive and non-intrusive means of dealing with abstract syntax trees of a program, which together render the visitor design pattern obsolete. We further extend C++ with open multi-methods to deal with the broader expression problem. Finally, we offer two techniques, one based on refining the type system of a language and the other on abstract interpretation, both of which allow developers to statically ensure or verify various run-time properties of their programs without having to deal with the full language semantics or even the abstract syntax tree of a program. Together, the solutions presented in this thesis make ensuring properties of interest about C++ programs available to average language users

    Programmiersprachen und Rechenkonzepte

    Seit 1984 veranstaltet die GI-Fachgruppe "Programmiersprachen und Rechenkonzepte" regelmäßig im Frühjahr einen Workshop im Physikzentrum Bad Honnef. Das Treffen dient in erster Linie dem gegenseitigen Kennenlernen, dem Erfahrungsaustausch, der Diskussion und der Vertiefung gegenseitiger Kontakte. In diesem Forum werden Vorträge und Demonstrationen sowohl bereits abgeschlossener als auch noch laufender Arbeiten vorgestellt, unter anderem (aber nicht ausschließlich) zu Themen wie - Sprachen, Sprachparadigmen - Korrektheit von Entwurf und Implementierung - Werkzeuge - Software-/Hardware-Architekturen - Spezifikation, Entwurf - Validierung, Verifikation - Implementierung, Integration - Sicherheit (Safety und Security) - eingebettete Systeme - hardware-nahe Programmierung. In diesem Technischen Bericht sind einige der präsentierten Arbeiten zusammen gestellt