86 research outputs found
A Historical Context for Data Streams
Machine learning from data streams is an active and growing research area.
Research on learning from streaming data typically makes strict assumptions
linked to computational resource constraints, including requirements for stream
mining algorithms to inspect each instance not more than once and be ready to
give a prediction at any time. Here we review the historical context of data
streams research placing the common assumptions used in machine learning over
data streams in their historical context.Comment: 9 page
Runtime Verification of Kotlin Coroutines
International audienceKotlin was introduced to Android as the recommended language for development. One of the unique functionalities of Kotlin is that of coroutines, which are lightweight tasks that can run concurrently inside threads. Programming using coroutines is difficult, among other things, because they can move between threads and behave unexpectedly. We introduce runtime verification in Kotlin. We provide a language to write properties and produce runtime monitors tailored to verify Kotlin coroutines. We identify, formalise and runtime verify seven properties about common runtime errors that are not easily identifiable by static analysis. To demonstrate the acceptability of the technique in real applications, we apply our framework to an in-house Android app and microbenchmarks and measure the execution time and memory overheads
Programming language trends : an empirical study
Predicting the evolution of software engineering technology trends is a dubious proposition. The recent evolution of software technology is a prime example; it is fast paced and affected by many factors, which are themselves driven by a wide range of sources. This dissertation is part of a long term project intended to analyze software engineering technology trends and how they evolve. Basically, the following questions will be answered: How to watch, predict, adapt to, and affect software engineering trends?
In this dissertation, one field of software engineering, programming languages, will be discussed. After reviewing the history of a group of programming languages, it shows that two kinds of factors, intrinsic factors and extrinsic factors, could affect the evolution of a programming language. Intrinsic factors are the factors that can be used to describe the general desigu criteria of programming languages. Extrinsic factors are the factors that are not directly related to the general attributes of programming languages, but still can affect their evolution. In order to describe the relationship of these factors and how they affect programming language trends, these factors need to be quantified. A score has been assigued to each factor for every programming language. By collecting historical data, a data warehouse has been established, which stores the value of each factor for every programming language. The programming language trends are described and evaluated by using these data.
Empirical research attempts to capture observed behaviors by empirical laws. In this dissertation, statistical methods are used to describe historical programming language trends and predict the evolution of the future trends. Several statistics models are constructed to describe the relationships among these factors. Canonical correlation is used to do the factor analysis. Multivariate multiple regression method has been used to construct the statistics models for programming language trends. After statistics models are constructed to describe the historical programming language trends, they are extended to do tentative prediction for future trends. The models are validated by comparing the predictive data and the actual data
Development of an Experimental Template Driven Editor for Trees
The development of a template editor for trees is examined by defining the mechanisms of the template and by discussing the techniques used to coordinate the traversal of two trees. This paper shows the feasibility of editing network and tree structures using an overlay (template) to keeping the data in a defined pattern.Computer Scienc
Liberating Coroutines: Combining Sequential and Parallel Execution
Concurrent programming using threads is considered a hard and error-prone task. Coroutines are conceptually simpler, they are easier
to program with due to their sequential nature. Flexible coroutines as presented by Belsnes and Østvold liberate classical coroutines from their quasi-parallel world and combine them with threads. This allows the programmer to factor programs into sequential and parallel tasks, leading to simpler programs.
This thesis presents an extension to the formal semantics for flexible coroutines. A detailed breakdown of the scheduling strategies and parameter passing is presented in the same formal framework. Some words are given on patterns that emerge when programming with flexible coroutines and these patterns are defined in the formal framework.
We present a clean implementation of flexible coroutines in Java, based on standard threads and semaphores. Challenges encountered, such as representing coroutines in Java and invoking methods across threads are discussed. This framework is used in examples that employ flexible coroutines in different ways; the classical synchronization problem of readers and writers, the Santa Claus problem and binary and general semaphores
Data dependent program generation
PhD ThesisWhen information is stored in a computer it can usually be
organised in many different ways. If the information is
used for a number of different purposes the ideal
organisation is not always obvious. It will depend on how
often various parts of the data are used, how often they are
changed, and the amount of data taking part in each
transaction. It may be difficult to predict these
parameters in advance, especially in data-base applications
where the pattern of use may change as time goes by.
Ultimately, one can visualise systems which can
automatically choose the optimum
can substantially assist in the
representation, or
choice. A step in
which
this
direction, which could itself find immediate application, is
to find a practical way to tailor programs to a particular
data organisation. The thesis describes an experimental
system which does this for a limited range of programs, and
the work which lead up to it. Both data retrieval and
simple updates are considered.
One prerequisite is a method of writing the program so that
it does not depend on the way that the data is stored. A
number of data-base systems achieve this independence by
describing the data as a collection of relations. These
systems and the background to them are reviewed. The
experimental system is loosely based on the use of
relations, but some modifications have been made to make the
processing simpler and so that the characteristics of the
data organisation can be described. The system incorporates
the representation into the program and produces a tailored
version which is expressed in abstract, Algol-like code.
The result is intended to be similar to code which a human
programmer might write in similar circumstances, but as far
as possible ignoring the details of any particular
implementation.IBM, Advanced Education Scheme
High-performance software packet processing
In today’s Internet, it is highly desirable to have fast and scalable software packet processing solutions for network applications that run on commodity hardware. The advent of cloud computing drives the continued rapid growth of Internet traffic. Moreover, the development of emerging networking techniques, such as Network Function Virtualization, significantly shapes the need for implementing the network functions in software. Finally, with the advancement of modern platforms as well as software frameworks for packet processing, network applications have potential to process 100+ Gbps network traffic on a single commodity server. Representative frameworks include the Click modular router, the RouteBricks scalable routing architecture, and BUFFALO, the software-based Ethernet switch. Beneath this general-purpose routing and switching functionality lie a broad set of network applications, many of which are handled with custom methods to provide cost-effectiveness and flexibility. This thesis considers two long-standing networking applications, IP lookup and distributed denial-of-service (DDoS) mitigation, and proposes efficient software-based methods drawing from this new perspective.
In this thesis, we first introduce several optimization techniques to accelerate network applications by taking advantage of modern CPU features. Then, we explore the IP lookup problem to find the longest matching prefix of an IP address in a set of prefixes. An ideal IP lookup algorithm should achieve small constant IP lookup time, and on-chip memory usage. However, no prior IP lookup algorithm achieves both requirements at the same time. We propose SAIL, a splitting approach to IP lookup, and a suite of algorithms for IP lookup based on SAIL framework. We conducted extensive experiments to evaluate our algorithms, and experimental results show that our SAIL algorithms are much faster than well-known IP lookup algorithms. Next, we switch our focus to DDoS, an attempt to disrupt the legitimate traffic of a victim by sending a flood of Internet traffic from different sources. Our solution is Gatekeeper, the first open-source and deployable DDoS mitigation system. We present a series of optimization techniques, including use of modern platforms, group prefetching, coroutines, and hashing, to accelerate Gatekeeper. Experimental results show that these optimization techniques significantly improve its performance over alternative baseline solutions.2022-01-30T00:00:00
- …