4,827 research outputs found
WaveScript: A Case-Study in Applying a Distributed Stream-Processing Language
Applications that combine live data streams with embedded, parallel,and distributed processing are becoming more commonplace. WaveScriptis a domain-specific language that brings high-level, type-safe,garbage-collected programming to these domains. This is made possibleby three primary implementation techniques. First, we employ a novelevaluation strategy that uses a combination of interpretation andreification to partially evaluate programs into stream dataflowgraphs. Second, we use profile-driven compilation to enable manyoptimizations that are normally only available in the synchronous(rather than asynchronous) dataflow domain. Finally, we incorporatean extensible system for rewrite rules to capture algebraic propertiesin specific domains (such as signal processing).We have used our language to build and deploy a sensor-network for theacoustic localization of wild animals, in particular, theYellow-Bellied marmot. We evaluate WaveScript's performance on thisapplication, showing that it yields good performance on both embeddedand desktop-class machines, including distributed execution andsubstantial parallel speedups. Our language allowed us to implementthe application rapidly, while outperforming a previous Cimplementation by over 35%, using fewer than half the lines of code.We evaluate the contribution of our optimizations to this success
Efficient openMP over sequentially consistent distributed shared memory systems
Nowadays clusters are one of the most used platforms in High Performance Computing and most programmers use the Message Passing Interface (MPI) library to program their applications in these distributed platforms getting their maximum performance, although it is a complex task. On the other side, OpenMP has been established as the de facto standard to program applications on shared memory platforms because it is easy to use and obtains good performance without too much effort.
So, could it be possible to join both worlds? Could programmers use the easiness of OpenMP in distributed platforms? A lot of researchers think so. And one of the developed ideas is the distributed shared memory (DSM), a software layer on top of a distributed platform giving an abstract shared memory view to the applications. Even though it seems a good solution it also has some inconveniences. The memory coherence between the nodes in the platform is difficult to maintain (complex management, scalability issues, high overhead and others) and the latency of the remote-memory accesses which can be orders of magnitude greater than on a shared bus due to the interconnection network.
Therefore this research improves the performance of OpenMP applications being executed on distributed memory platforms using a DSM with sequential consistency evaluating thoroughly the results from the NAS parallel benchmarks.
The vast majority of designed DSMs use a relaxed consistency model because it avoids some major problems in the area. In contrast, we use a sequential consistency model because we think that showing these potential problems that otherwise are hidden may allow the finding of some solutions and, therefore, apply them to both models.
The main idea behind this work is that both runtimes, the OpenMP and the DSM layer, should cooperate to achieve good performance, otherwise they interfere one each other trashing the final performance of applications.
We develop three different contributions to improve the performance of these applications: (a) a technique to avoid false sharing at runtime, (b) a technique to mimic the MPI behaviour, where produced data is forwarded to their consumers and, finally, (c) a mechanism to avoid the network congestion due to the DSM coherence messages. The NAS Parallel Benchmarks are used to test the contributions.
The results of this work shows that the false-sharing problem is a relative problem depending on each application. Another result is the importance to move the data flow outside of the critical path and to use techniques that forwards data as early as possible, similar to MPI, benefits the final application performance.
Additionally, this data movement is usually concentrated at single points and affects the application performance due to the limited bandwidth of the network. Therefore it is necessary to provide mechanisms that allows the distribution of this data through the computation time using an otherwise idle network.
Finally, results shows that the proposed contributions improve the performance of OpenMP applications on this kind of environments
Software Plagiarism Detection Using N-grams
Plagiarism is an act of copying where one doesn’t rightfully credit the original source. The
motivations behind plagiarism can vary from completing academic courses to even gaining
economical advantage. Plagiarism exists in various domains, where people want to take credit
from something they have worked on. These areas can include e.g. literature, art or software,
which all have a meaning for an authorship.
In this thesis we conduct a systematic literature review from the topic of source code
plagiarism detection methods, then based on the literature propose a new approach to detect
plagiarism which combines both similarity detection and authorship identification, introduce
our tokenization method for the source code, and lastly evaluate the model by using real life
data sets. The goal for our model is to point out possible plagiarism from a collection of
documents, which in this thesis is specified as a collection of source code files written by various
authors. Our data, which we will use to our statistical methods, consists of three datasets:
(1) collection of documents belonging to University of Helsinki’s first programming course, (2)
collection of documents belonging to University of Helsinki’s advanced programming course
and (3) submissions for source code re-use competition. Statistical methods in this thesis are
inspired by the theory of search engines, which are related to data mining when detecting
similarity between documents and machine learning when classifying document with the most
likely author in authorship identification.
Results show that our similarity detection model can be used successfully to retrieve
documents for further plagiarism inspection, but false positives are quickly introduced even
when using a high threshold that controls the minimum allowed level of similarity between
documents. We were unable to use the results of authorship identification in our study, as
the results with our machine learning model were not high enough to be used sensibly. This
was possibly caused by the high similarity between documents, which is due to the restricted
tasks and the course setting that teaches a specific programming style during the timespan of
the course
Multigrain shared memory
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 197-203).by Donald Yeung.Ph.D
The impact of asynchrony on computer architecture
The performance characteristics of asynchronous circuits are quite different from those of their synchronous counterparts. As a result, the best asynchronous design
of a particular system does not necessarily correspond to the best synchronous design, even at the algorithmic level. The goal of this thesis is to examine certain aspects of computer architecture and design in the context of an asynchronous VLSI implementation.
We present necessary and sufficient conditions under which the degree of pipelining of a component can be modified without affecting the correctness of an asynchronous
computation.
As an instance of the improvements possible using an asynchronous architecture, we present circuits to solve the prefix problem with average-case behavior better than
that possible by any synchronous solution in the case when the prefix operator has a right zero. We show that our circuit implementations are area-optimal given their
performance characteristics, and have the best possible average-case latency.
At the level of processor design, we present a mechanism for the implementation of precise exceptions in asynchronous processors. The novel feature of this mechanism is that it permits the presence of a data-dependent number of instructions in the execution pipeline of the processor.
Finally, at the level of processor architecture, we present the architecture of a processor with an independent instruction stream for branches. The instruction set
permits loops and function calls to be executed with minimal control-flow overhead
Extending and Implementing the Self-adaptive Virtual Processor for Distributed Memory Architectures
Many-core architectures of the future are likely to have distributed memory
organizations and need fine grained concurrency management to be used
effectively. The Self-adaptive Virtual Processor (SVP) is an abstract
concurrent programming model which can provide this, but the model and its
current implementations assume a single address space shared memory. We
investigate and extend SVP to handle distributed environments, and discuss a
prototype SVP implementation which transparently supports execution on
heterogeneous distributed memory clusters over TCP/IP connections, while
retaining the original SVP programming model
Ernst Denert Award for Software Engineering 2019
This open access book provides an overview of the dissertations of the five nominees for the Ernst Denert Award for Software Engineering in 2019. The prize, kindly sponsored by the Gerlind & Ernst Denert Stiftung, is awarded for excellent work within the discipline of Software Engineering, which includes methods, tools and procedures for better and efficient development of high quality software. An essential requirement for the nominated work is its applicability and usability in industrial practice. The book contains five papers describing the works by Sebastian Baltes (U Trier) on Software Developers’Work Habits and Expertise, Timo Greifenberg’s thesis on Artefaktbasierte Analyse modellgetriebener Softwareentwicklungsprojekte, Marco Konersmann’s (U Duisburg-Essen) work on Explicitly Integrated Architecture, Marija Selakovic’s (TU Darmstadt) research about Actionable Program Analyses for Improving Software Performance, and Johannes Späth’s (Paderborn U) thesis on Synchronized Pushdown Systems for Pointer and Data-Flow Analysis – which actually won the award. The chapters describe key findings of the respective works, show their relevance and applicability to practice and industrial software engineering projects, and provide additional information and findings that have only been discovered afterwards, e.g. when applying the results in industry. This way, the book is not only interesting to other researchers, but also to industrial software professionals who would like to learn about the application of state-of-the-art methods in their daily work
- …