843 research outputs found
A Fast Causal Profiler for Task Parallel Programs
This paper proposes TASKPROF, a profiler that identifies parallelism
bottlenecks in task parallel programs. It leverages the structure of a task
parallel execution to perform fine-grained attribution of work to various parts
of the program. TASKPROF's use of hardware performance counters to perform
fine-grained measurements minimizes perturbation. TASKPROF's profile execution
runs in parallel using multi-cores. TASKPROF's causal profile enables users to
estimate improvements in parallelism when a region of code is optimized even
when concrete optimizations are not yet known. We have used TASKPROF to isolate
parallelism bottlenecks in twenty three applications that use the Intel
Threading Building Blocks library. We have designed parallelization techniques
in five applications to in- crease parallelism by an order of magnitude using
TASKPROF. Our user study indicates that developers are able to isolate
performance bottlenecks with ease using TASKPROF.Comment: 11 page
Coz: Finding Code that Counts with Causal Profiling
Improving performance is a central concern for software developers. To locate
optimization opportunities, developers rely on software profilers. However,
these profilers only report where programs spent their time: optimizing that
code may have no impact on performance. Past profilers thus both waste
developer time and make it difficult for them to uncover significant
optimization opportunities.
This paper introduces causal profiling. Unlike past profiling approaches,
causal profiling indicates exactly where programmers should focus their
optimization efforts, and quantifies their potential impact. Causal profiling
works by running performance experiments during program execution. Each
experiment calculates the impact of any potential optimization by virtually
speeding up code: inserting pauses that slow down all other code running
concurrently. The key insight is that this slowdown has the same relative
effect as running that line faster, thus "virtually" speeding it up.
We present Coz, a causal profiler, which we evaluate on a range of
highly-tuned applications: Memcached, SQLite, and the PARSEC benchmark suite.
Coz identifies previously unknown optimization opportunities that are both
significant and targeted. Guided by Coz, we improve the performance of
Memcached by 9%, SQLite by 25%, and accelerate six PARSEC applications by as
much as 68%; in most cases, these optimizations involve modifying under 10
lines of code.Comment: Published at SOSP 2015 (Best Paper Award
Teleoperation of passivity-based model reference robust control over the internet
This dissertation offers a survey of a known theoretical approach and novel experimental results in establishing a live communication medium through the internet to host a virtual communication environment for use in Passivity-Based Model Reference Robust Control systems with delays. The controller which is used as a carrier to support a robust communication between input-to-state stability is designed as a control strategy that passively compensates for position errors that arise during contact tasks and strives to achieve delay-independent stability for controlling of aircrafts or other mobile objects. Furthermore the controller is used for nonlinear systems, coordination of multiple agents, bilateral teleoperation, and collision avoidance thus maintaining a communication link with an upper bound of constant delay is crucial for robustness and stability of the overall system. For utilizing such framework an elucidation can be formulated by preparing site survey for analyzing not only the geographical distances separating the nodes in which the teleoperation will occur but also the communication parameters that define the virtual topography that the data will travel through. This survey will first define the feasibility of the overall operation since the teleoperation will be used to sustain a delay based controller over the internet thus obtaining a hypothetical upper bound for the delay via site survey is crucial not only for the communication system but also the delay is required for the design of the passivity-based model reference robust control. Following delay calculation and measurement via site survey, bandwidth tests for unidirectional and bidirectional communication is inspected to ensure that the speed is viable to maintain a real-time connection. Furthermore from obtaining the results it becomes crucial to measure the consistency of the delay throughout a sampled period to guarantee that the upper bound is not breached at any point within the communication to jeopardize the robustness of the controller. Following delay analysis a geographical and topological overview of the communication is also briefly examined via a trace-route to understand the underlying nodes and their contribution to the delay and round-trip consistency. To accommodate the communication channel for the controller the input and output data from both nodes need to be encapsulated within a transmission control protocol via a multithreaded design of a robust program within the C language. The program will construct a multithreaded client-server relationship in which the control data is transmitted. For added stability and higher level of security the channel is then encapsulated via an internet protocol security by utilizing a protocol suite for protecting the communication by authentication and encrypting each packet of the session using negotiation of cryptographic keys during each session
Performance analysis of Intel Core 2 Duo processor
With the emergence of thread level parallelism as a more efficient method of improving processor performance, Chip Multiprocessor (CMP) technology is being more widely used in developing processor architectures. Also, the widening gap between CPU and memory speed has evoked the interest of researchers to understand performance of memory hierarchical architectures. As part of this research, performance characteristic studies were carried out on the Intel Core 2 Duo, a dual core power efficient processor, using a variety of new generation benchmarks. This study provides a detailed analysis of the memory hierarchy performance and the performance scalability between single and dual core processors. The behavior of SPEC CPU2006 benchmarks running on Intel Core 2 Duo processor is also explained. Lastly, the overall execution time and throughput measurement using both multi-programmed and multi-threaded workloads for the Intel Core 2 Duo processor is reported and compared to that of the Intel Pentium D and AMD Athlon 64X2 processors. Results showed that the Intel Core 2 Duo had the best performance for a variety of workloads due to its advanced micro-architectural features such as the shared L2 cache, fast cache to cache communication and smart memory access
Mining Fix Patterns for FindBugs Violations
In this paper, we first collect and track a large number of fixed and unfixed
violations across revisions of software.
The empirical analyses reveal that there are discrepancies in the
distributions of violations that are detected and those that are fixed, in
terms of occurrences, spread and categories, which can provide insights into
prioritizing violations.
To automatically identify patterns in violations and their fixes, we propose
an approach that utilizes convolutional neural networks to learn features and
clustering to regroup similar instances. We then evaluate the usefulness of the
identified fix patterns by applying them to unfixed violations.
The results show that developers will accept and merge a majority (69/116) of
fixes generated from the inferred fix patterns. It is also noteworthy that the
yielded patterns are applicable to four real bugs in the Defects4J major
benchmark for software testing and automated repair.Comment: Accepted for IEEE Transactions on Software Engineerin
Mastering DICOM with DVTk
The Digital Imaging and Communications in Medicine (DICOM) Validation Toolkit (DVTk) is an open-source framework with potential value for anyone working with the DICOM standard. DICOM’s flexibility requires hands-on experience in understanding ways in which the standard’s interpretation may vary among vendors. DVTk was developed as a clinical engineering tool to aid and accelerate DICOM integration at clinical sites. DVTk is used to provide an independent measurement of the accuracy of a product’s DICOM interface, according to both the DICOM standard and the product’s conformance statement. DVTk has stand-alone tools and a framework with which developers can create new tools. We provide an overview of the architecture of the toolkit, sample scenarios of its utility, and evidence of its relative ease of use. Our goal is to encourage involvement in this open-source project and attract developers to build off and further enrich this platform for DICOM integration testing
- …