843 research outputs found

    A Fast Causal Profiler for Task Parallel Programs

    Full text link
    This paper proposes TASKPROF, a profiler that identifies parallelism bottlenecks in task parallel programs. It leverages the structure of a task parallel execution to perform fine-grained attribution of work to various parts of the program. TASKPROF's use of hardware performance counters to perform fine-grained measurements minimizes perturbation. TASKPROF's profile execution runs in parallel using multi-cores. TASKPROF's causal profile enables users to estimate improvements in parallelism when a region of code is optimized even when concrete optimizations are not yet known. We have used TASKPROF to isolate parallelism bottlenecks in twenty three applications that use the Intel Threading Building Blocks library. We have designed parallelization techniques in five applications to in- crease parallelism by an order of magnitude using TASKPROF. Our user study indicates that developers are able to isolate performance bottlenecks with ease using TASKPROF.Comment: 11 page

    Coz: Finding Code that Counts with Causal Profiling

    Full text link
    Improving performance is a central concern for software developers. To locate optimization opportunities, developers rely on software profilers. However, these profilers only report where programs spent their time: optimizing that code may have no impact on performance. Past profilers thus both waste developer time and make it difficult for them to uncover significant optimization opportunities. This paper introduces causal profiling. Unlike past profiling approaches, causal profiling indicates exactly where programmers should focus their optimization efforts, and quantifies their potential impact. Causal profiling works by running performance experiments during program execution. Each experiment calculates the impact of any potential optimization by virtually speeding up code: inserting pauses that slow down all other code running concurrently. The key insight is that this slowdown has the same relative effect as running that line faster, thus "virtually" speeding it up. We present Coz, a causal profiler, which we evaluate on a range of highly-tuned applications: Memcached, SQLite, and the PARSEC benchmark suite. Coz identifies previously unknown optimization opportunities that are both significant and targeted. Guided by Coz, we improve the performance of Memcached by 9%, SQLite by 25%, and accelerate six PARSEC applications by as much as 68%; in most cases, these optimizations involve modifying under 10 lines of code.Comment: Published at SOSP 2015 (Best Paper Award

    Teleoperation of passivity-based model reference robust control over the internet

    Get PDF
    This dissertation offers a survey of a known theoretical approach and novel experimental results in establishing a live communication medium through the internet to host a virtual communication environment for use in Passivity-Based Model Reference Robust Control systems with delays. The controller which is used as a carrier to support a robust communication between input-to-state stability is designed as a control strategy that passively compensates for position errors that arise during contact tasks and strives to achieve delay-independent stability for controlling of aircrafts or other mobile objects. Furthermore the controller is used for nonlinear systems, coordination of multiple agents, bilateral teleoperation, and collision avoidance thus maintaining a communication link with an upper bound of constant delay is crucial for robustness and stability of the overall system. For utilizing such framework an elucidation can be formulated by preparing site survey for analyzing not only the geographical distances separating the nodes in which the teleoperation will occur but also the communication parameters that define the virtual topography that the data will travel through. This survey will first define the feasibility of the overall operation since the teleoperation will be used to sustain a delay based controller over the internet thus obtaining a hypothetical upper bound for the delay via site survey is crucial not only for the communication system but also the delay is required for the design of the passivity-based model reference robust control. Following delay calculation and measurement via site survey, bandwidth tests for unidirectional and bidirectional communication is inspected to ensure that the speed is viable to maintain a real-time connection. Furthermore from obtaining the results it becomes crucial to measure the consistency of the delay throughout a sampled period to guarantee that the upper bound is not breached at any point within the communication to jeopardize the robustness of the controller. Following delay analysis a geographical and topological overview of the communication is also briefly examined via a trace-route to understand the underlying nodes and their contribution to the delay and round-trip consistency. To accommodate the communication channel for the controller the input and output data from both nodes need to be encapsulated within a transmission control protocol via a multithreaded design of a robust program within the C language. The program will construct a multithreaded client-server relationship in which the control data is transmitted. For added stability and higher level of security the channel is then encapsulated via an internet protocol security by utilizing a protocol suite for protecting the communication by authentication and encrypting each packet of the session using negotiation of cryptographic keys during each session

    Performance analysis of Intel Core 2 Duo processor

    Get PDF
    With the emergence of thread level parallelism as a more efficient method of improving processor performance, Chip Multiprocessor (CMP) technology is being more widely used in developing processor architectures. Also, the widening gap between CPU and memory speed has evoked the interest of researchers to understand performance of memory hierarchical architectures. As part of this research, performance characteristic studies were carried out on the Intel Core 2 Duo, a dual core power efficient processor, using a variety of new generation benchmarks. This study provides a detailed analysis of the memory hierarchy performance and the performance scalability between single and dual core processors. The behavior of SPEC CPU2006 benchmarks running on Intel Core 2 Duo processor is also explained. Lastly, the overall execution time and throughput measurement using both multi-programmed and multi-threaded workloads for the Intel Core 2 Duo processor is reported and compared to that of the Intel Pentium D and AMD Athlon 64X2 processors. Results showed that the Intel Core 2 Duo had the best performance for a variety of workloads due to its advanced micro-architectural features such as the shared L2 cache, fast cache to cache communication and smart memory access

    Mining Fix Patterns for FindBugs Violations

    Get PDF
    In this paper, we first collect and track a large number of fixed and unfixed violations across revisions of software. The empirical analyses reveal that there are discrepancies in the distributions of violations that are detected and those that are fixed, in terms of occurrences, spread and categories, which can provide insights into prioritizing violations. To automatically identify patterns in violations and their fixes, we propose an approach that utilizes convolutional neural networks to learn features and clustering to regroup similar instances. We then evaluate the usefulness of the identified fix patterns by applying them to unfixed violations. The results show that developers will accept and merge a majority (69/116) of fixes generated from the inferred fix patterns. It is also noteworthy that the yielded patterns are applicable to four real bugs in the Defects4J major benchmark for software testing and automated repair.Comment: Accepted for IEEE Transactions on Software Engineerin

    The embedded Java benchmark suite JemBench

    Get PDF

    Mastering DICOM with DVTk

    Get PDF
    The Digital Imaging and Communications in Medicine (DICOM) Validation Toolkit (DVTk) is an open-source framework with potential value for anyone working with the DICOM standard. DICOM’s flexibility requires hands-on experience in understanding ways in which the standard’s interpretation may vary among vendors. DVTk was developed as a clinical engineering tool to aid and accelerate DICOM integration at clinical sites. DVTk is used to provide an independent measurement of the accuracy of a product’s DICOM interface, according to both the DICOM standard and the product’s conformance statement. DVTk has stand-alone tools and a framework with which developers can create new tools. We provide an overview of the architecture of the toolkit, sample scenarios of its utility, and evidence of its relative ease of use. Our goal is to encourage involvement in this open-source project and attract developers to build off and further enrich this platform for DICOM integration testing
    • …
    corecore