288 research outputs found

    Bio-inspired call-stack reconstruction for performance analysis

    Get PDF
    The correlation of performance bottlenecks and their associated source code has become a cornerstone of performance analysis. It allows understanding why the efficiency of an application falls behind the computer's peak performance and enabling optimizations on the code ultimately. To this end, performance analysis tools collect the processor call-stack and then combine this information with measurements to allow the analyst comprehend the application behavior. Some tools modify the call-stack during run-time to diminish the collection expense but at the cost of resulting in non-portable solutions. In this paper, we present a novel portable approach to associate performance issues with their source code counterpart. To address it, we capture a reduced segment of the call-stack (up to three levels) and then process the segments using an algorithm inspired by multi-sequence alignment techniques. The results of our approach are easily mapped to detailed performance views, enabling the analyst to unveil the application behavior and its corresponding region of code. To demonstrate the usefulness of our approach, we have applied the algorithm to several first-time seen in-production applications to describe them finely, and optimize them by using tiny modifications based on the analyses.We thankfully acknowledge Mathis Bode for giving us access to the Arts CF binaries, and Miguel Castrillo and Kim Serradell for their valuable insight regarding Nemo. We would like to thank Forschungszentrum Jülich for the computation time on their Blue Gene/Q system. This research has been partially funded by the CICYT under contracts No. TIN2012-34557 and TIN2015-65316-P.Peer ReviewedPostprint (author's final draft

    Coz: Finding Code that Counts with Causal Profiling

    Full text link
    Improving performance is a central concern for software developers. To locate optimization opportunities, developers rely on software profilers. However, these profilers only report where programs spent their time: optimizing that code may have no impact on performance. Past profilers thus both waste developer time and make it difficult for them to uncover significant optimization opportunities. This paper introduces causal profiling. Unlike past profiling approaches, causal profiling indicates exactly where programmers should focus their optimization efforts, and quantifies their potential impact. Causal profiling works by running performance experiments during program execution. Each experiment calculates the impact of any potential optimization by virtually speeding up code: inserting pauses that slow down all other code running concurrently. The key insight is that this slowdown has the same relative effect as running that line faster, thus "virtually" speeding it up. We present Coz, a causal profiler, which we evaluate on a range of highly-tuned applications: Memcached, SQLite, and the PARSEC benchmark suite. Coz identifies previously unknown optimization opportunities that are both significant and targeted. Guided by Coz, we improve the performance of Memcached by 9%, SQLite by 25%, and accelerate six PARSEC applications by as much as 68%; in most cases, these optimizations involve modifying under 10 lines of code.Comment: Published at SOSP 2015 (Best Paper Award

    Profiling, extracting, and analyzing dynamic software metrics

    Get PDF
    This thesis presents a methodology for the analysis of software executions aimed at profiling software, extracting dynamic software metrics, and then analyzing those metrics with the goal of assisting software quality researchers. The methodology is implemented in a toolkit which consists of an event-based profiler which collects more accurate data than existing profilers, and a program called MetricView that derives and extracts dynamic metrics from the generated profiles. The toolkit was designed to be modular and flexible, allowing analysts and developers to easily extend its functionality to derive new or custom dynamic software metrics. We demonstrate the effectiveness and usefulness of DynaMEAT by applying it to several open-source projects of varying sizes

    Improving satellite measurements of clouds and precipitation using machine learning

    Get PDF
    Observing and measuring clouds and precipitation is essential for climate science, meteorology, and an increasing range of societal and economic activities. This importance is due to the role of clouds and precipitation in the hydrological cycle and the weather and climate of the Earth. Furthermore, patterns of cloudiness and precipitation interact across continental scales and are highly variable in both space and time. Therefore their study and monitoring require observations with global coverage and high temporal resolution, which currently can only be provided by satellite observations.Inferring properties of clouds or precipitation from satellite observations is a non-trivial task. Due to the limited information content of the observations and the complex physics of the atmosphere, such retrievals are endowed with significant uncertainties. Traditional methods to perform these retrievals trade-off processing speed against accuracy and the ability to characterize the uncertainties in their predictions.This thesis develops and evaluates two neural-network-based methods for performing retrievals of hydrometeors, i.e., clouds and precipitation, that are capable of providing accurate predictions of the retrieval uncertainty. The practicality and benefits of the proposed methods are demonstrated using three real-world retrieval applications of cloud properties and precipitation. The demonstrated benefits of these methods over traditional retrieval methods led to the adoption of one of the algorithms for operational use at the European Organisation for the Exploitation of Meteorological Satellites. The two other algorithms are planned to be integrated into the operational processing at the Brazilian National Institute for Space Research, as well as the processing of observations from the Global Precipitation Measurement, a joint satellite mission by NASA and the Japanese Aerospace Exploration Agency.The principal advantage of the proposed methods is their simplicity and computational efficiency. A minor modification of the architecture and training of conventional neural networks is sufficient to capture the dominant source of uncertainty for remote sensing retrievals. As shown in this thesis, deep neural networks can significantly improve the accuracy of satellite retrievals of hydrometeors. With the proposed methods, the benefits of modern neural network architectures can be combined with reliable uncertainty estimates, which are required to improve the characterization of the observed hydrometeors

    Trends in regional atmospheric water cycles across ocean basins diagnosed using multiple products

    Get PDF
    2021 Spring.Includes bibliographical references.The importance of water within the earth system, especially its direct impacts on weather and climate through its presence and transport in the atmosphere, cannot be overstated. Accordingly, it is critical to obtain an accurate baseline understanding of the current state of the atmospheric branch of the water cycle if we are to infer future changes to the water cycle and associated influences on weather and climate. Technological advances in both remote and in-situ observing systems have made it possible to characterize water and energy budgets on global scales. However, relatively little work has been done to study the degree of closure, and thus the accuracy of these methods, at regional scales – especially over the oceans. This is a task complicated by the lack of long-term continuous data records of the variables of interest, including ocean surface evaporation, atmospheric water vapor flux divergence, and precipitation. This work aims to fill these gaps and contribute to the establishment of a baseline understanding of the water cycle within the current TRMM and GPM era. The evolution of water cycle closure within five independent regions in the equatorial Pacific, Atlantic, and Indian Oceans has been established previously using atmospheric reanalysis and gridded observational and pseudo-observational data products. That research found that while the water budgets closed extremely well in most basins, the water cycle within the West Pacific was found to trend out of closure within the first decade of the 21st century. The current study aims to extend this analysis temporally, in addition to including a wider variety of independent data sources to confirm the presence of this emerging lack of closure and hypothesize the reason for its existence. Differences between independent products are used within the context of each region to infer whether the emerging lack of closure is a data artifact or is a result of a more fundamental shift in the physical mechanisms and characteristics of the evaporation, precipitation, or water vapor flux divergence within a specific region. Results confirm an initial hypothesis that the emerging lack of water cycle closure in the West Pacific is not due to satellite or instrument drift. Rather, it appears to be related to changes in the prevalence of deep isolated versus deep organized convection in the West Pacific region and its associated impact on passive microwave precipitation retrieval algorithms

    Interactive Trace-Based Analysis Toolset for Manual Parallelization of C Programs

    Get PDF
    Massive amounts of legacy sequential code need to be parallelized to make better use of modern multiprocessor architectures. Nevertheless, writing parallel programs is still a difficult task. Automated parallelization methods can be effective both at the statement and loop levels and, recently, at the task level, but they are still restricted to specific source code constructs or application domains. We present in this article an innovative toolset that supports developers when performing manual code analysis and parallelization decisions. It automatically collects and represents the program profile and data dependencies in an interactive graphical format that facilitates the analysis and discovery of manual parallelization opportunities. The toolset can be used for arbitrary sequential C programs and parallelization patterns. Also, its program-scope data dependency tracing at runtime can complement the tools based on static code analysis and can also benefit from it at the same time. We also tested the effectiveness of the toolset in terms of time to reach parallelization decisions and of their quality. We measured a significant improvement for several real-world representative applications
    • …
    corecore