288 research outputs found
Bio-inspired call-stack reconstruction for performance analysis
The correlation of performance bottlenecks and their associated source code has become a cornerstone of performance analysis. It allows understanding why the efficiency of an application falls behind the computer's peak performance and enabling optimizations on the code ultimately. To this end, performance analysis tools collect the processor call-stack and then combine this information with measurements to allow the analyst comprehend the application behavior. Some tools modify the call-stack during run-time to diminish the collection expense but at the cost of resulting in non-portable solutions. In this paper, we present a novel portable approach to associate performance issues with their source code counterpart. To address it, we capture a reduced segment of the call-stack (up to three levels) and then process the segments using an algorithm inspired by multi-sequence alignment techniques. The results of our approach are easily mapped to detailed performance views, enabling the analyst to unveil the application behavior and its corresponding region of code. To demonstrate the usefulness of our approach, we have applied the algorithm to several first-time seen in-production applications to describe them finely, and optimize them by using tiny modifications based on the analyses.We thankfully acknowledge Mathis Bode for giving us access to the Arts CF binaries, and Miguel Castrillo and Kim Serradell for their valuable insight regarding Nemo. We would like to thank Forschungszentrum Jülich for the computation time on their Blue Gene/Q system. This research has been partially funded by the CICYT under contracts No. TIN2012-34557 and TIN2015-65316-P.Peer ReviewedPostprint (author's final draft
Coz: Finding Code that Counts with Causal Profiling
Improving performance is a central concern for software developers. To locate
optimization opportunities, developers rely on software profilers. However,
these profilers only report where programs spent their time: optimizing that
code may have no impact on performance. Past profilers thus both waste
developer time and make it difficult for them to uncover significant
optimization opportunities.
This paper introduces causal profiling. Unlike past profiling approaches,
causal profiling indicates exactly where programmers should focus their
optimization efforts, and quantifies their potential impact. Causal profiling
works by running performance experiments during program execution. Each
experiment calculates the impact of any potential optimization by virtually
speeding up code: inserting pauses that slow down all other code running
concurrently. The key insight is that this slowdown has the same relative
effect as running that line faster, thus "virtually" speeding it up.
We present Coz, a causal profiler, which we evaluate on a range of
highly-tuned applications: Memcached, SQLite, and the PARSEC benchmark suite.
Coz identifies previously unknown optimization opportunities that are both
significant and targeted. Guided by Coz, we improve the performance of
Memcached by 9%, SQLite by 25%, and accelerate six PARSEC applications by as
much as 68%; in most cases, these optimizations involve modifying under 10
lines of code.Comment: Published at SOSP 2015 (Best Paper Award
Profiling, extracting, and analyzing dynamic software metrics
This thesis presents a methodology for the analysis of software executions aimed at profiling software, extracting dynamic software metrics, and then analyzing those metrics with the goal of assisting software quality researchers. The methodology is implemented in a toolkit which consists of an event-based profiler which collects more accurate data than existing profilers, and a program called MetricView that derives and extracts dynamic metrics from the generated profiles. The toolkit was designed to be modular and flexible, allowing analysts and developers to easily extend its functionality to derive new or custom dynamic software metrics. We demonstrate the effectiveness and usefulness of DynaMEAT by applying it to several open-source projects of varying sizes
Improving satellite measurements of clouds and precipitation using machine learning
Observing and measuring clouds and precipitation is essential for climate science, meteorology, and an increasing range of societal and economic activities. This importance is due to the role of clouds and precipitation in the hydrological cycle and the weather and climate of the Earth. Furthermore, patterns of cloudiness and precipitation interact across continental scales and are highly variable in both space and time. Therefore their study and monitoring require observations with global coverage and high temporal resolution, which currently can only be provided by satellite observations.Inferring properties of clouds or precipitation from satellite observations is a non-trivial task. Due to the limited information content of the observations and the complex physics of the atmosphere, such retrievals are endowed with significant uncertainties. Traditional methods to perform these retrievals trade-off processing speed against accuracy and the ability to characterize the uncertainties in their predictions.This thesis develops and evaluates two neural-network-based methods for performing retrievals of hydrometeors, i.e., clouds and precipitation, that are capable of providing accurate predictions of the retrieval uncertainty. The practicality and benefits of the proposed methods are demonstrated using three real-world retrieval applications of cloud properties and precipitation. The demonstrated benefits of these methods over traditional retrieval methods led to the adoption of one of the algorithms for operational use at the European Organisation for the Exploitation of Meteorological Satellites. The two other algorithms are planned to be integrated into the operational processing at the Brazilian National Institute for Space Research, as well as the processing of observations from the Global Precipitation Measurement, a joint satellite mission by NASA and the Japanese Aerospace Exploration Agency.The principal advantage of the proposed methods is their simplicity and computational efficiency. A minor modification of the architecture and training of conventional neural networks is sufficient to capture the dominant source of uncertainty for remote sensing retrievals. As shown in this thesis, deep neural networks can significantly improve the accuracy of satellite retrievals of hydrometeors. With the proposed methods, the benefits of modern neural network architectures can be combined with reliable uncertainty estimates, which are required to improve the characterization of the observed hydrometeors
Trends in regional atmospheric water cycles across ocean basins diagnosed using multiple products
2021 Spring.Includes bibliographical references.The importance of water within the earth system, especially its direct impacts on weather and climate through its presence and transport in the atmosphere, cannot be overstated. Accordingly, it is critical to obtain an accurate baseline understanding of the current state of the atmospheric branch of the water cycle if we are to infer future changes to the water cycle and associated influences on weather and climate. Technological advances in both remote and in-situ observing systems have made it possible to characterize water and energy budgets on global scales. However, relatively little work has been done to study the degree of closure, and thus the accuracy of these methods, at regional scales – especially over the oceans. This is a task complicated by the lack of long-term continuous data records of the variables of interest, including ocean surface evaporation, atmospheric water vapor flux divergence, and precipitation. This work aims to fill these gaps and contribute to the establishment of a baseline understanding of the water cycle within the current TRMM and GPM era. The evolution of water cycle closure within five independent regions in the equatorial Pacific, Atlantic, and Indian Oceans has been established previously using atmospheric reanalysis and gridded observational and pseudo-observational data products. That research found that while the water budgets closed extremely well in most basins, the water cycle within the West Pacific was found to trend out of closure within the first decade of the 21st century. The current study aims to extend this analysis temporally, in addition to including a wider variety of independent data sources to confirm the presence of this emerging lack of closure and hypothesize the reason for its existence. Differences between independent products are used within the context of each region to infer whether the emerging lack of closure is a data artifact or is a result of a more fundamental shift in the physical mechanisms and characteristics of the evaporation, precipitation, or water vapor flux divergence within a specific region. Results confirm an initial hypothesis that the emerging lack of water cycle closure in the West Pacific is not due to satellite or instrument drift. Rather, it appears to be related to changes in the prevalence of deep isolated versus deep organized convection in the West Pacific region and its associated impact on passive microwave precipitation retrieval algorithms
Interactive Trace-Based Analysis Toolset for Manual Parallelization of C Programs
Massive amounts of legacy sequential code need to be parallelized to make better use of modern multiprocessor architectures. Nevertheless, writing parallel programs is still a difficult task. Automated parallelization methods can be effective both at the statement and loop levels and, recently, at the task level, but they are still restricted to specific source code constructs or application domains. We present in this article an innovative toolset that supports developers when performing manual code analysis and parallelization decisions. It automatically collects and represents the program profile and data dependencies in an interactive graphical format that facilitates the analysis and discovery of manual parallelization opportunities. The toolset can be used for arbitrary sequential C programs and parallelization patterns. Also, its program-scope data dependency tracing at runtime can complement the tools based on static code analysis and can also benefit from it at the same time. We also tested the effectiveness of the toolset in terms of time to reach parallelization decisions and of their quality. We measured a significant improvement for several real-world representative applications
Recommended from our members
Utilizing Runtime Information for Accurate Root Cause Identification in Performance Diagnosis
This dissertation highlights that existing performance diagnostic tools often become less effective due to their inherent inaccuracies in modern software. To overcome these inaccuracies and effectively identify the root causes of performance issues, it is necessary to incorporate supplementary runtime information into these tools. Within this context, the dissertation integrates specific runtime information into two typical performance diagnostic tools: profilers and causal tracing tools.
The integration yields a substantial enhancement in the effectiveness of performance diagnosis. Among these tools, gprof stands out as a representative profiler for performance diagnosis. Nonetheless, its effectiveness diminishes as the time cost calculated based on CPU sampling fails to accurately and adequately pinpoint the root causes of performance issues in complex software. To tackle this challenge, the dissertation introduces an innovative methodology called value-assisted cost profiling (vProf). This approach incorporates variable values observed during runtime into the profiling process.
By continuously sampling variable values from both normal and problematic executions, vProf refines function cost estimates, identifies anomalies in value distributions, and highlights potentially problematic code areas that could be the actual sources of performance is- sues. The effectiveness of vProf is validated through the diagnosis of 18 real-world performance is- sues in four widely-used applications. Remarkably, vProf outperforms other state-of-the-art tools, successfully diagnosing all issues, including three that had remained unresolved for over four years.
Causal tracing tools reveal the root causes of performance issues in complex software by generating tracing graphs. However, these graphs often suffer from inherent inaccuracies, characterized by superfluous (over-connected) and missed (under-connected) edges. These inaccuracies arise from the diversity of programming paradigms. To mitigate the inaccuracies, the dissertation proposes an approach to derive strong and weak edges in tracing graphs based on the vertices’ semantics collected during runtime. By leveraging these edge types, a beam-search-based diagnostic algorithm is employed to identify the most probable causal paths. Causal paths from normal and buggy executions are differentiated to provide key insights into the root causes of performance issues. To validate this approach, a causal tracing tool named Argus is developed and tested across multiple versions of macOS. It is evaluated on 12 well-known spinning pinwheel issues in popular macOS applications. Notably, Argus successfully diagnoses the root causes of all identified issues, including 10 issues that had remained unresolved for several years.
The results from both tools exemplify a substantial enhancement of performance diagnostic tools achieved by harnessing runtime information. The integration can effectively mitigate inherent inaccuracies, lend support to inaccuracy-tolerant diagnostic algorithms, and provide key insights to pinpoint the root causes
- …