Search CORE

36 research outputs found

Recommended from our members

Final Report on Statistical Debugging for Petascale Environments

Author: Liblit B
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 18/01/2013
Field of study

UNT Digital Library

Final Report on Statistical Debugging for Petascale Environments

Author: Liblit B
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 18/01/2013
Field of study

Crossref

UNT Digital Library

Recommended from our members

Lessons learned at 208K: Towards Debugging Millions of Cores

Author: Ahn D H
Arnold D C
de Supinski B R
Lee G L
Legendre M
Liblit B
Miller B P
Schulz M J
Publication venue: Lawrence Livermore National Laboratory
Publication date: 14/04/2008
Field of study

Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application--already, debugging the full Blue-Gene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks. In this paper, we present challenges to petascale tool development, using the Stack Trace Analysis Tool (STAT) as a case study. STAT is a lightweight tool that gathers and merges stack traces from a parallel application to identify process equivalence classes. We use results gathered at thousands of tasks on an Infiniband cluster and results up to 208K processes on BG/L to identify current scalability issues as well as challenges that will be faced at the petascale. We then present implemented solutions to these challenges and show the resulting performance improvements. We also discuss future plans to meet the debugging demands of petascale machines

UNT Digital Library

Software-Defect Localisation by Mining Dataflow-Enabled Call Graphs

Author: A. Zeller
B. Liblit
C. Liu
F. Eichinger
F. Eichinger
G. Kiczales
I.H. Witten
J.R. Quinlan
L.A. Kurgan
N. Ayewah
W. Masri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Defect localisation is essential in software engineering and is an important task in domain-specific data mining. Existing techniques building on call-graph mining can localise different kinds of defects. However, these techniques focus on defects that affect the controlflow and are agnostic regarding the dataflow. In this paper, we introduce dataflow-enabled call graphs that incorporate abstractions of the dataflow. Building on these graphs, we present an approach for defect localisation. The creation of the graphs and the defect localisation are essentially data mining problems, making use of discretisation, frequent subgraph mining and feature selection. We demonstrate the defect-localisation qualities of our approach with a study on defects introduced into Weka. As a result, defect localisation now works much better, and a developer has to investigate on average only 1.5 out of 30 methods to fix a defect

Crossref

KITopen

Suggesting Accurate Method and Class Names

Author: Arnaoudova V.
Bengio Y.
Botha J.
Gutmann M. U.
Kiros R.
Liblit B.
Maddison C. J.
Martin R. C.
Mikolov T.
Mikolov T.
Mnih A.
Mnih A.
Russell S.
Sridhara G.
Srivastava N.
Takang A.
Takang A. A.
van der Maaten L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Descriptive names are a vital part of readable, and hence maintain-able, code. Recent progress on automatically suggesting names for local variables tantalizes with the prospect of replicating that success with method and class names. However, suggesting names for meth-ods and classes is much more difficult. This is because good method and class names need to be functionally descriptive, but suggesting such names requires that the model goes beyond local context. We introduce a neural probabilistic language model for source code that is specifically designed for the method naming problem. Our model learns which names are semantically similar by assigning them to locations, called embeddings, in a high-dimensional contin-uous space, in such a way that names with similar embeddings tend to be used in similar contexts. These embeddings seem to contain semantic information about tokens, even though they are learned only from statistical co-occurrences of tokens. Furthermore, we introduce a variant of our model that is, to our knowledge, the first that can propose neologisms, names that have not appeared in the training corpus. We obtain state of the art results on the method, class, and even the simpler variable naming tasks. More broadly, the continuous embeddings that are learned by our model have the potential for wide application within software engineering

CiteSeerX

Crossref

Edinburgh Research Explorer

Efficient static analysis with path pruning using coverage data

Author: Liblit B. R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Recommended from our members

Lightweight and Statistical Techniques for Petascale Debugging: Correctness on Petascale Systems (CoPS) Preliminry Report

Author: de Supinski B R
Liblit B
Miller B P
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 13/09/2011
Field of study

Petascale platforms with O(10{sup 5}) and O(10{sup 6}) processing cores are driving advancements in a wide range of scientific disciplines. These large systems create unprecedented application development challenges. Scalable correctness tools are critical to shorten the time-to-solution on these systems. Currently, many DOE application developers use primitive manual debugging based on printf or traditional debuggers such as TotalView or DDT. This paradigm breaks down beyond a few thousand cores, yet bugs often arise above that scale. Programmers must reproduce problems in smaller runs to analyze them with traditional tools, or else perform repeated runs at scale using only primitive techniques. Even when traditional tools run at scale, the approach wastes substantial effort and computation cycles. Continued scientific progress demands new paradigms for debugging large-scale applications. The Correctness on Petascale Systems (CoPS) project is developing a revolutionary debugging scheme that will reduce the debugging problem to a scale that human developers can comprehend. The scheme can provide precise diagnoses of the root causes of failure, including suggestions of the location and the type of errors down to the level of code regions or even a single execution point. Our fundamentally new strategy combines and expands three relatively new complementary debugging approaches. The Stack Trace Analysis Tool (STAT), a 2011 R&D 100 Award Winner, identifies behavior equivalence classes in MPI jobs and highlights behavior when elements of the class demonstrate divergent behavior, often the first indicator of an error. The Cooperative Bug Isolation (CBI) project has developed statistical techniques for isolating programming errors in widely deployed code that we will adapt to large-scale parallel applications. Finally, we are developing a new approach to parallelizing expensive correctness analyses, such as analysis of memory usage in the Memgrind tool. In the first two years of the project, we have successfully extended STAT to determine the relative progress of different MPI processes. We have shown that the STAT, which is now included in the debugging tools distributed by Cray with their large-scale systems, substantially reduces the scale at which traditional debugging techniques are applied. We have extended CBI to large-scale systems and developed new compiler based analyses that reduce its instrumentation overhead. Our results demonstrate that CBI can identify the source of errors in large-scale applications. Finally, we have developed MPIecho, a new technique that will reduce the time required to perform key correctness analyses, such as the detection of writes to unallocated memory. Overall, our research results are the foundations for new debugging paradigms that will improve application scientist productivity by reducing the time to determine which package or module contains the root cause of a problem that arises at all scales of our high end systems. While we have made substantial progress in the first two years of CoPS research, significant work remains. While STAT provides scalable debugging assistance for incorrect application runs, we could apply its techniques to assertions in order to observe deviations from expected behavior. Further, we must continue to refine STAT's techniques to represent behavioral equivalence classes efficiently as we expect systems with millions of threads in the next year. We are exploring new CBI techniques that can assess the likelihood that execution deviations from past behavior are the source of erroneous execution. Finally, we must develop usable correctness analyses that apply the MPIecho parallelization strategy in order to locate coding errors. We expect to make substantial progress on these directions in the next year but anticipate that significant work will remain to provide usable, scalable debugging paradigms

UNT Digital Library

IEEE Transactions on Software Engineering : Vol. 36, No. 1, January - February 2010

Author: S. Horwotiz B. Liblit, M. Polishchuk, Et.all
Publication venue: IEEE (Institute of Electrical and Electronics Engineers)
Publication date
Field of study

1. Better Debugging via Output Tracing and Callstack-Sensitive Slicing 2. DECOR: A Method for the Specification and Detection of Code and Design Smells 3. Directed Explicit State-Space Search in the Generation of Counterexamples for Stochastic Model Checking 4. Effects of Personality on Pair Programming 5. Generating Event Sequence-Based Test Cases Using GUI Runtime State Feedback Etc

Open Library

Mining Edge-Weighted Call Graphs to Localise Software Bugs

Author: B. Korel
B. Liblit
I.F. Darwin
I.H. Witten
J. Han
J.R. Quinlan
M.J. Harrold
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Crossref

Reusing debugging knowledge via trace-based bug search

Author: Drew Schleck
Earl T. Barr
Jula H.
Liblit B.
Zhendong Su
Zhongxian Gu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref