72 research outputs found

    Source Code Retrieval from Large Software Libraries for Automatic Bug Localization

    Get PDF
    This dissertation advances the state-of-the-art in information retrieval (IR) based approaches to automatic bug localization in software. In an IR-based approach, one first creates a search engine using a probabilistic or a deterministic model for the files in a software library. Subsequently, a bug report is treated as a query to the search engine for retrieving the files relevant to the bug. With regard to the new work presented, we first demonstrate the importance of taking version histories of the files into account for achieving significant improvements in the precision with which the files related to a bug are located. This is motivated by the realization that the files that have not changed in a long time are likely to have ``stabilized and are therefore less likely to contain bugs. Subsequently, we look at the difficulties created by the fact that developers frequently use abbreviations and concatenations that are not likely to be familiar to someone trying to locate the files related to a bug. We show how an initial query can be automatically reformulated to include the relevant actual terms in the files by an analysis of the files retrieved in response to the original query for terms that are proximal to the original query terms. The last part of this dissertation generalizes our term-proximity based work by using Markov Random Fields (MRF) to model the inter-term dependencies in a query vis-a-vis the files. Our MRF work redresses one of the major defects of the most commonly used modeling approaches in IR, which is the loss of all inter-term relationships in the documents

    How Debian GNU/Linux is translated into Spanish

    Get PDF
    La Debian GNU/Linux és un dels paquets de programari per a Linux que compta amb una major distribució. Inclou milers de paquets de programari lliure (free, open source) provinents de moltes fonts diferents. Una de les peculiaritats del projecte Debian rau en el fet que funciona gràcies a un grup de voluntaris seleccionats per ells mateixos. En concret, això vol dir que les tasques de localització (inclosa la traducció) a moltes de les llengües previstes es duen a terme a partir de l'esforç de voluntaris. Aquest article descriu com es localitza el Debian al castellà, com a exemple dels esforços que s'estan fent en el món del programari lliure.La Debian GNU/Linux es uno de los paquetes de software para Linux que cuenta con una mayor distribución. Incluye miles de paquetes de software libre (free, open source) que provienen de muchas Fuentes diferentes. Una de las peculiaridades del proyecto Debian radical en el hecho de que funciona gracias a un grupo de voluntarios seleccionados por ellos mismos. En concreto, esto quiere decir que las tareas de localización (incluida la traducción) a muchas de las lenguas previstas se llevan a cabo a partir del esfuerzo de voluntarios. Este artículo describe cómo se localiza el Debian al castellano como ejemplo de los esfuerzos que es están realizando en el mundo del software libre.Debian GNU/Linux is one of the largest Linux-based software distributions, including thousands of libre (free, open source) software packages coming from many different sources. One of the peculiarities of the Debian project is being run altogether by a self-selected group of volunteers. In particular, this means that all the localization tasks (including translation) to the many supported languages is run on volunteer effort. This paper describes how Debian is localized to the Spanish language, as a case example of translation efforts in the libre software world

    Doctor of Philosophy

    Get PDF
    dissertationAggressive random testing tools, or fuzzers, are impressively effective at finding bugs in compilers and programming language runtimes. For example, a single test-case generator has resulted in more than 460 bugs reported for a number of production-quality C compilers. However, fuzzers can be hard to use. The first problem is that failures triggered by random test cases can be difficult to debug because these tests are often large. To report a compiler bug, one must often construct a small test case that triggers the bug. The existing automated test-case reduction technique, delta debugging, is not sufficient to produce small, reportable test cases. A second problem is that fuzzers are indiscriminate: they repeatedly find bugs that may not be severe enough to fix right away. Third, fuzzers tend to generate a large number of test cases that only trigger a few bugs. Some bugs are triggered much more frequently than others, creating needle-in-the-haystack problems. Currently, users rule out undesirable test cases using ad hoc methods such as disallowing problematic features in tests and filtering test results. This dissertation investigates approaches to improving the utility of compiler fuzzers. Two components, an aggressive test-case reducer and a tamer, are added to the fuzzing workflow to make the fuzzer more user friendly. We introduce C-Reduce, an aggressive test-case reducer for C/C++ programs, which exploits rich domain-specific knowledge to output test cases nearly as good as those produced by skilled humans. This reducer produces outputs that are, on average, more than 30 times smaller than those produced by the existing reducer that is most commonly used by compiler engineers. Second, this dissertation formulates and addresses the fuzzer taming problem: given a potentially large number of random test cases that trigger failures, order them such that diverse, interesting test cases are highly ranked. Bug triage can be effectively automated, relying on techniques from machine learning to suppress duplicate bug-triggering test cases and test cases triggering known bugs. An evaluation shows the ability of this tool to solve the fuzzer taming problem for 3,799 test cases triggering 46 bugs in a C compiler

    AmaLgam+: Composing rich information sources for accurate bug localization

    Get PDF

    Will fault localization work for these failures? An automated approach to predict effectiveness of fault localization tools

    Get PDF
    corecore