4 research outputs found

    Understanding Eye Gaze Patterns in Code Comprehension

    Get PDF
    Program comprehension is a sub-field of software engineering that seeks to understand how developers understand programs. Comprehension acts as a starting point for many software engineering tasks such as bug fixing, refactoring, and feature creation. The dissertation presents a series of empirical studies to understand how developers comprehend software in realistic settings. The unique aspect of this work is the use of eye tracking equipment to gather fine-grained detailed information of what developers look at in software artifacts while they perform realistic tasks in an environment familiar to them, namely a context including both the Integrated Development Environment (Eclipse or Visual Studio) and a web browser (Google Chrome). The iTrace eye tracking infrastructure is used for certain eye tracking studies on large code files as it is able to handle page scrolling and context switching. The first study is a classroom-based study on how students actively trained in the classroom understand grouped units of C++ code. Results indicate students made many transitions between lines that were closer together, and were attracted the most to if statements and to a lesser extent assignment code. The second study seeks to understand how developers use Stack Overflow page elements to build summaries of open source project code. Results indicate participants focused more heavily on question and answer text, and the embedded code, more than they did the title, question tags, or votes. The third study presents a larger code summarization study using different information contexts: Stack Overflow, bug repositories and source code. Results show participants tended to visit up to two codebase files in either the combined or isolated codebase session, but visit more bug report pages, and spend longer time on new Stack Overflow pages they visited, when given either these two treatments in isolation. In the combined session, time spent on the one or two codebase files they viewed dominated the session time. Information learned from tracking developers\u27 gaze in these studies can form foundations for developer behavior models, which we hope can later inform recommendations for actions one might take to achieve workflow goals in these settings. Advisor: Bonita Shari

    Exploring regularities in software with statistical models and their applications

    Get PDF
    Software systems are becoming popular. They are used with different platforms for different applications. Software systems are developed with support from programming languages, which help developers work conveniently. Programming languages can have different paradigms with different form, syntactic structures, keywords, and representation ways. In many cases, however, programming languages are similar in different important aspects: 1. They are used to support description of specific tasks, 2. Source codes are written in languages and includes a limit set of distinctive tokens, many tokens are repeated like keywords, function calls, and 3. They follow specific syntactic rules to make machine understanding. Those points also respect the similarity between programming language and natural language. Due to its critical role in many applications, natural language processing (NLP) has been studied much and given many promising results like automatic cross-language translation, speech-to-text, information searching, etc. It is interesting to observe if there are similar characteristics between natural language and programming language and whether techniques in NLP can be reused for programming language processing? Recent works in software engineering (SE) shows that their similarities between NLP and programming language processing and techniques in NLP can be reused for PLP. This dissertation introduces my works with contributions in study of characteristics of programming languages, the models which employed them and the main applications that show the usefulness of the proposed models. Study in both three aspects has drawn interests from software engineering community and received awards due to their innovation and applicability. \u27 I hope that this dissertation will bring a systematic view of how advantage techniques in natural language processing and machine learning can be re-used and give huge benefit for programming language processing, and how those techniques are adapted with characteristics of programming language and software systems
    corecore