41 research outputs found

    Eye movements in code reading:relaxing the linear order

    Get PDF
    Abstract—Code reading is an important skill in programming. Inspired by the linearity that people exhibit while natural lan-guage text reading, we designed local and global gaze-based mea-sures to characterize linearity (left-to-right and top-to-bottom) in reading source code. Unlike natural language text, source code is executable and requires a specific reading approach. To validate these measures, we compared the eye movements of novice and expert programmers who were asked to read and comprehend short snippets of natural language text and Java programs. Our results show that novices read source code less linearly than natural language text. Moreover, experts read code less linearly than novices. These findings indicate that there are specific differences between reading natural language and source code, and suggest that non-linear reading skills increase with expertise. We discuss the implications for practitioners and educators. I

    Sequence-to-sequence learning-based conversion of pseudo-code to source code using neural translation approach

    Get PDF
    Pseudo-code refers to an informal means of representing algorithms that do not require the exact syntax of a computer programming language. Pseudo-code helps developers and researchers represent their algorithms using human-readable language. Generally, researchers can convert the pseudo-code into computer source code using different conversion techniques. The efficiency of such conversion methods is measured based on the converted algorithm's correctness. Researchers have already explored diverse technologies to devise conversion methods with higher accuracy. This paper proposes a novel pseudo-code conversion learning method that includes natural language processing-based text preprocessing and a sequence-to-sequence deep learning-based model trained with the SPoC dataset. We conducted an extensive experiment on our designed algorithm using descriptive bilingual understudy scoring and compared our results with state-of-the-art techniques. Result analysis shows that our approach is more accurate and efficient than other existing conversion methods in terms of several performances metrics. Furthermore, the proposed method outperforms the existing approaches because our method utilizes two Long-Short-Term-Memory networks that might increase the accuracy. © 2013 IEEE

    Applying static code analysis for domain-specific languages

    Get PDF
    The use of code quality control platforms for analysing source code is increasingly gaining attention in the developer community. These platforms are prepared to parse and check source code written in a variety of general-purpose programming languages. The emergence of domain-specific languages enables professionals from different areas to develop and describe problem solutions in their disciplines. Thus, source code quality analysis methods and tools can also be applied to software artefacts developed with a domain-specific language. To evaluate the quality of domain-specific language code, every software component required by the quality platform to parse and query the source code must be developed. This becomes a time-consuming and error-prone task, for which this paper describes a model-driven interoperability strategy that bridges the gap between the grammar formats of source code quality parsers and domain-specific text languages. This approach has been tested on the most widespread platforms for designing text-based languages and source code analysis. This interoperability approach has been evaluated on a number of specific contexts in different domain areas

    Causative Insights into Open Source Software Security using Large Language Code Embeddings and Semantic Vulnerability Graph

    Full text link
    Open Source Software (OSS) security and resilience are worldwide phenomena hampering economic and technological innovation. OSS vulnerabilities can cause unauthorized access, data breaches, network disruptions, and privacy violations, rendering any benefits worthless. While recent deep-learning techniques have shown great promise in identifying and localizing vulnerabilities in source code, it is unclear how effective these research techniques are from a usability perspective due to a lack of proper methodological analysis. Usually, these methods offload a developer's task of classifying and localizing vulnerable code; still, a reasonable study to measure the actual effectiveness of these systems to the end user has yet to be conducted. To address the challenge of proper developer training from the prior methods, we propose a system to link vulnerabilities to their root cause, thereby intuitively educating the developers to code more securely. Furthermore, we provide a comprehensive usability study to test the effectiveness of our system in fixing vulnerabilities and its capability to assist developers in writing more secure code. We demonstrate the effectiveness of our system by showing its efficacy in helping developers fix source code with vulnerabilities. Our study shows a 24% improvement in code repair capabilities compared to previous methods. We also show that, when trained by our system, on average, approximately 9% of the developers naturally tend to write more secure code with fewer vulnerabilities

    Providing Metacognitive Support Using Learning by Teaching Paradigm

    Get PDF
    This item is only available electronically.Learning by teaching technique is a powerful approach that enhances students to think deeply, orally and repeatedly. However, there are some obstacles to use this technique in school settings such as time-consuming, the anxiety of failing in front of the classmates and finding matching peers. In order to take advantage of this method for the student, there are several computer-based systems have been implemented to apply this approach where students teach the virtual agents to play the tutee role. All of these existing systems focus on various domains, and none of them have considered programming problem solving. In addition to that, the majority of the exiting systems did not provided meta-cognitive support. They only the focus on providing feedback about the content such as providing correct answers. This type of feedback called Knowledge of Correct Response: KCR). In our work, we build a computer-based learning environment that enables the novice programmers to teach problem solving to an animated agent. It combines learning by teaching technique and meta-cognitive support. That will help novice programmers to acquire deep learning on how to solve problems and prepare those programmers for future learning tasks. This project could provide a solution to novice programmers who usually tend to focus on writing the code rather than understanding the problem properly because that would lead them to be frustrated when they do not know how to deal with unfamiliar programming problems. We conducted an experiment in order to compare the e↵ect of providing guided meta-cognitive feedback and KCR feedback on the novice programmers’ skills in learning by teaching paradigm. We implemented two versions of our system. The first version which provides meta-cognitive feedback and the other version which provides KCR feedback. We analysed data from novice programmers, 18-25 years old, who at least studied and passed at least one programming course. They are from College of Computer at Al-lieth in Umm Al-Qura University. The place of the conducted experiment was in the college’s lab. We found that the meta-cognitive feedback e↵ect positively on the novice programmers’ skills comparing among the pre-test, post-test and delayed test. The performance of 82% of the participants in the experimental group (who received guided meta-cognitive feedback) has been improved after the post-test whereas the performance of only 30% of participants in the control group (who received KCR feedback) has been improved. Although the difficulty of the delayed test compared to the pre-test and the post-test, the performance of 70% of the participants in the experimental group has been improved whereas the performance of only 50% of the participants in the control group has been improved. We are not surprised about the improvement of the control group because learning by teaching technique can encourage ( but not to induce) the practice of meta-cognitive skills implicitly whereas the experimental group use learning teaching technique with meta-cognitive support in an explicit way.Thesis (MCompSc) -- University of Adelaide, School of Computer Science, 201

    Neural Machine Translation for Code Generation

    Full text link
    Neural machine translation (NMT) methods developed for natural language processing have been shown to be highly successful in automating translation from one natural language to another. Recently, these NMT methods have been adapted to the generation of program code. In NMT for code generation, the task is to generate output source code that satisfies constraints expressed in the input. In the literature, a variety of different input scenarios have been explored, including generating code based on natural language description, lower-level representations such as binary or assembly (neural decompilation), partial representations of source code (code completion and repair), and source code in another language (code translation). In this paper we survey the NMT for code generation literature, cataloging the variety of methods that have been explored according to input and output representations, model architectures, optimization techniques used, data sets, and evaluation methods. We discuss the limitations of existing methods and future research directionsComment: 33 pages, 1 figur

    Introductory programming: a systematic literature review

    Get PDF
    As computing becomes a mainstream discipline embedded in the school curriculum and acts as an enabler for an increasing range of academic disciplines in higher education, the literature on introductory programming is growing. Although there have been several reviews that focus on specific aspects of introductory programming, there has been no broad overview of the literature exploring recent trends across the breadth of introductory programming. This paper is the report of an ITiCSE working group that conducted a systematic review in order to gain an overview of the introductory programming literature. Partitioning the literature into papers addressing the student, teaching, the curriculum, and assessment, we explore trends, highlight advances in knowledge over the past 15 years, and indicate possible directions for future research

    Improving the Flexibility of CLARA\u27s Automated Matching and Repair Processes

    Get PDF
    More computer science researchers focus on automated program repair, with the world steadfastly moving towards automation. CLARA is an example of an automated program repair tool that provides feedback to novice programmers solving introductory programming assignments in Java, C++, and Python. CLARA involves test-based repair, requiring as input a correct program, an incorrect program, and its corresponding test case. Our work only focuses on Python. CLARA has two main limitations. The first involves lacking support for commonly used language constructs such as standard input, standard output, and import statements. We address this issue by extending CLARA\u27s abstract syntax tree processor and interpreter to include these constructs. The second limitation is that CLARA requires both the correct and the incorrect program to have the same control flow. In a real-world setting, it is not easy to find such programs, reducing the true impact CLARA can have on the learning of novice programmers. Therefore, we implement a graph matching technique between the correct and incorrect programs that considers both the semantic and the topological information to help overcome this limitation. Using this matching, we modify the incorrect program to match its control flow with the correct program. To verify that our technique overcomes the control flow limitation, we conduct experiments to run CLARA and compare the number of programs repaired with and without the graph matching technique. We also analyze the percentage of the program modified by CLARA and the number of correct programs needed to repair all valid incorrect programs. Our experiments show that CLARA can parse, process, and repair many more programs after our extensions. Additionally, our experiments indicate that we never enable CLARA to replace all the source code of an incorrect program with all the source code of a correct program

    Investigating tools and techniques for improving software performance on multiprocessor computer systems

    Get PDF
    The availability of modern commodity multicore processors and multiprocessor computer systems has resulted in the widespread adoption of parallel computers in a variety of environments, ranging from the home to workstation and server environments in particular. Unfortunately, parallel programming is harder and requires more expertise than the traditional sequential programming model. The variety of tools and parallel programming models available to the programmer further complicates the issue. The primary goal of this research was to identify and describe a selection of parallel programming tools and techniques to aid novice parallel programmers in the process of developing efficient parallel C/C++ programs for the Linux platform. This was achieved by highlighting and describing the key concepts and hardware factors that affect parallel programming, providing a brief survey of commonly available software development tools and parallel programming models and libraries, and presenting structured approaches to software performance tuning and parallel programming. Finally, the performance of several parallel programming models and libraries was investigated, along with the programming effort required to implement solutions using the respective models. A quantitative research methodology was applied to the investigation of the performance and programming effort associated with the selected parallel programming models and libraries, which included automatic parallelisation by the compiler, Boost Threads, Cilk Plus, OpenMP, POSIX threads (Pthreads), and Threading Building Blocks (TBB). Additionally, the performance of the GNU C/C++ and Intel C/C++ compilers was examined. The results revealed that the choice of parallel programming model or library is dependent on the type of problem being solved and that there is no overall best choice for all classes of problem. However, the results also indicate that parallel programming models with higher levels of abstraction require less programming effort and provide similar performance compared to explicit threading models. The principle conclusion was that the problem analysis and parallel design are an important factor in the selection of the parallel programming model and tools, but that models with higher levels of abstractions, such as OpenMP and Threading Building Blocks, are favoured
    corecore