89 research outputs found

    Blackbox: A Large Scale Repository of Novice Programmers’ Activity

    Get PDF
    Automatically observing and recording the programming be- haviour of novices is an established computing education research technique. However, prior studies have been con- ducted at a single institution on a small or medium scale, without the possibility of data re-use. Now, the widespread availability of always-on Internet access allows for data col- lection at a much larger, global scale. In this paper we re- port on the Blackbox project, begun in June 2013. Black- box is a perpetual data collection project that collects data from worldwide users of the BlueJ IDE – a programming environment designed for novice programmers. Over one hundred thousand users have already opted-in to Blackbox. The collected data is anonymous and is available to other researchers for use in their own studies, thus benefitting the larger research community. In this paper, we describe the data available via Blackbox, show some examples of analyses that can be performed using the collected data, and discuss some of the analysis challenges that lie ahead

    Meaningful Categorisation of Novice Programmer Errors

    Get PDF
    The frequency of different kinds of error made by students learning to write computer programs has long been of interest to researchers and educators. In the past, various studies investigated this topic, usually by recording and analysing compiler error messages, and producing tables of relative frequencies of specific errors diagnostics produced by the compiler. In this paper, we improve on such prior studies by investigating actual logical errors in student code, as opposed to diagnostic messages produced by the compiler. The actual errors reported here are more precise, more detailed and more accurate than the diagnostic produced automatically

    An exploration of novice compilation behaviour in BlueJ

    Get PDF
    Our research explores the process by which beginning programmers go about writing programs. We have focused our explorations on what we call compilation behaviour: the programming behaviour a student engages in while repeatedly editing and compiling their programs in an attempt to make them syntactically, if not semantically, correct. The students whose behaviour we have observed were engaged in learning to program in an objects-first style using BlueJ, an environment designed for supporting novice programmers just starting out with the Java programming language. The significant results of our work are two-fold. First, we have developed tools for visualising the process by which students write their programs. Using these tools, we can quickly obtain valuable information about their process, and use that information to inform further research regarding their behaviour, or apply it immediately in a classroom context to better support the struggling learner. Second, we have proposed a quantification of novice compilation behavior which we call the error quotient. Using this metric, we can determine how well (or poorly) a student fares with syntax errors while learning to program. This quantity, like our tools for visualisation, provides a powerful indicator for how much or little a student is struggling with the language while programming, and correlates significantly with traditional indicators for academic progress

    An exploration of novice compilation behaviour in BlueJ

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    A gentle transition from Java programming to Web Services using XML-RPC

    Get PDF
    Exposing students to leading edge vocational areas of relevance such as Web Services can be difficult. We show a lightweight approach by embedding a key component of Web Services within a Level 3 BSc module in Distributed Computing. We present a ready to use collection of lecture slides and student activities based on XML-RPC. In addition we show that this material addresses the central topics in the context of web services as identified by Draganova (2003)

    A Data-Driven Approach to Compare the Syntactic Difficulty of Programming Languages

    Get PDF
    Educators who teach programming subjects are often wondering “which programming language should I teach first?”. The debate behind this question has a long history and coming up with a definite answer to this question would be farfetched. Nonetheless, several efforts can be identified in the literature wherein pros and cons of mainstream programming languages are examined, analysed, and discussed in view of their potential to facilitate the didactics of programming concepts especially to novice programmers. In line with these efforts, we explore the latter question by comparing the syntactic difficulty of two modern, but fundamentally different, programming languages: Java and Python. To achieve this objective, we introduce a standalone and purely data-driven method which stores the code submissions and clusters the errors occurred under the aid of a custom transition probability matrix. For the evaluation of this model a total of 219,454 submissions, made by 715 first-year undergraduate students, in 259 unique programming exercises were gathered and analysed. The results indicate that Python is an easier-to-grasp programming language and is, therefore, highly recommended as the steppingstone in introductory courses. Besides, the adoption of the described method enables educators to not only identify those students who struggle with coding (syntax-wise) but further paves the pathway for the adoption of personalised and adaptive learning practices

    An Exploration of Traditional and Data Driven Predictors of Programming Performance

    Get PDF
    This thesis investigates factors that can be used to predict the success or failure of students taking an introductory programming course. Four studies were performed to explore how aspects of the teaching context, static factors based upon traditional learning theories, and data-driven metrics derived from aspects of programming behaviour were related to programming performance. In the first study, a systematic review into the worldwide outcomes of programming courses revealed an average pass rate of 67.7\%. This was found to have not significantly changed over time, or to have differed based upon aspects of the teaching context, such as the programming language taught to students. The second study showed that many of the factors based upon traditional learning theories, such as learning styles, are context dependent, and fail to consistently predict programming performance when they are applied across different teaching contexts. The third study explored data-driven metrics derived from the programming behaviour of students. Analysing data logged from students using the BlueJ IDE, 10 new data-driven metrics were identified and validated on three independently gathered datasets. Weaker students were found to make a greater percentage of successive errors, and spend a greater percentage of their lab time resolving errors than stronger students. The Robust Relative algorithm was developed to hybridize four of the strongest data-driven metrics into a performance predictor. The novel relative scoring of students based upon how their resolve times for different types of errors compared to the resolve times of their peers, resulted in a predictor which could explain a large proportion of the variance in the performance of three independent cohorts, R2R^2 = 42.19\%, 43.65\% and 44.17\% - almost double the variance which could be explained by Jadud's Error Quotient metric. The fourth study situated the findings of this thesis within the wider literature, by applying meta-analysis techniques to statistically synthesise fifty years of conflicting research, such that the most important factors for learning programming could be identified. 482 results describing the effects of 116 factors on programming performance were synthesised and consolidated to form a six class theoretical framework. The results showed that the strongest predictors identified over the past fifty years are data-driven metrics based upon programming behaviour. Several of the traditional predictors were also found to be influential, suggesting that both a certain level of scientific maturity and self-concept are necessary for programming. Two thirds of the weakest predictors were based upon demographic and psychological factors, suggesting that age, gender, self-perceived abilities, learning styles, and personality traits have no relevance for programming performance. This thesis argues that factors based upon traditional learning theories struggle to consistently predict programming performance across different teaching contexts because they were not intended to be applied for this purpose. In contrast, the main advantage of using data-driven approaches to derive metrics based upon students' programming processes, is that these metrics are directly based upon the programming behaviours of students, and therefore can encapsulate such changes in their programming knowledge over time. Researchers should continue to explore data-driven predictors in the future

    Approaches to Support Student Learning in Introductory Programming Laboratory Classes

    Get PDF
    Objectives: This thesis will explore some innovative solutions to communication difficulties that exist in higher education teaching of introductory programming. Communication between a teacher and student is important, as it is the main opportunity where a student can ask a teacher questions about a particular problem they have, and a teacher can give feedback to direct them towards a solution. It is expected that through utilising technology in laboratory practical classes, communication between teachers and student can be improved. Methods: This thesis primarily explores the possibilities of using student compiler and method invocation data, collected during a practical class and sent directly to a teacher. This data maybe beneficial as a method of allowing teachers to see if a student requires help. This thesis utilises a variety of research methods including questionnaires, observations of classroom interactions and collection of data recorded from student and teachers interactions with the technology. The approaches are used during an investigation into the current approaches of laboratory practical teaching, before progressing onto investigations using the technology developed that accompanies this thesis. Results: The results identified that the majority of the students and teachers who used the technology felt that it improved their ability to communicate within laboratory practical classes. The teachers felt that they could use the data collected by the technology to view activity from the students and see a student’s progress. The teachers could interpret the data collected from the technology and students who needed help could be identified. Conclusions: This thesis has demonstrated that technology has the potential to improve communication in laboratory classes, and enable teachers to support students more effectively. However, the technology developed in this thesis, does not eliminate the requirement for a teacher to interact with a student face-to-face, but rather its role is to act as an indicator of students who may need assistance
    • …
    corecore