2,811 research outputs found

    Expanding the scope of statistical computing: Training statisticians to be software engineers

    Full text link
    Traditionally, statistical computing courses have taught the syntax of a particular programming language or specific statistical computation methods. Since the publication of Nolan and Temple Lang (2010), we have seen a greater emphasis on data wrangling, reproducible research, and visualization. This shift better prepares students for careers working with complex datasets and producing analyses for multiple audiences. But, we argue, statisticians are now often called upon to develop statistical software, not just analyses, such as R packages implementing new analysis methods or machine learning systems integrated into commercial products. This demands different skills. We describe a graduate course that we developed to meet this need by focusing on four themes: programming practices; software design; important algorithms and data structures; and essential tools and methods. Through code review and revision, and a semester-long software project, students practice all the skills of software engineering. The course allows students to expand their understanding of computing as applied to statistical problems while building expertise in the kind of software development that is increasingly the province of the working statistician. We see this as a model for the future evolution of the computing curriculum in statistics and data science.Comment: 22 page

    Complete Issue 24, 2001

    Get PDF

    The importance of good coding practices for data scientists

    Full text link
    Many data science students and practitioners are reluctant to adopt good coding practices as long as the code "works". However, code standards are an important part of modern data science practice, and they play an essential role in the development of "data acumen". Good coding practices lead to more reliable code and often save more time than they cost, making them important even for beginners. We believe that principled coding practices are vital for statistics and data science. To install these practices within academic programs, it is important for instructors and programs to begin establishing these practices early, to reinforce them often, and to hold themselves to a higher standard while guiding students. We describe key aspects of coding practices (both good and bad), focusing primarily on the R language, though similar standards are applicable to other software environments. The lessons are organized into a top ten list

    Automatic feedback and assessment of team-coding assignments in a DevOps context

    Get PDF
    We describe an automated assessment process for team-coding assignments based on DevOps best practices. This system and methodology includes the definition of Team Performance Metrics measuring properties of the software developed by each team, and their correct use of DevOps techniques. It tracks the progress on each of metric by each group. The methodology also defines Individual Performance Metrics to measure the impact of individual student contributions to increase in Team Performance Metrics. Periodically scheduled reports using these metrics provide students valuable feedback. This process also facilitates the process of assessing the assignments. Although this method is not intended to produce the final grade of each student, it provides very valuable information to the lecturers. We have used it as the main source of information for student and team assessment in one programming course. Additionally, we use other assessment methods to calculate the final grade: written conceptual tests to check their understanding of the development processes, and cross-evaluations. Qualitative evaluation of the students filling relevant questionnaires are very positive and encouraging.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature

    HydroShare – A Case Study of the Application of Modern Software Engineering to a Large Distributed Federally-Funded Scientific Software Development Project

    Get PDF
    HydroShare is an online collaborative system under development to support the open sharing of hydrologic data, analytical tools, and computer models. With HydroShare, scientists can easily discover, access, and analyze hydrologic data and thereby enhance the production and reproducibility of hydrologic scientific results. HydroShare also takes advantage of emerging social media functionality to enable users to enhance information about and collaboration around hydrologic data and models. HydroShare is being developed by an interdisciplinary collaborative team of domain scientists, university software developers, and professional software engineers from ten institutions located across the United States. While the combination of non–co-located, diverse stakeholders presents communication and management challenges, the interdisciplinary nature of the team is integral to the project’s goal of improving scientific software development and capabilities in academia. This chapter describes the challenges faced and lessons learned with the development of HydroShare, as well as the approach to software development that the HydroShare team adopted on the basis of the lessons learned. The chapter closes with recommendations for the application of modern software engineering techniques to large, collaborative, scientific software development projects, similar to the National Science Foundation (NSF)–funded HydroShare, in order to promote the successful application of the approach described herein by other teams for other projects

    From software engineering to courseware engineering

    Get PDF
    Proceedings of: 2016 IEEE Global Engineering Education Conference (EDUCON), 10-13 April 2016, Abu Dhabi, United Arab EmiratesThe appearance of MOOCs has contributed to the use of educational technology in new contexts. As a consequence, many teachers face the challenge of creating educational content (courseware) to be offered in MOOCs. Although some best practices exist, it is true that most of the content is being developed without much thought about adequacy, reusability, maintainability, composability, etc. The main thesis at this paper is that we are facing a "courseware crisis" in the same way as there was a "software crisis" 50 years ago, and that the way out is to identify good engineering discipline to aid in the development of courseware. We need Courseware Engineering in the same way as at those times we needed Software Engineering. Therefore, the challenge is now to define and develop fundamentals, tools, and methods of Courseware Engineering, as an analogy to the fundamentals, tools, and methods that were developed in Software Engineering.The eMadrid Excellence Network is being funded by the Madrid Regional Government (Comunidad de Madrid) with grant No. S2013/ICE-2715. This work also received partial support from the Spanish Ministry of Economy and Competitiveness Project RESET (TIN2014-53199-C3-1-R) and from the European Erasmus+ projects MOOC-Maker (561533-EPP-1-2015-1-ES-EPPKA2-CBHE-JP) and SHEILA (562080-EPP-1-2015-BE-EPPKA3-PI-FORWARD). The first author would like to acknowledge fruitful discussions with Martin Wirsing and his group from LMU München during his research stay at this university with a scholarship from the Spanish Ministry of Education, Culture, and Sport

    N.O.V.I.: Note Organizer for the Visually Impaired

    Get PDF
    Visually impaired students face extra challenges when it comes to the basic necessity of note-taking. Current assistive technology is fragmented in function. These students often need to combine solutions such as voice recording lectures, hiring someone to transcribe notes to braille, hiring a reader, etc. The amount of time and money they need for these solutions proves to be a great disadvantage, and we wish to provide an easier solution that will give these students a more independent and productive learning experience. Our solution is an application that can offer intuitive, convenient, and comprehensive access to notes for the visually impaired. We would implement our solution through an iOS mobile application that would allow users to upload hand written or text notes so that they could be organized and accessed through voice prompts. Users with visual disabilities would then have a platform to store and organize their notes in order to increase their ability to learn independently

    Education Research Using Data Mining and Machine Learning with Computer Science Undergraduates

    Get PDF
    In recent decades, we are witness to an explosion of technology use and integration of everyday life. The engine of technology application in every aspect of life is Computer Science (CS). Appropriate CS education to fulfill the demand from the workforce for graduates is a broad and challenging problem facing many universities. Research into this ‘supply–chain’ problem is a central focus of CS education research. As of late, Educational Data Mining (EDM) emerges as an area connecting CS education research with the goal to help students stay in their program, improve performance in their program, and graduate with a degree. We contribute to this work with several research studies and future work focusing on CS undergraduate students relating to their program success and course performance analyzed through the lens of data mining. We perform research into student success predictors beyond diversity and gender. We examine student behaviors in course load and completion. We study workforce readiness with creation of a new teaching strategy, its deployment in the classroom, and the analysis shows us relevant Software Engineering (SE) topics for computing jobs. We look at cognitive learning in the beginning CS course its relations to course performance. We use decision trees in machine learning algorithms to predict student success or failure of CS core courses using performance and semester span of core curriculum. These research areas refine pathways for CS course sequencing to improve retention, reduce time-to–graduation, and increase success in the work field
    • …
    corecore