1,356 research outputs found

    Tackling Version Management and Reproducibility in MLOps

    Get PDF
    A crescente adoção de soluções baseadas em machine learning (ML) exige avanços na aplicação das melhores práticas para manter estes sistemas em produção. Operações de machine learning (MLOps) incorporam princípios de automação contínua ao desenvolvimento de modelos de ML, promovendo entrega, monitoramento e treinamento contínuos. Devido a vários fatores, como a natureza experimental do desenvolvimento de modelos de ML ou a necessidade de otimizações derivadas de mudanças nas necessidades de negócios, espera-se que os cientistas de dados criem vários experimentos para desenvolver um modelo ou preditor que atenda satisfatoriamente aos principais desafios de um dado problema. Como a reavaliação de modelos é uma necessidade constante, metadados são constantemente produzidos devido a várias execuções de experimentos. Esses metadados são conhecidos como artefatos ou ativos de ML. A linhagem adequada entre esses artefatos possibilita a recriação do ambiente em que foram desenvolvidos, facilitando a reprodutibilidade do modelo. Vincular informações de experimentos, modelos, conjuntos de dados, configurações e alterações de código requer organização, rastreamento, manutenção e controle de versão adequados. Este trabalho investigará as melhores práticas, problemas atuais e desafios relacionados ao gerenciamento e versão de artefatos e aplicará esse conhecimento para desenvolver um fluxo de trabalho que suporte a engenharia e operacionalização de ML, aplicando princípios de MLOps que facilitam a reprodutibilidade dos modelos. Cenários cobrindo preparação de dados, geração de modelo, comparação entre versões de modelo, implantação, monitoramento, depuração e re-treinamento demonstraram como as estruturas e ferramentas selecionadas podem ser integradas para atingir esse objetivo.The growing adoption of machine learning solutions requires advancements in applying best practices to maintain artificial intelligence systems in production. Machine Learning Operations (MLOps) incorporates DevOps principles into machine learning development, promoting automation, continuous delivery, monitoring, and training capabilities. Due to multiple factors, such as the experimental nature of the machine learning process or the need for model optimizations derived from changes in business needs, data scientists are expected to create multiple experiments to develop a model or predictor that satisfactorily addresses the main challenges of a given problem. Since the re-evaluation of models is a constant need, metadata is constantly produced due to multiple experiment runs. This metadata is known as ML artifacts or assets. The proper lineage between these artifacts enables environment recreation, facilitating model reproducibility. Linking information from experiments, models, datasets, configurations, and code changes requires proper organization, tracking, maintenance, and version control of these artifacts. This work will investigate the best practices, current issues, and open challenges related to artifact versioning and management and apply this knowledge to develop an ML workflow that supports ML engineering and operationalization, applying MLOps principles that facilitate model reproducibility. Scenarios covering data preparation, model generation, comparison between model versions, deployment, monitoring, debugging, and retraining demonstrated how the selected frameworks and tools could be integrated to achieve that goal

    Moving hands-on mechanical engineering experiences online: Course redesigns and student perspectives

    Get PDF
    Hands-on lab experiences are essential for enabling students to be successful engineers, especially those who identify as kinesthetic learners. This case study describes how a Mechanical Engineering Practice course sequence was redesigned during the COVID-19 emergency transition to remote learning and examines how students responded to these changes. The remote course included videos of Graduate Teaching Assistants conducting data acquisition phases of the practice session to replace hands-on experiments. To understand student perspectives and performance, researchers reviewed approximately 400 reflective essays from Spring 2020 and compared assignment submissions between Fall 2019 and Spring 2020. Results suggest that some students perceived the loss of hands-on activities as detrimental to their learning and it was not comparable to face-to-face counterparts. Furthermore, students felt forced to develop self-directed learning skills. However, in contrast to student comments in reflective essays, comparisons of assignment submissions suggested that students in Spring 2020 did not receive lower grades or have a reduced demonstration of conceptual knowledge obtained in the course

    A Pattern Approach to Understand Group Collaboration in Hands-on and Remote Laboratories

    Get PDF
    We identify patterns of group collaboration within hands-on and remote laboratories. The pattern of group collaboration includes three elements: the collaboration mode, the communication medium and the collaboration structure. In addition, we examine how patterns of group collaboration evolved during different phases of the labs. Based upon our observation of 22 engineering students, we found two common patterns of the collaboration mode in both hands-on labs and remote labs: in one case, students seem to minimize cognitive effort, and in the other, they continue to do what they have been doing before. We also described the different types of communication media and collaboration structure in the two labs. Face-to-face meetings were found to be the dominant method of group communication in both labs, but students adopted a wider variety of communication methods when working with remote labs, and they interacted more with each other when they ran remote labs

    JISC Preservation of Web Resources (PoWR) Handbook

    Get PDF
    Handbook of Web Preservation produced by the JISC-PoWR project which ran from April to November 2008. The handbook specifically addresses digital preservation issues that are relevant to the UK HE/FE web management community”. The project was undertaken jointly by UKOLN at the University of Bath and ULCC Digital Archives department

    High Energy Physics Forum for Computational Excellence: Working Group Reports (I. Applications Software II. Software Libraries and Tools III. Systems)

    Full text link
    Computing plays an essential role in all aspects of high energy physics. As computational technology evolves rapidly in new directions, and data throughput and volume continue to follow a steep trend-line, it is important for the HEP community to develop an effective response to a series of expected challenges. In order to help shape the desired response, the HEP Forum for Computational Excellence (HEP-FCE) initiated a roadmap planning activity with two key overlapping drivers -- 1) software effectiveness, and 2) infrastructure and expertise advancement. The HEP-FCE formed three working groups, 1) Applications Software, 2) Software Libraries and Tools, and 3) Systems (including systems software), to provide an overview of the current status of HEP computing and to present findings and opportunities for the desired HEP computational roadmap. The final versions of the reports are combined in this document, and are presented along with introductory material.Comment: 72 page

    Curriculum Guidelines for Undergraduate Programs in Data Science

    Get PDF
    The Park City Math Institute (PCMI) 2016 Summer Undergraduate Faculty Program met for the purpose of composing guidelines for undergraduate programs in Data Science. The group consisted of 25 undergraduate faculty from a variety of institutions in the U.S., primarily from the disciplines of mathematics, statistics and computer science. These guidelines are meant to provide some structure for institutions planning for or revising a major in Data Science

    Proceedings of the ECSCW'95 Workshop on the Role of Version Control in CSCW Applications

    Full text link
    The workshop entitled "The Role of Version Control in Computer Supported Cooperative Work Applications" was held on September 10, 1995 in Stockholm, Sweden in conjunction with the ECSCW'95 conference. Version control, the ability to manage relationships between successive instances of artifacts, organize those instances into meaningful structures, and support navigation and other operations on those structures, is an important problem in CSCW applications. It has long been recognized as a critical issue for inherently cooperative tasks such as software engineering, technical documentation, and authoring. The primary challenge for versioning in these areas is to support opportunistic, open-ended design processes requiring the preservation of historical perspectives in the design process, the reuse of previous designs, and the exploitation of alternative designs. The primary goal of this workshop was to bring together a diverse group of individuals interested in examining the role of versioning in Computer Supported Cooperative Work. Participation was encouraged from members of the research community currently investigating the versioning process in CSCW as well as application designers and developers who are familiar with the real-world requirements for versioning in CSCW. Both groups were represented at the workshop resulting in an exchange of ideas and information that helped to familiarize developers with the most recent research results in the area, and to provide researchers with an updated view of the needs and challenges faced by application developers. In preparing for this workshop, the organizers were able to build upon the results of their previous one entitled "The Workshop on Versioning in Hypertext" held in conjunction with the ECHT'94 conference. The following section of this report contains a summary in which the workshop organizers report the major results of the workshop. The summary is followed by a section that contains the position papers that were accepted to the workshop. The position papers provide more detailed information describing recent research efforts of the workshop participants as well as current challenges that are being encountered in the development of CSCW applications. A list of workshop participants is provided at the end of the report. The organizers would like to thank all of the participants for their contributions which were, of course, vital to the success of the workshop. We would also like to thank the ECSCW'95 conference organizers for providing a forum in which this workshop was possible