16 research outputs found

    Guix-HPC Activity Report 2017–2018

    Get PDF
    Guix-HPC is a collaborative effort to bring reproducible software deployment to scientific workflows and high-performance computing (HPC). Guix-HPC builds upon the GNU Guix software deployment tool and aims to make it a better tool for HPC practitioners and scientists concerned with reproducible research.Guix-HPC was launched in September 2017 as a joint software development project involving three research institutes: Inria, the Max DelbrĂĽck Center for Molecular Medicine (MDC), and the Utrecht Bioinformatics Center (UBC). GNU Guix for HPC and reproducible science has received contributions from additional individuals and organizations, including Cray, Inc. and Tourbillion Technology.This report highlights key achievements of Guix-HPC between its launch date in September 2017 and today, February 2019

    Contribution à la convergence d'infrastructure entre le calcul haute performance et le traitement de données à large échelle

    Get PDF
    The amount of produced data, either in the scientific community or the commercialworld, is constantly growing. The field of Big Data has emerged to handle largeamounts of data on distributed computing infrastructures. High-Performance Computing (HPC) infrastructures are traditionally used for the execution of computeintensive workloads. However, the HPC community is also facing an increasingneed to process large amounts of data derived from high definition sensors andlarge physics apparati. The convergence of the two fields -HPC and Big Data- iscurrently taking place. In fact, the HPC community already uses Big Data tools,which are not always integrated correctly, especially at the level of the file systemand the Resource and Job Management System (RJMS).In order to understand how we can leverage HPC clusters for Big Data usage, andwhat are the challenges for the HPC infrastructures, we have studied multipleaspects of the convergence: We initially provide a survey on the software provisioning methods, with a focus on data-intensive applications. We contribute a newRJMS collaboration technique called BeBiDa which is based on 50 lines of codewhereas similar solutions use at least 1000 times more. We evaluate this mechanism on real conditions and in simulated environment with our simulator Batsim.Furthermore, we provide extensions to Batsim to support I/O, and showcase thedevelopments of a generic file system model along with a Big Data applicationmodel. This allows us to complement BeBiDa real conditions experiments withsimulations while enabling us to study file system dimensioning and trade-offs.All the experiments and analysis of this work have been done with reproducibilityin mind. Based on this experience, we propose to integrate the developmentworkflow and data analysis in the reproducibility mindset, and give feedback onour experiences with a list of best practices.RésuméLa quantité de données produites, que ce soit dans la communauté scientifiqueou commerciale, est en croissance constante. Le domaine du Big Data a émergéface au traitement de grandes quantités de données sur les infrastructures informatiques distribuées. Les infrastructures de calcul haute performance (HPC) sont traditionnellement utilisées pour l’exécution de charges de travail intensives en calcul. Cependant, la communauté HPC fait également face à un nombre croissant debesoin de traitement de grandes quantités de données dérivées de capteurs hautedéfinition et de grands appareils physique. La convergence des deux domaines-HPC et Big Data- est en cours. En fait, la communauté HPC utilise déjà des outilsBig Data, qui ne sont pas toujours correctement intégrés, en particulier au niveaudu système de fichiers ainsi que du système de gestion des ressources (RJMS).Afin de comprendre comment nous pouvons tirer parti des clusters HPC pourl’utilisation du Big Data, et quels sont les défis pour les infrastructures HPC, nousavons étudié plusieurs aspects de la convergence: nous avons d’abord proposé uneétude sur les méthodes de provisionnement logiciel, en mettant l’accent sur lesapplications utilisant beaucoup de données. Nous contribuons a l’état de l’art avecune nouvelle technique de collaboration entre RJMS appelée BeBiDa basée sur 50lignes de code alors que des solutions similaires en utilisent au moins 1000 fois plus.Nous évaluons ce mécanisme en conditions réelles et en environnement simuléavec notre simulateur Batsim. En outre, nous fournissons des extensions à Batsimpour prendre en charge les entrées/sorties et présentons le développements d’unmodèle de système de fichiers générique accompagné d’un modèle d’applicationBig Data. Cela nous permet de compléter les expériences en conditions réellesde BeBiDa en simulation tout en étudiant le dimensionnement et les différentscompromis autours des systèmes de fichiers.Toutes les expériences et analyses de ce travail ont été effectuées avec la reproductibilité à l’esprit. Sur la base de cette expérience, nous proposons d’intégrerle flux de travail du développement et de l’analyse des données dans l’esprit dela reproductibilité, et de donner un retour sur nos expériences avec une liste debonnes pratiques

    It’s like flossing your teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security

    Get PDF
    The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created. Unfortunately, much of the software industry believes R-Bs are too far out of reach for most projects. The goal of this paper is to help identify a path for R-Bs to become a commonplace property. To this end, we conducted a series of 24 semi-structured expert interviews with participants from the Reproducible-Builds.org project, and iterated on our questions with the reproducible builds community. We identified a range of motivations that can encourage open source developers to strive for R-Bs, including indicators of quality, security benefits, and more efficient caching of artifacts. We identify experiences that help and hinder adoption, which heavily include communication with upstream projects. We conclude with recommendations on how to better integrate R-Bs with the efforts of the open source and free software community

    The low surface brightness Universe: the last frontier.

    Get PDF
    El estudio del Universo de bajo brillo superficial supone una de las mayores oportunidades de descubrimientos en la Astronomía actual. Sin embargo, las técnicas tradicionales para reducir y tratar los datos astronómicos no son capaces de proporcionar imágenes en las que las estructuras de bajo brillo superficial puedan ser analizadas y estudiadas con suficiente nivel de detalle. Ir más profundo de 30 mag/arcsec2 implica observar estructuras ~1500 veces más débiles que el brillo del cielo más oscuro de la Tierra, y múltiples efectos sistemáticos lo hacen muy difícil (por ejemplo, el campo plano, la sobre-substracción del fondo del cielo, las reflexiones internas, la luz dispersada, los cirros Galácticos, etc.). En esta tesis se ha llevado a cabo una exploración del Universo de bajo brillo superficial con especial énfasis en los apartados más técnicos. Para corregir los efectos sistemáticos y poder estudiar las estructuras de bajo brillo superficial, se han desarrollado, mejorado y aplicado muchas técnicas a una gran cantidad de datos de telescopios muy diferentes. Se ha prestado especial atención al efecto de la función de dispersión de punto (PSF) para corregir la luz dispersada en las imágenes astronómicas. Un resultado notable es la detección de una estructura estelar de marea muy débil de la galaxia NGC 1052-DF4. Esto revela que la galaxia NGC 1052-DF4 está interactuando con otra galaxia cercana, explicando así su sorprendente falta de materia oscura. Las técnicas y herramientas desarrolladas están pensadas para ser utilizadas en la próxima generación de telescopios y cartografiados profundos con el objetivo de mejorar la calidad de los datos. Junto al desarrollo de esta tesis, también se ha desarrollado y madurado un programa para realizar investigaciones reproducibles: Maneage (del inglés, Managing data Lineage). El objetivo de Maneage es proporcionar un entorno totalmente controlado para llevar a cabo estudios científicos reproducibles. En este sentido, la práctica totalidad de las reducciones y el análisis de los datos astronómicos en esta tesis se han realizado utilizando Maneage

    Enabling the Future, or How to Survive FOREVER. A study of networks, processes and ambiguity in net art and the need for an expanded practice of conservation.

    Get PDF
    Net art is one of the most viewed and experienced artforms, yet some net artworks stop functioning in less than five months. At the heart of this research lies the question of net art’s survival. While net art is hardly accounted for in museum collections – the traditional keepers of cultural heritage – this dissertation explores the material and behaviour of net art. Using a broad range of interdisciplinary resources the chapters open up key theoretical issues that rethink museum practices. Among others, this includes notions of authenticity, authorship, documentation and documents, networks, open source, performativity and processual. Arguing for the need to reconsider traditional attitudes in museums and notions of static conservation as well as acknowledging decentralised and community-based approaches, this dissertation describes an expanded practice of conservation in the computational age. It shows how net art operates through often imperceptible or ambiguous performance of processes and is networked in various ways. It then examines the way these strategies are used and fold back into notions of authenticity, documentation and variability. It is in addressing and answering some of the challenges facing net art that this dissertation makes a distinctive contribution to the field of conservation, curatorial studies as well as to cultural and museum analysis. At the same time, an exploration of net art’s intersections with conservation puts studies on net art into a new perspective. Consequently, the study enables more informed decisions when responding to, critically analysing or working with net art, in particular software-based processes. Surviving FOREVER means embracing rather then fearing ephemerality, loss and obsolescence

    Fòrum de Recerca nº 18. Número complet.

    Get PDF
    XVIII Jornades de Foment de la Investigació de la Facultat de Ciències Humanes i Socials (Any 2013

    Challenges for engineering students working with authentic complex problems

    Get PDF
    Engineers are important participants in solving societal, environmental and technical problems. However, due to an increasing complexity in relation to these problems new interdisciplinary competences are needed in engineering. Instead of students working with monodisciplinary problems, a situation where students work with authentic complex problems in interdisciplinary teams together with a company may scaffold development of new competences. The question is: What are the challenges for students structuring the work on authentic interdisciplinary problems? This study explores a three-day event where 7 students from Aalborg University (AAU) from four different faculties and one student from University College North Denmark (UCN), (6th-10th semester), worked in two groups at a large Danish company, solving authentic complex problems. The event was structured as a Hackathon where the students for three days worked with problem identification, problem analysis and finalizing with a pitch competition presenting their findings. During the event the students had workshops to support the work and they had the opportunity to use employees from the company as facilitators. It was an extracurricular activity during the summer holiday season. The methodology used for data collection was qualitative both in terms of observations and participants’ reflection reports. The students were observed during the whole event. Findings from this part of a larger study indicated, that students experience inability to transfer and transform project competences from their previous disciplinary experiences to an interdisciplinary setting
    corecore