239 research outputs found

    Scalable Declarative HEP Analysis Workflows for Containerised Compute Clouds

    Get PDF
    We describe a novel approach for experimental High-Energy Physics (HEP) data analyses that is centred around the declarative rather than imperative paradigm when describing analysis computational tasks. The analysis process can be structured in the form of a Directed Acyclic Graph (DAG), where each graph vertex represents a unit of computation with its inputs and outputs, and the graph edges describe the interconnection of various computational steps. We have developed REANA, a platform for reproducible data analyses, that supports several such DAG workflow specifications. The REANA platform parses the analysis workflow and dispatches its computational steps to various supported computing backends (Kubernetes, HTCondor, Slurm). The focus on declarative rather than imperative programming enables researchers to concentrate on the problem domain at hand without having to think about implementation details such as scalable job orchestration. The declarative programming approach is further exemplified by a multi-level job cascading paradigm that was implemented in the Yadage workflow specification language. We present two recent LHC particle physics analyses, ATLAS searches for dark matter and CMS jet energy correction pipelines, where the declarative approach was successfully applied. We argue that the declarative approach to data analyses, combined with recent advancements in container technology, facilitates the portability of computational data analyses to various compute backends, enhancing the reproducibility and the knowledge preservation behind particle physics data analyses.Peer reviewe

    Theoretical and technological building blocks for an innovation accelerator

    Get PDF
    The scientific system that we use today was devised centuries ago and is inadequate for our current ICT-based society: the peer review system encourages conservatism, journal publications are monolithic and slow, data is often not available to other scientists, and the independent validation of results is limited. Building on the Innovation Accelerator paper by Helbing and Balietti (2011) this paper takes the initial global vision and reviews the theoretical and technological building blocks that can be used for implementing an innovation (in first place: science) accelerator platform driven by re-imagining the science system. The envisioned platform would rest on four pillars: (i) Redesign the incentive scheme to reduce behavior such as conservatism, herding and hyping; (ii) Advance scientific publications by breaking up the monolithic paper unit and introducing other building blocks such as data, tools, experiment workflows, resources; (iii) Use machine readable semantics for publications, debate structures, provenance etc. in order to include the computer as a partner in the scientific process, and (iv) Build an online platform for collaboration, including a network of trust and reputation among the different types of stakeholders in the scientific system: scientists, educators, funding agencies, policy makers, students and industrial innovators among others. Any such improvements to the scientific system must support the entire scientific process (unlike current tools that chop up the scientific process into disconnected pieces), must facilitate and encourage collaboration and interdisciplinarity (again unlike current tools), must facilitate the inclusion of intelligent computing in the scientific process, must facilitate not only the core scientific process, but also accommodate other stakeholders such science policy makers, industrial innovators, and the general public

    The Healthgrid White Paper

    Get PDF

    At the crossroads of big science, open science, and technology transfer

    Get PDF
    Les grans infraestructures científiques s’enfronten a demandes creixents de responsabilitat pública, no només per la seva contribució al descobriment científic, sinó també per la seva capacitat de generar valor econòmic secundari. Per construir i operar les seves infraestructures sofisticades, sovint generen tecnologies frontereres dissenyant i construint solucions tècniques per a problemes d’enginyeria complexos i sense precedents. En paral·lel, la dècada anterior ha presenciat la ràpida irrupció de canvis tecnològics que han afectat la manera com es fa i es comparteix la ciència, cosa que ha comportat l’emergència del concepte d’Open Science (OS). Els governs avancen ràpidament vers aquest paradigma de OS i demanen a les grans infraestructures científiques que "obrin" els seus processos científics. No obstant, aquestes dues forces s'oposen, ja que la comercialització de tecnologies i resultats científics requereixen normalment d’inversions financeres importants i les empreses només estan disposades a assumir aquest cost si poden protegir la innovació de la imitació o de la competència deslleial. Aquesta tesi doctoral té com a objectiu comprendre com les noves aplicacions de les TIC afecten els resultats de la recerca i la transferència de tecnologia resultant en el context de les grans infraestructures científiques. La tesis pretén descobrir les tensions entre aquests dos vectors normatius, així com identificar els mecanismes que s’utilitzen per superar-les. La tesis es compon de quatre estudis: 1) Un estudi que aplica un mètode de recerca mixt que combina dades de dues enquestes d’escala global realitzades online (2016, 2018), amb dos cas d’estudi de dues comunitats científiques en física d’alta energia i biologia molecular que avaluen els factors explicatius darrere les pràctiques de compartir dades per part dels científics; 2) Un estudi de cas d’Open Targets, una infraestructura d’informació basada en dades considerades bens comuns, on el Laboratori Europeu de Biologia Molecular-EBI i empreses farmacèutiques col·laboren i comparteixen dades científiques i eines tecnològiques per accelerar el descobriment de medicaments; 3) Un estudi d’un conjunt de dades únic de 170 projectes finançats en el marc d’ATTRACT (un nou instrument de la Comissió Europea liderat per les grans infraestructures científiques europees) que té com a objectiu comprendre la naturalesa del procés de serendipitat que hi ha darrere de la transició de tecnologies de grans infraestructures científiques a aplicacions comercials abans no anticipades. ; i 4) un cas d’estudi sobre la tecnologia White Rabbit, un hardware sofisticat de codi obert desenvolupat al Consell Europeu per a la Recerca Nuclear (CERN) en col·laboració amb un extens ecosistema d’empreses.Las grandes infraestructuras científicas se enfrentan a crecientes demandas de responsabilidad pública, no solo por su contribución al descubrimiento científico sino también por su capacidad de generar valor económico para la sociedad. Para construir y operar sus sofisticadas infraestructuras, a menudo generan tecnologías de vanguardia al diseñar y construir soluciones técnicas para problemas de ingeniería complejos y sin precedentes. Paralelamente, la década anterior ha visto la irrupción de rápidos cambios tecnológicos que afectan la forma en que se genera y comparte la ciencia, lo que ha llevado a acuñar el concepto de Open Science (OS). Los gobiernos se están moviendo rápidamente hacia este nuevo paradigma y están pidiendo a las grandes infraestructuras científicas que "abran" el proceso científico. Sin embargo, estas dos fuerzas se oponen, ya que la comercialización de tecnología y productos científicos generalmente requiere importantes inversiones financieras y las empresas están dispuestas a asumir este coste solo si pueden proteger la innovación de la imitación o la competencia desleal. Esta tesis doctoral tiene como objetivo comprender cómo las nuevas aplicaciones de las TIC están afectando los resultados científicos y la transferencia de tecnología resultante en el contexto de las grandes infraestructuras científicas. La tesis pretende descubrir las tensiones entre estas dos fuerzas normativas e identificar los mecanismos que se emplean para superarlas. La tesis se compone de cuatro estudios: 1) Un estudio que emplea un método mixto de investigación que combina datos de dos encuestas de escala global realizadas online (2016, 2018), con dos caso de estudio sobre dos comunidades científicas distintas -física de alta energía y biología molecular- que evalúan los factores explicativos detrás de las prácticas de intercambio de datos científicos; 2) Un caso de estudio sobre Open Targets, una infraestructura de información basada en datos considerados como bienes comunes, donde el Laboratorio Europeo de Biología Molecular-EBI y compañías farmacéuticas colaboran y comparten datos científicos y herramientas tecnológicas para acelerar el descubrimiento de fármacos; 3) Un estudio de un conjunto de datos único de 170 proyectos financiados bajo ATTRACT, un nuevo instrumento de la Comisión Europea liderado por grandes infraestructuras científicas europeas, que tiene como objetivo comprender la naturaleza del proceso fortuito detrás de la transición de las tecnologías de grandes infraestructuras científicas a aplicaciones comerciales previamente no anticipadas ; y 4) un estudio de caso de la tecnología White Rabbit, un sofisticado hardware de código abierto desarrollado en el Consejo Europeo de Investigación Nuclear (CERN) en colaboración con un extenso ecosistema de empresas.Big science infrastructures are confronting increasing demands for public accountability, not only within scientific discovery but also their capacity to generate secondary economic value. To build and operate their sophisticated infrastructures, big science often generates frontier technologies by designing and building technical solutions to complex and unprecedented engineering problems. In parallel, the previous decade has seen the disruption of rapid technological changes impacting the way science is done and shared, which has led to the coining of the concept of Open Science (OS). Governments are quickly moving towards the OS paradigm and asking big science centres to "open up” the scientific process. Yet these two forces run in opposition as the commercialization of scientific outputs usually requires significant financial investments and companies are willing to bear this cost only if they can protect the innovation from imitation or unfair competition. This PhD dissertation aims at understanding how new applications of ICT are affecting primary research outcomes and the resultant technology transfer in the context of big and OS. It attempts to uncover the tensions in these two normative forces and identify the mechanisms that are employed to overcome them. The dissertation is comprised of four separate studies: 1) A mixed-method study combining two large-scale global online surveys to research scientists (2016, 2018), with two case studies in high energy physics and molecular biology scientific communities that assess explanatory factors behind scientific data-sharing practices; 2) A case study of Open Targets, an information infrastructure based upon data commons, where European Molecular Biology Laboratory-EBI and pharmaceutical companies collaborate and share scientific data and technological tools to accelerate drug discovery; 3) A study of a unique dataset of 170 projects funded under ATTRACT -a novel policy instrument of the European Commission lead by European big science infrastructures- which aims to understand the nature of the serendipitous process behind transitioning big science technologies to previously unanticipated commercial applications; and 4) a case study of White Rabbit technology, a sophisticated open-source hardware developed at the European Council for Nuclear Research (CERN) in collaboration with an extensive ecosystem of companies

    Discovering the signalling cues involved in early human brain development

    Get PDF
    corecore