29 research outputs found
kiwiPy: Robust, high-volume, messaging for big-data and computational science workflows
In this work we present kiwiPy, a Python library designed to support robust
message based communication for high-throughput, big-data, applications while
being general enough to be useful wherever high-volumes of messages need to be
communicated in a predictable manner. KiwiPy relies on the RabbitMQ protocol,
an industry standard message broker, while providing a simple and intuitive
interface that can be used in both multithreaded and coroutine based
applications. To demonstrate some of kiwiPy's functionality we give examples
from AiiDA, a high-throughput simulation platform, where kiwiPy is used as a
key component of the workflow engine
How to verify the precision of density-functional-theory implementations via reproducible and universal workflows
In the past decades many density-functional theory methods and codes adopting
periodic boundary conditions have been developed and are now extensively used
in condensed matter physics and materials science research. Only in 2016,
however, their precision (i.e., to which extent properties computed with
different codes agree among each other) was systematically assessed on
elemental crystals: a first crucial step to evaluate the reliability of such
computations. We discuss here general recommendations for verification studies
aiming at further testing precision and transferability of
density-functional-theory computational approaches and codes. We illustrate
such recommendations using a greatly expanded protocol covering the whole
periodic table from Z=1 to 96 and characterizing 10 prototypical cubic
compounds for each element: 4 unaries and 6 oxides, spanning a wide range of
coordination numbers and oxidation states. The primary outcome is a reference
dataset of 960 equations of state cross-checked between two all-electron codes,
then used to verify and improve nine pseudopotential-based approaches. Such
effort is facilitated by deploying AiiDA common workflows that perform
automatic input parameter selection, provide identical input/output interfaces
across codes, and ensure full reproducibility. Finally, we discuss the extent
to which the current results for total energies can be reused for different
goals (e.g., obtaining formation energies).Comment: Main text: 23 pages, 4 figures. Supplementary: 68 page
Common workflows for computing material properties using different quantum engines
The prediction of material properties based on density-functional theory has become routinely common, thanks, in part, to the steady increase in the number and robustness of available simulation packages. This plurality of codes and methods is both a boon and a burden. While providing great opportunities for cross-verification, these packages adopt different methods, algorithms, and paradigms, making it challenging to choose, master, and efficiently use them. We demonstrate how developing common interfaces for workflows that automatically compute material properties greatly simplifies interoperability and cross-verification. We introduce design rules for reusable, code-agnostic, workflow interfaces to compute well-defined material properties, which we implement for eleven quantum engines and use to compute various material properties. Each implementation encodes carefully selected simulation parameters and workflow logic, making the implementerâs expertise of the quantum engine directly available to non-experts. All workflows are made available as open-source and full reproducibility of the workflows is guaranteed through the use of the AiiDA infrastructure.This work is supported by the MARVEL National Centre of Competence in Research (NCCR) funded by the Swiss National Science Foundation (grant agreement ID 51NF40-182892) and by the European Unionâs Horizon 2020 research and innovation program under Grant Agreement No. 824143 (European MaX Centre of Excellence âMaterials design at the Exascaleâ) and Grant Agreement No. 814487 (INTERSECT project). We thank M. Giantomassi and J.-M. Beuken for their contributions in adding support for PseudoDojo tables to the aiida-pseudo (https://github.com/aiidateam/aiida-pseudo) plugin. We also thank X. Gonze, M. Giantomassi, M. Probert, C. Pickard, P. Hasnip, J. Hutter, M. Iannuzzi, D. Wortmann, S. BlĂŒgel, J. Hess, F. Neese, and P. Delugas for providing useful feedback on the various quantum engine implementations. S.P. acknowledges support from the European Unions Horizon 2020 Research and Innovation Programme, under the Marie SkĆodowska-Curie Grant Agreement SELPH2D No. 839217 and computer time provided by the PRACE-21 resources MareNostrum at BSC-CNS. E.F.-L. acknowledges the support of the Norwegian Research Council (project number 262339) and computational resources provided by Sigma2. P.Z.-P. thanks to the Faraday Institution CATMAT project (EP/S003053/1, FIRG016) for financial support. KE acknowledges the Swiss National Science Foundation (grant number 200020-182015). G.Pi. and K.E. acknowledge the swissuniversities âMaterials Cloudâ (project number 201-003). Work at ICMAB is supported by the Severo Ochoa Centers of Excellence Program (MICINN CEX2019-000917-S), by PGC2018-096955-B-C44 (MCIU/AEI/FEDER, UE), and by GenCat 2017SGR1506. B.Z. thanks to the Faraday Institution FutureCat project (EP/S003053/1, FIRG017) for financial support. J.B. and V.T. acknowledge support by the Joint Lab Virtual Materials Design (JLVMD) of the Forschungszentrum JĂŒlich.Peer reviewe
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
Workflows in AiiDA:Engineering a high-throughput, event-based engine for robust and modular computational workflows
Over the last two decades, the field of computational science has seen a dramatic shift towards incorporating high-throughput computation and big-data analysis as fundamental pillars of the scientific discovery process. This has necessitated the development of tools and techniques to deal with the generation, storage and processing of large amounts of data. In this work we present an in-depth look at the workflow engine powering AiiDA, a widely adopted, highly flexible and database-backed informatics infrastructure with an emphasis on data reproducibility. We detail many of the design choices that were made which were informed by several important goals: the ability to scale from running on individual laptops up to high-performance supercomputers, managing jobs with runtimes spanning from fractions of a second to weeks and scaling up to thousands of jobs concurrently, and all this while maximising robustness. In short, AiiDA aims to be a Swiss army knife for high-throughput computational science. As well as the architecture, we outline important API design choices made to give workflow writers a great deal of liberty whilst guiding them towards writing robust and modular workflows, ultimately enabling them to encode their scientific knowledge to the benefit of the wider scientific community
A bridge between trust and control: Computational workflows meet automated battery cycling
Compliance with good research data management practices means trust in the integrity of the data, and it is achievable by a full control of the data gathering process. In this work, we demonstrate tooling which bridges these two aspects, and illustrate its use in a case study of automated battery cycling. We successfully interface off-the-shelf battery cycling hardware with the computational workflow management software AiiDA, allowing us to control experiments, while ensuring trust in the data by tracking its provenance. We design user interfaces compatible with this tooling, which span the inventory, experiment design, and result analysis stages. Other features, including monitoring of workflows and import of externally generated and legacy data are also implemented. Finally, the full software stack required for this work is made available in a set of open-source packages
A bridge between trust and control: Computational workflows meet automated battery cycling
<h3>Electronic Supplementary Information</h3><p>This is the supporting information archive for the above manuscript. The archive contains the following items:</p><ul><li>A pdf file, containing supplementary figures for the manuscript, including further screenshots, plotted data, and automatically generated provenance graphs.</li><li>A set of csv files, containing the outputs of the robotic battery assembly component of Aurora for the cell batches studied in this work, including electrode diameters, weights, and specific capacities.</li><li>A set of videos showing the installation and user interaction with the AiiDAlab-Aurora user interface can be found on the <a href="https://big-map.github.io/big-map-registry/apps/aiidalab-aurora.html">BIG-MAP App Store page of AiiDAlab-Aurora</a>.</li><li>The cell cycling data exported from AiiDA, as well as the raw data files from EC-Lab, can be found on Materials Cloud Archive under DOI: <a href="https://doi.org/10.24435/materialscloud:qh-gt">https://doi.org/10.24435/materialscloud:qh-gt</a></li></ul>
Virtual computational chemistry teaching laboratories â hands-on at a distance
The COVID-19 pandemic disrupted chemistry teaching practices globally as many courses were forced online necessitating adaptation to the digital platform. The biggest impact was to the practical component of the chemistry curriculum â the so-called wet lab. Naively, it would be thought that computer-based teaching labs would have little problem in making the move. However, this is not the case as there are many unrecognised differences between delivering computer-based teaching in-person and virtually: software issues, technology and classroom management. Consequently, relatively few âhands-onâ computational chemistry teaching laboratories are delivered online. In this paper we describe these issues in more detail and how they can be addressed, drawing on our experience in delivering a third-year computational chemistry course as well as remote hands-on workshops for the Virtual Winter School on Computational Chemistry and the European BIG-MAP project
Virtual Computational Chemistry Teaching Laboratories-Hands-On at a Distance
The COVID-19 pandemic disrupted chemistry teaching practices globally as many courses were forced online, necessitating adaptation to the digital platform. The biggest impact was to the practical component of the chemistry curriculum-the so-called wet lab. Naively, it would be thought that computer-based teaching laboratories would have little problem in making the move. However, this is not the case as there are many unrecognized differences between delivering computer-based teaching in-person and virtually: software issues, technology, and classroom management. Consequently, relatively few "hands-on" computational chemistry teaching laboratories are delivered online. In this paper, we describe these issues in more detail and how they can be addressed, drawing on our experience in delivering a third-year computational chemistry course as well as remote hands-on workshops for the Virtual Winter School on Computational Chemistry and the European BIG-MAP project.THEO