44 research outputs found
Research Software Sustainability: Lessons Learned at NCSA
This paper discusses why research software is important, and what sustainability means in this context. It then talks about how research software sustainability can be achieved, and what our experiences at NCSA have been using specific examples, what we have learned from this, and how we think these lessons can help others
Research Software Engineers: Career Entry Points and Training Gaps
As software has become more essential to research across disciplines, and as
the recognition of this fact has grown, the importance of professionalizing the
development and maintenance of this software has also increased. The community
of software professionals who work on this software have come together under
the title Research Software Engineer (RSE) over the last decade. This has led
to the formalization of RSE roles and organized RSE groups in universities,
national labs, and industry. This, in turn, has created the need to understand
how RSEs come into this profession and into these groups, how to further
promote this career path to potential members, as well as the need to
understand what training gaps need to be filled for RSEs coming from different
entry points. We have categorized three main classifications of entry paths
into the RSE profession and identified key elements, both advantages and
disadvantages, that should be acknowledged and addressed by the broader
research community in order to attract and retain a talented and diverse pool
of future RSEs.Comment: Submitted to IEEE Computing in Science & Engineering (CiSE): Special
Issue on the Future of Research Software Engineers in the U
Research Software Development & Management in Universities: Case Studies from Manchester's RSDS Group, Illinois' NCSA, and Notre Dame's CRC
Modern research in the sciences, engineering, humanities, and other fields
depends on software, and specifically, research software. Much of this research
software is developed in universities, by faculty, postdocs, students, and
staff. In this paper, we focus on the role of university staff. We examine
three different, independently-developed models under which these staff are
organized and perform their work, and comparatively analyze these models and
their consequences on the staff and on the software, considering how the
different models support software engineering practices and processes. This
information can be used by software engineering researchers to understand the
practices of such organizations and by universities who want to set up similar
organizations and to better produce and maintain research software.Comment: 2019 Intl. Work. on Soft. Eng. for Science (SE4Science), May 28,
2019, with ICSE'1
Digitization and search: A non-traditional use of HPC
We describe our efforts in developing an open source cyberinfrastructure to provide a form of automated search of handwritten content within large digitized document archives. Such collections are a treasure trove of data ranging from decades ago to as far as the present. The information contained in these collections is also very relevant to both researchers who might extract numerical or statistical data from such sources as well as the general public. With the push to digitize our paper archives we are, how-ever, faced with the fact that though these digital versions are easier to share, they are not trivially searchable as the digitiza-tion process produces image data and not text. This inability to find and/or identify contents within these collections makes this data largely unusable without a lengthy and costly manual transcription process carried out by human beings
BRACELET: Hierarchical Edge-Cloud Microservice Infrastructure for Scientific Instruments’ Lifetime Connectivity
Recent advances in cyber-infrastructure have enabled digital data sharing and ubiquitous network connectivity between scientific instruments and cloud-based storage infrastructure for uploading, storing, curating, and correlating of large amounts of materials and semiconductor fabrication data and metadata. However, there is still a significant number of scientific instruments running on old operating systems that are taken offline and cannot connect to the cloud infrastructure, due to security and performance concerns. In this paper, we propose BRACELET - an edge-cloud infrastructure that augments the existing cloud-based infrastructure with edge devices and helps to tackle the unique performance and security challenges that scientific instruments face when they are connected to the cloud through public network. With BRACELET, we put a networked edge device, called cloudlet, in between the scientific instruments and the cloud as the middle tier of a three-tier hierarchy. The cloudlet will shape and protect the data traffic from scientific instruments to the cloud, and will play a foundational role in keeping the instruments connected throughout its lifetime, and continuously providing the otherwise missing performance and security features for the instrument as its operating system ages.NSF Award Number 1659293NSF Award Number 1443013Ope
4CeeD: Real-Time Data Acquisition and Analysis Framework for Material-related Cyber-Physical Environments
In this paper, we propose a data acquisition and analysis framework for materials-to-devices processes, named 4CeeD, that focuses on the immense potential of capturing, accurately curating, correlating, and coordinating materials-to-devices digital data in a real-time and trusted manner before fully archiving and publishing them for wide access and sharing. In particular, 4CeeD consists of: (i) a curation service for collecting data from experimental instruments, curating, and wrapping of data with extensive metadata in real-time and in a trusted manner, (ii) a cloudlet for caching collected data from curation service and coordinating data transfer with the back-end, and (iii) a cloud-based coordination service for storing data, extracting meta-data, analyzing and finding correlations among the data. Our evaluation results show that our proposed approach is able to help researchers significantly save time and cost spent on experiments, and is efficient in dealing with high-volume and fast-changing workload of heterogeneous types of experimental data.National Science Foundation/NSF ACI 1443013Ope
Standing together for reproducibility in large-scale computing: report on reproducibility@XSEDE
This is the final report on reproducibility@xsede, a one-day workshop held in conjunction with XSEDE14, the annual conference of the Extreme Science and Engineering Discovery Environment (XSEDE). The workshop's discussion-oriented agenda focused on reproducibility in large-scale computational research. Two important themes capture the spirit of the workshop submissions and discussions: (1) organizational stakeholders, especially supercomputer centers, are in a unique position to promote, enable, and support reproducible research; and (2) individual researchers should conduct each experiment as though someone will replicate that experiment. Participants documented numerous issues, questions, technologies, practices, and potentially promising initiatives emerging from the discussion, but also highlighted four areas of particular interest to XSEDE: (1) documentation and training that promotes reproducible research; (2) system-level tools that provide build- and run-time information at the level of the individual job; (3) the need to model best practices in research collaborations involving XSEDE staff; and (4) continued work on gateways and related technologies. In addition, an intriguing question emerged from the day's interactions: would there be value in establishing an annual award for excellence in reproducible research? Overvie