895,686 research outputs found
Emerging good practice in managing research data and research information within UK Universities
Sound data intensive science depends upon effective research data and information management. Efficient and interoperable research information systems will be crucial for enabling and exploiting data intensive research however it is equally important that a research ecosystem is cultivated within research-intensive institutions that foster sustainable communication, cooperation and support of a diverse range of research-related staff. Researchers, librarians, administrators, ethics advisors, and IT professionals all have a vital contribution to make in ensuring that research data and related information is available, visible, understandable and usable over the mid to long term. This paper will provide a summary of several ongoing initiatives that the Jisc-funded Digital Curation Centre (DCC) are currently involved with in the UK and internationally to help staff within higher education institutions prepare to meet funding body mandates relating to research data management and sharing and to engage fully in the digital agenda
The medical science DMZ: a network design pattern for data-intensive medical science
Abstract:
Objective
We describe a detailed solution for maintaining high-capacity, data-intensive network flows (eg, 10, 40, 100 Gbps+) in a scientific, medical context while still adhering to security and privacy laws and regulations.
Materials and Methods
High-end networking, packet-filter firewalls, network intrusion-detection systems.
Results
We describe a “Medical Science DMZ” concept as an option for secure, high-volume transport of large, sensitive datasets between research institutions over national research networks, and give 3 detailed descriptions of implemented Medical Science DMZs.
Discussion
The exponentially increasing amounts of “omics” data, high-quality imaging, and other rapidly growing clinical datasets have resulted in the rise of biomedical research “Big Data.” The storage, analysis, and network resources required to process these data and integrate them into patient diagnoses and treatments have grown to scales that strain the capabilities of academic health centers. Some data are not generated locally and cannot be sustained locally, and shared data repositories such as those provided by the National Library of Medicine, the National Cancer Institute, and international partners such as the European Bioinformatics Institute are rapidly growing. The ability to store and compute using these data must therefore be addressed by a combination of local, national, and industry resources that exchange large datasets. Maintaining data-intensive flows that comply with the Health Insurance Portability and Accountability Act (HIPAA) and other regulations presents a new challenge for biomedical research. We describe a strategy that marries performance and security by borrowing from and redefining the concept of a Science DMZ, a framework that is used in physical sciences and engineering research to manage high-capacity data flows.
Conclusion
By implementing a Medical Science DMZ architecture, biomedical researchers can leverage the scale provided by high-performance computer and cloud storage facilities and national high-speed research networks while preserving privacy and meeting regulatory requirements
Introducing distributed dynamic data-intensive (D3) science: Understanding applications and infrastructure
A common feature across many science and engineering applications is the
amount and diversity of data and computation that must be integrated to yield
insights. Data sets are growing larger and becoming distributed; and their
location, availability and properties are often time-dependent. Collectively,
these characteristics give rise to dynamic distributed data-intensive
applications. While "static" data applications have received significant
attention, the characteristics, requirements, and software systems for the
analysis of large volumes of dynamic, distributed data, and data-intensive
applications have received relatively less attention. This paper surveys
several representative dynamic distributed data-intensive application
scenarios, provides a common conceptual framework to understand them, and
examines the infrastructure used in support of applications.Comment: 38 pages, 2 figure
Theory and Practice of Data Citation
Citations are the cornerstone of knowledge propagation and the primary means
of assessing the quality of research, as well as directing investments in
science. Science is increasingly becoming "data-intensive", where large volumes
of data are collected and analyzed to discover complex patterns through
simulations and experiments, and most scientific reference works have been
replaced by online curated datasets. Yet, given a dataset, there is no
quantitative, consistent and established way of knowing how it has been used
over time, who contributed to its curation, what results have been yielded or
what value it has.
The development of a theory and practice of data citation is fundamental for
considering data as first-class research objects with the same relevance and
centrality of traditional scientific products. Many works in recent years have
discussed data citation from different viewpoints: illustrating why data
citation is needed, defining the principles and outlining recommendations for
data citation systems, and providing computational methods for addressing
specific issues of data citation.
The current panorama is many-faceted and an overall view that brings together
diverse aspects of this topic is still missing. Therefore, this paper aims to
describe the lay of the land for data citation, both from the theoretical (the
why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association
for Information Science and Technology (JASIST), 201
AgroDataCube and AGINFRA+: Operationalising Big Data for Agricultural Informatics
Big Data methods and tools are becoming widely adopted by the ICT industry and create new opportunities for data intensive science in the agro-environmental domain. However, Big Data adoption is still in its infancy for Agricultural Information Systems, and many barriers still exist for wider use of big data analysis ..
- …