2,674,665 research outputs found
Data sharing: What about epidemiological cohorts?
General-purpose population-based cohorts can be powerful scientific platforms. The Gazel Cohort Study, a French cohort of 20,000 adults followed-up since more than 20 years, was designed as an open epidemiologic laboratory, hosting more than 40 nested research projects from French and international teams.
Formal rules were elaborated to define the way that the Gazel database can be accessed.
One of the main problems faced by the Gazel team members is scientific recognition of their work, as the large majority of the publications come from external research groups
Secret Sharing for Cloud Data Security
Cloud computing helps reduce costs, increase business agility and deploy
solutions with a high return on investment for many types of applications.
However, data security is of premium importance to many users and often
restrains their adoption of cloud technologies. Various approaches, i.e., data
encryption, anonymization, replication and verification, help enforce different
facets of data security. Secret sharing is a particularly interesting
cryptographic technique. Its most advanced variants indeed simultaneously
enforce data privacy, availability and integrity, while allowing computation on
encrypted data. The aim of this paper is thus to wholly survey secret sharing
schemes with respect to data security, data access and costs in the
pay-as-you-go paradigm
Identifying Data Sharing in Biomedical Literature
Many policies and projects now encourage investigators to share their raw research data with other scientists. Unfortunately, it is difficult to measure the effectiveness of these initiatives because data can be shared in such a variety of mechanisms and locations. We propose a novel approach to finding shared datasets: using NLP techniques to identify declarations of dataset sharing within the full text of primary research articles. Using regular expression patterns and machine learning algorithms on open access biomedical literature, our system was able to identify 61% of articles with shared datasets with 80% precision. A simpler version of our classifier achieved higher recall (86%), though lower precision (49%). We believe our results demonstrate the feasibility of this approach and hope to inspire further study of dataset retrieval techniques and policy evaluation.

CamFlow: Managed Data-sharing for Cloud Services
A model of cloud services is emerging whereby a few trusted providers manage
the underlying hardware and communications whereas many companies build on this
infrastructure to offer higher level, cloud-hosted PaaS services and/or SaaS
applications. From the start, strong isolation between cloud tenants was seen
to be of paramount importance, provided first by virtual machines (VM) and
later by containers, which share the operating system (OS) kernel. Increasingly
it is the case that applications also require facilities to effect isolation
and protection of data managed by those applications. They also require
flexible data sharing with other applications, often across the traditional
cloud-isolation boundaries; for example, when government provides many related
services for its citizens on a common platform. Similar considerations apply to
the end-users of applications. But in particular, the incorporation of cloud
services within `Internet of Things' architectures is driving the requirements
for both protection and cross-application data sharing.
These concerns relate to the management of data. Traditional access control
is application and principal/role specific, applied at policy enforcement
points, after which there is no subsequent control over where data flows; a
crucial issue once data has left its owner's control by cloud-hosted
applications and within cloud-services. Information Flow Control (IFC), in
addition, offers system-wide, end-to-end, flow control based on the properties
of the data. We discuss the potential of cloud-deployed IFC for enforcing
owners' dataflow policy with regard to protection and sharing, as well as
safeguarding against malicious or buggy software. In addition, the audit log
associated with IFC provides transparency, giving configurable system-wide
visibility over data flows. [...]Comment: 14 pages, 8 figure
Publishing and sharing sensitive data
Sensitive data has often been excluded from discussions about data publication and sharing. It was believed that sharing sensitive data is not ethical or that it is too difficult to do safely. This opinion has changed with greater understanding and use of methods to ‘de-sensitise’ (i.e., confidentialise) data; that is, modify the data to remove information so that participants or subjects are no longer identifiable, and the capacity to grant ‘conditional access’ to data. Requirements of publishers and funding bodies for researchers to publish and share their data have also seen sensitive data sharing increase.
This guide outlines best practice for the publication and sharing of sensitive research data in the Australian context. The Guide follows the sequence of steps that are necessary for publishing and sharing sensitive data, as outlined in the ‘Publishing and Sharing Sensitive Data Decision Tree’. It provides the detail and context to the steps in this Decision Tree. References for further reading are provided for those that are interested.
By following the sections below, and steps within, you will be able to make clear, lawful, and ethical decisions about sharing your data safely. It can be done in most cases!
How the Guide interacts with your institutional policies
This Guide is not intended to override institutional policies on data management or publication. Most researchers operate within the policies of their institution and/or funding arrangement and must, therefore, ensure their decisions about data publication align with these policies. This is particularly relevant for Intellectual Property, and sometimes, your classification of sensitive data (e.g., NSW Government Department of Environment & Heritage, Sensitive Data Species Policy) or selection of data repository. The Guide indicates the steps at which you should check your institutional policies
A review of journal policies for sharing research data
*Background:* Sharing data is a tenet of science, yet commonplace in only a few subdisciplines. Recognizing that a data sharing culture is unlikely to be achieved without policy guidance, some funders and journals have begun to request and require that investigators share their primary datasets with other researchers. The purpose of this study is to understand the current state of data sharing policies within journals, the features of journals which are associated with the strength of their data sharing policies, and whether the strength of data sharing policies impact the observed prevalence of data sharing. 

*Methods:* We investigated these relationships with respect to gene expression microarray data in the journals that most often publish studies about this type of data. We measured data sharing prevalence as the proportion of papers with submission links from NCBI's Gene Expression Omnibus (GEO) database. We conducted univariate and linear multivariate regressions to understand the relationship between the strength of data sharing policy and journal impact factor, journal subdiscipline, journal publisher (academic societies vs. commercial), and publishing model (open vs. closed access).

*Results:* Of the 70 journal policies, 18 (26%) made no mention of sharing publication-related data within their Instruction to Author statements. Of the 42 (60%) policies with a data sharing policy applicable to microarrays, we classified 18 (26% of 70) as moderately strong and 24 (34% of 70) as strong.
Existence of a data sharing policy was associated with the type of journal publisher: half of all commercial publishers had a policy compared to 82% of journals published by academic society. All four of the open-access journals had a data sharing policy. Policy strength was associated with impact factor: the journals with no data sharing policy, a weak policy, and a strong policy had respective median impact factors of 3.6, 4.5, and 6.0. Policy strength was positively associated with measured data sharing submission into the GEO database: the journals with no data sharing policy, a weak policy, and a strong policy had median data sharing prevalence of 11%, 19%, and 29% respectively.

*Conclusion:* This review and analysis begins to quantify the relationship between journal policies and data sharing outcomes and thereby contributes to assessing the incentives and initiatives designed to facilitate widespread, responsible, effective data sharing. 


- …
