40 research outputs found
The RAppArmor Package: Enforcing Security Policies in R Using Dynamic Sandboxing on Linux
The increasing availability of cloud computing and scientific super computers
brings great potential for making R accessible through public or shared
resources. This allows us to efficiently run code requiring lots of cycles and
memory, or embed R functionality into, e.g., systems and web services. However
some important security concerns need to be addressed before this can be put in
production. The prime use case in the design of R has always been a single
statistician running R on the local machine through the interactive console.
Therefore the execution environment of R is entirely unrestricted, which could
result in malicious behavior or excessive use of hardware resources in a shared
environment. Properly securing an R process turns out to be a complex problem.
We describe various approaches and illustrate potential issues using some of
our personal experiences in hosting public web services. Finally we introduce
the RAppArmor package: a Linux based reference implementation for dynamic
sandboxing in R on the level of the operating system
RProtoBuf: Efficient Cross-Language Data Serialization in R
Modern data collection and analysis pipelines often involve a sophisticated mix of applications written in general purpose and specialized programming languages. Many formats commonly used to import and export data between different programs or systems, such as CSV or JSON, are verbose, inefficient, not type-safe, or tied to a specific programming language. Protocol Buffers are a popular method of serializing structured data between applications - while remaining independent of programming languages or operating systems. They offer a unique combination of features, performance, and maturity that seems particularly well suited for data-driven applications and numerical computing. The RProtoBuf package provides a complete interface to Protocol Buffers from the R environment for statistical computing. This paper outlines the general class of data serialization requirements for statistical computing, describes the implementation of the RProtoBuf package, and illustrates its use with example applications in large-scale data collection pipelines and web services
Surgical quality in organ procurement during day and night: an analysis of quality forms
OBJECTIVES: To analyse a potential association between surgical quality and time of day. DESIGN: A retrospective analysis of complete sets of quality forms filled out by the procuring and accepting surgeon on organs from deceased donors. SETTING: Procurement procedures in the Netherlands are organised per region. All procedures are performed by an independent, dedicated procurement team that is associated with an academic medical centre in the region. PARTICIPANTS: In 18 months' time, 771 organs were accepted and procured in The Netherlands. Of these, 17 organs were declined before transport and therefore excluded. For the remaining 754 organs, 591 (78%) sets of forms were completed (procurement and transplantation). Baseline characteristics were comparable in both daytime and evening/night-time with the exception of height (p=0.003). PRIMARY OUTCOME MEASURE: All complete sets of quality forms were retrospectively analysed for the primary outcome, procurement-related surgical injury. Organs were categorised based on the starting time of the procurement in either daytime (8:00-17:00) or evening/night-time (17:00-8:00). RESULTS: Out of 591 procured organs, 129 organs (22%) were procured during daytime and 462 organs (78%) during evening/night-time. The incidence of surgical injury was significantly lower during daytime; 22 organs (17%) compared with 126 organs (27%) procured during evening/night-time (p=0.016). This association persists when adjusted for confounders. CONCLUSIONS: This study shows an increased incidence of procurement-related surgical injury in evening/night-time procedures as compared with daytime. Time of day might (in)directly influence surgical performance and should be considered a potential risk factor for injury in organ procurement procedures
Embedded Scientific Computing: A Scalable, Interoperable and Reproducible Approach to Statistical Software for Data-Driven Business and Open Science
Methods for scientific computing are traditionally implemented in specialized software packages assisting the statistician in all facets of the data analysis process. A single product typically includes a wealth of functionality to interactively manage, explore and analyze data, and often much more. However, increasingly many users and organizations wish to integrate statistical computing into third party software. Rather than working in a specialized statistical environment, methods to analyze and visualize data get incorporated into pipelines, web applications and big data infrastructures. This way of doing data analysis requires a different approach to statistical software which emphasizes interoperability and programmable interfaces rather than user interaction. We refer to this branch of computing as embedded scientific computing