1,921 research outputs found
Recommended from our members
Storing and manipulating environmental big data with JASMIN
JASMIN is a super-data-cluster designed to provide
a high-performance high-volume data analysis environment for
the UK environmental science community. Thus far JASMIN
has been used primarily by the atmospheric science and earth
observation communities, both to support their direct scientific workflow, and the curation of data products in the STFC Centre for Environmental Data Archival (CEDA). Initial JASMIN configuration and first experiences are reported here. Useful improvements in scientific workflow are presented. It is clear from the explosive growth in stored data and use that there was a pent up demand for a suitable big-data analysis environment.
This demand is not yet satisfied, in part because JASMIN does not yet have enough compute, the storage is fully allocated, and not all software needs are met. Plans to address these constraints are introduced
Recommended from our members
Technology to aid the analysis of large-volume multi-institute climate model output at a central analysis facility (PRIMAVERA Data Management Tool V2.10)
The PRIMAVERA project aimed to develop a new generation of advanced and well-evaluated high-resolution global climate models. As part of PRIMAVERA, seven different climate models were run in both standard and higher-resolution configurations, with common initial conditions and forcings to form a multi-model ensemble. The ensemble simulations were run on high-performance computers across Europe and generated approximately 1.6 PiB (pebibytes) of output. To allow the data from all models to be analysed at this scale, PRIMAVERA scientists were encouraged to bring their analysis to the data. All data were transferred to a central analysis facility (CAF), in this case the JASMIN super-data-cluster, where it was catalogued and details made available to users using the web interface of the PRIMAVERA Data Management Tool (DMT). Users from across the project were able to query the available data using the DMT and then access it at the CAF. Here we describe how the PRIMAVERA project used the CAF's facilities to enable users to analyse this multi-model dataset. We believe that PRIMAVERA's experience using a CAF demonstrates how similar, multi-institute, big-data projects can efficiently share, organise and analyse large volumes of data.</p
Recommended from our members
High resolution global climate modelling; the UPSCALE project, a large simulation campaign
The UPSCALE (UK on PRACE: weather-resolving Simulations of Climate for globAL Environmental risk) project constructed and ran an ensemble of HadGEM3 (Hadley Centre Global Environment Model 3) atmosphere only global climate simulations over the period 1985–2011, at resolutions of N512 (25 km), N216 (60 km) and N96
(130 km) as used in current global weather forecasting, seasonal prediction and climate modelling respectively. Alongside these present climate simulations a parallel ensemble looking at extremes of future climate was run, using a timeslice methodology to consider conditions at the end of this century. These simulations were primarily performed using a 144 million core hour, single year grant of computing time from PRACE (the Partnership for Advanced Computing in Europe) in 2012, with additional resources supplied by the Natural Environment Research Council (NERC) and the Met Office. Almost 400 terabytes of simulation data were generated on the HERMIT supercomputer at the High Performance Computing Center Stuttgart (HLRS), and transferred to the JASMIN super-data cluster provided by the Science and Technology Facilities Council Centre for Data Archival (STFC CEDA) for analysis and storage. In this paper we describe the implementation of the project, present the technical challenges in terms of optimisation, data output, transfer and storage that such a project involves and include details of the model configuration and the composition of the UPSCALE data set. This data set is
available for scientific analysis to allow assessment of the value of model resolution in both present and potential future climate conditions
Assessment of variability parameters and diversity of panicle architectural traits associated with yield in rice (Oryza sativa L.)
The rice panicle, a pivotal reproductive structure, signifies the transition from vegetative to reproductive growth in plants. Comprising components such as the rachis, primary and secondary branches, seed quantities and branch lengths, panicle architecture profoundly influences grain production. This study delves into the diversity of panicle architecture traits and scrutinizes variability parameters across 69 distinct rice genotypes. Our findings underscore substantial variations in panicle architecture traits among genotypes. Particularly noteworthy are traits with the highest coefficient of variation (CV%), encompassing the count of secondary branches, single plant yield, productive tillers per plant, seeds per secondary branch and panicle weight. Correlation analysis reveals robust positive connections between panicle weight, the number of filled grains per panicle, 1000-grain weight and single plant yield. The number of secondary branches exhibits the most substantial phenotypic coefficient of variation (PCV%) at 47.14%, accompanied by a genotypic coefficient of variation (GCV%) of 43.57%. Traits such as days to 50% flowering, plant height and number of filled grains per panicle manifest high heritability (97.04%, 91.24% and 76.22% respectively) and notable genetic advancement (23.11%, 39.62% and 47.49%). The principal component analysis identifies the primary component (PC1) as the principal contributor to variance. Biplot analysis accentuates positive correlations between attributes like the number of filled grains per panicle, panicle length, plant height, primary branch count, panicle weight, seeds per primary branch and the number of secondary branches with single plant yield. By employing Mahalanobis D2 statistics, the classification of genotypes into 6 distinct clusters reveals clusters III and IV as distinguished by their significant inter-cluster and intra-cluster distances. This comprehensive analysis unveils the potential for harnessing panicle architecture traits to enhance grain production and advances our comprehension of intricate relationships within diverse rice genotypes
STFC Centre for Environmental Data Archival (CEDA) Annual Report 2013 (April 2012-March 2013)
The mission of the Centre for Environmental Archival (CEDA) is to deliver long term curation of
scientifically important environmental data at the same time as facilitating the use of data by the
environmental science community. CEDA was established by the amalgamation of the activities of two of
the Natural Environment Research Council (NERC) designated data centres: the British Atmospheric
Data Centre, and the NERC Earth Observation Data Centre.
We are pleased to present here our fourth annual report, covering activities for the 2013 year (April 2012
to March 2013). The report consists of two sections and appendices, the first section broadly providing a
summary of activities and some statistics with some short descriptions of some significant activities, and
a second section introducing some exemplar projects and activities. The report concludes with additional
details of activities such as publications, software maintained etc
- …