Search CORE

11 research outputs found

Hadoop for High-Performance Climate Analytics: Use Cases and Lessons Learned

Author: Tamkin Glenn
Publication venue
Publication date
Field of study

Scientific data services are a critical aspect of the NASA Center for Climate Simulations mission (NCCS). Hadoop, via MapReduce, provides an approach to high-performance analytics that is proving to be useful to data intensive problems in climate research. It offers an analysis paradigm that uses clusters of computers and combines distributed storage of large data sets with parallel computation. The NCCS is particularly interested in the potential of Hadoop to speed up basic operations common to a wide range of analyses. In order to evaluate this potential, we prototyped a series of canonical MapReduce operations over a test suite of observational and climate simulation datasets. The initial focus was on averaging operations over arbitrary spatial and temporal extents within Modern Era Retrospective- Analysis for Research and Applications (MERRA) data. After preliminary results suggested that this approach improves efficiencies within data intensive analytic workflows, we invested in building a cyber infrastructure resource for developing a new generation of climate data analysis capabilities using Hadoop. This resource is focused on reducing the time spent in the preparation of reanalysis data used in data-model inter-comparison, a long sought goal of the climate community. This paper summarizes the related use cases and lessons learned

NASA Technical Reports Server

MERRA/AS: The MERRA Analytic Services Project Interim Report

Author: Duffy Dan
Grieg Cristina
Luczak Ed
McInerney Mark
Nadeau Denis
Schnase John
Tamkin Glenn
Thompson Hoot
Publication venue
Publication date
Field of study

MERRA AS is a cyberinfrastructure resource that will combine iRODS-based Climate Data Server (CDS) capabilities with Coudera MapReduce to serve MERRA analytic products, store the MERRA reanalysis data collection in an HDFS to enable parallel, high-performance, storage-side data reductions, manage storage-side driver, mapper, reducer code sets and realized objects for users, and provide a library of commonly used spatiotemporal operations that can be composed to enable higher-order analyses

NASA Technical Reports Server

The Virtual Climate Data Server (vCDS): An iRODS-Based Data Management Software Appliance Supporting Climate Data Services and Virtualization-as-a-Service in the NASA Center for Climate Simulation

Author: Duffy Daniel Q.
Gill Roger
Ripley W. David III
Schnase John L.
Stong Savannah
Tamkin Glenn S.
Publication venue
Publication date
Field of study

Scientific data services are becoming an important part of the NASA Center for Climate Simulation's mission. Our technological response to this expanding role is built around the concept of a Virtual Climate Data Server (vCDS), repetitive provisioning, image-based deployment and distribution, and virtualization-as-a-service. The vCDS is an iRODS-based data server specialized to the needs of a particular data-centric application. We use RPM scripts to build vCDS images in our local computing environment, our local Virtual Machine Environment, NASA s Nebula Cloud Services, and Amazon's Elastic Compute Cloud. Once provisioned into one or more of these virtualized resource classes, vCDSs can use iRODS s federation capabilities to create an integrated ecosystem of managed collections that is scalable and adaptable to changing resource requirements. This approach enables platform- or software-asa- service deployment of vCDS and allows the NCCS to offer virtualization-as-a-service: a capacity to respond in an agile way to new customer requests for data services

NASA Technical Reports Server

MERRA Analytic Services: Meeting the Big Data Challenges of Climate Science Through Cloud-enabled Climate Analytics-as-a-service

Author: Duffy Daniel Quinn
Grieg Christina M.
McInerney Mark A.
Nadeau Denis
Schnase John L.
Tamkin Glenn S.
Thompson John H.
Webster William P.
Publication venue
Publication date
Field of study

Climate science is a Big Data domain that is experiencing unprecedented growth. In our efforts to address the Big Data challenges of climate science, we are moving toward a notion of Climate Analytics-as-a-Service (CAaaS). We focus on analytics, because it is the knowledge gained from our interactions with Big Data that ultimately produce societal benefits. We focus on CAaaS because we believe it provides a useful way of thinking about the problem: a specialization of the concept of business process-as-a-service, which is an evolving extension of IaaS, PaaS, and SaaS enabled by Cloud Computing. Within this framework, Cloud Computing plays an important role; however, we it see it as only one element in a constellation of capabilities that are essential to delivering climate analytics as a service. These elements are essential because in the aggregate they lead to generativity, a capacity for self-assembly that we feel is the key to solving many of the Big Data challenges in this domain. MERRA Analytic Services (MERRAAS) is an example of cloud-enabled CAaaS built on this principle. MERRAAS enables MapReduce analytics over NASAs Modern-Era Retrospective Analysis for Research and Applications (MERRA) data collection. The MERRA reanalysis integrates observational data with numerical models to produce a global temporally and spatially consistent synthesis of 26 key climate variables. It represents a type of data product that is of growing importance to scientists doing climate change research and a wide range of decision support applications. MERRAAS brings together the following generative elements in a full, end-to-end demonstration of CAaaS capabilities: (1) high-performance, data proximal analytics, (2) scalable data management, (3) software appliance virtualization, (4) adaptive analytics, and (5) a domain-harmonized API. The effectiveness of MERRAAS has been demonstrated in several applications. In our experience, Cloud Computing lowers the barriers and risk to organizational change, fosters innovation and experimentation, facilitates technology transfer, and provides the agility required to meet our customers' increasing and changing needs. Cloud Computing is providing a new tier in the data services stack that helps connect earthbound, enterprise-level data and computational resources to new customers and new mobility-driven applications and modes of work. For climate science, Cloud Computing's capacity to engage communities in the construction of new capabilies is perhaps the most important link between Cloud Computing and Big Data

NASA Technical Reports Server

System and Method for Providing a Climate Data Persistence Service

Author: Duffy Daniel Q.
McInerney Mark
Nadeau Denis
Ripley III, William David
Schnase John L.
Sinno Scott
Strong Savannah L.
Tamkin Glenn S.
Thompson John H.
Publication venue
Publication date
Field of study

A system, method and computer-readable storage devices for providing a climate data persistence service. A system configured to provide the service can include a climate data server that performs data and metadata storage and management functions for climate data objects, a compute-storage platform that provides the resources needed to support a climate data server, provisioning software that allows climate data server instances to be deployed as virtual climate data servers in a cloud computing environment, and a service interface, wherein persistence service capabilities are invoked by software applications running on a client device. The climate data objects can be in various formats, such as International Organization for Standards (ISO) Open Archival Information System (OAIS) Reference Model Submission Information Packages, Archive Information Packages, and Dissemination Information Packages. The climate data server can enable scalable, federated storage, management, discovery, and access, and can be tailored for particular use cases

NASA Technical Reports Server

Big Data Challenges in Climate Science: Improving the Next-Generation Cyberinfrastructure

Author: Carriere Laura
Cinquini Luca
Duffy Daniel Q.
Hart Andre F.
Lee Tsengdar J.
Lynnes Christopher S.
Mattmann Chris A.
McInerney Mark A.
Potter Gerald L.
Ramirez Paul M.
Rinsland Pamela
Schnase John L.
Tamkin Glenn S.
Waliser Duane
Webster W. Philip
Williams Dean N.
Publication venue
Publication date
Field of study

The knowledge we gain from research in climate science depends on the generation, dissemination, and analysis of high-quality data. This work comprises technical practice as well as social practice, both of which are distinguished by their massive scale and global reach. As a result, the amount of data involved in climate research is growing at an unprecedented rate. Climate model intercomparison (CMIP) experiments, the integration of observational data and climate reanalysis data with climate model outputs, as seen in the Obs4MIPs, Ana4MIPs, and CREATE-IP activities, and the collaborative work of the Intergovernmental Panel on Climate Change (IPCC) provide examples of the types of activities that increasingly require an improved cyberinfrastructure for dealing with large amounts of critical scientific data. This paper provides an overview of some of climate science's big data problems and the technical solutions being developed to advance data publication, climate analytics as a service, and interoperability within the Earth System Grid Federation (ESGF), the primary cyberinfrastructure currently supporting global climate research activities

NASA Technical Reports Server

The Minimum Modulation Curve as a tool for specifying optical performance: application to surfaces with mid-spatial frequency errors

Author: Alonso
Aryan
Church
Dube
Dunn
Filhaber
Forbes
Glenn D. Boreman
Hadar
Hamidreza Aryan
Marioge
Ross
Stover
Tamkin
Thomas J. Suleski
Tinker
Publication venue: 'The Optical Society'
Publication date
Field of study

Crossref

Toward a Monte Carlo approach to selecting climate variables in MaxEnt.

Author: Caleb S Spradlin
Glenn S Tamkin
Jian Li
John L Schnase
Mark L Carroll
Mary E Aronne
Roger L Gill
Savannah L Strong
Thomas P Maxwell
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2021
Field of study

MaxEnt is an important aid in understanding the influence of climate change on species distributions. There is growing interest in using IPCC-class global climate model outputs as environmental predictors in this work. These models provide realistic, global representations of the climate system, projections for hundreds of variables (including Essential Climate Variables), and combine observations from an array of satellite, airborne, and in-situ sensors. Unfortunately, direct use of this important class of data in MaxEnt modeling has been limited by the large size of climate model output collections and the fact that MaxEnt can only operate on a relatively small set of predictors stored in a computer's main memory. In this study, we demonstrate the feasibility of a Monte Carlo method that overcomes this limitation by finding a useful subset of predictors in a larger, externally-stored collection of environmental variables in a reasonable amount of time. Our proposed solution takes an ensemble approach wherein many MaxEnt runs, each drawing on a small random subset of variables, converges on a global estimate of the top contributing subset of variables in the larger collection. In preliminary tests, the Monte Carlo approach selected a consistent set of top six variables within 540 runs, with the four most contributory variables of the top six accounting for approximately 93% of overall permutation importance in a final model. These results suggest that a Monte Carlo approach could offer a viable means of screening environmental predictors prior to final model construction that is amenable to parallelization and scalable to very large data sets. This points to the possibility of near-real-time multiprocessor implementations that could enable broader and more exploratory use of global climate model outputs in environmental niche modeling and aid in the discovery of viable predictors

Directory of Open Access Journals

Simple methods for estimating the performance and specification of optical components with anisotropic mid-spatial frequency surface errors

Author: Aikens
Aryan
Aryan
Church
Deck
Dunn
Duparré
Elson
Filhaber
Forbes
Glenn D. Boreman
Hamidreza Aryan
Lawson
Marioge
Sidick
Sohn
Tamkin
Thomas J. Suleski
Youngworth
Youngworth
Publication venue: 'The Optical Society'
Publication date
Field of study

Crossref

White matter deficits in psychopathic offenders and correlation with factor structure

Author: AL Glenn
AL Glenn
Arash Nazeri
Aristotle N. Voineskos
AS Tamkin
B Schiffer
Daniel Felsky
Danilo R. de Jesus
Dennis J. L. G. Schutter
EC Finger
EC Finger
F Sundram
GE Munro
H Soderstrom
I Barkataki
J Acosta-Cabronero
J Intrator
JC Keel
JC Motzkin
JL Muller
JL Weiss
JM Bjork
JW Buckholtz
JW Buckholtz
K Rubia
KA Kiehl
L Heimer
LM Gatzke-Kopp
M Boccardi
M Boccardi
M Boccardi
M Cima
M Koenigs
M Koenigs
N Birbaumer
R Veit
RC Oldfield
RD Hare
RJ Blair
RJ Blair
S Ducharme
SA De Brito
SA De Brito
SM Smith
SM Smith
SM Smith
SM Smith
Sylco S. Hoppenbrouwers
Tania Stirpe
TE Behrens
TE Behrens
TE Nichols
VA Coenen
Xi-Nian Zuo
Y Gao
Y Yang
Zafiris J. Daskalakis
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Contains fulltext : 119541.pdf (publisher's version ) (Open Access)Psychopathic offenders show a persistent pattern of emotional unresponsivity to the often horrendous crimes they perpetrate. Recent studies have related psychopathy to alterations in white matter. Therefore, diffusion tensor imaging followed by tract-based spatial statistics (TBSS) analysis in 11 psychopathic offenders matched to 11 healthy controls was completed. Fractional anisotropy was calculated within each voxel and comparisons were made between groups using a permutation test. Any clusters of white matter voxels different between groups were submitted to probabilistic tractography. Significant differences in fractional anisotropy were found between psychopathic offenders and healthy controls in three main white matter clusters. These three clusters represented two major networks: an amygdalo-prefrontal network, and a striato-thalamo-frontal network. The interpersonal/affective component of the PCL-R correlated with white matter deficits in the orbitofrontal cortex and frontal pole whereas the antisocial component correlated with deficits in the striato-thalamo-frontal network. In addition to replicating earlier work concerning disruption of an amygdala-prefrontal network, we show for the first time that white matter integrity in a striato-thalamo-frontal network is disrupted in psychopathic offenders. The novelty of our findings lies in the two dissociable white matter networks that map directly onto the two major factors of psychopathy.8 p

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Radboud Repository