29 research outputs found

    Application of Data Cubes for Improving Detection of Water Cycle Extreme Events

    Get PDF
    As part of an ongoing NASA-funded project to remove a longstanding barrier to accessing NASA data (i.e., accessing archived time-step array data as point-time series), for the hydrology and other point-time series-oriented communities, "data cubes" are created from which time series files (aka "data rods") are generated on-the-fly and made available as Web services from the Goddard Earth Sciences Data and Information Services Center (GES DISC). Data cubes are data as archived rearranged into spatio-temporal matrices, which allow for easy access to the data, both spatially and temporally. A data cube is a specific case of the general optimal strategy of reorganizing data to match the desired means of access. The gain from such reorganization is greater the larger the data set. As a use case of our project, we are leveraging existing software to explore the application of the data cubes concept to machine learning, for the purpose of detecting water cycle extreme events, a specific case of anomaly detection, requiring time series data. We investigate the use of support vector machines (SVM) for anomaly classification. We show an example of detection of water cycle extreme events, using data from the Tropical Rainfall Measuring Mission (TRMM)

    MODIS Aerosol Optical Depth Bias Adjustment Using Machine Learning Algorithms

    Get PDF
    To monitor the earth atmosphere and its surface changes, satellite based instruments collect continuous data. While some of the data is directly used, some others such as aerosol properties are indirectly retrieved from the observation data. While retrieved variables (RV) form very powerful products, they don't come without obstacles. Different satellite viewing geometries, calibration issues, dynamically changing atmospheric and earth surface conditions, together with complex interactions between observed entities and their environment affect them greatly. This results in random and systematic errors in the final products

    Estimation and Bias Correction of Aerosol Abundance using Data-driven Machine Learning and Remote Sensing

    Get PDF
    Air quality information is increasingly becoming a public health concern, since some of the aerosol particles pose harmful effects to peoples health. One widely available metric of aerosol abundance is the aerosol optical depth (AOD). The AOD is the integrated light extinction coefficient over a vertical atmospheric column of unit cross section, which represents the extent to which the aerosols in that vertical profile prevent the transmission of light by absorption or scattering. The comparison between the AOD measured from the ground-based Aerosol Robotic Network (AERONET) system and the satellite MODIS instruments at 550 nm shows that there is a bias between the two data products. We performed a comprehensive analysis exploring possible factors which may be contributing to the inter-instrumental bias between MODIS and AERONET. The analysis used several measured variables, including the MODIS AOD, as input in order to train a neural network in regression mode to predict the AERONET AOD values. This not only allowed us to obtain an estimate, but also allowed us to infer the optimal sets of variables that played an important role in the prediction. In addition, we applied machine learning to infer the global abundance of ground level PM2.5 from the AOD data and other ancillary satellite and meteorology products. This research is part of our goal to provide air quality information, which can also be useful for global epidemiology studies

    Bringing Analysis Closer to Data: Developing a Visualization Tool for L2 Earth Science Satellite Data

    Get PDF
    Earth Science satellite missions provide a unique opportunity for scientists to visualize complex and multifaceted observations projected geospatially across maps of the Earth. While visualization tools can help scientists comprehend, analyze, and share data, visualizing Level-2 Earth Sciences data poses its own specific set of challenges. Since the geospatial information in Level-2 data files is stored as independent variables, the plotting process involves matching dimensional information from latitude and longitude with a desired variable. Variables are stored in different ways across various Earth Science data file formats, which complicates the process of extracting data and plotting variables from a given file without requiring extensive user input and prerequisite familiarity with the file type variable structure. In coordination with NASAs Goddard Earth Sciences Data Information Services Center (GES DISC), the team developed a Level-2 Earth Science data visualization tool that aims to address some of the complexities associated with plotting Level-2 data. This tool offers command-line and user interface support for file and variable selection to accommodate varying use cases and degrees of user familiarity with the structure of a given file. The visualization tool is written in Python 3 and utilizes a modular approach to facilitate continued expansion and reuse. In addressing some common complications involved in plotting Level-2 Earth Sciences data, the tool aims to help to link the process of analysis more directly with data acquisition and visualization, bringing analysis closer to data across levels of processing

    Complexities in Subsetting Level 2 Data

    Get PDF
    Satellite Level 2 data presents unique challenges for tools and services. From nonlinear spatial geometry to inhomogeneous file data structure to inconsistent temporal variables to complex data variable dimensionality to multiple file formats, there are many difficulties in creating general tools for Level 2 data support. At NASA Goddard Earth Sciences Data and Information Services Center (GES DISC), we are implementing a general Level 2 Subsetting service for Level 2 data to a user-specified spatio-temporal region of interest (ROI). In this presentation, we will unravel some of the challenges faced in creating this service and the strategies we used to surmount them

    Mining Twitter Data to Augment NASA GPM Validation

    Get PDF
    The Twitter data stream is an important new source of real-time and historical global information for potentially augmenting the validation program of NASA's Global Precipitation Measurement (GPM) mission. There have been other similar uses of Twitter, though mostly related to natural hazards monitoring and management. The validation of satellite precipitation estimates is challenging, because many regions lack data or access to data, especially outside of the U.S. and in remote and developing areas. The time-varying set of "precipitation" tweets can be thought of as an organic network of rain gauges, potentially providing a widespread view of precipitation occurrence. Twitter provides a large source of crowd for crowdsourcing. During a 24-hour period in the middle of the snow storm this past March in the U.S. Northeast, we collected more than 13,000 relevant precipitation tweets with exact geolocation. The overall objective of our project is to determine the extent to which processed tweets can provide additional information that improves the validation of GPM data. Though our current effort focuses on tweets and precipitation, our approach is general and applicable to other social media and other geophysical measurements. Specifically, we have developed an operational infrastructure for processing tweets, in a format suitable for analysis with GPM data; engaged with potential participants, both passive and active, to "enrich" the Twitter stream; and inter-compared "precipitation" tweet data, ground station data, and GPM retrievals. In this presentation, we detail the technical capabilities of our tweet processing infrastructure, including data abstraction, feature extraction, search engine, context-awareness, real-time processing, and high volume (big) data processing; various means for "enriching" the Twitter stream; and results of inter-comparisons. Our project should bring a new kind of visibility to Twitter and engender a new kind of appreciation of the value of Twitter by the science research communities

    CO2 Data Distribution and Support from the Goddard Earth Science Data and Information Services Center (GES-DISC)

    Get PDF
    This talk will describe the support and distribution of CO2 data products from OCO-2, AIRS, and ACOS, that are archived and distributed from the Goddard Earth Sciences Data and Information Services Center. We will provide a brief summary of the current online archive and distribution metrics for the OCO-2 Level 1 products and plans for the Level 2 products. We will also describe collaborative data sets and services (e.g., matchups with other sensors) and solicit feedback for potential future services

    NASA GES DISC support of CO2 Data from OCO-2, ACOS, and AIRS

    Get PDF
    NASA Goddard Earth Sciences Data and Information Services Centers (GES DISC) is the data center assigned to archive and distribute current AIRS, ACOS data and data from the upcoming OCO-2 mission. The GES DISC archives and supports data containing information on CO2 as well as other atmospheric composition, atmospheric dynamics, modeling and precipitation. Along with the data stewardship, an important mission of GES DISC is to facilitate access to and enhance the usability of data as well as to broaden the user base. GES DISC strives to promote the awareness of science content and novelty of the data by working with Science Team members and releasing news articles as appropriate. Analysis of events that are of interest to the general public, and that help in understanding the goals of NASA Earth Observing missions, have been among most popular practices.Users have unrestricted access to a user-friendly search interface, Mirador, that allows temporal, spatial, keyword and event searches, as well as an ontology-driven drill down. Variable subsetting, format conversion, quality screening, and quick browse, are among the services available in Mirador. The majority of the GES DISC data are also accessible through OPeNDAP (Open-source Project for a Network Data Access Protocol) and WMS (Web Map Service). These services add more options for specialized subsetting, format conversion, image viewing and contributing to data interoperability