390 research outputs found
ํด๋ผ์ฐ๋ ์ปดํจํ ํ๊ฒฝ๊ธฐ๋ฐ์์ ์์น ๋ชจ๋ธ๋ง๊ณผ ๋จธ์ ๋ฌ๋์ ํตํ ์ง๊ตฌ๊ณผํ ์๋ฃ์์ฑ์ ๊ดํ ์ฐ๊ตฌ
ํ์๋
ผ๋ฌธ(๋ฐ์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ์์ฐ๊ณผํ๋ํ ์ง๊ตฌํ๊ฒฝ๊ณผํ๋ถ, 2022. 8. ์กฐ์๊ธฐ.To investigate changes and phenomena on Earth, many scientists use high-resolution-model results based on numerical models or develop and utilize machine learning-based prediction models with observed data. As information technology advances, there is a need for a practical methodology for generating local and global high-resolution numerical modeling and machine learning-based earth science data.
This study recommends data generation and processing using high-resolution numerical models of earth science and machine learning-based prediction models in a cloud environment.
To verify the reproducibility and portability of high-resolution numerical ocean model implementation on cloud computing, I simulated and analyzed the performance of a numerical ocean model at various resolutions in the model domain, including the Northwest Pacific Ocean, the East Sea, and the Yellow Sea. With the containerization method, it was possible to respond to changes in various infrastructure environments and achieve computational reproducibility effectively.
The data augmentation of subsurface temperature data was performed using generative models to prepare large datasets for model training to predict the vertical temperature distribution in the ocean. To train the prediction model, data augmentation was performed using a generative model for observed data that is relatively insufficient compared to satellite dataset.
In addition to observation data, HYCOM datasets were used for performance comparison, and the data distribution of augmented data was similar to the input data distribution. The ensemble method, which combines stand-alone predictive models, improved the performance of the predictive model compared to that of the model based on the existing observed data. Large amounts of computational resources were required for data synthesis, and the synthesis was performed in a cloud-based graphics processing unit environment.
High-resolution numerical ocean model simulation, predictive model development, and the data generation method can improve predictive capabilities in the field of ocean science. The numerical modeling and generative models based on cloud computing used in this study can be broadly applied to various fields of earth science.์ง๊ตฌ์ ๋ณํ์ ํ์์ ์ฐ๊ตฌํ๊ธฐ ์ํด ๋ง์ ๊ณผํ์๋ค์ ์์น ๋ชจ๋ธ์ ๊ธฐ๋ฐ์ผ๋ก ํ ๊ณ ํด์๋ ๋ชจ๋ธ ๊ฒฐ๊ณผ๋ฅผ ์ฌ์ฉํ๊ฑฐ๋ ๊ด์ธก๋ ๋ฐ์ดํฐ๋ก ๋จธ์ ๋ฌ๋ ๊ธฐ๋ฐ ์์ธก ๋ชจ๋ธ์ ๊ฐ๋ฐํ๊ณ ํ์ฉํ๋ค. ์ ๋ณด๊ธฐ์ ์ด ๋ฐ์ ํจ์ ๋ฐ๋ผ ์ง์ญ ๋ฐ ์ ์ง๊ตฌ์ ์ธ ๊ณ ํด์๋ ์์น ๋ชจ๋ธ๋ง๊ณผ ๋จธ์ ๋ฌ๋ ๊ธฐ๋ฐ ์ง๊ตฌ๊ณผํ ๋ฐ์ดํฐ ์์ฑ์ ์ํ ์ค์ฉ์ ์ธ ๋ฐฉ๋ฒ๋ก ์ด ํ์ํ๋ค.
๋ณธ ์ฐ๊ตฌ๋ ์ง๊ตฌ๊ณผํ์ ๊ณ ํด์๋ ์์น ๋ชจ๋ธ๊ณผ ๋จธ์ ๋ฌ๋ ๊ธฐ๋ฐ ์์ธก ๋ชจ๋ธ์ ๊ธฐ๋ฐ์ผ๋ก ํ ๋ฐ์ดํฐ ์์ฑ ๋ฐ ์ฒ๋ฆฌ๊ฐ ํด๋ผ์ฐ๋ ํ๊ฒฝ์์ ํจ๊ณผ์ ์ผ๋ก ๊ตฌํ๋ ์ ์์์ ์ ์ํ๋ค.
ํด๋ผ์ฐ๋ ์ปดํจํ
์์ ๊ณ ํด์๋ ์์น ํด์ ๋ชจ๋ธ ๊ตฌํ์ ์ฌํ์ฑ๊ณผ ์ด์์ฑ์ ๊ฒ์ฆํ๊ธฐ ์ํด ๋ถ์ํํ์, ๋ํด, ํฉํด ๋ฑ ๋ชจ๋ธ ์์ญ์ ๋ค์ํ ํด์๋์์ ์์น ํด์ ๋ชจ๋ธ์ ์ฑ๋ฅ์ ์๋ฎฌ๋ ์ด์
ํ๊ณ ๋ถ์ํ์๋ค. ์ปจํ
์ด๋ํ ๋ฐฉ์์ ํตํด ๋ค์ํ ์ธํ๋ผ ํ๊ฒฝ ๋ณํ์ ๋์ํ๊ณ ๊ณ์ฐ ์ฌํ์ฑ์ ํจ๊ณผ์ ์ผ๋ก ํ๋ณดํ ์ ์์๋ค.
๋จธ์ ๋ฌ๋ ๊ธฐ๋ฐ ๋ฐ์ดํฐ ์์ฑ์ ์ ์ฉ์ ๊ฒ์ฆํ๊ธฐ ์ํด ์์ฑ ๋ชจ๋ธ์ ์ด์ฉํ ํ์ธต ์ดํ ์จ๋ ๋ฐ์ดํฐ์ ๋ฐ์ดํฐ ์ฆ๊ฐ์ ์คํํ์ฌ ํด์์ ์์ง ์จ๋ ๋ถํฌ๋ฅผ ์์ธกํ๋ ๋ชจ๋ธ ํ๋ จ์ ์ํ ๋์ฉ๋ ๋ฐ์ดํฐ ์ธํธ๋ฅผ ์ค๋นํ๋ค. ์์ธก๋ชจ๋ธ ํ๋ จ์ ์ํด ์์ฑ ๋ฐ์ดํฐ์ ๋นํด ์๋์ ์ผ๋ก ๋ถ์กฑํ ๊ด์ธก ๋ฐ์ดํฐ์ ๋ํด์ ์์ฑ ๋ชจ๋ธ์ ์ฌ์ฉํ์ฌ ๋ฐ์ดํฐ ์ฆ๊ฐ์ ์ํํ์๋ค. ๋ชจ๋ธ์ ์์ธก์ฑ๋ฅ ๋น๊ต์๋ ๊ด์ธก ๋ฐ์ดํฐ ์ธ์๋ HYCOM ๋ฐ์ดํฐ ์ธํธ๋ฅผ ์ฌ์ฉํ์์ผ๋ฉฐ, ์ฆ๊ฐ ๋ฐ์ดํฐ์ ๋ฐ์ดํฐ ๋ถํฌ๋ ์
๋ ฅ ๋ฐ์ดํฐ ๋ถํฌ์ ์ ์ฌํจ์ ํ์ธํ์๋ค. ๋
๋ฆฝํ ์์ธก ๋ชจ๋ธ์ ๊ฒฐํฉํ ์์๋ธ ๋ฐฉ์์ ๊ธฐ์กด ๊ด์ธก ๋ฐ์ดํฐ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ๋ ์์ธก ๋ชจ๋ธ์ ์ฑ๋ฅ์ ๋นํด ํฅ์๋์๋ค. ๋ฐ์ดํฐํฉ์ฑ์ ์ํด ๋ง์ ์์ ๊ณ์ฐ ์์์ด ํ์ํ์ผ๋ฉฐ, ๋ฐ์ดํฐ ํฉ์ฑ์ ํด๋ผ์ฐ๋ ๊ธฐ๋ฐ GPU ํ๊ฒฝ์์ ์ํ๋์๋ค.
๊ณ ํด์๋ ์์น ํด์ ๋ชจ๋ธ ์๋ฎฌ๋ ์ด์
, ์์ธก ๋ชจ๋ธ ๊ฐ๋ฐ, ๋ฐ์ดํฐ ์์ฑ ๋ฐฉ๋ฒ์ ํด์ ๊ณผํ ๋ถ์ผ์์ ์์ธก ๋ฅ๋ ฅ์ ํฅ์์ํฌ ์ ์๋ค. ๋ณธ ์ฐ๊ตฌ์์ ์ฌ์ฉ๋ ํด๋ผ์ฐ๋ ์ปดํจํ
๊ธฐ๋ฐ์ ์์น ๋ชจ๋ธ๋ง ๋ฐ ์์ฑ ๋ชจ๋ธ์ ์ง๊ตฌ ๊ณผํ์ ๋ค์ํ ๋ถ์ผ์ ๊ด๋ฒ์ํ๊ฒ ์ ์ฉ๋ ์ ์๋ค.1. General Introduction 1
2. Performance of numerical ocean modeling on cloud computing 6
2.1. Introduction 6
2.2. Cloud Computing 9
2.2.1. Cloud computing overview 9
2.2.2. Commercial cloud computing services 12
2.3. Numerical model for performance analysis of commercial clouds 15
2.3.1. High Performance Linpack Benchmark 15
2.3.2. Benchmark Sustainable Memory Bandwidth and Memory Latency 16
2.3.3. Numerical Ocean Model 16
2.3.4. Deployment of Numerical Ocean Model and Benchmark Packages on Cloud Clusters 19
2.4. Simulation results 21
2.4.1. Benchmark simulation 21
2.4.2. Ocean model simulation 24
2.5. Analysis of ROMS performance on commercial clouds 26
2.5.1. Performance of ROMS according to H/W resources 26
2.5.2. Performance of ROMS according to grid size 34
2.6. Summary 41
3. Reproducibility of numerical ocean model on the cloud computing 44
3.1. Introduction 44
3.2. Containerization of numerical ocean model 47
3.2.1. Container virtualization 47
3.2.2. Container-based architecture for HPC 49
3.2.3. Container-based architecture for hybrid cloud 53
3.3. Materials and Methods 55
3.3.1. Comparison of traditional and container based HPC cluster workflows 55
3.3.2. Model domain and datasets for numerical simulation 57
3.3.3. Building the container image and registration in the repository 59
3.3.4. Configuring a numeric model execution cluster 64
3.4. Results and Discussion 74
3.4.1. Reproducibility 74
3.4.2. Portability and Performance 76
3.5. Conclusions 81
4. Generative models for the prediction of ocean temperature profile 84
4.1. Introduction 84
4.2. Materials and Methods 87
4.2.1. Model domain and datasets for predicting the subsurface temperature 87
4.2.2. Model architecture for predicting the subsurface temperature 90
4.2.3. Neural network generative models 91
4.2.4. Prediction Models 97
4.2.5. Accuracy 103
4.3. Results and Discussion 104
4.3.1. Data Generation 104
4.3.2. Ensemble Prediction 109
4.3.3. Limitations of this study and future works 111
4.4. Conclusion 111
5. Summary and conclusion 114
6. References 118
7. Abstract (in Korean) 140๋ฐ
Efficacy of Feedforward and LSTM Neural Networks at Predicting and Gap Filling Coastal Ocean Timeseries: Oxygen, Nutrients, and Temperature
Ocean data timeseries are vital for a diverse range of stakeholders (ranging from government, to industry, to academia) to underpin research, support decision making, and identify environmental change. However, continuous monitoring and observation of ocean variables is difficult and expensive. Moreover, since oceans are vast, observations are typically sparse in spatial and temporal resolution. In addition, the hostile ocean environment creates challenges for collecting and maintaining data sets, such as instrument malfunctions and servicing, often resulting in temporal gaps of varying lengths. Neural networks (NN) have proven effective in many diverse big data applications, but few oceanographic applications have been tested using modern frameworks and architectures. Therefore, here we demonstrate a โproof of conceptโ neural network application using a popular โoff-the-shelfโ framework called โTensorFlowโ to predict subsurface ocean variables including dissolved oxygen and nutrient (nitrate, phosphate, and silicate) concentrations, and temperature timeseries and show how these models can be used successfully for gap filling data products. We achieved a final prediction accuracy of over 96% for oxygen and temperature, and mean squared errors (MSE) of 2.63, 0.0099, and 0.78, for nitrates, phosphates, and silicates, respectively. The temperature gap-filling was done with an innovative contextual Long Short-Term Memory (LSTM) NN that uses data before and after the gap as separate feature variables. We also demonstrate the application of a novel dropout based approach to approximate the Bayesian uncertainty of these temperature predictions. This Bayesian uncertainty is represented in the form of 100 monte carlo dropout estimates of the two longest gaps in the temperature timeseries from a model with 25% dropout in the input and recurrent LSTM connections. Throughout the study, we present the NN training process including the tuning of the large number of NN hyperparameters which could pose as a barrier to uptake among researchers and other oceanographic data users. Our models can be scaled up and applied operationally to provide consistent, gap-free data to all data users, thus encouraging data uptake for data-based decision making
An Artificial Neural Network to Infer the Mediterranean 3D Chlorophyll-a and Temperature Fields from Remote Sensing Observations
Remote sensing data provide a huge number of sea surface observations, but cannot give direct information on deeper ocean layers, which can only be provided by sparse in situ data. The combination of measurements collected by satellite and in situ sensors represents one of the most effective strategies to improve our knowledge of the interior structure of the ocean ecosystems. In this work, we describe a Multi-Layer-Perceptron (MLP) network designed to reconstruct the 3D fields of ocean temperature and chlorophyll-a concentration, two variables of primary importance for many upper-ocean bio-physical processes. Artificial neural networks can efficiently model eventual non-linear relationships among input variables, and the choice of the predictors is thus crucial to build an accurate model. Here, concurrent temperature and chlorophyll-a in situ profiles and several different combinations of satellite-derived surface predictors are used to identify the optimal model configuration, focusing on the Mediterranean Sea. The lowest errors are obtained when taking in input surface chlorophyll-a, temperature, and altimeter-derived absolute dynamic topography and surface geostrophic velocity components. Network training and test validations give comparable results, significantly improving with respect to Mediterranean climatological data (MEDATLAS). 3D fields are then also reconstructed from full basin 2D satellite monthly climatologies (1998โ2015) and resulting 3D seasonal patterns are analyzed. The method accurately infers the vertical shape of temperature and chlorophyll-a profiles and their spatial and temporal variability. It thus represents an effective tool to overcome the in-situ data sparseness and the limits of satellite observations, also potentially suitable for the initialization and validation of bio-geophysical models
Subsurface temperature estimation of mesoscale eddies in the Northwest Pacific Ocean from satellite observations using a residual muti-channel attention convolution network
The mesoscale eddies are prevalent oceanic circulation phenomena, exerting significant influence on various aspects of the marine environment including energy transfer, material transport and ecosystem dynamics in the Northwest Pacific Ocean. However, due to sparse vertical observational data, the understanding of the three-dimensional temperature structure of individual cases of mesoscale eddies remains limited. In recent years, utilizing surface remote sensing observations to estimate subsurface temperature anomaly has been crucial for comprehending the intricate multi-dimensional dynamic processes in the ocean. Consequently, this paper proposes an eddy residual multi-channel attention convolution network (ERCACN) with the adaptive threshold and designs the combination of various surface features to estimate the eddy subsurface temperature anomaly (ESTA). By integrating results with climatic temperature, thermal structures containing 46 levels at depths up to 1000ย m could be obtained, achieving excellent daily temporal resolution and 0.25ยฐ spatial resolution. Validation using independent Argo profiles from 2016 to 2017 reveals that the combination of multiple surface variables outperforms univariate methods, and the ERCACN model demonstrates superior performance compared to other approaches. Overall, with an 8% error deemed acceptable, the ERCACN model achieves a precision of 88.08% in estimating ESTA. This method provides a novel perspective for other essential oceanic variables, contributing to a better perception of the global climate system
Spatial-Temporal Data Mining for Ocean Science: Data, Methodologies, and Opportunities
With the increasing amount of spatial-temporal~(ST) ocean data, numerous
spatial-temporal data mining (STDM) studies have been conducted to address
various oceanic issues, e.g., climate forecasting and disaster warning.
Compared with typical ST data (e.g., traffic data), ST ocean data is more
complicated with some unique characteristics, e.g., diverse regionality and
high sparsity. These characteristics make it difficult to design and train STDM
models. Unfortunately, an overview of these studies is still missing, hindering
computer scientists to identify the research issues in ocean while discouraging
researchers in ocean science from applying advanced STDM techniques. To remedy
this situation, we provide a comprehensive survey to summarize existing STDM
studies in ocean. Concretely, we first summarize the widely-used ST ocean
datasets and identify their unique characteristics. Then, typical ST ocean data
quality enhancement techniques are discussed. Next, we classify existing STDM
studies for ocean into four types of tasks, i.e., prediction, event detection,
pattern mining, and anomaly detection, and elaborate the techniques for these
tasks. Finally, promising research opportunities are highlighted. This survey
will help scientists from the fields of both computer science and ocean science
have a better understanding of the fundamental concepts, key techniques, and
open challenges of STDM in ocean
Study on prediction of SST and SSS in Southern Ocean by multi-layers ConvLSTM model๏ผๅคๅฑคConvLSTMใขใใซใซใใๅๆฅตๆตทใฎๆตท้ขๆฐดๆธฉใใใณๆตท้ขๅกฉๅใฎไบๆธฌใซ้ขใใ็ ็ฉถ๏ผ
ๆฑไบฌๆตทๆดๅคงๅญฆไฟฎๅฃซๅญฆไฝ่ซๆ 2020ๅนดๅบฆ(2021ๅนด3ๆ) ๆตทๆด่ณๆบ็ฐๅขๅญฆ ไฟฎๅฃซ ็ฌฌ3523ๅทๆๅฐๆๅก: ๅๅบ่ฃไบ้ๅ
จๆๅ
ฌ่กจๅนดๆๆฅ: 2021-06-21ๆฑไบฌๆตทๆดๅคงๅญฆ202
Convolutional GRU Network for Seasonal Prediction of the El Ni\~no-Southern Oscillation
Predicting sea surface temperature (SST) within the El Ni\~no-Southern
Oscillation (ENSO) region has been extensively studied due to its significant
influence on global temperature and precipitation patterns. Statistical models
such as linear inverse model (LIM), analog forecasting (AF), and recurrent
neural network (RNN) have been widely used for ENSO prediction, offering
flexibility and relatively low computational expense compared to large dynamic
models. However, these models have limitations in capturing spatial patterns in
SST variability or relying on linear dynamics. Here we present a modified
Convolutional Gated Recurrent Unit (ConvGRU) network for the ENSO region
spatio-temporal sequence prediction problem, along with the Ni\~no 3.4 index
prediction as a down stream task. The proposed ConvGRU network, with an
encoder-decoder sequence-to-sequence structure, takes historical SST maps of
the Pacific region as input and generates future SST maps for subsequent months
within the ENSO region. To evaluate the performance of the ConvGRU network, we
trained and tested it using data from multiple large climate models. The
results demonstrate that the ConvGRU network significantly improves the
predictability of the Ni\~no 3.4 index compared to LIM, AF, and RNN. This
improvement is evidenced by extended useful prediction range, higher Pearson
correlation, and lower root-mean-square error. The proposed model holds promise
for improving our understanding and predicting capabilities of the ENSO
phenomenon and can be broadly applicable to other weather and climate
prediction scenarios with spatial patterns and teleconnections.Comment: 13 pages, 7 figure
Machine Learning for Earth Systems Modeling, Analysis and Predictability
Artificial intelligence (AI) and machine learning (ML) methods and applications have been continuously explored in many areas of scientific research. While these methods have lead to many advances in climate science, there remains room for growth especially in Earth System Modeling, analysis and predictability. Due to their high computational expense and large volumes of complex data they produce, earth system models (ESMs) provide an abundance of potential for enhancing both our understanding of the climate system as well as improving performance of ESMs themselves using ML techniques. Here I demonstrate 3 specific areas of development using ML: statistical downscaling, predictability using non-linear latent spaces and emulation of complex parametrization. These three areas of research illustrate the ability of innovative ML methods to advance our understanding of climate systems through ESMs.
In Aim 1, I present a first application of a fast super resolution convolutional neural network (FSRCNN) based approach for downscaling earth system model (ESM) simulations. We adapt the FSRCNN to improve reconstruction on ESM data, we term the FSRCNN-ESM. We find that FSRCNN-ESM outperforms FSRCNN and other super-resolution methods in reconstructing high resolution images producing finer spatial scale features with better accuracy for surface temperature, surface radiative fluxes and precipitation.
In Aim 2, I construct a novel Multi-Input Multi-Output Autoencoder-decoder (MIMO-AE) in an application of multi-task learning to capture the non-linear relationship of Southern California precipitation (SC-PRECIP) and tropical Pacific Ocean sea surface temperature (TP-SST) on monthly time-scales. I find that the MIMO-AE index provides enhanced predictability of SC-PRECIP for a lead-time of up-to four months as compared to Ni{\~n}o 3.4 index and the El Ni{\~n}o Southern Oscillation Longitudinal Index. I also use a MTL method to expand on a convolutional long short term memory (conv-LSTM) to predict Nino 3.4 index by including multiple input variables known to be associated with ENSO, namely sea level pressure (SLP), outgoing longwave radiation (ORL) and surface level zonal winds (U).
In Aim 3, I demonstrate the capability of DNNs for learning computationally expensive parameterizations in ESMs. This study develops a DNN to replace the full radiation model in the E3SM
- โฆ