153 research outputs found

    Prediction of monthly Arctic sea ice concentrations using satellite and reanalysis data based on convolutional neural networks

    Get PDF
    Changes in Arctic sea ice affect atmospheric circulation, ocean current, and polar ecosystems. There have been unprecedented decreases in the amount of Arctic sea ice due to global warming. In this study, a novel 1-month sea ice concentration (SIC) prediction model is proposed, with eight predictors using a deep-learning approach, convolutional neural networks (CNNs). This monthly SIC prediction model based on CNNs is shown to perform better predictions (mean absolute error - MAE - of 2.28 %, anomaly correlation coefficient - ACC - of 0.98, root-mean-square error - RMSE - of 5.76 %, normalized RMSE - nRMSE - of 16.15 %, and NSE - Nash-Sutcliffe efficiency - of 0.97) than a random-forest-based (RF-based) model (MAE of 2.45 %, ACC of 0.98, RMSE of 6.61 %, nRMSE of 18.64 %, and NSE of 0.96) and the persistence model based on the monthly trend (MAE of 4.31 %, ACC of 0.95, RMSE of 10.54 %, nRMSE of 29.17 %, and NSE of 0.89) through hindcast validations. The spatio-temporal analysis also confirmed the superiority of the CNN model. The CNN model showed good SIC prediction results in extreme cases that recorded unforeseen sea ice plummets in 2007 and 2012 with RMSEs of less than 5.0 %. This study also examined the importance of the input variables through a sensitivity analysis. In both the CNN and RF models, the variables of past SICs were identified as the most sensitive factor in predicting SICs. For both models, the SIC-related variables generally contributed more to predict SICs over ice-covered areas, while other meteorological and oceanographic variables were more sensitive to the prediction of SICs in marginal ice zones. The proposed 1-month SIC prediction model provides valuable information which can be used in various applications, such as Arctic shipping-route planning, management of the fishing industry, and long-term sea ice forecasting and dynamics

    A novel framework of detecting convective initiation combining automated sampling, machine learning, and repeated model tuning from geostationary satellite data

    Get PDF
    This paper proposes a complete framework of a machine learning-based model that detects convective initiation (CI) from geostationary meteorological satellite data. The suggested framework consists of three main processes: (1) An automated sampling tool; (2) machine learning-based CI detection modelling; (3) repeated model tuning through validation. In this study, the automated sampling tool was able to track the CI objects iteratively, even without ancillary data such as an atmospheric motion vector (AMV). The collected samples were used to train the machine learning model for CI detection. Random forest (RF) was used to classify the CI and non-CI. To enhance the advantages of the machine learning approach, we adopted model tuning to iteratively update the training dataset from each validation result by adding hits and misses to the CI samples, and false alarms and correct negatives to the non-CI samples. Using 12 interest fields from the Himawari-8 Advanced Himawari Imager (AHI) over the Korean Peninsula, this simple and intuitive tuning process increased the overall probability of detection (POD) from 0.79 to 0.82 and decreased the overall false alarm rate (FAR) from 0.46 to 0.37 with around 40 min of the lead-time. Amongst the 12 interest fields, Tb(11.2) ??m was identified as the most significant predictor in the RF model, followed by Tb(8.6-11.2) ??m, and Tb(6.2-7.3) ??m. The effect of model tuning on the CI detection performance was also analyzed using spatiotemporal validation maps. By automatically collecting and updating the machine learning training dataset, the suggested framework is expected to help the maintenance of the CI detection model from an operational perspective

    Detection of deterministic and probabilistic convection initiation using Himawari-8 Advanced Himawari Imager data

    Get PDF
    The detection of convective initiation (CI) is very important because convective clouds bring heavy rainfall and thunderstorms that typically cause severe socio-economic damage. In this study, deterministic and probabilistic CI detection models based on decision trees (DT), random forest (RF), and logistic regression (LR) were developed using Himawari-8 Advanced Himawari Imager (AHI) data obtained from June to August 2016 over the Korean Peninsula. A total of 12 interest fields that contain brightness temperature, spectral differences of the brightness temperatures, and their time trends were used to develop CI detection models. While, in our study, the interest field of 11.2 mu m T-b was considered the most crucial for detecting CI in the deterministic models and the probabilistic RF model, the trispectral difference, i.e. (8.6-11.2 mu m)-(11.2-12.4 mu m), was determined to be the most important one in the LR model. The performance of the four models varied by CI case and validation data. Nonetheless, the DT model typically showed higher probability of detection (POD), while the RF model produced higher overall accuracy (OA) and critical success index (CSI) and lower false alarm rate (FAR) than the other models. The CI detection of the mean lead times by the four models were in the range of 20-40 min, which implies that convective clouds can be detected 30 min in advance, before precipitation intensity exceeds 35 dBZ over the Korean Peninsula in summer using the Himawari-8 AHI data

    Retrieval of total precipitable water from Himawari-8 AHI data: A comparison of random forest, extreme gradient boosting, and deep neural network

    Get PDF
    Total precipitable water (TPW), a column of water vapor content in the atmosphere, provides information on the spatial distribution of moisture. The high-resolution TPW, together with atmospheric stability indices such as convective available potential energy (CAPE), is an effective indicator of severe weather phenomena in the pre-convective atmospheric condition. With the advent of high performing imaging instrument onboard geostationary satellites such as Advanced Himawari Imager (AHI) onboard Himawari-8 of Japan and Advanced Meteorological Imager (AMI) onboard GeoKompsat-2A of Korea, it is expected that unprecedented spatiotemporal resolution data (e.g., AMI plans to provide 2 km resolution data at every 2 min over the northeast part of East Asia) will be provided. To derive TPW from such high-resolution data in a timely fashion, an efficient algorithm is highly required. Here, machine learning approaches-random forest (RF), extreme gradient boosting (XGB), and deep neural network (DNN)-are assessed for the TPW retrieved from AHI over the clear sky in Northeast Asia area. For the training dataset, the nine infrared brightness temperatures (BT) of AHI (BT8 to 16 centered at 6.2, 6.9, 7.3, 8.6, 9.6, 10.4, 11.2, 12.4, and 13.3 ??m, respectively), six dual channel differences and observation conditions such as time, latitude, longitude, and satellite zenith angle for two years (September 2016 to August 2018) are used. The corresponding TPW is prepared by integrating the water vapor profiles from InterimEuropean Centre for Medium-Range Weather Forecasts Re-Analysis data (ERA-Interim). The algorithm performances are assessed using the ERA-Interim and radiosonde observations (RAOB) as the reference data. The results show that the DNN model performs better than RF and XGB with a correlation coefficient of 0.96, a mean bias of 0.90 mm, and a root mean square error (RMSE) of 4.65 mm when compared to the ERA-Interim. Similarly, DNN results in a correlation coefficient of 0.95, a mean bias of 1.25 mm, and an RMSE of 5.03 mm when compared to RAOB. Contributing variables to retrieve the TPW in each model and the spatial and temporal analysis of the retrieved TPW are carefully examined and discussed. ?? 2019 by the authors

    Evolutionary coupling analysis identifies the impact of disease-associated variants at less-conserved sites

    Get PDF
    Genome-wide association studies have discovered a large number of genetic variants in human patients with the disease. Thus, predicting the impact of these variants is important for sorting disease-associated variants (DVs) from neutral variants. Current methods to predict the mutational impacts depend on evolutionary conservation at the mutation site, which is determined using homologous sequences and based on the assumption that variants at well-conserved sites have high impacts. However, many DVs at less-conserved but functionally important sites cannot be predicted by the current methods. Here, we present a method to find DVs at less-conserved sites by predicting the mutational impacts using evolutionary coupling analysis. Functionally important and evolutionarily coupled sites often have compensatory variants on cooperative sites to avoid loss of function. We found that our method identified known intolerant variants in a diverse group of proteins. Furthermore, at less-conserved sites, we identified DVs that were not identified using conservation-based methods. These newly identified DVs were frequently found at protein interaction interfaces, where species-specific mutations often alter interaction specificity. This work presents a means to identify less-conserved DVs and provides insight into the relationship between evolutionarily coupled sites and human DVs.11Ysciescopu

    Short-Term Forecasting of Satellite-Based Drought Indices Using Their Temporal Patterns and Numerical Model Output

    Get PDF
    Drought forecasting is essential for effectively managing drought-related damage and providing relevant drought information to decision-makers so they can make appropriate decisions in response to drought. Although there have been great efforts in drought-forecasting research, drought forecasting on a short-term scale (up to two weeks) is still difficult. In this research, drought-forecasting models on a short-term scale (8 days) were developed considering the temporal patterns of satellite-based drought indices and numerical model outputs through the synergistic use of convolutional long short term memory (ConvLSTM) and random forest (RF) approaches over a part of East Asia. Two widely used drought indices-Scaled Drought Condition Index (SDCI) and Standardized Precipitation Index (SPI)-were used as target variables. Through the combination of temporal patterns and the upcoming weather conditions (numerical model outputs), the overall performances of drought-forecasting models (ConvLSTM and RF combined) produced competitive results in terms of r (0.90 and 0.93 for validation SDCI and SPI, respectively) and nRMSE (0.11 and 0.08 for validation of SDCI and SPI, respectively). Furthermore, our short-term drought-forecasting model can be effective regardless of drought intensification or alleviation. The proposed drought-forecasting model can be operationally used, providing useful information on upcoming drought conditions with high resolution (0.05 degrees)

    Classification and mapping of paddy rice by combining Landsat and SAR time series data

    Get PDF
    Rice is an important food resource, and the demand for rice has increased as population has expanded. Therefore, accurate paddy rice classification and monitoring are necessary to identify and forecast rice production. Satellite data have been often used to produce paddy rice maps with more frequent update cycle (e.g., every year) than field surveys. Many satellite data, including both optical and SAR sensor data (e.g., Landsat, MODIS, and ALOS PALSAR), have been employed to classify paddy rice. In the present study, time series data from Landsat, RADARSAT-1, and ALOS PALSAR satellite sensors were synergistically used to classify paddy rice through machine learning approaches over two different climate regions (sites A and B). Six schemes considering the composition of various combinations of input data by sensor and collection date were evaluated. Scheme 6 that fused optical and SAR sensor time series data at the decision level yielded the highest accuracy (98.67% for site A and 93.87% for site B). Performance of paddy rice classification was better in site A than site B, which consists of heterogeneous land cover and has low data availability due to a high cloud cover rate. This study also proposed Paddy Rice Mapping Index (PMI) considering spectral and phenological characteristics of paddy rice. PMI represented well the spatial distribution of paddy rice in both regions. Google Earth Engine was adopted to produce paddy rice maps over larger areas using the proposed PMI-based approach

    Improving Local Climate Zone Classification Using Incomplete Building Data and Sentinel 2 Images Based on Convolutional Neural Networks

    Get PDF
    Recent studies have enhanced the mapping performance of the local climate zone (LCZ), a standard framework for evaluating urban form and function for urban heat island research, through remote sensing (RS) images and deep learning classifiers such as convolutional neural networks (CNNs). The accuracy in the urban-type LCZ (LCZ1-10), however, remains relatively low because RS data cannot provide vertical or horizontal building components in detail. Geographic information system (GIS)-based building datasets can be used as primary sources in LCZ classification, but there is a limit to using them as input data for CNN due to their incompleteness. This study proposes novel methods to classify LCZ using Sentinel 2 images and incomplete building data based on a CNN classifier. We designed three schemes (S1, S2, and a scheme fusion; SF) for mapping 50 m LCZs in two megacities: Berlin and Seoul. S1 used only RS images, and S2 used RS and building components such as area and height (or the number of stories). SF combined two schemes (S1 and S2) based on three conditions, mainly focusing on the confidence level of the CNN classifier. When compared to S1, the overall accuracies for all LCZ classes (OA) and the urban-type LCZ (OA(urb)) of SF increased by about 4% and 7-9%, respectively, for the two study areas. This study shows that SF can compensate for the imperfections in the building data, which causes misclassifications in S2. The suggested approach can be excellent guidance to produce a high accuracy LCZ map for cities where building databases can be obtained, even if they are incomplete

    Retrieval of Melt Ponds on Arctic Multiyear Sea Ice in Summer from TerraSAR-X Dual-Polarization Data Using Machine Learning Approaches: A Case Study in the Chukchi Sea with Mid-Incidence Angle Data

    Get PDF
    Melt ponds, a common feature on Arctic sea ice, absorb most of the incoming solar radiation and have a large effect on the melting rate of sea ice, which significantly influences climate change. Therefore, it is very important to monitor melt ponds in order to better understand the sea ice-climate interaction. In this study, melt pond retrieval models were developed using the TerraSAR-X dual-polarization synthetic aperture radar (SAR) data with mid-incidence angle obtained in a summer multiyear sea ice area in the Chukchi Sea, the Western Arctic, based on two rule-based machine learning approachesdecision trees (DT) and random forest (RF)in order to derive melt pond statistics at high spatial resolution and to identify key polarimetric parameters for melt pond detection. Melt ponds, sea ice and open water were delineated from the airborne SAR images (0.3-m resolution), which were used as a reference dataset. A total of eight polarimetric parameters (HH and VV backscattering coefficients, co-polarization ratio, co-polarization phase difference, co-polarization correlation coefficient, alpha angle, entropy and anisotropy) were derived from the TerraSAR-X dual-polarization data and then used as input variables for the machine learning models. The DT and RF models could not effectively discriminate melt ponds from open water when using only the polarimetric parameters. This is because melt ponds showed similar polarimetric signatures to open water. The average and standard deviation of the polarimetric parameters based on a 15 x 15 pixel window were supplemented to the input variables in order to consider the difference between the spatial texture of melt ponds and open water. Both the DT and RF models using the polarimetric parameters and their texture features produced improved performance for the retrieval of melt ponds, and RF was superior to DT. The HH backscattering coefficient was identified as the variable contributing the most, and its spatial standard deviation was the next most contributing one to the classification of open water, sea ice and melt ponds in the RF model. The average of the co-polarization phase difference and the alpha angle in a mid-incidence angle were also identified as the important variables in the RF model. The melt pond fraction and sea ice concentration retrieved from the RF-derived melt pond map showed root mean square deviations of 2.4% and 4.9%, respectively, compared to those from the reference melt pond maps. This indicates that there is potential to accurately monitor melt ponds on multiyear sea ice in the summer season at a local scale using high-resolution dual-polarization SAR data.open
    corecore