20 research outputs found
Shape-based Feature Engineering for Solar Flare Prediction
Solar flares are caused by magnetic eruptions in active regions (ARs) on the
surface of the sun. These events can have significant impacts on human
activity, many of which can be mitigated with enough advance warning from good
forecasts. To date, machine learning-based flare-prediction methods have
employed physics-based attributes of the AR images as features; more recently,
there has been some work that uses features deduced automatically by deep
learning methods (such as convolutional neural networks). We describe a suite
of novel shape-based features extracted from magnetogram images of the Sun
using the tools of computational topology and computational geometry. We
evaluate these features in the context of a multi-layer perceptron (MLP) neural
network and compare their performance against the traditional physics-based
attributes. We show that these abstract shape-based features outperform the
features chosen by the human experts, and that a combination of the two feature
sets improves the forecasting capability even further.Comment: To be published in Proceedings for Innovative Applications of
Artificial Intelligence Conference 202
Solar Flare Prediction and Feature Selection using Light Gradient Boosting Machine Algorithm
Solar flares are among the most severe space weather phenomena, and they have
the capacity to generate radiation storms and radio disruptions on Earth. The
accurate prediction of solar flare events remains a significant challenge,
requiring continuous monitoring and identification of specific features that
can aid in forecasting this phenomenon, particularly for different classes of
solar flares. In this study, we aim to forecast C and M class solar flares
utilising a machine-learning algorithm, namely the Light Gradient Boosting
Machine. We have utilised a dataset spanning 9 years, obtained from the
Space-weather Helioseismic and Magnetic Imager Active Region Patches (SHARP),
with a temporal resolution of 1 hour. A total of 37 flare features were
considered in our analysis, comprising of 25 active region parameters and 12
flare history features. To address the issue of class imbalance in solar flare
data, we employed the Synthetic Minority Oversampling Technique (SMOTE). We
used two labeling approaches in our study: a fixed 24-hour window label and a
varying window that considers the changing nature of solar activity. Then, the
developed machine learning algorithm was trained and tested using forecast
verification metrics, with an emphasis on evaluating the true skill statistic
(TSS). Furthermore, we implemented a feature selection algorithm to determine
the most significant features from the pool of 37 features that could
distinguish between flaring and non-flaring active regions. We found that
utilising a limited set of useful features resulted in improved prediction
performance. For the 24-hour prediction window, we achieved a TSS of 0.63
(0.69) and accuracy of 0.90 (0.97) for C (M) class solar flares.Comment: Accepted for publication in Solar Physics journa
Solar Flare Prediction From Extremely Imbalanced Multivariate Time Series Data Using Minimally Random Convolutional Kernel Transform
Solar flares are characterized by sudden bursts of electromagnetic radiation from the Sun\u27s surface, and caused by the changes in magnetic field states in solar active regions. Earth and its surrounding space environment can suffer from various negative impacts caused by solar flares ranging from electronic communication disruption to radiation exposure-based health risks to the astronauts. In this paper, we address the solar flare prediction problem from magnetic field parameter-based multivariate time series (MVTS) data using multiple state-of-the-art machine learning classifiers that include MINImally RandOm Convolutional KErnel Transform (MINIROCKET), Support Vector Machine (SVM), Canonical Interval Forest (CIF), Multiple Representations SEQuence Learner (MR-SEQL), Long Short-Term Memory (LSTM)-based deep learning model, and the Transformer model. We showed our results on the Space Weather ANalytics for Solar Flares (SWAN-SF) benchmark data set, a partitioned collection of MVTS data of active region magnetic field parameters spanning over 9 years of operation of the Solar Dynamics Observatory (SDO). The MVTS instances of the SWAN-SF dataset are labeled by GOES X-ray flux-based flare class labels, and attributed to extreme class imbalance because of the rarity of the major flaring events (e.g., X and M). To minimize the dimensionality of the data, we also included data preprocessing activities such as statistical summarization. We used the true skill statistic (TSS) and realizations of the Heidke Skill Score (HSS; HSS2) score as a performance validation metric in this class-imbalanced dataset. Finally, we demonstrate the advantages of the MVTS learning algorithm MINIROCKET, which produces better results than other classifiers without the need for essential data preprocessing steps such as normalization, statistical summarization, and class imbalance handling heuristics
Comparative Study of Machine Learning Models on Solar Flare Prediction Problem
Solar flare events are explosions of energy and radiation from the Sun’s surface. These events occur due to the tangling and twisting of magnetic fields associated with sunspots. When Coronal Mass ejections accompany solar flares, solar storms could travel towards earth at very high speeds, disrupting all earthly technologies and posing radiation hazards to astronauts. For this reason, the prediction of solar flares has become a crucial aspect of forecasting space weather. Our thesis utilized the time-series data consisting of active solar region magnetic field parameters acquired from SDO that span more than eight years. The classification models take AR data from an observation period of 12 hours as input to predict the occurrence of flare in next 24 hours. We performed preprocessing and feature selection to find optimal feature space consisting of 28 active region parameters that made our multivariate time series dataset (MVTS). For the first time, we modeled the flare prediction task as a 4-class problem and explored a comprehensive set of machine learning models to identify the most suitable model. This research achieved a state-of-the-art true skill statistic (TSS) of 0.92 with a 99.9% recall of X-/M- class flares on our time series forest model. This was accomplished with the augmented dataset in which the minority class is over-sampled using synthetic samples generated by SMOTE and the majority classes are randomly under-sampled. This work has established a robust dataset and baseline models for future studies in this task, including experiments on remedies to tackle the class imbalance problem such as weighted cost functions and data augmentation. Also the time series classifiers implemented will enable shapelets mining that can provide interpreting ability to domain experts
Recommended from our members
Enhanced flare prediction by advanced feature extraction from solar images : developing automated imaging and machine learning techniques for processing solar images and extracting features from active regions to enable the efficient prediction of solar flares.
Space weather has become an international issue due to the catastrophic impact
it can have on modern societies. Solar flares are one of the major solar activities that
drive space weather and yet their occurrence is not fully understood. Research is
required to yield a better understanding of flare occurrence and enable the development
of an accurate flare prediction system, which can warn industries most at risk to take
preventative measures to mitigate or avoid the effects of space weather. This thesis
introduces novel technologies developed by combining advances in statistical physics,
image processing, machine learning, and feature selection algorithms, with advances in
solar physics in order to extract valuable knowledge from historical solar data, related to
active regions and flares. The aim of this thesis is to achieve the followings: i) The
design of a new measurement, inspired by the physical Ising model, to estimate the
magnetic complexity in active regions using solar images and an investigation of this
measurement in relation to flare occurrence. The proposed name of the measurement is
the Ising Magnetic Complexity (IMC). ii) Determination of the flare prediction
capability of active region properties generated by the new active region detection
system SMART (Solar Monitor Active Region Tracking) to enable the design of a new
flare prediction system. iii) Determination of the active region properties that are most
related to flare occurrence in order to enhance understanding of the underlying physics
behind flare occurrence. The achieved results can be summarised as follows: i) The new
active region measurement (IMC) appears to be related to flare occurrence and it has a
potential use in predicting flare occurrence and location. ii) Combining machine
learning with SMART¿s active region properties has the potential to provide more
accurate flare predictions than the current flare prediction systems i.e. ASAP
(Automated Solar Activity Prediction). iii) Reduced set of 6 active region properties
seems to be the most significant properties related to flare occurrence and they can
achieve similar degree of flare prediction accuracy as the full 21 SMART active region
properties. The developed technologies and the findings achieved in this thesis will
work as a corner stone to enhance the accuracy of flare prediction; develop efficient
flare prediction systems; and enhance our understanding of flare occurrence. The
algorithms, implementation, results, and future work are explained in this thesis
The Challenge of Machine Learning in Space Weather Nowcasting and Forecasting
The numerous recent breakthroughs in machine learning (ML) make imperative to
carefully ponder how the scientific community can benefit from a technology
that, although not necessarily new, is today living its golden age. This Grand
Challenge review paper is focused on the present and future role of machine
learning in space weather. The purpose is twofold. On one hand, we will discuss
previous works that use ML for space weather forecasting, focusing in
particular on the few areas that have seen most activity: the forecasting of
geomagnetic indices, of relativistic electrons at geosynchronous orbits, of
solar flares occurrence, of coronal mass ejection propagation time, and of
solar wind speed. On the other hand, this paper serves as a gentle introduction
to the field of machine learning tailored to the space weather community and as
a pointer to a number of open challenges that we believe the community should
undertake in the next decade. The recurring themes throughout the review are
the need to shift our forecasting paradigm to a probabilistic approach focused
on the reliable assessment of uncertainties, and the combination of
physics-based and machine learning approaches, known as gray-box.Comment: under revie
Forecasting Solar Flares Using Magnetogram-based Predictors and Machine Learning
We propose a forecasting approach for solar flares based on data from Solar Cycle 24, taken by the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) mission. In particular, we use the Spaceweather HMI Active Region Patches (SHARP) product that facilitates cut-out magnetograms of solar active regions (AR) in the Sun in near-realtime (NRT), taken over a five-year interval (2012 – 2016). Our approach utilizes a set of thirteen predictors, which are not included in the SHARP metadata, extracted from line-of-sight and vector photospheric magnetograms. We exploit several Machine Learning (ML) and Conventional Statistics techniques to predict flares of peak magnitude >M1 and >C1, within a 24 h forecast window. The ML methods used are multi-layer perceptrons (MLP), support vector machines (SVM) and random forests (RF). We conclude that random forests could be the prediction technique of choice for our sample, with the second best method being multi-layer perceptrons, subject to an entropy objective function. A Monte Carlo simulation showed that the best performing method gives accuracy ACC=0.93(0.00), true skill statistic TSS=0.74(0.02) and Heidke skill score HSS=0.49(0.01) for >M1 flare prediction with probability threshold 15% and ACC=0.84(0.00), TSS=0.60(0.01) and HSS=0.59(0.01) for >C1 flare prediction with probability threshold 35%