20 research outputs found

    Shape-based Feature Engineering for Solar Flare Prediction

    Full text link
    Solar flares are caused by magnetic eruptions in active regions (ARs) on the surface of the sun. These events can have significant impacts on human activity, many of which can be mitigated with enough advance warning from good forecasts. To date, machine learning-based flare-prediction methods have employed physics-based attributes of the AR images as features; more recently, there has been some work that uses features deduced automatically by deep learning methods (such as convolutional neural networks). We describe a suite of novel shape-based features extracted from magnetogram images of the Sun using the tools of computational topology and computational geometry. We evaluate these features in the context of a multi-layer perceptron (MLP) neural network and compare their performance against the traditional physics-based attributes. We show that these abstract shape-based features outperform the features chosen by the human experts, and that a combination of the two feature sets improves the forecasting capability even further.Comment: To be published in Proceedings for Innovative Applications of Artificial Intelligence Conference 202

    Solar Flare Prediction and Feature Selection using Light Gradient Boosting Machine Algorithm

    Full text link
    Solar flares are among the most severe space weather phenomena, and they have the capacity to generate radiation storms and radio disruptions on Earth. The accurate prediction of solar flare events remains a significant challenge, requiring continuous monitoring and identification of specific features that can aid in forecasting this phenomenon, particularly for different classes of solar flares. In this study, we aim to forecast C and M class solar flares utilising a machine-learning algorithm, namely the Light Gradient Boosting Machine. We have utilised a dataset spanning 9 years, obtained from the Space-weather Helioseismic and Magnetic Imager Active Region Patches (SHARP), with a temporal resolution of 1 hour. A total of 37 flare features were considered in our analysis, comprising of 25 active region parameters and 12 flare history features. To address the issue of class imbalance in solar flare data, we employed the Synthetic Minority Oversampling Technique (SMOTE). We used two labeling approaches in our study: a fixed 24-hour window label and a varying window that considers the changing nature of solar activity. Then, the developed machine learning algorithm was trained and tested using forecast verification metrics, with an emphasis on evaluating the true skill statistic (TSS). Furthermore, we implemented a feature selection algorithm to determine the most significant features from the pool of 37 features that could distinguish between flaring and non-flaring active regions. We found that utilising a limited set of useful features resulted in improved prediction performance. For the 24-hour prediction window, we achieved a TSS of 0.63 (0.69) and accuracy of 0.90 (0.97) for \geqC (\geqM) class solar flares.Comment: Accepted for publication in Solar Physics journa

    Solar Flare Prediction From Extremely Imbalanced Multivariate Time Series Data Using Minimally Random Convolutional Kernel Transform

    Get PDF
    Solar flares are characterized by sudden bursts of electromagnetic radiation from the Sun\u27s surface, and caused by the changes in magnetic field states in solar active regions. Earth and its surrounding space environment can suffer from various negative impacts caused by solar flares ranging from electronic communication disruption to radiation exposure-based health risks to the astronauts. In this paper, we address the solar flare prediction problem from magnetic field parameter-based multivariate time series (MVTS) data using multiple state-of-the-art machine learning classifiers that include MINImally RandOm Convolutional KErnel Transform (MINIROCKET), Support Vector Machine (SVM), Canonical Interval Forest (CIF), Multiple Representations SEQuence Learner (MR-SEQL), Long Short-Term Memory (LSTM)-based deep learning model, and the Transformer model. We showed our results on the Space Weather ANalytics for Solar Flares (SWAN-SF) benchmark data set, a partitioned collection of MVTS data of active region magnetic field parameters spanning over 9 years of operation of the Solar Dynamics Observatory (SDO). The MVTS instances of the SWAN-SF dataset are labeled by GOES X-ray flux-based flare class labels, and attributed to extreme class imbalance because of the rarity of the major flaring events (e.g., X and M). To minimize the dimensionality of the data, we also included data preprocessing activities such as statistical summarization. We used the true skill statistic (TSS) and realizations of the Heidke Skill Score (HSS; HSS2) score as a performance validation metric in this class-imbalanced dataset. Finally, we demonstrate the advantages of the MVTS learning algorithm MINIROCKET, which produces better results than other classifiers without the need for essential data preprocessing steps such as normalization, statistical summarization, and class imbalance handling heuristics

    Comparative Study of Machine Learning Models on Solar Flare Prediction Problem

    Get PDF
    Solar flare events are explosions of energy and radiation from the Sun’s surface. These events occur due to the tangling and twisting of magnetic fields associated with sunspots. When Coronal Mass ejections accompany solar flares, solar storms could travel towards earth at very high speeds, disrupting all earthly technologies and posing radiation hazards to astronauts. For this reason, the prediction of solar flares has become a crucial aspect of forecasting space weather. Our thesis utilized the time-series data consisting of active solar region magnetic field parameters acquired from SDO that span more than eight years. The classification models take AR data from an observation period of 12 hours as input to predict the occurrence of flare in next 24 hours. We performed preprocessing and feature selection to find optimal feature space consisting of 28 active region parameters that made our multivariate time series dataset (MVTS). For the first time, we modeled the flare prediction task as a 4-class problem and explored a comprehensive set of machine learning models to identify the most suitable model. This research achieved a state-of-the-art true skill statistic (TSS) of 0.92 with a 99.9% recall of X-/M- class flares on our time series forest model. This was accomplished with the augmented dataset in which the minority class is over-sampled using synthetic samples generated by SMOTE and the majority classes are randomly under-sampled. This work has established a robust dataset and baseline models for future studies in this task, including experiments on remedies to tackle the class imbalance problem such as weighted cost functions and data augmentation. Also the time series classifiers implemented will enable shapelets mining that can provide interpreting ability to domain experts

    The Challenge of Machine Learning in Space Weather Nowcasting and Forecasting

    Get PDF
    The numerous recent breakthroughs in machine learning (ML) make imperative to carefully ponder how the scientific community can benefit from a technology that, although not necessarily new, is today living its golden age. This Grand Challenge review paper is focused on the present and future role of machine learning in space weather. The purpose is twofold. On one hand, we will discuss previous works that use ML for space weather forecasting, focusing in particular on the few areas that have seen most activity: the forecasting of geomagnetic indices, of relativistic electrons at geosynchronous orbits, of solar flares occurrence, of coronal mass ejection propagation time, and of solar wind speed. On the other hand, this paper serves as a gentle introduction to the field of machine learning tailored to the space weather community and as a pointer to a number of open challenges that we believe the community should undertake in the next decade. The recurring themes throughout the review are the need to shift our forecasting paradigm to a probabilistic approach focused on the reliable assessment of uncertainties, and the combination of physics-based and machine learning approaches, known as gray-box.Comment: under revie

    Forecasting Solar Flares Using Magnetogram-based Predictors and Machine Learning

    Get PDF
    We propose a forecasting approach for solar flares based on data from Solar Cycle 24, taken by the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) mission. In particular, we use the Spaceweather HMI Active Region Patches (SHARP) product that facilitates cut-out magnetograms of solar active regions (AR) in the Sun in near-realtime (NRT), taken over a five-year interval (2012 – 2016). Our approach utilizes a set of thirteen predictors, which are not included in the SHARP metadata, extracted from line-of-sight and vector photospheric magnetograms. We exploit several Machine Learning (ML) and Conventional Statistics techniques to predict flares of peak magnitude >M1 and >C1, within a 24 h forecast window. The ML methods used are multi-layer perceptrons (MLP), support vector machines (SVM) and random forests (RF). We conclude that random forests could be the prediction technique of choice for our sample, with the second best method being multi-layer perceptrons, subject to an entropy objective function. A Monte Carlo simulation showed that the best performing method gives accuracy ACC=0.93(0.00), true skill statistic TSS=0.74(0.02) and Heidke skill score HSS=0.49(0.01) for >M1 flare prediction with probability threshold 15% and ACC=0.84(0.00), TSS=0.60(0.01) and HSS=0.59(0.01) for >C1 flare prediction with probability threshold 35%
    corecore