Solar Image Modeling (SIM)
Keywords: Extraction of dark structures in solar images, time series prediction
The solar wind is constituted by plasma and radiation emitted by the Sun and it can have adverse effects on power grids, telecommunication infrastructure and space assets. In particular during late October-November 2003, geomagnetic storms (popularly known as “Halloween storms”) resulted in numerous satellite anomalies, including the rerouting of transpolar flights and disruption of power grids. During a solar storm in 1989, the Hydro-Quebec power grid was down in less than two hours from the onset of the storm, and the harmonics from the induced currents tripped protective systems that resulted in a loss of electricity to more than 6 million people, for a period of 9 hours. Space weather can also severely affect GPS- or GNSS-based navigation. In this project, we used the growing number of high resolution multi-spectral images of the Sun to improve the prediction of the solar wind close to the Earth. Coronal holes (CHs) are regions of the solar corona which appear as dark regions in solar images in the wavelengths 171 Å, 193 Å, and 211 Å. Importantly, CHs were shown to be the origin of high-speed solar wind streams. Therefore, we exploited the classical computer vision tools and machine-learning algorithms to detect CHs on solar images. Then we developed a machine-learning algorithm predicting the solar wind speed from the detected CHs.
We started by collecting the solar images provided by the Solar Dynamics Observatory (SDO) mission, and the coronal holes contained in the Heliophysics Events Knowledgebase (HEK) database. The collected solar images are from 2010 to 2019, encompassing the variability of one solar cycle. We collected the solar images at 1-hour frequency for the 171 Å, 193 Å, and 211 Å wavelengths, relevant for the identification of solar corona structures , at 1-hour frequency. In addition to these images, we also collected images of the other available wavelengths 94 Å, 131 Å, 304 Å, 335 Å, 1600 Å, 1700 Å, 4500 Å with a two-hour frequency. We employed computer vision and machine learning (ML) tools to develop an automated algorithm to extract coronal holes from the images in the relevant 171 Å, 193 Å, and 211 Å wavelengths. In particular, we started applying standard preprocessing to correct artifacts in the raw images. Then we developed and applied an additional preprocessing developed by us to increase the contrast in the image and make the coronal holes region more distinguishable. Finally we used an unsupervised machine learning method, k-means clustering, to cluster the dark pixels in the image and generate the coronal holes detections. We compared our coronal holes detections with the ones of other state-of-the art algorithms, among which deep learning models, and we found that they are compatible. We have published these results in the Astrophysical Journal: Inceoglu et al., 2022 ApJ 930 118.
Some studies showed that coronal hole regions near the center of the sun, as observed from the Earth, contribute more to the solar wind. Therefore we developed an approach to investigate the contribution of the coronal holes at different latitudes and longitudes on the sun to the solar wind. We started by running our coronal hole detection algorithm on our dataset of solar images and obtained a dataset of coronal detections, spanning the period 2010-2019. Using these coronal holes detections we derived a machine-learning model forecasting the solar wind from the coronal holes areas. We defined the training dataset to encompass 8 years of data and evaluated the performance of the model on an independent 2 years test data. We created a gap of 180 days between training and test data to avoid data leakage of long-lasting coronal holes. We also removed the effect of Coronal Mass Ejections in the solar wind, since these are independent phenomena. In order to derive the features for our machine-learning model for the solar-wind speed prediction, we considered a grid on the sun from which we derived the coronal hole area together with its corresponding location on the solar disk. These features are sampled in a 24h cadence over four days to capture the possible change of the coronal holes. We also included the solar wind observed during the past solar rotation, as it can contain some information since a coronal hole may survive for few solar rotations. Finally we considered the number of sunspots as feature, to encode information about the solar cycle.
We tested several grids sizes and depending on the metric, different grids show to be the best. The best result with the Root Mean Squared Error (RMSE) metric was obtained for a 4x3 grid. The best RMSE of the peak of the speeds of high-speed solar wind streams (HSS), which is one of the most evident effects of coronal holes, is obtained for a 10x10 grid (90.4 km/s). There is a trade-off between the performance of the model on the full time series and the performance for the peaks of HSS. That is due to using the Mean Squared Error (MSE) loss function, which assumes a Gaussian distribution. However, the distribution of the solar wind speed is skewed and with heavy tails. Thus, high and low solar wind speeds are not predicted correctly by the model. We solved that problem by correcting the predictions with a distribution transformation (learned on the training data), that maps the predicted distribution onto the observed one, using the Box-Cox transformation. After the distribution transformation, the underestimation of HSS peaks disappears and with the 10x10 grid model we were able to improve the RMSE and the detection of HSS.
Finally, we analyzed the features' importance, and we found that as expected the coronal hole area close to the equator and the solar wind speed measured one solar rotation ago are the most important features. For future works, we conclude that a different loss function is needed to account for the special distribution of the problem, and that the evaluation of models with the RMSE metric is misleading because it does not capture the performance for the most important events, the HSS peaks.
We included these results about the solar-wind prediction model in a paper, which we submitted for publication and is currently under revision. We also plan to publish the datasets and the codes together with the paper. Our developed solar wind model outperforms multiple other empirical models and has lower complexity. It serves as a prototype of a model that will be put into operations and in the future may be transferred to ESA, NASA, NOAA , and national agencies.
Publications
Identification of Coronal Holes on AIA/SDO Images Using Unsupervised Machine Learning
Inceoglu F, Shprits Y, Heinemann S, Bianco S - The Astrophysical Journal - 2022