About The Reference Catalog:

Supervised learning relies on a well-defined reference database that provides labels with uniform criteria across all samples. In the context of flare detection, the principal identifying features include the start, peak, and end times of events, the morphology of the flux profile during flares, and the characterization of the background level to distinguish quiescent from flaring phases. In practice, however, the construction of such a database presents significant challenges. Solar SXR emissions vary with time, and no clear-cut definition of the background flux exists, which can unambiguously distinguish events from the quiescent level. Furthermore, flare lightcurves display substantial differences in their temporal profiles. Several flare catalogs provide start, peak, and end times; however, significant discrepancies arise among them owing to the arbitrary criteria embedded in each flare detection algorithm.

The use of an incomplete catalog as a training reference risks introducing systematic bias into the CNN identifications. To mitigate this risk, a new, internally consistent reference catalog is constructed. A total of 145 days are randomly selected from the interval 01 January 2018 to 31 May 2025, spanning solar minimum to maximum. Each day's data is visually inspected, and flaring events are annotated according to the authors’ judgment and definition of a flare. Particular emphasis is placed on the identification of rise episodes rather than complete flare durations. By concentrating on the rise episodes, the non-overlap constraint, that is, one flare must terminate before the next begins is partially relaxed. In other words, while overlap is not permitted within the rise episode of an ongoing flare, subsequent events initiating during the decay of a preceding flare are permitted.

The manual procedure yielded a reference catalog of 7,700 solar flares, providing the foundation for supervised training and evaluation of the CNN framework.