The production and demand of electricity is greatly affected by weather conditions, implementation of tariff policies, policies to mitigate the effects of climate change and other unforeseeable events, such as pandemics and wars. These factors together with the lack of technology for efficient electricity storage render the electricity price in all the electricity markets volatile and thereby difficult to be forecast. A reliable market price forecasting tool is a valuable instrument that market participants may exploit to deal with market price volatility, as the availability of reliable forecasts enables better strategic planning. The unexpected existence of negative and extremely high prices renders any forecasting endeavour ever more challenging.
2. Electricity Day-Ahead Market Conditions and Their Effect on the Different Supervised Algorithms for Market Price Forecasting
A significant problem related to deregulated energy markets is the prediction of extremely high Market Clearing Prices. In 
, extreme price values are attributed to factors affecting the normal operation of the grid, such as device failures, to the bidding strategies of the market participants and to sudden increase in demand. Consumers can better manage risks associated with peak values if more choices for purchasing electricity are available, for example, from a centralized power pool or through bilateral contracts 
. Extremely high prices can be stabilized by having elastic demand whereas a decrease in the expected price may be obtained, at the expense of increased volatility, by using renewable energy sources 
. Demand elasticity is achieved by the demand-side management which at the same time has the effect of reducing the market price volatility.
The use of electricity demand management together with historical price data demonstrated that prediction methods must learn the actual relationships between prices and the factors affecting them 
. To this end, neural networks and extreme learning machines were proposed in 
, respectively, to predict marginal prices, with forecasting Mean Absolute Error (MAE) ranging between 0 and 8 ($/MWh). These articles concluded that more detailed data concerning network structure and system operations is required for the development of simulation techniques and analytical approaches to yield lower forecasting errors. The use of predicting methodologies (ANN, ELM) is thus more suitable for cases where sufficient system operational data acquisition is not feasible 
Market power is defined as the ability of market actors to manipulate prices to their benefit for specific periods of time 
. As enormous profits can be achieved by increased market prices 
generators tend to exercise market power by changing their offer curves. Market power can be exercised through the so called economic withholding or/and physical withholding 
. Economic withholding occurs when a producer submits an offer curve with relatively high prices compared to its marginal cost 
. Physical withholding is when generators reduce the offers of their generation capacity in order to render a proportion of the capacity of their power plants unavailable 
. The market risks can be effectively minimized via the design and utilization of monitoring and control mechanisms. These mechanisms require that the behaviour of market participants is constantly monitored to prevent power abuse. Specifically, the market prices of the supply curves offered by the various generators should always be representative of the reasonable expectation of their short-run marginal costs 
. The supply market prices deemed unreasonable are replaced by the default market values. Another option for market monitoring and conditions identification is the examination of the generation and demand curves and how they both affect the clearing price. A less complicated and less time consuming method could be based on identifying signals in the demand and supply time series that indicate the possibility of occurrence of extremely high or negative prices.
Local market suppliers attempt to sell their excessive energy at the highest possible price 
. On the other hand, buyers (consumers) in the same market are cost pruners who seek market price that is lower to the utility rate 
. On some occasions market actors emphasize on gains obtained from prosumer models/profiles which enable users to maximize their utility via price signaling 
. An ANN oriented approach to estimate the system marginal price (SMP) during weekends and public holidays was proposed in 
. The conclusion made therein was that lower error values were obtained during Sundays due to the fact that SMP curve was less volatile as compared to that of Saturdays.
Over time, socio-economic factors, along with the global economy, have caused energy markets to undergo substantial transformations. A measure named predictive density which signals the likelihood of upward or downward trends of oil prices is given in 
. It was also concluded that during periods of extreme volatile economic climates, such variables can be considered for MCP forecasting. In 
it was demonstrated that forecasting methodologies that take into consideration such measures yield improved forecasts 
. In order to further improve the forecasting results methodologies that classify the days to days with, normal, excessively high and negative prices have been proposed. These kinds of methodologies attempt to exploit the intrinsic characteristics of prices that appear in these categories. Many authors have used machine learning- and/or statistical-based methodologies to forecast normal and peak price values. To this end, the Extreme Machine Learning (ELM) was deployed to forecast normal and extremely high Day-Ahead MCP values 
Over the years, a variety of methodologies has been proposed for normal price forecasting. In 
the asymmetric Takagi-Sugeno-Kang neuro-fuzzy model in combination with the fuzzy c-mean (FCM) data pre-processing method which classifies the patterns that may exist in the data was proposed. A two-stage methodology based on a cascaded neural network (CNN) that relies on a two-stage feature selection has been developed in 
. In the first stage the modified relief algorithm is used to capture the relevant features, whereas in the second stage, the relevance values of the obtained features are further analysed to find the features to be used to train the network. Other types of neural networks that have been exploited are the recurrent neural networks (RNN) 
and probabilistic neural networks (PNN) 
. Deep learning-based algorithms such as the Lasso Estimated Auto-Regressive model and Deep Learning models were proposed in 
. It was concluded that the Deep Learning model, in overall, could perform better than the LASSO model, but the LASSO model is suitable for short-term forecasts.
it was argued that using a model incorporating about 400 explanatory variables, a variance stabilizing transformation and a re-calibrated LASSO models gives better forecasts. An improvement of the previously mentioned method was proposed in 
, where it used the Seasonal Component Auto-Regressive (SCAR) model to decompose the electricity market price time series into trend-seasonal and a stochastic parts, and subsequently, model each one separately. It was observed that accuracy improved when the load forecasts were deseasonalized. The model was tested on Global Energy Forecasting Competition 2014 and Nord Pool data demonstrating lower weekly MAE than models proposed in other studies.
A hybrid model for accurate Day-Ahead forecasting was employed in 
. In this paper the empirical mode decomposition filter and the maximum dependency and minimum redundancy criteria are together applied to construct features. This methodology gave lower average MAPE, as compared to other models, for both 1 h and 24 h ahead forecasting. However, the RMSE of the forecasts of the New South Wales (NSW) market prices was higher than the RMSE given by other methods (average RMSE ($/MWh): NSW market was 28.9628.96
and for PJM market was 7.297.29
). Another hybrid model, which consists of a multiple linear regression model, an ARIMA model and Holt-Winters model was proposed in 
There are cases in which extremely high or negative prices appear in the electricity markets. The need for accurate forecasts in these cases has lead many researchers to develop models or methodologies for spike forecasting, whereas, there are not reports of similar endeavours in the direction of negative price forecasting. In order to identify extremely high prices a fixed threshold
, where 𝜇
are the estimated mean and standard deviation of observed prices for a given period) is commonly used in the literature. That is if the price exceeds the threshold it is considered as a extremely high. The techniques that over the years have been proposed are based on: clustering analysis of the market clearing values 
, probabilistic neural networks (PNN) 
and a combination of Bayesian experts and support vector machines (SVM) 
. The techniques proposed vary in complexity, however the key point is that the majority of them identify price time series that contain extreme values and treat them in separate clusters from the other ones. Each cluster contains its unique features that are subsequently used to implement forecasting. A different approach was presented in 
where a two-stage feature selection methodology, based on information theory, for forecasting occurrence and spike price value was proposed. The selected features were subsequently exploited by a methodology based on a combination of PNN with Hybrid Neuro Evolutionary System (HNES) to forecast the price values.
A combination of two ANNs was used in 
to forecast normal market prices. The first network gives forecasts for the next day and the second gives forecasts for the next week. The forecasting of extremely high prices was performed using the Generalized Pareto Distribution (GPD). The reason they separated the days to days of normal and of exceedingly high market prices was that the networks could not capture the extremely high prices even though the training set that was used was containing data of 16 years. The market price data was for the period 7 December 1998–1 January 2014 and was related to the Australian market zones. An ELM-based market price classification methodology was proposed in 
. More specifically, the training data was classified based on thresholds, while for testing three-dimensional vectors were used. The methodology was tested on the Ontario and PJM markets and it was found found are that the classification was more accurate for the Ontario market.
A support vector machine-based method of forecasting the occurrence of the extremely high prices was proposed in 
. Therein the extremely high market prices were defined as those that exceeded the 95th percentile, which was estimated by fitting a Generalized Pareto distribution to the innovations an AR-EGARCH model. The data that was used were the log-transformed market prices, demand and wind production. The selection of the input features was conducted by finding the optimal number of lags of the log market prices. The proposed methodology was compared to NN and XGBoost-based methodologies which were unable to accurately classify extremely high or negative prices. A hybrid methodology for forecasting both the appearance and the actual value of extremely high prices was employed in 
. The hybrid methodology was based on the wavelet transform and on certain time domain and calendar indicators. In addition, mutual information (MI) was used for the feature selection, whereas the forecasting of the appearance of the extremely high prices was carried out by a Probabilistic Neural Network (PNN). This methodology was on data obtained from the PJM and QLD (Australia) markets. Regarding the PJM market for threshold equal to 150, the extremely high price forecast accuracy was 97.3%
with a forecast confidence interval of 87.7%
, while for a threshold equal to 200 was 92%
and confidence interval of 88.5%
. The accuracy of the corresponding measures for the QLD market for the month of June 2004 was lower. Namely, the extremely high price forecast accuracy was 88.23%
and the confidence interval was 83.33%
and for January 2003 92.10%