An Optimized Algorithm for Renewable Energy Forecasting Based on Machine Learning

The large-scale application of renewable energy power generation technology brings new challenges to the operation of traditional power grids and energy management on the load side. Microgrid can effectively solve this problem by using its regulation and flexibility, and is considered to be an ideal platform. The traditional method of computing total transfer capability is difficult due to the central integration of wind farms. As a result, the differential evolution extreme learning machine is offered as a data mining approach for extracting operating rules for the total transfer capability of tie-lines in wind-integrated power systems. K-medoids clustering under the two-dimensional “wind powerload consumption” feature space is used to define representative operational scenarios initially. Then, using stochastic sampling and repetitive power flow, a knowledge base for total transfer capability operating rule mining is created. Then, a novel method is used to filter redundant characteristics and find features that are closely associated to the total transfer capability in order to decrease the ultra-high dimensionality of operational features. Finally, by feeding the training data into the proposed algorithm, the total transfer capability operation rules are derived from the knowledge base. It can be seen that, the proposed algorithm can optimize the system performance with good accuracy and generality, according to numerical data.


Introduction
In the real-time operation of the power grid, dispatchers often rely on a series of safe and stable operation rules to judge and maintain the security of the power system. The limit transmission power or total transmission capacity of the key transmission section of the interconnected power grid is one of the important operation indicators. For a long time, various safe and stable operation rules, including the total transmission capacity (TTC) of the transmission section, are often calculated and formulated in the offline phase under typical operating conditions. However, after large-scale intermittent clean energy is connected to the grid, the grid operation mode is random and uncertain [1,2]. In order to ensure the stability of the power grid, it can only be operated with conservative operating rules, which will easily lead to abandonment of light and wind, and reduce the operating efficiency of the power grid.
In recent years, the development and wide application of big data and artificial intelligence technologies have provided new technical means for fine modeling of operating rules and even smart grid dispatching. Reference [3] was the first to propose the concept of fine operation rules of power system, and comprehensively used sensitivity analysis technology and data mining technology to establish fine rules of transmission section total transmission capacity. On the basis of [3], the authors in [4] further constructs a distributed security feature selection method, which provides technical support for online training and learning of fine rules. Reference [5] considered the influence of intermittent wind power on the TTC of the transmission section, and used the scene clustering method to extract representative scenes. For the representative scenes, TTC rules were mined and a knowledge base was formed, which was better adapted to real-time monitoring of the safety of wind power transmission channels. Reference [6] uses the correlation classification method to extract the power grid stable operation rules. Since the proposed method introduces the time factor, the obtained rules can not only reveal the information of the strongly correlated influencing factors of the stable operation of the power grid, but also extract the influence factor changes on the system state change the internal relationship, thus providing support for scheduling operation decisions. In [7], a method for extracting fine operating rules based on artificial neural network (ANN) is proposed. Compared with linear models, the operating rules based on ANN are improved in prediction accuracy. Compared with traditional rules, the fine-grained rule-learning modeling method based on big data and artificial intelligence technology can consider more refined power grid security feature states, has stronger adaptability to real-time operating conditions, and has stronger nonlinear extraction ability [8,9].
Based on the idea of big data-driven power system rule extraction and operation decision-making, this paper proposes a method for extracting limit transmission power operation rules of wind power system transmission section based on differential evolution extreme learning machine. Considering the uncertainty of wind power output and the time series fluctuation characteristics of load, the grid operating conditions are represented by the two-dimensional feature of "wind power-load", and the extraction of typical operating conditions is realized based on the K-medoids clustering method. On the basis of each typical operating condition, a random operating condition set is generated by random sampling, and the repeated power flow method with embedded transient stability check is used to search and obtain the limit transmission power of key transmission sections under random operating conditions, and record random operating conditions and their corresponding critical transmission section limit transmission power constitute a big data knowledge base. Aiming at the high-dimensional operation feature attribute set of complex interconnected power grids, the feature dimensionality reduction is realized based on the RELIEF-F algorithm, and the differential evolution extreme learning machine is further used to learn and extract the association prediction rules of the transmission section limit transmission power in the dimensionality reduction feature space. In the real-time operation stage, through the two-stage working condition matching and rule prediction, the fast and accurate estimation of the limit transmission power of the transmission section can be realized, so as to provide a basis for the monitoring and control of power grid stability. The effectiveness of the proposed method is verified in the simulation of a New England 39-node system with wind power.

Clustering of Time Series Running Scenarios
The limit transmission power total transmission capacity refers to the maximum transmission capacity of the transmission section subject to the stability constraints of various power grids, and it is time-varying with changes in the operating conditions of the power grid. The randomness and volatility of wind power output make the operating conditions of the power grid change rapidly, and the total transmission capacity setting calculated based on typical operating conditions has the risk of failure, which may lead to misjudgment of stability. If the full operating condition set considering the influence of various uncertain factors is adopted, it will greatly increase the difficulty of fitting the operating rules [10].
Scene clustering is an important means to reduce the scene dimension and extract typical operating conditions. It is used in intermittent clean energy grid-connected spinning reserve demand assessment [11], reactive power optimization assessment [12], wind farm site selection planning [13] and other issues successfully applied. In order to effectively extract representative typical operating conditions, the "wind power-load" two-dimensional feature is used to characterize any operating condition, and the historical record of "wind power-load demand" data is used to form a complete set of scenarios, based on the Kmedoids clustering method. [14] implements scene clustering and representative typical scene extraction. After obtaining the representative typical scene set, the big data knowledge base construction and operation rule extraction can be carried out for each representative scene, so as to adapt to the real-time operation scene where the total transmission capacity operation rules of the transmission section change with the change of "wind power-load".

Repeated Power Flow Calculation
In order to extract the operation rules of the transmission section limit transmission power total transmission capacity, in the knowledge base construction stage, the calculation of the limit transmission power of the specified transmission section needs to be carried out for random working conditions. The total transmission capacity calculation methods include continuous power flow method [15,16] and optimal power flow method [17,18]. In actual operation, important transmission sections are often constrained by transient stability. The continuous power flow method generally adopts a quasi-steadystate model, so the transient stability constraints of the section cannot be considered. For the optimal power flow method, the introduction of temporary stability constraints makes it difficult to solve the model, and how to improve the solution speed and robustness of the algorithm remains to be studied. Therefore, a repeated power flow binary search method for the transmission power growth of the transmission section is proposed. The algorithm flow is as follows.
(1) Given the initial operating conditions of the power grid, initialize the binary search interval [λ s , λ u ] of the load growth factor. (2) Take the median value of the interval λ L = (λ s + λ u )/2, substitute λ L into Eq. (1) to update the load demand at the receiving end, and at the same time adjust the output of the generator at the sending end according to Eq. (2) as follows: In the formula, k Li is the growth rate factor of the receiving end grid load i; ΔP L is the total load increment of the receiving end grid; P R Gj is the active backup of generator j at the sending end; P R G is the total active power reserve of the generator at the sending end, so there are (3) Calculate the power flow of the power grid after the synchronous increase of load and power generation, consider the set of fault scenarios where a three-phase short circuit occurs in any transmission line of the section, and conduct time-domain simulations one by one. In this paper, the generator model shown in Eq. (4) is used, and the transient stability check is carried out based on Eq. (5).
In the formula, Δδ max represents the maximum unit power angle difference at any simulation time step. If the transient stability index S is less than 0, the grid is judged to be unstable after the fault. Conversely, the grid can remain stable after a fault.
(4) If the current operating condition of the power grid satisfies the transient stability constraint of the section fault set, update the binary search interval, let λ s = λ L ; otherwise, let λ u = λ L .
(5) If the interval gap satisfies the calculation accuracy (λ u − λ s < Δλ th ), the critical load growth factor λ cr = (λ s + λ u )/2 is obtained, and the power flow under the critical operating condition is calculated. At this time, the total transmission power of the section is the initial total transmission capacity for operating conditions.

Big Data Knowledge Base Construction
A set of typical operating conditions can be obtained through the clustering of time series scenarios in Section 1 of this paper. Considering the uncertainty of wind power output, synchronous machine output, and load demand respectively, random sampling is used to generate random operating conditions for each typical operating condition: (1) For any typical operating condition, calculate the maximum offset radius ðP max wind ; P max load Þ, so that the wind power output and load demand fluctuate randomly within the range of [ÀP max wind ; þP max wind ] and ÀP max load ; þP max load respectively, while the synchronous machine is within [80%, 120%] random fluctuations within the range of times the initial output, generate random operating conditions and calculate the power flow. Then, record all operating characteristic parameters such as node voltage amplitude and phase angle, load demand and generator output under this operating condition. (2) Based on the repeated power flow method proposed in Section 3.1, calculate the ultimate transmission power of the transmission section under typical and random conditions. (3) Mark the k-th operating characteristic parameter of the i-th random working condition with F k i , and P TTC i is the limit transmission power of the section under this working condition, expressed as Eq. (6). Calculate the parameter deviation of each feature parameter and section total transmission capacity relative to the central operating scene, and finally use the parameter deviation F i as the input feature and T i as the prediction target to form a big data knowledge table.

RELIEF-F Feature Selection
In order to find the features with implicit relationship to total transmission capacity as completely as possible, this paper retains all the data that can be collected by supervisory control and data acquisition (SCADA). The sample data set must contain a large number of redundant features and noise data, which will lead to a large computational burden for subsequent extraction of fine rules. The accuracy of the rules decreases, so it is necessary to perform feature screening on the original sample set.
RELIEF-F is a filtering feature selection algorithm that does not need to rely on subsequent learners. It is suitable for preprocessing of sample sets containing redundant features and noisy data. Its core idea is to evaluate the ability of features to distinguish adjacent samples, using the evaluation value to quantify the correlation between the feature and the target, the larger the evaluation value, the greater the contribution of the feature to the predicted target, and the feature is retained.
In the regression problem, the target value is a set of continuous values, and the traditional RELIEF-F algorithm is not suitable [19]. Reference [20] proposed an improved method of RELIEF-F applied to regression problems and aiming at the problem that the traditional RELIEF-F algorithm cannot obtain the sample category information in the regression problem, the improved RELIEF-F algorithm uses the distance between the predicted values of the samples to construct a probability model. Since it is difficult to solve the probability model directly, consider using the following algorithm for estimation, and then obtain the evaluation value of each feature. The specifics steps are shown in Algorithm 1.  The N dP , N dF (F), N dPdF (F) correspond to the approximate values of P F diff , P P diff , and P PjF diff P respectively. The function f( ⋅ ) represents the target value of the sample, and the functions diff( ⋅ ) and d( ⋅ ) correspond to Eqs. (7) and (8) In the formula, rank(R i , I j ) is the position sequence sorted according to the distance between the nearest neighbor I j and the selected sample R i ; σ is the self-defined parameter m. In this paper, σ = 50.

Extreme Learning Machine
The extraction of fine rules for transmission sections requires the learner to be able to construct models quickly and accurately, and the fine rules extracted by the learner must ensure strong generalization ability. As a new single-layer feed-forward neural network (SLFNN) learning algorithm is proposed and extreme learning machine has the advantages of fast training speed, simple generation network structure, and strong generalization ability, and has been widely used in power systems.
The extreme learning machine randomly generates the weight and threshold matrix between the input layer and the hidden layer, approximates the sample with zero error, and directly solves the problem by least squares on the weight of the hidden layer and the output layer. Compared with the error back-propagation (BP) algorithm, extreme learning machine directly obtains the feedforward network structure through analytical solution, which is not easy to fall into local optimum and has strong generalization ability. The specific execution steps of extreme learning machine can be found in [21], which can be briefly described in Fig. 1.

Differential Evolution Extreme Learning Machine
When extreme learning machine is used to extract fine rules of transmission section, due to the disturbance of random factors in the power system, the accuracy of actual prediction may be reduced. Therefore, in order to enhance the generalization ability of extreme learning machine under the influence of uncertain factors, this paper uses intelligent The optimization algorithm is combined with extreme learning machine. Compared with other evolutionary algorithms, the differential evolution algorithm has the advantages of simple execution, fast convergence speed and good global search performance [22], and is especially suitable for neural network optimization. The specific implementation steps of differential evolution (DE) can be found in [23]. This article focuses on the specific steps of applying differential evolution to extreme learning machine: 1) Real-number coding is performed on the weight matrix ω and the threshold vector b from the SLFNN input layer to the hidden layer, and the population is randomly initialized. 2) Perform 5-fold cross-validation on the input samples, use each individual in the population to construct SLFNN for ELM training, input the validation set into SLFNN to obtain the predicted output T v pop ¼ E pop ðx v Þ, and calculate the population fitness, the fitness function is shown in Eq. (9) as follows: In the formula, pop represents the current individual number, and E pop (⋅) represents the extreme learning machine constructed by the current individual; T v pop represents the current validation set prediction output vector; y v represents the target vector of the current validation set; sv is the number of samples in the validation set.
3) Perform differential evolution selection, crossover, and mutation operations. 4) Obtain the offspring, calculate the individual fitness of the offspring, and select the optimal individual. 5) Whether the maximum evolutionary algebra is reached, if yes, output the current optimal extreme learning machine, otherwise return to step 3) until the maximum evolutionary algebra.

Experimental Results and Analysis
The calculation example adopts an improved 3-zone New England 39-node system, which is centrally connected to the wind farm at busbar 17, with an installed capacity of 600 MW (See Fig. 2). In this paper, the time domain simulation is performed based on MATLAB-PSAT software, and the wind turbine adopts the classic wind double-fed induction generator (DFIG) model provided by PSAT. Figure 2: System configuration In addition, the squared correlation coefficient (SCC) and the mean squared error (MSE) indicators are used to measure the precision of the section fine rules extracted by the proposed method.

Sample Generation
For the standard example system, due to the lack of historical data, it is not possible to directly use the historical "wind power-load" time series data for scene clustering. In order to verify the method proposed in this paper, the time series model of reference [24] is used to generate the "wind power-load" time series data. After obtaining the "wind power-load" scene data through time series simulation, the scene clustering and representative typical scene extraction based on the K-medoids method are carried out, and five types of representative scene centers are obtained as shown in Tab. 1.

Feature Selection
The starting feature space is shown in Tab. 2. Using the proposed algorithm to perform feature selection on the training samples in scene cluster 3, the evaluation value of the prediction correlation of each feature with respect to the target value can be obtained as shown in Fig. 3.  Net active and reactive power difference in the sending end area 6 Net active and reactive power difference in receiver area 7 Each bus voltage difference 8 Active load difference of each node 9 The difference between the active power generation of each generator (Continued) In order to verify the effectiveness of feature selection, this paper selects the best top 5 and top 10 features and the worst top 5 and top 10 features as input to proposed method, and obtains Tab. 3.
It is obvious from Tab. 3 that the fine rule accuracy obtained by selecting the best features into the final input set is significantly better than that obtained by selecting the worst features into the final input set. Moreover, as the number of features selected in the input set increases, the accuracy of the extracted fine rules also increases, and the training time also increases relatively. Therefore, it is necessary to select the features that can ensure the accuracy of the fine rules of the transmission section without costing Features for a lot of training time.

Section Fine Rule Extraction
According to the results in Section 6.2, this paper selects the top 40 features with the highest feature evaluation value as the training data for the final input to the learner, and the final selected feature space is shown in Tab. 4. The difference between the reactive power generation of each generator 11 The total flow difference of the section 12 Active wind power 13 Total load difference  The section fine rule of scene 3 is established by proposed method, and the fine rule will be used to predict the test set total transmission capacity. The result is that the MSE is 0.054, and the SCC is 0.9859. The error distribution between the predicted value and the actual value is shown in Fig. 4.
Similarly, for other clustering scenarios, the algorithm proposed in this paper can also accurately extract the fine rules of the section. Based on the number of samples in scene 3, the number of samples of other scene clusters is obtained by scaling according to the coverage size of the clustering scene, and 200 test samples are also selected to test the performance of fine rules. The rule prediction performance of all scene clusters is shown in Tab. 5, and the error distribution is shown in Fig. 5. The difference of the phase angle of the connecting line of each section 2 Nodes 3,4,7,8,15,16,18,20,21,[23][24][25][26][27][28][29]39 Active Load 3 Active power output of generators at nodes 30-38 4 Nodes 31-32, 34-39 Generator reactive output 5 Wind farm active output 6 Total Q in sending region 7 Total Q in receiver region  It can be seen that the method proposed in this paper can adapt to the extraction of fine rules for transmission sections in different scenarios. The fine rules constructed by proposed algorithm have strong generalization ability and can quickly and accurately predict the section total transmission capacity.

Algorithm Comparison
This section gives the performance indicators when the fine rules extracted by different algorithms are applied to the prediction of scene cluster 3, and the input features are all consistent with those in Section 6.3. As shown in Tab. 6. As can be seen from Tab. 6, compared with traditional back propagation neural network (BPNN), the extreme learning machine (ELM) has faster training speed and stronger generalization ability, while the training speed of the proposed algorithm is relatively slow, which is due to the differential evolution optimization. At the same time, all individuals need to perform extreme learning, which reduces the overall training speed. But compared with BPNN and ELM, the proposed algorithm has higher accuracy and stronger generalization ability. Considering that the extraction of fine rules can be performed offline, and the time required for online prediction by the learner is basically in milliseconds, the use of higherprecision proposed algorithm is more suitable for this scenario.

Conclusion
This paper proposes a novel algorithm to optimize the power flow in electrical distribution system. The large-scale connection of intermittent clean energy such as wind power to the power grid makes the real-time operating conditions of the power grid more random and uncertain. The traditional safe and stable operation rules based on calculation of typical operating conditions have the risk of failure, and it is difficult to ensure  the efficiency and safety of the power grid. In order to solve the above problems, based on the idea of big data-driven power system rule extraction and operation decision-making, this paper proposes an adaptive differential evolution extreme learning machine extraction method containing the limit transmission power operation rules of the transmission section of the wind power system. Firstly, the power grid operating conditions are characterized by the two-dimensional feature of "wind power-load", and the typical operating conditions are extracted based on the K-medoids clustering method. Then, on the basis of each typical operating condition, a random operating condition set is generated by random sampling, and the repeated power flow method with embedded transient stability check is used to search and obtain the limit transmission power of key transmission sections under random operating conditions. Aiming at the high-dimensional operation feature attribute set of complex interconnected power grids, the RELIEF-F algorithm is used to achieve feature dimensionality reduction, and the feature attributes that are strongly coupled with the transmission section are identified. Finally, the differential evolution extreme learning machine is used to learn the association prediction rules for extracting the ultimate transmission power of the transmission section in the dimensionality reduction feature space. In the real-time operation stage, the fast and accurate estimation of the limit transmission power of the transmission section can be realized through two-stage working condition matching and rule prediction. The example of New England 39 nodes including wind power verifies that the proposed method has strong correlation fitting and nonlinear generalization ability, and can realize fast and accurate optimization of the limit transmission power of wind power transmission section. Future work will to be consider the transient characteristics and evaluate the proposed method in different deployment scenarios.