Digital signal processing of electroencephalography (EEG) data is now widely utilized in various applications, including motor imagery classification, seizure detection and prediction, emotion classification, mental task classification, drug impact identification and sleep state classification. With the increasing number of recorded EEG channels, it has become clear that effective channel selection algorithms are required for various applications. Guided Whale Optimization Method (Guided WOA), a suggested feature selection algorithm based on Stochastic Fractal Search (SFS) technique, evaluates the chosen subset of channels. This may be used to select the optimum EEG channels for use in Brain-Computer Interfaces (BCIs), the method for identifying essential and irrelevant characteristics in a dataset, and the complexity to be eliminated. This enables (SFS-Guided WOA) algorithm to choose the most appropriate EEG channels while assisting machine learning classification in its tasks and training the classifier with the dataset. The (SFS-Guided WOA) algorithm is superior in performance metrics, and statistical tests such as ANOVA and Wilcoxon rank-sum are used to demonstrate this.

Digital signal processing is critical for several applications, including seizure detection/prediction, sleep state classification, and categorization of motor imagery. As shown in

As previously stated, the interface between the brain and the computer (or another device) may be intrusive or non-invasive. While invasive technologies have recently demonstrated some promise in a variety of applications due to their high accuracy and low noise [

The brain regions are shown in

Frequency band | Speed (Hz) | Mental state | EEG recording (1 Sec.) |
---|---|---|---|

Delta | 1–4 (Slow) | Deep sleep | |

Theta | 4–8 | Drowsy | |

Alpha | 8–12 | Relaxed | |

Beta | 12–30 (Fast) | Focused |

The EEG signals collected are often multi-channel in nature. For example, we have two options while classifying these signals: work on a subset of channels chosen based on specific criteria or work on all channels [

Seizure prediction and detection is another area where channel reduction may be helpful. The scientific and industry community are particularly interested in medical support systems’ portable development. This can detect the onset of epileptic seizures early or even hours in advance by incorporating algorithms, thereby avoiding injury [

The current work contribution can be summarized as follow.

A continuous version of the Guided Whale Optimization based on Stochastic Fractal Search algorithm (Continuous SFS-Guided WOA) is presented.

A binary version of the SFS-Guided WOA algorithm (Binary SFS-Guided WOA) is also presented.

Two publicly accessible datasets for electroencephalogram (EEG) signal processing, named BCI Competition IV-dataset 2a and BCI Competition IV-data set III, are utilized to test the suggested method.

The SFS-Guided WOA algorithm is employed to evaluate the chosen subset of EEG channels of the two datasets.

This is used to select the optimum EEG channels for use in Brain-Computer Interfaces (BCIs).

Statistical tests such as ANOVA and Wilcoxon rank-sum are used to demonstrate the presented method's performance.

Feature selection methods may be categorized as filter-based, wrapper-based, or hybrid-based [

The optimization technique is widely used in various fields of study, including computer science, engineering [

Recently, numerous studies have used optimization to resolve given problems, such as the Whale Optimization Algorithm (WOA). WOA was used to locate the optimal weights for training the neural community and developed a multi-objective model of WOA, which was then applied to the problem of forecasting wind speed. Additionally, WOA was widely employed to determine the final location and length of capacitors used inside the radial system [

The nature of EEG alerts may be very complicated since they are no longer linked, however random. The EEG dimension is determined by various factors, most notably the individual's age, gender, psychological state, and intellectual state of the issue [

Optimizer of Genetic Algorithm (GA) is inspired by biology (survival of the fittest). Initialization is a critical GA process. Alternatively, other genetic operators, such as elitism, may be used [

Particle Swarm Optimization (PSO) is another optimization technique that simulates the motions and interactions of individuals in a flock of birds or a school of fish [

The Grey Wolf Optimizer (GWO) is an algorithm that mimics grey wolf leadership, social structure, and hunting behavior. Encircling and assaulting the victim are the first two phases. This optimizer possesses Exploration and exploitation must be conducted in a balanced manner. While high search accuracy is simple to implement, it results in premature convergence due to the fluctuating positions of the three leaders. The greater the number of variables, the lower performance is achieved. It is utilized in feature selection, parameter adjustment of PID controllers, clustering, robotics, and route finding [

The foraging behaviors of humpback whales inspired the Whale Optimization Algorithm (WOA). They catch fish with bubbles as they swirl around a school of fish. It is a simple method for exploring a vast search space that is sluggish to convergence, prone to local optima stagnation, and computationally costly. WOA is applied in route planning, voltage offset reduction, and precision control of laser sensor systems [

The Guided WOA is a variant of the standard WOA. In the Guided WOA technique, to address the main disadvantage of this method, the search strategy for a single random whale may be substituted with an advanced design capable of quickly moving the whales toward the optimal solution or prey. The original WOA compels whales to travel randomly around one another, comparable to the global search. A whale may follow three random whales rather than one to improve exploration performance in the modified WOA (Guided WOA) [

The Stochastic Fractal Search (SFS) technique's diffusion process may generate a sequence of random walks around the optimum solution. This enhances the Guided WOA's exploration capacity by using this diffusion process to find the optimal solution. Gaussian random walks are used as a component of the diffusion process that occurs around the updated optimum position. Algorithm 1 shows the continuous version of the SFS-Guided WOA algorithm. The binary conversion of the algorithm is shown in Algorithm 2, which explains step by step how to convert the continuous algorithm to a binary one to be applied for the tested EEG problem.

This section discusses the experimental results. The data preprocessing process is explained, including the description and the correlation matrix of tested EEG datasets. Configuration of the suggested algorithm is also discussed. Performance metrics and results discussion are described in detail in this part.

Two publicly accessible datasets for electroencephalogram (EEG) signal processing are utilized in this work to test our suggested method.

Dataset | Subscription | # Rows | # Features/Columns |
---|---|---|---|

BCI competition IV- dataset 2a | D1 | 45,000 | 22 |

BCI competition IV- data set III | D2 | 64,000 | 19 |

Feature | s1 | s2 | s3 | s4 | s5 | s6 | s7 |
---|---|---|---|---|---|---|---|

Number of values | 14304 | 14304 | 14304 | 14304 | 14304 | 14304 | 14304 |

Minimum | 4200 | 3910 | 4200 | 4070 | 4310 | 4570 | 4030 |

25% percentile | 4280 | 3990 | 4250 | 4110 | 4330 | 4610 | 4060 |

Median | 4290 | 4000 | 4260 | 4120 | 4340 | 4620 | 4070 |

75% percentile | 4310 | 4020 | 4270 | 4130 | 4350 | 4630 | 4080 |

Maximum | 4470 | 4150 | 4350 | 4190 | 4400 | 4670 | 4140 |

Range | 270.0 | 240.0 | 150.0 | 120.0 | 90.00 | 100.0 | 110.0 |

Mean | 4298 | 4007 | 4262 | 4120 | 4340 | 4618 | 4071 |

Std. deviation | 32.52 | 27.33 | 16.44 | 17.08 | 12.33 | 13.26 | 17.97 |

Std. error of mean | 0.2719 | 0.2285 | 0.1375 | 0.1428 | 0.1031 | 0.1109 | 0.1503 |

Sum | 61480820 | 57319530 | 60958540 | 58935910 | 62074800 | 66060020 | 58231630 |

s8 | s9 | s10 | s11 | s12 | s13 | s14 | cat |

14304 | 14304 | 14304 | 14304 | 14304 | 14304 | 14304 | 14304 |

4570 | 4150 | 4170 | 4130 | 4220 | 4490 | 4240 | 0.000 |

4600 | 4190 | 4220 | 4190 | 4270 | 4590 | 4340 | 0.000 |

4610 | 4200 | 4230 | 4200 | 4280 | 4600 | 4350 | 0.000 |

4620 | 4210 | 4240 | 4210 | 4290 | 4620 | 4370 | 1.000 |

4670 | 4260 | 4290 | 4270 | 4330 | 4720 | 4490 | 1.000 |

100.0 | 110.0 | 120.0 | 140.0 | 110.0 | 230.0 | 250.0 | 1.000 |

4614 | 4200 | 4229 | 4200 | 4277 | 4603 | 4358 | 0.4509 |

14.67 | 14.12 | 15.30 | 18.62 | 15.35 | 25.73 | 32.01 | 0.4976 |

0.1226 | 0.1181 | 0.1279 | 0.1557 | 0.1284 | 0.2151 | 0.2676 | 0.004161 |

65999470 | 60071740 | 60495390 | 60073290 | 61176090 | 65842840 | 62338180 | 6449 |

Each dataset is subdivided into three equal-sized segments at random: training, validation, and test. During the learning phase, training is utilized to fine-tune the KNN classifier. Validation is a technique for testing. When determining the fitness function of a particular solution. Normalize data to ensure that all features are contained within the same limits and are handled equally by the machine learning model. One of the simplest methods for scaling data is to use the min-max scaler, which scales and bounds data features between 0 and 1.

The evaluation metrics of the suggested method and compared algorithms are shown in

Metric | Value |
---|---|

Average error | |

Average select size | |

Average fitness | |

Best fitness | |

Worst fitness | |

Standard deviation |

Results of the experimental for the two tested datasets, D1 and D2, based on the suggested and compared methods are shown in

Metric | Datasets | bSFS-guided WOA | bGWO | bGA | bWOA | bPSO |
---|---|---|---|---|---|---|

Average error | D1 | 0.161956522 | 0.165217391 | 0.163478261 | 0.165652174 | 0.162173913 |

D2 | 0.027467811 | 0.027682403 | 0.032618026 | 0.028540773 | 0.030257511 | |

Average select size | D1 | 0.385714286 | 0.5 | 0.553571429 | 0.542857143 | 0.542857143 |

D2 | 0.61875 | 0.76875 | 0.7625 | 0.775 | 0.71875 | |

Average fitness | D1 | 0.161956522 | 0.165217391 | 0.163478261 | 0.165652174 | 0.162173913 |

D2 | 0.027467811 | 0.027682403 | 0.032618026 | 0.028540773 | 0.030257511 | |

Best fitness | D1 | 0.006308489 | 0.009128255 | 0.006371262 | 0.012126419 | 0.010375499 |

D2 | 0.002920846 | 0.002945634 | 0.010175433 | 0.00544218 | 0.006746602 | |

Worst fitness | D1 | 0.134782609 | 0.152173913 | 0.152173913 | 0.139130435 | 0.139130435 |

D2 | 0.021459227 | 0.021459227 | 0.021459227 | 0.025751073 | 0.025751073 | |

Standard deviation | D1 | 0.173913043 | 0.173913043 | 0.173913043 | 0.2 | 0.182608696 |

D2 | 0.034334764 | 0.034334764 | 0.055793991 | 0.0472103 | 0.055793991 |

ANOVA and Wilcoxon Signed Rank tests are performed to confirm the suggested method compared to other algorithms.

SS | DF | MS | F (DFn, DFd) | P value | |
---|---|---|---|---|---|

Treatment (between columns) | 0.00012 | 4 | 2.99E−05 | F (4, 45) = 33.25 | P < 0.0001 |

Residual (within columns) | 4.05E−05 | 45 | 9E−07 | - | - |

Total | 0.00016 | 49 | - | - | - |

SS | DF | MS | F (DFn, DFd) | P value | |
---|---|---|---|---|---|

Treatment (between columns) | 0.0002 | 4 | 5.01E−05 | F (4, 45) = 69.76 | P < 0.0001 |

Residual (within columns) | 3.23E−05 | 45 | 7.18E−07 | - | - |

Total | 0.000233 | 49 | - | - | - |

bSFS-Guided WOA | bGWO | bGA | bWOA | bPSO | |
---|---|---|---|---|---|

Theoretical median | 0 | 0 | 0 | 0 | 0 |

Actual median | 0.162 | 0.1652 | 0.1635 | 0.1657 | 0.1622 |

Number of values | 10 | 10 | 10 | 10 | 10 |

Wilcoxon signed rank test | |||||

Sum of signed ranks (W) | 55 | 55 | 55 | 55 | 55 |

Sum of positive ranks | 55 | 55 | 55 | 55 | 55 |

Sum of negative ranks | 0 | 0 | 0 | 0 | 0 |

P value (two tailed) | 0.002 | 0.002 | 0.002 | 0.002 | 0.002 |

Exact or estimate? | Exact | Exact | Exact | Exact | Exact |

P value summary | ** | ** | ** | ** | ** |

Significant (alpha = 0.05)? | Yes | Yes | Yes | Yes | Yes |

How big is the discrepancy? | |||||

Discrepancy | 0.162 | 0.1652 | 0.1635 | 0.1657 | 0.1622 |

bSFS-Guided WOA | bGWO | bGA | bWOA | bPSO | |
---|---|---|---|---|---|

Theoretical median | 0 | 0 | 0 | 0 | 0 |

Actual median | 0.02747 | 0.02768 | 0.03262 | 0.02854 | 0.03026 |

Number of values | 10 | 10 | 10 | 10 | 10 |

Wilcoxon signed rank test | |||||

Sum of signed ranks (W) | 55 | 55 | 55 | 55 | 55 |

Sum of positive ranks | 55 | 55 | 55 | 55 | 55 |

Sum of negative ranks | 0 | 0 | 0 | 0 | 0 |

P value (two tailed) | 0.002 | 0.002 | 0.002 | 0.002 | 0.002 |

Exact or estimate? | Exact | Exact | Exact | Exact | Exact |

P value summary | ** | ** | ** | ** | ** |

Significant (alpha = 0.05)? | Yes | Yes | Yes | Yes | Yes |

How big is the discrepancy? | |||||

Discrepancy | 0.02747 | 0.02768 | 0.03262 | 0.02854 | 0.03026 |

The average error of the suggested (bSFS-Guided WOA) and compared algorithms (bPSO, bWOA, bGA and bGWO) over the two tested datasets (D1 and D2) is shown in

In this work, the Guided Whale Optimization Method (Guided WOA) algorithm based on Stochastic Fractal Search (SFS) technique is used to evaluate the chosen subset of channels for EEG datasets. This method is used to select the optimum EEG channels for use in Brain-Computer Interfaces (BCIs). The (SFS-Guided WOA) algorithm is superior in terms of performance metrics, and statistical tests such as ANOVA and Wilcoxon rank-sum are used to demonstrate this. The results for the two tested datasets based on the suggested and compared methods (GWO, GA, WOA, and PSO algorithms) show the quality of the recommended method. The average error and average select error confirm the performance of the SFS-Guided WOA algorithm. Other metrics, such as average, best, worst fitness and standard deviation, also show the quality of the suggested method compared to other optimization techniques. The average error of the presented (bSFS-Guided WOA) algorithm and compared algorithms (bPSO, bWOA, bGA and bGWO) indicates the performance of the recommended method over the tested datasets. Residual, Homoscedasticity, QQ plots and heat map of the suggested and compared algorithms are also tested over the two datasets. The recommended method in this work will be tested for other datasets in the future.

The authors thank Taif University Accessibility Center for the study participants. We deeply acknowledge Taif University for supporting this study through Taif University Researchers Supporting Project Number (TURSP-2020/150), Taif University, Taif, Saudi Arabia.