Optimized Artificial Neural Network Techniques to Improve Cybersecurity of Higher Education Institution

: Education acts as an important part of economic growth and improvement in human welfare. The educational sectors have transformed a lot in recent days, and Information and Communication Technology (ICT) is an effective part of the education field. Almost every action in university and college, right from the process from counselling to admissions and fee deposits has been automated. Attendance records, quiz, evaluation, mark, and grade submissions involved the utilization of the ICT. Therefore, security is essential to accomplish cybersecurity in higher security institutions (HEIs). In this view, this study develops an Automated Outlier Detection for CyberSecurity in Higher Education Institutions (AOD-CSHEI) technique. The AOD-CSHEI technique intends to determine the presence of intrusions or attacks in the HEIs. The AOD-CSHEI technique initially performs data pre-processing in two stages namely data conversion and class labelling. In addition, the Adaptive Synthetic (ADASYN) technique is exploited for the removal of outliers in the data. Besides, the sparrow search algorithm (SSA) with deep neural network (DNN) model is used for the classification of data into the existence or absence of intrusions in the HEIs network. Finally, the SSA is utilized to effectually adjust the hyper parameters of the DNN approach. In order to showcase the enhanced performance of the AOD-CSHEI technique, a set of simulations take place on three benchmark datasets and the results reported the enhanced efficiency of the AOD-CSHEI technique over its compared methods with higher accuracy of 0.9997.


Introduction
The network environment of education institutions is uncontrollable, with different types of users namely residents, researchers, students, faculty, etc. [1].There are several incidences where information in education institutions was the aim of hacking attempts.In education institutions, several measures have been taken to control suspected traffics.Novel attack takes advantage of computer vulnerability that doesn't have a solution at present.They are hard to identify, through reactive and proactive security methods.There is two technique for detecting attacks -anomaly detection and signaturebased detection.Signature-based detection depends on matching attack patterns with signatures saved in a repository [2].This technique isn't effective with attacks that signature is unavailable.In anomaly detection, standard profile pattern is preserved and any deviation or abnormality from this pattern is described.Anomaly detection could identify novel attacks; however, it leads to a higher amount of false positives.This approach requires heavy human contribution to upgrade signature repository and standard profiles.This is a time-consuming and expensive procedure [3].The upgrading speed is slower when compared to the speed of new intrusion.Novel attack discovery needs defenders to be on-guard, however, this is impossible for automatically interfaced system.Few types of automated defense method are needed for preventing this attack.Automated signature generation and attack detection schemes support intrusion detection systems (IDS) to report and capture this attack.No single approach could assist in resolving this issue.Integration of methods-like signature generation algorithm, honeypots, IDS, analysis, and tracking-is required.
Network Intrusion Detection System (NIDS) was rapidly advanced in industry and academia responding to the growing cyberattacks against commercial enterprises and governments worldwide.The yearly cost of cybercrime is rising endlessly [4].The more disturbing cybercrimes are resulting from denial of services, web-based attacks, and malicious insiders.Organizations could lose the intellectual property with this malevolent software crept into the network that might result in disruption to a country's critical national framework.Organization deploys antivirus software, firewall, and NIDS for securing computer systems from unauthorized accessing [5].One of the attentive areas to solve cyberattacks rapidly is to distinguish the attack method earlier from the system utilizing NIDS.The NIDS is developed for detecting malevolent activity includes distributed denial of service (DDoS), virus, and worm attacks.The crucial factor for NIDS is reliability abnormality, detection speed, and accuracy.To fulfill the requirement of an IDS, the researcher has discovered the likelihood of utilizing machine learning (ML) and deep learning (DL) methods [6].The two technique comes under the class of artificial intelligence (AI) and aims at learning effective data from the big data.This technique has received much recognition in the fields of network security [7], in the past few years because of the development of graphics processor units (GPU).The above two methods are effective tools in learning important features from the network traffics and predicting the normal and abnormal events on the basis of learned patterns [8].The ML-based IDS heavily based on feature engineering for learning important data from the network traffics.Meanwhile, DL-based IDS don't depend on feature engineering and are good at learning complicated features automatically from the raw information because of their deep framework.
Vinayakumar et al. [9] define how consecutive data modelling is a related process in Cybersecurity.Sequence is temporal features implicitly or explicitly.The recurrent neural network (RNN) method is a set of artificial neural network (ANN) that has seemed as a principle, powerful method for learning dynamic temporal behavior in a random length of largescale sequence data.Moreover, stacked RNN (S-RNN) has the possibility of quickly learning complicated temporal behavior, involving sparse representation.Agarwal et al. [10] introduced a certain factor that makes complex for an IDS to detect and monitor web-based attacks.Also, the study presents a complete review of the current detection system developed exclusively for observing web traffics.Moreover, recognize different dimensions to compare the IDS from distinct perceptions based on the functionality and design.Also, we presented a conceptual architecture of web-based IDS with a prevention method for offering systematic guidelines for the system performance.
Zhou et al. [11] present an IDS method and it is depending on the ensemble learning and feature selection methods.Initially, a heuristic method named correlation based feature selection (CFS)-bat algorithm (BA) is presented for reduction dimension that chooses optimum subset on the basis of correlations among the features.Next, present an ensemble model which integrates C4.5, random forest (RF), and Forest using Penalizing Attribute (Forest PA) algorithm.Akashdeep et al. [12] developed a smart technique that implements feature ranking based on the data correlation and gain.Then, reduction feature is performed by integrating rank attained from data correlation and gain with a method for identifying useless and useful characteristics.This reduction feature is later given to an feed forward neural network (FFNN) model for testing and training on KDD99 datasets.
Jin et al. [13] designed an IDS called SwiftIDS, i.e., able to analyse huge traffic information in higher-speed network at an appropriate time and keep acceptable recognition performance.SwiftIDS accomplishes this aim by two techniques.One method is that light gradient boosting machine (LightGBM) is adapted as the IDS for handling the huge data traffics.Li et al. [14] present effective DL methods such as autoencoder (AE)-IDS based random forest (RF) technique.This approach created the training set with feature grouping and FS.When the training process gets completed, the method could forecast the fallouts with AE that significantly decreases the recognition time and efficiently enhanced the predictive performance.
This study presented a novel automated outlier detection technique for cybersecurity in higher education institutions (HEI), named AOD-CSHEI technique.The AOD-CSHEI technique originally executes data pre-processing in two stages namely data conversion and class labelling.Also, the Adaptive Synthetic (ADASYN) is exploited for the removal of outliers in the data.Further, the sparrow search algorithm (SSA) with DNN model is used for classifying the data into the existence or absence of intrusions in the HEIs network.Lastly, the SSA is utilized to effectually adjust the hyper parameter of the DNN.To demonstrate the improved outcomes of the AOD-CSHEI technique, a wide ranging experimental analysis is carried out using three benchmark datasets.
The remaining sections of the paper is organized as follows.Section 2 elaborates the proposed model, Section 3 offers the performance validation, and Section 4 draws the conclusion.

The Proposed AOD-CSHEI Technique
This study has presented a new AOD-CSHEI technique to identify the presence of intrusions or attacks in the HEIs and the overall process is given in Fig. 1.At the initial stage, the input data is pre-processed in two stages namely data conversion and class labelling.The AOD-CSHEI technique performs different subprocesses namely pre-processing, ADASYN based outlier detection, DNN based classification, and SSA based hyperparameter tuning.In this work, the SSA with DNN model is used for the classification of data into the existence or absence of intrusions in the HEIs network and the SSA is utilized to effectually adjust the hyper parameters of the DNN model.

ADASYN Based Outlier Detection
During the removal of outlier's process, the ADASYN technique receives the pre-processed data as input to eradicate the outliers that exist in it.The fundamental concept of ADASYN technique is to describe the weight distribution of minority sample by determining the degree of learning difficulty of minority sample [15].For binary classification problems, the dataset D tr of m samples are formulated by {x i , y i }, in which i = 1, . . ., m, x i indicates a sample of n-dimension feature space X, and y i represent the label of sample x i , y i ∈ Y = {−1, 1}.The amount of majority sample represents m 1 , and the amount of minority sample represent m s .Once d th is 1, it implies that it could be accepted when the amount of samples in distinct classes is equivalent.β ∈ [0, 1] indicates a variable utilized to set the balanced degree of synthetic data set afterward creating sample.When β = 1, the dataset to generate a new sample would be balanced completely [16], i.e., the amount of sample in distinct classes is equal.K denotes the parameter to find KNN.For the generated sample set S returned by the approach, it would be fused with the original dataset D tr into a novel dataset as a training set.This approach creates further novel instances in the area wherein learning is complex for minority sample that could efficiently reinforce the model learning of minority sample therefore enhancing the model recognition rate of minority sample prediction.

DNN Based Classification Model
At this stage, the DNN model gets executed to determine the presence of intrusions or attacks in the HEIs.The DNN is a network system i.e., depending upon DL approach.This technique is extensively applied in the image classification, computational biology, and signal prediction fields due to its benefits namely ease of understanding and simple structure.The internal architecture of the DNN comprises input, output, and hidden layers; each layer is fully connected.The input layer has m neuron, as well as w and b, denote the weight and bias, correspondingly [17].The gradient backpropagation method is employed for updating parameters in the DNN.This parameter includes bias b and weight w of all the connection layers.There might be an unavoidable error between the output and the input sample label at the time of network training.Once the DNN method begins to train, few initialized network parameter needs to be fixed namely network model parameter (the amount of neurons from the hidden layers, the amount of neurons from the input layer, the amount of neurons in the output layer, and the activation function), epoch, momentum, batch size, initial learning rate.

SSA Based Hyperparameter Tuning Process
For boosting the efficacy of the DNN, the SSA is applied to properly tune the hyper parameter of the DNN.In general, sparrow is the type of bird i.e., more common one since it tends to relate with group and survives more near to us.For experimental purpose, virtual sparrow is utilized for searching food source.The position of the sparrow is determined as follows: In which n indicates the amount of sparrows and d denotes the dimensional of parameter that must be tuned as follows: While the values existing in all the rows of F x determines the fitness value of each individual.In SSA, the producer with optimum fitness value has the importance of attained food in the search method [18].Also, the producer's sparrow takes responsibility for guiding the motion of the entire population and searching for food.
In the equation, t denotes the existing iteration, j = 1, 2, . . ., d. χ t i,j determines the value of jth parameter of ith sparrow.As well, iter − max is a constant with various rounds.α ∈ (0, 1] denotes arbitrary value R 2 (R 2 ∈ [0, 1]) and ST (ST ∈ [0.5, 1.0]) determine the alarm values and the safety thresholds along with, O denoting a random value following the standard distribution and L represents a matrix of 1 × d where all the elements within 1. Fig. 2 illustrates the flowchart of SSA.
1.When R 2 < ST, it shows the absenteeism of predator and the producer enters to a search process 2. When R 2 = ST, it shows that few sparrows have found the predator, and all the sparrows should fly to a safe place at a fast speed.In the event of scrounger, it is essential for enforcing the rules ( 1) and ( 2).After winning the battle, they obtain producer food instantaneously; otherwise, they persevere to achieve the rules ( 1) and ( 2): In the equation, X P denotes the optimal location reached by producer, X worst characterizes the existing global worst position, A describes a matrix of 1 × d where all the elements within 1 or −1, and A + = A T (AA T ) −1 .if i > n/2, it can be suggested that the ith scrounger with the worst fitness values is more possible that hungry.At the simulation time, the sparrow is considered as the one that is danger aware in ten to twenty percent of the overall population.The initial location of the sparrow is made randomly in the population.According to the rules, it is arithmetically determined by: In which X best indicates the existing global optimal position, β denotes the step size control variable, is a standard distribution of arbitrary numbers with a variance of 1and mean values of 0. K ∈ [−1, 1] indicates an arbitrary value.Now f i indicates the fitness values of the existing sparrow f g and f w shows the existing global optimum and worse fitness values, respectively ε denote the smaller constant utilized for eliminating the zero-division-error.When f i > f g , it is represented that the sparrow existing at the edge of swarm, X best describes the location of the middle of the population and is secured around it.f i = f g shows that the sparrow in the center of population is aware of the risks and moves closer to another sparrow and K describe the path of the sparrow motion.
OBL is a powerful mechanism utilized for optimization to increase the convergence speed of distinct metaheuristic approaches [19].The efficient model of the OBL includes the validation of the existing population in the similar round to describe the optimum candidate for given problems.The idea of OBL was applied efficiently in and the concept of opposite value is needed to be determined for describing OBL.

Experimental Validation
In this section, the experimental result analysis of the AOD-CSHEI methodology takes place using three benchmark dataset [20].A comparative analysis is made with decision tree (DT), logistic regression (LR), Naïve Bayesian (NB), ANN, support vector machines (SVM), Adaboost, and LightGBM techniques.
Tab. 1 provides a detailed comparative study of the AOD-CSHEI technique with existing techniques on the test NSL-KDD data set.Fig. 3 offers the accuracy analysis of the AOD-CSHEI technique and existing techniques on the testing and training of NSL-KDD datasets.This study has presented a new AOD-CSHEI technique to identify the presence of intrusions or attacks in the HEIs.The AOD-CSHEI technique performs different subprocesses namely preprocessing, ADASYN based outlier detection, DNN based classification, and SSA based hyperparameter tuning.In this work, the SSA with DNN model is used for the classification of data into the existence or absence of intrusions in the HEIs network and the SSA is utilized to effectually adjust the hyper parameters of the DNN model.In order to showcase the enhanced efficacy of the AOD-CSHEI technique, a set of simulations take place on three benchmark datasets and the results reported the enhanced efficiency of the AOD-CSHEI technique over its compared methods.Therefore, the AOD-CSHEI technique has been utilized as an effective tool for cybersecurity in HEIs.In the future, the AOD-CSHEI technique can be placed in the online learning process of HEIs.

Figure 4 :
Figure 4: Time analysis of AOD-CSHEI technique on NSL-KDD dataset Fig. 5 demonstrates the ROC analysis of the AOD-CSHEI methodology on NSL-KDD dataset.The figure exposed that the AOD-CSHEI technique has reached enhanced outcome with the minimum ROC of 99.9714.Tab. 2 offers a detailed comparative study of the AOD-CSHEI technique with existing techniques on the test UNSW-NB15 dataset.Fig. 6 provides the accuracy analysis of the AOD-CSHEI approach and existing methods on the training and testing set of UNSW-NB15 datasets.The results demonstrated that the NB system has exhibited ineffectual outcomes with the least values of accuracy.At the same time, the ANN, DT, and LR approaches have reached somewhat higher values of accuracy.Then, the Adaboost model has resulted in moderately increased accuracy values.Afterward, the LightGBM and SVM technique has reached reasonable accuracy values, the projected AOD-CSHEI technique has accomplished maximum training and testing accuracy of 0.8918 and 0.8852 correspondingly.

Fig. 11
Fig.11exhibits the ROC analysis of the AOD-CSHEI approach on CICIDS2017 dataset.The figure exposed that the AOD-CSHEI methodologies have attained improved outcome with the lower ROC of 99.9904.The above mentioned result analysis reported the supremacy of the AOD-CSHEI technique over the recent approaches.

Table 1 :
Result analysis of AOD-CSHEI technique on NSL-KDD dataset

Table 3 :
Result analysis of AOD-CSHEI technique on CICIDS2017 dataset