Optimized Convolutional Neural Network for Automatic Detection of COVID-19

: The outbreak of COVID-19 affected global nations and is posing serious challenges to healthcare systems across the globe. Radiologists use X-Rays or Computed Tomography (CT) images to confirm the presence of COVID-19. So, image processing techniques play an important role in diagnostic procedures and it helps the healthcare professionals during critical times. The current research work introduces Multi-objective Black Widow Optimization (MBWO)-based Convolutional Neural Network i.e., MBWO-CNN technique for diagnosis and classification of COVID-19. MBWO-CNN model involves four steps such as preprocessing, feature extraction, parameter tuning, and classification. In the beginning, the input images undergo preprocessing followed by CNN-based feature extraction. Then, Multi-objective Black Widow Optimization (MBWO) technique is applied to fine tune the hyperparameters of CNN. Finally, Extreme Learning Machine with autoencoder (ELM-AE) is applied as a classifier to confirm the presence of COVID-19 and classify the disease under different class labels. The proposed MBWO-CNN model was validated experimentally and the results obtained were compared with the results achieved by existing techniques. The experimental results ensured the superior results of the ELM-AE model by attaining maximum classification performance with the accuracy of 96.43%. The effectiveness of the technique is proved through promising results and the model can be applied in diagnosis and classification of COVID-19.


Introduction
The outbreak of a novel coronavirus (nCoV) (officially named as 'SARS-CoV2') was first identified by the researchers in Wuhan, China by December 2019. Later, this virus was found to be a causative agent of COVID-19, a contagious life-threatening disease. Though its outbreak was first reported in China, it quickly travelled to global nations from January 2020 itself [1]. World Health Organization (WHO) officially named this disease as 'coronavirus disease'  in February 2020 and categorized it as 'pandemic'. Healthcare system in India and many other nations across the globe experienced enormous pressure in treating their patients. Further, the disease also collapsed the economic growth of numerous nations. With 1.58 million positive cases and 34,968 confirmed deaths by the end of July 2020, India is one among the badly-hit countries for this disease. Few common symptoms of COVID-19 are high fever, dry cough, constipation, headache, loss of sense and taste, breathlessness and so on. To reduce the high mortality rate of this disease, it is important to predict or diagnose the disease at very early stages. Further, coronavirus  positive patients have to be quarantined for a certain period of time. This separation or quarantine is highly subjective in developing countries like India due to limited availability of medical facilities that can handle COVID-19 patients. In this scenario, the Chinese government has stated that Real-Time Polymerase Chain Reaction (RT-PCR) can be used to confirm the presence of COVID-19 [2]. RT-PCR is not only a time-consuming procedure, but it also produces severe false-negative rates too at times, which complicates the prognosis.
At few instances, virus-affected individuals produce negative results which mostly ends up in increased fatality rate. Virus-affected individuals spread the virus unknowingly to other people who are healthy and normal, since COVID-19 is an extremely communicable disease. The medical reports of infected people especially chest Computed Tomography (CT) scan images confirm the occurrence of bilateral modification. Thus, chest CT, a highly sensitive technique, is applied as a secondary confirmation in disease prediction and diagnosis of SARS-CoV2. It becomes mandatory for a radiologist to examine these chest CT images obtained from infected individuals. In this scenario, the development of a Deep Learning (DL)-based detection technique becomes inevitable to examine the chest CT scan images without the help of a radiologist.
Artificial Intelligence (AI) is one of the advanced technologies that has been extensively applied in the acceleration of biomedical applications. Under the application of DL frameworks, AI is employed in several domains such as image analysis, data categorization and image segregation and so on. Individuals, infected with COVID-19, suffers from pneumonia since the virus first attacks the respiratory tract and then enters the lungs gradually. Many DL works have been conducted so far with the help of chest X-ray image data model. Previously, a number of studies leveraged pneumonia X-ray images using three diverse DL methodologies like finetuned technique, technique with no fine-tuning, and technique trained from scrap. With respect to ResNet approach, the dataset is first classified under several labels such as age, gender, etc., Then, Multi-Layer Perceptron (MLP) is applied as a classifier since it produces the highest accuracy.
A classification model was proposed in the literature [3] which used pneumonia data to segregate the images. The study made use of SVM classifier, InceptionV3 and VGG-16 methodologies as DL frameworks. Backpropagation Neural Network (BPNN) as well as Competitive Neural Network approaches were employed in the literature [4] to classify the pneumonia data. With the application of pneumonia and healthy chest X-ray images, the maximum number of datasets were declared as sample data and the presented model was compared with previous CNNs. Finally, a better classification result was achieved by the proposed method.
A DL approach was introduced in the study conducted earlier [5] to classify pneumonia data collected from scratch and this data was used for training purpose. The proposed model had a convolutional layer, dense block as well as a flatten layer. In this study, the input size was 200 × 200 pixels to determine the probabilities of classification using sigmoid function. The study accomplished the maximum classification rate for pneumonia from X-ray images. In one of the studies conducted earlier [6], DL methods were developed under three categories of image datasets namely, normal, viral pneumonia, and bacterial pneumonia. Initially, the images were pre-processed to remove the noise. Followed by, augmentation scheme was employed for all images and a transfer learning technique was applied to train all the approaches. Finally, optimal accuracy was attained.
In recent times, developers have presented an imaging pattern of chest CT scan to predict COVID-19 [7]. COVID-19 can be diagnosed on the basis of recent travel information and their signs and symptoms. The study [8] reported high sensitivity of chest CT scan in predicting COVID-19 compared to RT-PCR. Berheim et al. [9] investigated chest CT images of 121 patients from four different regions of China and confirmed the presence of disease. In general, DL method is widely employed in the prediction of acute pneumonia. Li et al. [10] deployed a DL approach called 'COVNet' to extract visual features from chest CT for predicting COVID-19. This method also used visual features to distinguish community-obtained pneumonia and alternate non-pneumonia based lung infections. Gozes et al. [11] implied an AI-relied CT analysis model to predict and quantify the viral load in COVID-19. The network automatically obtained an opacity portion in the lungs. Consequently, the method achieved the highest sensitivity and specificity values. Further, the system remained efficient over pixel spacing as well as slice thickness.
Shan et al. [12] deployed a DL-dependent network (VB-net) for automated segmentation of lung disease regions under the application of chest CT scan. Xu et al. [13] proposed a detection approach to differentiate COVID-19 pneumonia and influenza, a viral pneumonia, using DL models. In this research, CNN model was applied for prediction. Wang et al. [14] examined the radiological alterations present in the CT images of patients. A DL-relied prediction approach was applied with altered inception transfer learning framework. Few features were filtered from CT images for advanced analysis. Though the accuracy attained was better than Xu's approach, this is a time-consuming method for disease analysis.
Narin et al. [15] projected an automated CNN transfer learning method to detect COVID-19. The study found that highest accuracy was achieved by ResNet50 model. Sethy et al. [16] proposed a DL approach to predict COVID-19 from X-ray images. The image was first filtered for deep features and then sent to SVM for classification. Maximum accuracy was accomplished in the implemented approach. From an extended review, it can be confirmed that the image processing techniques can be applied upon chest CT scan images prior to the categorization of COVID-19 affected patients.
The current research article proposes a new COVID-19 diagnosis and classification model using multi-objective Black Widow Optimization-based Convolutional Neural Network (MBWO-CNN). The proposed MBWO-CNN model detects and classifies COVID-19 through image processing technique. The technique has four steps such as preprocessing, feature extraction, parameter tuning, and classification. Initially, the input images undergo preprocessing followed by CNN-based feature extraction. Then, multi-objective Black Widow Optimization (MBWO) algorithm is implemented to tune the hyperparameters of CNN. The application of MBWO algorithm helps in improving the performance of CNN. Finally, Extreme Learning Machine with Autoencoder (ELM-AE) is applied as a classifier to detect the presence of COVID-19 and categorize it under different class labels. The proposed MBWO-CNN model was experimentally validated in this study and the results obtained were compared with existing techniques. Fig. 1 shows the working procedure of MBWO-CNN technique. First, the input image is preprocessed during when the image is resized into a fixed size along with class labeling. Then, feature extraction is executed for the preprocessed image. The hyperparameter tuning process helps in the selection of initial hyperparameters of CNN model. Finally, the feature vectors are classified and the images are categorized into corresponding class labels as either COVID or non-COVID.

Feature Extraction
Once the input image is preprocessed, feature extraction is carried out using MBWO-CNN model. The application of MBWO algorithm helps in the selection of initial hyperparameters of CNN.

Convolutional Neural Network
CNN is a subfield of DL model which implies the maximum breakthrough from image analysis. CNN is predominantly applied in the examination of visual images during image classification process. Both hierarchical infrastructure as well as effective feature extraction of an image makes CNN, the most preferred and dynamic approach for image categorization. Initially, the layers are arranged in 3D format.
The neurons in the applied layer are not designated as the complete collection of neurons in the secondary layer. In other words, a minimum number of neurons is available in the secondary layer too. Consequently, the result gets degraded as a single vector of possible values, which are incorporated together in the dimension of depth. Fig. 2 shows both training and testing approaches of DCNN in the classification of COVID-19. The figure shows that CNN classification model applies diverse layers to develop a method and tests its function. To categorize the patients as either infected with COVID-19 or not, the properties of chest CT images are tested in an accurate manner. COVID-19 disease categorization, based on chest CT image, is performed to identify the duplicate classifications and process the images. To classify COVID-19 affected patients with the help of CNN approach, the steps given herewith are followed. • Convolution layer uses convolution task for input. It sends the data to consecutive layer • Pooling layer concatenates the results of clusters with a neuron present in subsequent layer • FC layers link all the neurons as a single layer, with other neurons present in the subsequent layer. In case of FC layer, the neurons obtain the input from all the elements of existing layer CNN functions on the basis of obtaining features from the images. There is no need to perform manual feature extraction. Hence, the features remain unequipped and it gains knowledge at the time of network training with a collection of images. Training process makes the DL model highly effective in computer vision operations. CNNs perform feature prediction with the help of massive number of hidden layers. The layer enhances the difficulty of learned features [17,18].

Multi-Objective Fitness Function
According to the literature, CNN gains experience from hyperparameter tuning problems. The hyperparameters are kernel size, padding, kernel type, hidden layer, stride, activation functions, learning value, momentum, epoch count, and batch size. These variables should be tuned. Here, multi-objective Fitness Function (FF) is expressed as follows.
where, S n and S p denote the sensitivity and specificity attributes, correspondingly. Sensitivity, a true positive rate, processes the number of actual positives that are correctly classified. Confusion matrix is applied in the estimation of sensitivity (S n ) which can be numerically determined according to the literature [19].
Here, T p and F n refer to True Positive (TP) and False Negative (FN) measures, correspondingly. S n is selected as a value between [0, 100]. S n deems '100' to be a reasonable value. Specificity (S p ) determines the ratio of correctly-classified True Negatives (TN) and is evaluated as given herewith.
Here, T n and F p signify True Negative (TN) value and False Positive (FP) measures respectively. S n comes under [0, 100]. The phenomenon of S p approaching 100 can be considered. Fig. 3 shows the flowchart of BWO algorithm. Evolutionary Algorithm (EA) performs alike the projected approach based on the initial location of spiders. In that method, all the spiders arrive at a capable solution. The initial spiders make an attempt to produce a new generation. Female black widow consumes the male, once the mating is over. When the eggs are laid, spiderlings come out of the egg sacs [20]. It cohabits on maternal web for few days, sometimes a week too. During this period, sibling cannibalism is observed. Optimization issue can be resolved using the measures of problem scores, which develop a proper architecture for the solution developed from a recent problem. In GA and PSO methodologies, the structure is termed as 'Chromosome' and 'Particle position' correspondingly. However, in BWO, it is termed as 'widow'. At this point, a capable solution is assumed for all the problems in the form of a black widow spider. These black widow spiders depict the measures of problem variables. Further, to resolve the standard functions, the infrastructure is demonstrated as an array.

BWO Algorithm
For N var −dimensional optimization issue, a widow is defined as an array of 1 × N var which represents the solution of concerned issue and is expressed as follows.
The variable measures (x 1 , x 2 , . . . , x N var ) denote the floating-point value. The fitness of widow is attained by estimating Fitness Function (FF) f at a widow of (x 1 , x 2 , . . . , x N var ). Hence, In order to invoke the optimization method, a candidate widow matrix of size N pop × N var is created with the help of initial population of spiders. Followed by, a pair of parents is randomly selected to compute pro-development by mating. In this process, male black widow is consumed by females, once the mating is over.

(ii) Procreate
As pairs are autonomous in nature, it begins to mate and gives birth to next generation. Mating process occurs in the web. An approximate of 1,000 eggs is laid during every mating. Some of the spiderlings die due to different reasons, while the healthy one stays alive. For reproduction, an array named 'alpha' is developed. This is to ensure that the widow array is generated using arbitrary values of the offsprings, under the application of α, the given function. Here, x 1 and x 2 denote the parents, whereas y 1 and y 2 indicate the offspring.
It is followed for N var /2 iterations in which the randomly chosen values are not repeated. Eventually, both children and parents are included in an array and are arranged by fitness measures. As per the cannibalism rating, better individuals are included with newly-produced population.

(iii) Cannibalism
There are three types of cannibalism reported so far. Sexual cannibalism, where a female BW consumes the male after mating. Here, both female and male are examined via fitness measures. Secondly, sibling cannibalism occurs during when a healthy spiderling eats the vulnerable ones. Here, Cannibalism Rating (CR) is fixed according to the number of survivors. In third type, the baby spiders prey on the mother. The fitness measures are applied to identify weak and strong spider lings.

(iv) Mutation
In this process, the mutation value of the individuals are randomly selected to develop a population. The selected solutions are randomly interchanged in the array. It is estimated using a mutation rate.
(v) Convergence Similar to EAs, the author assume three termination criteria such as predetermined values of iterations, observance of modifications in fitness measures of an optimal widow during different iterations and accomplishment of accuracy up to certain level.

(vi) Parameter Setting
The deployed BWO model is composed of few attributes such as Procreating Rate (PP), CR, and Mutation Rate (PM) to attain the best outcomes. The variable has to be modified to enhance the efficiency of this method and eventually, supreme solutions can be attained. At the time of fine tuning the parameters, the chances are high for moving from local optima to higher ability and finding the searching area globally. Therefore, the exact number of attributes assure that the management is effective between exploitation and exploration phases. The developed approach is composed of three significant controlling attributes such as PP, CR, and PM. Here, PP denotes the procreating ratio that calculates the number of individuals to be involved in procreation [19]. PP is controlled through the production of diverse offspring with additional diversification. The maximum number of chances is offered to find the search space accurately. CR is defined as a controlling attribute of cannibalism that eliminates improper individuals from the population. When appropriate measures of this parameter are changed, it assures the maximum function for exploitation process, by transmitting the search agents from local to global level. PM denotes the percentage of individuals that participate in mutation. The right value for this attribute lies between exploitation and exploration phases. It has the potential to effectively manage the transformation of search agents from global stage to local ones and boost them to attain the optimal solutions.

ELM-SA Based Classification
The current research article considered ELM-SA for classification. It applies the obtained features and estimates the possibility of objects present in an image. Both activation function as well as dropout layer are employed in the establishment of non-linearity and reduction of overfitting issues correspondingly [21]. ELM is defined as a Single hidden Layer Feed-forward Neural network (SLFN) in which the hidden layer is non-linear, since non-linear activation function is present in it. In other terms, the output layer is linear with no activation function. It is composed of three layers namely, input layer, hidden layer, and output layer. Assume that x is a training instance and f (x) indicates the simulation outcome of NN. SLFN, along with k hidden nodes, is implied by the function given below.
where G (w, b, x) implies the hidden layer activation function, w defines the input weight matrix that connects input and hidden layers, b refers to bias weight of the hidden layer, and B = [β 1 β 2 . . . β m ] denotes the weight from hidden and output layers. ELM makes use of n training instances, d input neurons, k hidden neurons, and m output neurons (i.e., m classes) while Eq. (7) is expressed as follows.
where t i represents m-dimensional required output vector for i-th training instance x i , d-dimensional w j implies the j-th weight vector from input layer to j-th hidden neuron, and b j describes the bias of j-th hidden neuron. In this approach, w j , x i refers to interior product of w j and x i . Here, sigmoid function g is employed as an activation function. Hence, the result of j-th hidden neuron can be defined by the following equation.
where exp(·) implies the exponent arithmetic, and e 2 refers to steepness attribute. In matrix format, the method (8) is reorganized as follows where T ∈ R n×m defines the target result, ⎦ is meant to be a hidden layer whereas output matrix of ELM has the size of (n, k) and is represented as given herewith.
Followed by, B is determined by minimum norm least-squares solution: where C refers to a regularization variable and ELM is defined as follows.
ELM is upgraded as Kernel-based ELM (KELM) through kernel trick. Suppose Where Here, x i and x j illustrate i-th as well as j-th training instances, correspondingly. Followed by, substituting HH T by , the implications of KELM are represented as given herewith where f KELM (x) signifies the simulation outcome of KELM method, as well as h (x) . A significant feature of KELM is that the hidden node count is predicted and fixed.
There are no arbitrary feature mappings present in it. Moreover, processing duration is limited to ELM, because of the presence of kernel trick.
k is a transformation vector applied for representation learning in terms of x (i) k . Based on Eq. (10), B is interchanged by (i) and T is modified by X (i) correspondingly in MELM.
where H (i) depicts the resultant matrix of i-th hidden layer using X (i) , whereas (i) has been resolved as follows. Besides, where X * denotes the final implication of X (1) . X * is applied as a hidden-layer that results in the estimation of final weight β * and is processed as follows.

Experimental Validation
This section discusses the classifier results achieved by MBWO-CNN model in the classification of chest X-Ray image [22] dataset. The dataset comprises of a number of chest X-ray images from both COVID-19 patients and Non-COVID patients. The number of images under COVID-19 is 220 and the number of images under normal class is 27. Fig. 5 shows some of the test images used in the study.     61%. The CNNRNN model demonstrated slightly optimal classification outcome with an accuracy of 85.66% and F-score of 91.20%. Also, the ANN model yielded moderate result with an accuracy of 86% and F-score of 91.34%. In addition, the LSTM method attained a manageable result with an accuracy of 86.66% and F-score of 91.89%. Likewise, the DT model achieved an acceptable accuracy of 86.71% and F-score of 87%. In line with this, the CNN model accomplished a moderate accuracy of 87.36% and F-score of 89.65%. Furthermore, the ANFIS model yielded an accuracy of 88.11% and F-score of 89.04%. Besides, the KNN model offered better result compared to earlier models with its accuracy being 88.91% and F-score being 89%. Followed by, the CoroNet model managed to illustrate a moderate outcome with an accuracy of 90.21% and F-score of 91%. Though DTL approach reached a moderate accuracy of 90.75% and F-score of 90.43%, the XGBoost method exceeded all the previous models in terms of accuracy (91.57%) and F-score (92%). Simultaneously, the LR model achieved an accuracy of 92.12% and F-score of 92%. Here, the MLP model registered a competitive outcome with an accuracy of 93.13% and F-score of 93%. But, the proposed MBWO-CNN model outperformed all the existing models and achieved the highest accuracy of 96.43% and F-score of 96.68%. The results of the analyses discussed above emphasize that the presented MBWO-CNN model is a proficient tool in diagnosis and classification of COVID-19.

Conclusion
The current research work developed an effective MBWO-CNN model for diagnosis and classification of COVID-19. The input image was first preprocessed in which the image was adjusted to a fixed size. Then, feature extraction was performed for the preprocessed image. Hyperparameter tuning process helped in the selection of initial hyperparameters of CNN model. Finally, the feature vectors were classified and the images were categorized into corresponding class labels, i.e. either COVID-19 or Non-COVID. A detailed comparative analysis was conducted to validate the effectiveness of the proposed MBWO-CNN model in terms of detection performance. The results were investigated under several aspects. The proposed MBWO-CNN model accomplished the highest sensitivity of 95.78%, specificity of 96.15%, and accuracy of 96.43% in the experimental investigations. Therefore, the study established the effectiveness of MBWO-CNN model. It can be considered as an effective model for diagnosis and classification of COVID-19. In future, MBWO-CNN model can be deployed in IoT and cloud-enabled diagnostic tools to assist remote patients.

Funding Statement:
The authors received no specific funding for this study.

Conflicts of Interest:
The authors declare that they have no conflicts of interest to report regarding the present study.