An Efficient CNN-Based Automated Diagnosis Framework from COVID-19 CT Images

: Corona Virus Disease-2019 (COVID-19) continues to spread rapidly in the world. It has dramatically affected daily lives, public health, and the world economy. This paper presents a segmentation and classification framework of COVID-19 images based on deep learning. Firstly, the classification process is employed to discriminate between COVID-19, non-COVID, and pneumonia by Convolutional Neural Network (CNN). Then, the segmentation process is applied for COVID-19 and pneumonia CT images. Finally, the resulting segmented images are used to identify the infected region, whether COVID-19 or pneumonia. The proposed CNN consists of four Convolutional (Conv) layers, four batch normalization layers, and four Rectified Linear Units (ReLUs). The sizes of Conv layer used filters are 8, 16, 32, and 64. Four max-pooling layers are employed with a stride of 2 and a 2 × 2 window. The classification layer comprises a Fully-Connected (FC) layer and a soft-max activation function used to take the classification decision. A novel saliency-based region detection algorithm and an active contour segmentation strategy are applied to segment COVID-19 and pneumonia CT images. The acquired findings substantiate the efficacy of the proposed framework for helping the specialists in automated diagnosis applications.

follow-up of a disease that has already been diagnosed and treated. Medical images can be acquired with several techniques such as MRI (Magnetic Resonance Imaging), CT (Computed Tomography), X-ray imaging, and PET (Positron Emission Tomography) [1,2].
The CT is a diagnostic imaging technique performed to build cross-sectional images of the human body by the X-ray. The cross-sections are reconstructed from the attenuation factors evaluation of the X-ray beams into the studied object size. It depends on the basic principle that the tissue density that a beam of the X-ray moves can be evaluated by computing the attenuation factor. Using this principle, CT allows reconstructing the body density through a two-dimensional section perpendicular to the acquisition system axis [3].
At the end of 2019, COVID-19 began to spread. The classification and the segmentation of COVID-19 images is now a critical task for the researchers. The coronavirus spreads so quickly between people. With increasing infection rates of this virus, researchers directed their work to find effective solutions in detecting and diagnosing COVID-19. The early detection of the virus aids in selecting the proper treatment [4,5].
COVID-19 is an irresistible disease initiated by a new virus that has not been discovered in humans before. The virus causes a respiratory disease with symptoms like fever and coughing, causing pneumonia in more serious cases. The new coronavirus is spread by contact with an infected person. The rapid and correct diagnosis of this disease plays a significant role in effective treatment planning and patient care. Several imaging techniques have been applied for the diagnosis of coronavirus images. Imaging tests can aid the specialists to discover the disease. The CT scan works as a practical approach to the early screening of COVID-19 [6].
Deep learning networks are the best choice for image classification. They are used to extract the image features automatically. Besides, a CNN is a type of deep learning network that can efficiently work on medical images. It helps to extract and learn valuable features found in images. A CNN is composed of input and output layers and many hidden layers. The hidden layers comprise pooling layers, convolutional layers, and fully-connected layers [7][8][9][10][11][12].
Our paper contribution is to present a framework for the classification and segmentation of COVID-19 CT images. The suggested framework succeeds in dealing with the non-homogeneity and high CT image variations in the segmentation process. A novel saliency-based region detection algorithm and an active contour segmentation strategy are applied to segment the COVID-19 and pneumonia CT images. The used segmentation algorithm clarifies the affected region, accurately. The proposed framework achieves high performance in the classification and segmentation processes. The implemented CNN from scratch on CT images classifies normal, pneumonia, and COVID-19 cases. The proposed framework achieves an accuracy of 99.59% in the classification process.
The remainder of this study is summarized as follows. The related works are discussed in Section 2. The proposed hybrid classification-segmentation framework is presented in Section 3. The utilized dataset description is introduced in Section 4. Extensive experimental analyses to validate the proposed framework are offered in Section 5. Lastly, the concluding remarks are summarized in Section 6.

Related Work
Sethy et al. [13] introduced a technique of COVID-19 detection depending on deep features. The image features are extracted from a pre-trained CNN such as AlexNet, VGG16, and VGG19. The resulting features are classified by an SVM (Support Vector Machine). This technique is utilized to classify X-ray chest pneumonia. Turkoglu et al. [14] presented a technique for the classification and detection of COVID-19 based on CNN. The effective features are chosen using the relief feature selection algorithm from all layers of the architecture of the AlexNet. Then, the classification process is applied with the SVM. This technique classifies COVID-19, normal, and pneumonia chest X-ray images.
Ouchicha et al. [15] proposed a model for detecting COVID-19 from chest X-ray images based on deep learning. This model depends on the residual neural network, and it is built with multilevels with various kernel dimensions to determine the local and global features. Residuals are connected to other levels to participate with information. This model achieved an accuracy of 96.69%. Jain et al. [16] introduced a system for pneumonia detection from chest X-ray images using CNNs and transfer learning. This system depends on six different networks for pneumonia detection. The first and the second networks are composed of two and three convolutional layers, correspondingly. The other four networks are pre-trained models such as ResNet50, Inception-v3, VGG16, and VGG19.
Oulefki et al. [17] presented a technique for the segmentation of COVID-19 chest CT images. The local mean filter is used to improve the image quality. Multi-level thresholding segmentation is applied to segment the images into pneumonia and non-pneumonia regions. This technique achieved a segmentation accuracy of 98%. Amyar et al. [18] presented a technique for segmentation and classification of COVID-19 based on deep learning. This technique depends on a multi-task algorithm to identify COVID-19 patients and segment COVID-19 lesions from chest CT images. A single encoder is used for feature extraction. Two decoders and a multi-layer perceptron are applied for reconstruction, segmentation, and classification, respectively.
Mahmud et al. [19] presented an attempt for detection of COVID-19 and pneumonia from X-ray images. Different forms of CNN are designed and trained on X-ray images of several resolutions for performance optimization. Gradient-based differential localization was incorporated to distinguish abnormal areas from X-ray images indicating different types of pneumonia. Wang et al. [20] introduced a technique for classification of COVID-19 CT images. This technique is composed of three steps. Firstly, the Region of Interest (RoI) is randomly chosen. Then, the pretrained CNN model is trained to extract features of the images. Finally, a classification network is used to discriminate the COVID-19 cases.

The Proposed Hybrid Classification-Segmentation Framework
The proposed framework is applied for the classification and segmentation of COVID-19 images. Firstly, the classification process is employed to discriminate between COVID-19, non-COVID, and pneumonia. Then, the segmentation process is applied for COVID-19 and pneumonia CT images. Finally, the resulting segmented images are used to identify the infected region, whether it is COVID-19 or pneumonia. The CNN is used for the classification process. The CNN is composed of an input layer, convolutional layers, pooling layers, and a classification layer. The input layer takes CT images with size 512 × 512. The convolutional layers comprise convolution operation, batch normalization layer, and ReLU function. The classification layer consists of a fully-connected layer and a soft-max activation function to take the classification decision. The proposed CNN consists of four Conv layers, four batch normalization layers, and four ReLU functions. The sizes of the used filters of Conv layers are 8, 16, 32, and 64. Four max-pooling layers are implemented with stride two and window size 2 × 2. Fig. 1 shows the flow diagram of the suggested classification and segmentation framework of COVID-19 CT images.

Convolutional Layer
The convolution is the essential operation of the CNN that is performed to extract specific characteristics from the input images. A mathematical process is performed by moving a window over the whole image to generate the feature map as the output [21,22]. This process decreases the image size, which makes it simpler to manipulate the image. Each point in the generated feature map can be evaluated as: where p i represents the pixel at position i and w i represents the wight of that pixel in the RoI S.

Activation Function
Two types of activation functions are employed in the proposed framework: ReLU and softmax. The ReLU is a widely used activation function because it presents a good performance in learning, and it is less expensive. It is applied in the Conv layer to generate the feature maps. On the other hand, soft-max is applied in the classification layer of the CNN. The soft-max function normalizes the inputs into a probability distribution [15,16].

Batch Normalization Layer
It is a layer that is used to enhance the convergence during the training process. It is implemented for performance optimization to decrease over-fitting and achieve better test accuracy [15].

Pooling Layer
It is used to decrease the image size. The number of parameters is decreased, and also the computational complexity is reduced to govern the over-fitting. The pooling operations are characterized by a specific window size and a specific size of stride. Polling is classified into average pooling and max pooling [23,24].

Classification Layer
This is the last layer of the CNN that transforms the fully-connected layer output to several classes. It consists of a fully-connected layer and a soft-max activation function. Adam optimizer is implemented to update the weights of the network depending on the training data. The softmax activation function is used to provide the classification output [25][26][27][28].

Segmentation of COVID-19 CT Images
Due to intensity inhomogeneity and pixel variations of CT images, the segmentation of COVID-19 images is still challenging. A novel saliency-based region detection algorithm and an active contour segmentation strategy are applied to segment COVID-19 and pneumonia CT images. In image segmentation, saliency refers to a pixel or object appearance in an obvious way among its neighbors and illustrates the unique characteristics of an image [28]. The saliency information can be used to segment the image. In this paper, we develop active contour segmentation for the segmentation of COVID-19 and pneumonia CT images. This algorithm successfully deals with the significant variations in size, texture, and position of infection in COVID-19 CT images.
The saliency-based region detection and image segmentation algorithm is applied to overcome intensity inhomogeneities and significant variations of images. A new level-set evolution protocol of active contour is designed depending on internal and external energy functions. A new energy function is derived to extract the objects obviously [28][29][30][31][32][33]. We have I:Ω → 2 with I as the input image and Ω as the image domain. φ is the level-set function with initial contour C: 0 , in and ex are the zero-level set domains inside and outside 0 . The proposed energy function is defined as: The external function E ex is defined by region, gradient, and saliency. On the other hand, the internal energy function E in is used as a restriction for level set evolution. For non-homogeneous images, the pixels are gathered and pixels with the same intensity and saliency values are grouped together in Ω in and Ω ex . E ex includes the saliency information and the color intensity variance of Ω in and Ω ex for I.
is the smooth approximated Heaviside function, and ε balances the smoothness. α and λ are fixed scaling constants for saliency information and variance of color intensity. h is the edge indicator that illustrates the saliency information and variance of color intensity.
where , k p , and * are the gradient operator, the Gaussian kernel with standard deviation ρ and the convolution operator.
Let S be the saliency information used to determine the most distinct objects or areas such as edges, colors, and texture in an image. C 1 and C 2 are the scalar approximation of the mean intensities for in and ex , respectively [34,35].
I is the mean pixel value of I and I k is the image blurred by the Gaussian filter. S 1 and S 2 are the saliency means for in and ex . For external energy, the segmentation may be inaccurate and irregular. Hence, internal energy must be determined to perform accurate segmentation.
where μ and v are constants, and l (φ) represents the weighted length term of the contour that deals with the object boundary depending on edge information.
Also, the area of the contour to evaluate the RoI is defined as: where δ ε is a Dirac delta function. η(I) is used to modulate the length term.
To minimize (11) with respect to, φ the derivative can be defined as: where div(.) and ∇ refer to the Laplacian operator. By performing the steepest gradient descent algorithm such that ∂E s ∂φ = 0, the evolution of with time t is defined as [36]: The proposed level-set function is started as: Since p ≥ 0 is a constant initial level-set parameter, the evolution φ should be stopped using a threshold γ as follows: φ will not converge anymore, and Δt is the initial parameter time step.

Dataset Descriptions
The used dataset contains CT images for normal, pneumonia, and COVID-19 cases. This dataset has been acquired by China national center of bio-information. The size of the input images is 512 × 512. The images were downloaded from Kaggle repository. Fig. 2 shows sample images of normal, pneumonia, and COVID-19 cases.

Simulation Results and Discussions
The CNN is trained from scratch on CT images to classify normal, pneumonia, and COVID-19 cases. The proposed framework achieves an accuracy of 99.59%, with 80% of the data for training and 20% for testing. It achieves 99.17% with 70% of the data for training and 30% for testing, and 98.34% with 60% of the data for training and 40% for testing. Training accuracy illustrates the percentage of correct images being classified under the correct label. The computed loss function is a cross-entropy loss function that is plotted versus iterations. After applying the classification process, the COVID-19 and pneumonia images are segmented by saliency-based region detection and image segmentation to detect the infected region, accurately. The resulting segmented images help specialists to diagnose the disease and detect the appropriate treatment.
Metrics are used to assess the suggested framework performance in terms of accuracy, loss, sensitivity, specificity, F1 score, precision, MCC (Matthews Correlation Coefficient), and NPV (Negative Predictive Value). Accuracy defines the efficiency of the proposed framework. Accuracy and loss are the most important metrics that are used to define performance. The loss and accuracy curves clarify the validation and training data. TP (True Positive), TN (True Negative), FP (False Positive), and TN (False Negative) are used to determine the used metrics [37,38]. Prediction ratio per class is illustrated in the confusion matrix of pneumonia, normal, and COVID-19 cases. Prediction ratios for all classes are very encouraging. An important consideration is the balanced distribution of images in the dataset. The balanced distribution enhances prediction results. Fig. 3 shows training and validation accuracy of the proposed framework for classification with 80% for training and 20% for testing. Fig. 4 shows cross-entropy loss of the proposed framework for classification with 80% for training and 20% for testing. Fig. 5 shows training and validation accuracy of the proposed framework for classification with 70% for training and 30% for testing. Fig. 6 shows cross-entropy loss of the proposed framework for classification with 70% for training and 30% for testing. Fig. 7 shows training and validation accuracy of the proposed framework for classification with 60% for training and 40% for testing. Fig. 8 shows cross-entropy loss of the proposed framework for classification with 60% for training and 40% for testing. Fig. 9 illustrates definitions of TP, FP, TN, and FN for infected persons with COVID-19. Fig. 10 shows confusion matrix of the proposed framework with a classification CNN. Fig. 11 shows evaluation results of the CNN model. Tab. 1 demonstrates evaluation metric values of the proposed framework with 80% for training and 20% for testing. Tab. 2 illustrates evaluation metric values of the proposed framework 70% for training and 30% for testing. Tab. 3 shows the evaluation metric values of the proposed framework with 60% for training and 40% for testing.  The proposed CNN model was implemented for different training and testing sets. The proposed framework achieves high accuracy for all sets. Different evaluation metrics are used to assess its performance. It achieves good performance based on these metrics.
Tab. 1 illustrates the classification metrics with 80% for training and 20% for testing. Tab. 2 illustrates the classification metrics with 70% for training and 30% for testing. Tab. 3 illustrates the classification metrics with 60% for training and 40% for testing. The results in Fig. 10 illustrates the confusion matrices of the proposed framework for different sets of training and testing. The confusion matrix depends on false positive rate and true negative rate for evaluating the performance. The results in Fig. 11 illustrate different CNN evaluation metrics for all training and testing sets. The proposed framework achieves good results for various training and testing sets. Fig. 12 shows samples of the resulting segmented COVID-19 CT images. Fig. 13 shows samples of the resulting segmented pneumonia CT images. Tab. 4 shows evaluation metric values for samples of the resulting segmented COVID-19 CT images. Tab. 5 gives evaluation metric values for samples of the resulting segmented pneumonia CT images. Tab. 6 shows a comparison of the suggested framework for segmentation with other models. Tab. 7 shows a comparison of the suggested framework for classification with other models.      The proposed framework is applied to segment and classify COVID-19, normal, and pneumonia CT images. The classification process is performed with a CNN, and it achieves accuracies of 99.59%, 99.17%, and 98.34% for the 80% for training and 20% for testing, 70% for training and 20% for testing, and 60% for training and 40% for testing, respectively. The proposed framework for classification and segmentation achieves high performance. Also, visual results reveal a difference between COVID-19 and pneumonia CT images.      Tab. 4 illustrates segmentation metrics of segmented COVID-19 CT images. Tab. 5 illustrates segmentation metrics of segmented pneumonia CT images. Tab. 6 gives a comparison between the proposed framework for COVID-19 and pneumonia image segmentation and other models. The proposed framework is more accurate than other models. Tab. 7 illustrates a performance comparison between the proposed framework for classification and other models. The proposed framework is more accurate than the other models.

Conclusions and Future Work
This paper presented an efficient framework for the classification and segmentation of COVID-19, normal, and pneumonia CT images. The classification process is based on a CNN composed of four Conv layers, four max-pooling layers, and a classification layer. A novel saliency-based region detection algorithm and an active contour segmentation strategy are applied for the segmentation of COVID-19 and pneumonia CT images. Simulation results proved that the accuracy level achieved on CT images with the CNN reaches 99.59%. The outcomes of the suggested framework are better compared to those of the other conventional models. In the future, we can incorporate advanced deep learning and transfer learning algorithms for the classification and segmentation processes on large datasets of COVID-19, X-ray and CT images for achieving a more efficient automated diagnosis process.