Computer Vision and Deep Learning-enabled Weed Detection Model for Precision Agriculture

Presently, precision agriculture processes like plant disease, crop yield prediction, species recognition, weed detection, and irrigation can be accomplished by the use of computer vision (CV) approaches. Weed plays a vital role in influencing crop productivity. The wastage and pollution of farmland's natural atmosphere instigated by full coverage chemical herbicide spraying are increased. Since the proper identification of weeds from crops helps to reduce the usage of herbicide and improve productivity, this study presents a novel computer vision and deep learning based weed detection and classification (CVDL-WDC) model for precision agriculture. The proposed CVDL-WDC technique intends to properly discriminate the plants as well as weeds. The proposed CVDL-WDC technique involves two processes namely multiscale Faster RCNN based object detection and optimal extreme learning machine (ELM) based weed classification. The parameters of the ELM model are optimally adjusted by the use of farmland fertility optimization (FFO) algorithm. A comprehensive simulation analysis of the CVDL-WDC technique against benchmark dataset reported the enhanced outcomes over its recent approaches interms of several measures.

Significant research has presented accurate variable spraying methods to avoid herbicide residual issues and waste created by the conventional full coverage spraying [3]. To accomplish accurate variable spraying, the main problem that needs to be resolved is how to understand realtime accurate recognition and classification of weeds crops [4]. In traditional agriculture settings, herbicide is employed at uniform rate to the entire field although distribution of weed is patchy. It causes environmental pollution and leads to higher input costs for farmers [5]. In contradiction of this, Precision Agriculture (PA) purposes necessity-based on site-specific application [6].
Adapting PA practice for using herbicides needs precise weed mapping by categorizing host plants and weeds [7]. Classification of plants is a tedious process due to spectral resemblances among distinct kinds of plants. The current mapping method assumes that host plant is planted in rows. The line detection approach is utilized for categorizing plants in a row as host plant, and plant that falls out of seeding line as weeds. The interline method failed to identify weed plants positioned inside crop lines and host plant falls out of crop line [8]. Method to realize weed recognition through CV technique primarily consist of deep learning and conventional image processing. Once weed recognition is implemented by using conventional imageprocessing technique, feature extraction, namely color, shape, and texture, of the image and combined with conventional machine learning (ML) approaches like Support Vector Machine (SVM) or random forest, for weed detection is needed [9]. This method needs to develop features automatically and have higher dependence on image acquisition method, pre-processing method, and the quality of feature extraction. Due to the development in computing power and the growth in data dimensions, DL approaches could extract multi-scale and multi-dimensional spatial semantic feature data of weeds by using Convolution Neural Network (CNN) because of improved data expression capability, avoiding the drawbacks of conventional extraction method. Hence, they have received growing interest towards the authors [10].
Lottes et al. presented a novel crop-weed classification model which depends on fully convolutional networks (FCN) with encoding-decoding infrastructure and incorporates spatial data with assuming image sequence [11]. The exploitation the crop arrangement data that is noticeable in the image orders allows this technique for robustly estimating pixel-wise labeling of images as to crop and weed, for instance, a semantic segmentation. The RGB color images of seedling rice are taken from paddy fields, and ground truth (GT) images are attained by manually labeling the pixel from the RGB image with 3 distinct types such as weed, rice seedling, and background [12]. The class weight co-efficient is computed for solving the problem of unbalance of the amount of classification type. The GT as well as RGB A software is established, Pynovisão that with utilize of superpixel segmentation technique SLIC is utilized for building a robust image data set and classify image utilizing the method training by Caffe software [13]. For comparing the outcomes of ConvNets, SVMs, AdaBoost, and RF are utilized in conjunction with group of shape, color, and texture feature extracting approaches. A novel method is developed which combined different shape features for establishing a pattern for all varieties of plants. For enabling the vision system from the detection of weeds dependent upon its patterns, SVM and ANN are utilized [14]. Four species of general weed from sugar beet field are considered. The shape feature fixed contained Fourier descriptors and moment invariant feature. Pereira et al. [15] presented the automatic identification of any species using techniques of supervised pattern detection approach and shape descriptor for composing an adjacent future expert method to automatic application of correct herbicide. The experimentally utilizing several recent approaches have demonstrated the robustness of utilized pattern detection approaches.

Paper Contributions
The major contributions of the study are discussed as follows. This study presents a novel computer vision and deep learning based weed detection and classification (CVDL-WDC) model for precision agriculture. The proposed CVDL-WDC technique intends to properly discriminate the plants as well as weeds. The proposed CVDL-WDC technique involves two processes namely multiscale Faster RCNN based object detection and optimal extreme learning machine (ELM) based weed classification. The parameters of the ELM model are optimally adjusted by the use of farmland fertility optimization (FFO) algorithm. A comprehensive simulation analysis of the CVDL-WDC technique against benchmark dataset and the results are examined under diverse dimensions.

Paper Organization
The rest of the study is organized as follows. Section 2 elaborates the working of CVDL-WDC technique and the experimental results are offered in Section 3. Lastly, Section 4 concludes the study.

The Proposed Model
This study has developed a new CVDL-WDC technique for proper discrimination of plants and weeds in precision agriculture. The proposed CVDL-WDC technique encompasses a series of subprocesses namely WF based pre-processing, Multi-scale Faster RCNN based object detection, ELM based classification, and FFO based parameter optimization. The proposed model properly identified the weeds among crops, reduce the usage of herbicide and improve productivity. Fig. 1 illustrates the overall prcoess of CVDL-WDC technique.

Pre-processing Using WF Technique
At the initial stage, the WF approach can be utilized for eradicating the noise that exists in the image. Noise removal is an image pre-processing approach proposed to optimize the feature of the image corrupted by noise [16]. A certain instance is adoptive filtering whereby the denoising is typically performed according to the noise content existing in the image. Consider the corrupted image is described asÎ x; y ð Þ, the noise when the noise variance through the image is equivalent to zero, r 2 y ¼ 0 ¼>Î ¼Î x; y ð Þ. When the global noise variance is smaller, and the local variance is large when compared to the global variance, next the ration is nearly is equivalent to one, Whenr 2 y ) r 2 y , thenÎ ¼Î x; y ð Þ. While a higher local variance represents the incidence of edge in the image window. The global and local variances are equivalent to b I ¼ b l L asr 2 y % r 2 y . This is the average intensity in a standard region.

Multiscale Faster RCNN Based Object Detection
During object detection process, the multiscale Faster RCNN model is applied to identify the weeds as well as plants. Indeed, the recognized object is smaller in size and lower in resolution. The earlier models (i.e., Fast RCNN) has better recognition performance for larger objects cannot efficiently identify smaller object in an image [17]. The primary reason is that this model depends on DNN which makes the image evaluated with downsampled and convolution for obtaining high-level and more abstract features. Every downsampling causes the image to be minimized by half. When the object is analogous to the size of object in the PASCAL VOC, the object detailed feature is attained by using this downsampling and convolution. But, When the recognized object is on a smaller scale, the last feature might be left 1-2 pixels afterward several downsampling. Consequently, some characteristics could not completely determine the features of the object and the current recognition technique could not efficiently identify the smaller target object. The deep the convolutional process, the more abstract the object feature could denote the higher-level feature of object. The shallow convolutional layer extracts the lower-level feature of object. However, for smaller objects, the lower-level feature ensures effective object features. For getting higher-level and abstract object features and ensuring that there is sufficient pixel to determine smaller objects, we combined the feature of distinct scales for ensuring the local detail of the object. Simultaneously, focus more interest on the global features of the object depending upon the Fast RCNN.
The multiscale Faster RCNN method is separated into four portions: the initial one is the feature extraction that contains 2 pooling layers, 3 RoI pooling layers, 5 convolutional layers, and 5 ReLU layers. Then, standardize the output of 3th, 4th, and 5th convolution, correspondingly. The standardized output is transmitted to the RPN layer and the feature combinational layers for the extracted multiscale feature and the generation of PR, correspondingly. Next is the feature combinational layer which integrates distinct scales feature of 3rd, 4th, and 5th layer into 1D feature through connection process. Then the RPN layer largely comprehends the generation of PR. The final layer is utilized for realizing bounding box regression and classification of objects are in PR which is made up of BBox and softmax. To attain the combinational feature vector, needed to normalize the feature vector of distinct scales. Generally, the deep convolutional layer outputs the small scale feature. In contrast, the low convolutional layer output the large scale feature. The weight of largescale feature would be large when compared to of smaller scale feature in the network weight that is tuned when the feature of this distinct scale is integrated that resulting in the low recognition performance. To avoid this largescale feature from covering smaller scale features, the feature tensor i.e., outputted from distinct RoI pooling must be standardized beforehand this tensor is concatenated. In the study, we employed L2 normalization. The normalized process, that is utilized for processing each feature vector i.e., pooled, is positioned afterward RoI pooling. Afterward, normalization, the scale of feature vector of 3th; 4th; and 5th layers would be regularized to a unified scale.
Whereas X represents the original vector from 3th, 4th, and 5th layers,X denotes normalization feature vector, and D indicates the channel amount of RoI pooling. The vector X would be scaled uniformly by scale factor Whereas Y ¼ ½y 1 ; y 2 ; . . . ; y d T . In the procedure of error BP, we needed to alter the scale factor c and input vector X:

Optimal ELM Based Weed Classification
At the time of weed classification, the features are received by the ELM to classify into distinct classes. Huang et al. presented ELM for improving the network trained speed, afterward extensive the concept of ELM in neuron hidden node to another hidden node [18]. The trained instances are signified as fx i ; t i g n i¼1 , where n implies the trained instances number, x i refers the input of i th sample with m dimensional and t i has resultant of i th samples. Afterward, to provide the input vector x; the resultant of SLFNs with L hidden node is expressed as: where h x ð Þ ¼ ½h 1 x ð Þ Á Á Á h L x ð Þ T refers the hidden outcome, and b ¼ ½b 1 Á Á Á b L T represents the resultant weight. Considered that output of these n trained instances are estimated with zero error, the compact design is as follows where H ¼ ½h x 1 ð ÞÁÁÁh x n ð Þ T is termed as hidden resultant matrix. The solution of resultant weight b only contains an easy linear formula, and the solution is matching to minimized of training error that is min k Hb À t k. An optimum evaluation of resultant weights are demonstrated by Moore-Penrose generalization inverse H y as follows: Usually, orthogonal projection is utilized for solving the generalization inverse H y . When H T H is nonsingular, H y ¼ ðH T HÞ À1 H T , or when HH T is non-singular, H y ¼ H T ðHH T Þ À1 : In order to boost the classification efficiency of the ELM model, the FFO algorithm is applied to it. A metaheuristic is a type of model-free method to resolve different kinds of optimization issues which are newly exploited in a wide-ranging application. The FFO approach involves six major portions which are described below [19]: 1. Initialization: here, the possible solution and the number of sections for (n) in the farmland are determined. Regarding, the population (N) is modeled by the following equation: Whereas k i represents a positive digit within 1; N ½ ; and n defines an integer value. The k value is chosen as two that can be accomplished using errors and trials. For making the first individual in the possible range, the subsequent formula is adapted: Whereas, L j and U j represents the lower and upper bounds in the variable j, and d indicates a random value within 0; 1 ½ :The farmland is separated into three subsections of local memory A; B; and C ð Þand a global memory whereby the minimum quality soil is located in section A.
2. Evaluate the quality of soil in each section of the farmland: this phase directs the farmland decision variable in the section. Calculate the cost function values for the decision variable. Likewise, the soil quality was attained as follows: 3. Update the memory: here, the local and global memories are upgraded. The optimal solution of the farmland is saved in the local memory and the solutions amongst them are taken into account as global memory. For determining the amount of optimum local and global memories, the subsequent formulas are utilized: Whereas, t 2 O:1; 1 ½ , and M local and M Global determine the amount of stored solutions in local and global memory, correspondingly.
4. Soil quality difference for all the sections: define the quality of section and store the optimal one in the local memory. Moreover, the optimal solution is saved in the global memory. To improve the worst-case result, they are upgraded by integrating to the optimal-case solution of global memory. At last, the variable of the solution is upgraded by: In the equation, X MGlobal symbolizes an arbitrary number by using global solution, X ij indicates a worstcase i.e., chosen to update, and h defines a decimal value in the following: h ¼ a Â r 1 (17) in which, a denotes a constant number within 0; 1 ½ , and r 1 indicates an arbitrary number within [−1, 1].
while r 2 indicates an arbitrary number within 0; 1 ½ , and b defines a constant within 0; 1 ½ that is assumed initially in farmland fertility.
5. The composition of soil: afterward detecting the optimal local solution L best ð Þ, the farmland optimal soil integration is chosen by the farmers. Besides, the optimum global solution G best ð Þ is attained for combining the farmland to design the quality of the soil: In which, Q determines the optimal global integration for the solution and is a constant within 0; 1 ½ Best Global ð Þ ; r 3 characterizes an arbitrary number within 0; 1 ½ , and x indicates the variable of the farmland fertility i.e., determined by: 6. Last condition: compute the potential solution to the searching region. In the method, once the ending condition is attained, the procedure stops, or else, it is repeated until attaining the optimal solution. FFO algorithm derives a fitness function to attain improved classification performance. It determines a positive integer to represent the better performance of the candidate solutions. In this study, the minimization of the classification error rate is considered as the fitness function, as given in Eq. (22). The optimal solution has a minimal error rate and the worse solution attains an increased error rate.

Results and Discussion
This section examines the weed detection and classification results of the CVDL-WDC technique using the benchmark dataset [20]. Fig. 2 demonstrates the sample images consisting of healthy plants as well as weeds. Fig. 3 illustrates the visualization result analysis of the CVDL-WDC technique. The figure stated that the CVDL-WDC technique has effectively recognized and classified weeds among other plants.
Tab. 1 and Fig. 4 Fig. 11 [21]. The results show that the HOG-SVM and GW-GFD techniques have required increased CT of 235 and 205 s respectively. Along with that, the GLCM, FCN-PF, and RF techniques have needed slightly decreased CT of 185, 137, and 156 s respectively. In line with, the FCN-RCWD technique has accomplished somewhat considerable CT of 78 s. However, the proposed CVDL-WDC technique has attained minimal CT of 43 s. From the results and discussion, it is evident that the CVDL-WDC technique has reached effective weed detection and classification performance.