Web Page Recommendation Using Distributional Recurrent Neural Network

In the data retrieval process of the Data recommendation system, the matching prediction and similarity identification take place a major role in the ontology. In that, there are several methods to improve the retrieving process with improved accuracy and to reduce the searching time. Since, in the data recommendation system, this type of data searching becomes complex to search for the best matching for given query data and fails in the accuracy of the query recommendation process. To improve the performance of data validation, this paper proposed a novel model of data similarity estimation and clustering method to retrieve the relevant data with the best matching in the big data processing. In this paper advanced model of the Logarithmic Directionality Texture Pattern (LDTP) method with a Metaheuristic Pattern Searching (MPS) system was used to estimate the similarity between the query data in the entire database. The overall work was implemented for the application of the data recommendation process. These are all indexed and grouped as a cluster to form a paged format of database structure which can reduce the computation time while at the searching period. Also, with the help of a neural network, the relevancies of feature attributes in the database are predicted, and the matching index was sorted to provide the recommended data for given query data. This was achieved by using the Distributional Recurrent Neural Network (DRNN). This is an enhanced model of Neural Network technology to find the relevancy based on the correlation factor of the feature set. The training process of the DRNN classifier was carried out by estimating the correlation factor of the attributes of the dataset. These are formed as clusters and paged with proper indexing based on the MPS parameter of similarity metric. The overall performance of the proposed work can be evaluated by varying the size of the training database by 60%, 70%, and 80%. The parameters that are considered for performance analysis are Precision, Recall, F1-score and the accuracy of data retrieval, the query recommendation output, and comparison with other state-of-art methods.


Introduction
CLOUD is an eminent technology in recent days, which provides highly scalable services to web pages. It enables the customers to rent out the spaces on their physical machine with increased profit maximization. The cloud computing environment is classified into homogeneous and heterogeneous clouds. In a homogeneous cloud, the entire service is offered by a single vendor and in a heterogeneous cloud, the service contains components integrated from various vendors. In the query recommendation process, the Data recommendation concept was majorly focused to present the sorted list of subject and course information by referring to the database. Big data in Data recommendation focused to analyze the data cluster with labeled properties that are can be characterized based on the ratings, probability of visit, and other parameters. Web service is a platform-independent factor that is mainly used to help machinemachine communication in a network [1]. Accuracy is one of the major considerations for the web pages to select their required services, which describes the non-functional characteristics of the web services. Normally, the Prediction [2] is defined as the set of characteristics of data availability, reputation, and throughput. Normally, the processes of service selection and recommendation are the major things that are used to enable service composition in recent years [3]. The traditional works developed similaritybased query recommendation systems. It does not satisfy the web page requirement by providing the most similar services [4]. Moreover, it utilized some encryption and clustering mechanisms during the service storage and retrieval. The existing encryption techniques such as K-means [5], K-Medoids [6], and Fuzzy C-Means (FCM) [7,8] are not highly efficient for query recommendation, because, it has the major issues of being highly sensitive, requires a finite number of clusters, and large searching space. The major reasons for using the encryption [9] techniques are to ensure the confidentiality, privacy, integrity, access control, and authentication of the data. Then, the existing encryption techniques such as Elliptic Curve Cryptography (ECC), blowfish, Advanced Encryption Standard (AES), and Rivest Shamir Adleman (RSA) are used in the traditional works, which has the drawbacks of increased complexity and time consumption. So, these techniques are also not highly suitable for an accurate query recommendation [10]. The major objectives of this paper are as follows: To preprocess the given dataset, the stop words removal and stemming processes and extract the features of data that are arranged according to attributes. To for the cluster of data feature set by using the Logarithmic Directionality Texture Pattern (LDTP) method for big data. To find the similarity indexing by using the Metaheuristic Pattern Searching (MPS) system and arrange the data in proper paging architecture. To arrange the data by sorting feature attributes based on their similarity and form the hierarchical structure.
To improve the accuracy of the query matching and improve the speed of the process. To recommend the most similar items to the requested web pages, the correlation factor between the data attributes is to be computed with the best matching.
The rest of the sections in the paper are organized as follows: Section 2 reviews the existing frameworks and techniques that are used for query recommendation. Section 3 provides a clear description of the proposed methodology with its detailed flow representation. The experimental results of the existing and proposed mechanisms are evaluated and compared in Section 4. Finally, the paper is concluded and the enhancements that can be implemented in the future are stated in Section 5.

Related Work
In this section, the existing techniques and algorithms related to query recommendation are surveyed with their advantages and disadvantages.
In [11] developed a location aware personalized collaborative filtering mechanism for improving the Prediction of the recommendation system. In this mechanism, both the location of the web pages and the web services were leveraged for electing the target service or the web page. The main Prediction factors that considered in this work were availability, response time, web page dependency, and reliability. The stages involved in this system were as, Web page location information Identification of similar web pages Prediction based on the web pages Prediction based on the services Prediction based on web pages and services Recommendation Here, the similarity computation was performed by selecting similar neighbors with the use of the weighted PCC technique. Also, the impact of sparseness was examined in this work for evaluating the accuracy of prediction. Reference [12] analyzed the shortlists for supporting the web page decision process in a recommendation system. In this paper, it was stated that the shortlists provided a better improvement in both downstream and web page satisfaction performance. Also, it improved the quality of recommendations by implementing additional feedback. Reference [13] Suggested a context-aware Prediction scheme for the web page recommendation system. Here, the mapping relationship between the geographical distance and similarity value was analyzed in the web page side. Moreover, this mechanism selected the most effective similarity function for attaining an exact similarity between the web pages. Moreover, the Matrix Factorization (MF) method was utilized as the basic model, which offered integrated context information. The disadvantage behind this research work was, it required attaining the detailed resource configuration with a time factor. Reference [14] Developed a new approach by integrating collaborative filtering with the content-based filtering technique for service recommendation. In this system, the semantic content and rating data were utilized to perform the recommendation by the use of the probabilistic generative model. The key objective of this paper was to investigate the recent state of the art in web service recommendation systems. Moreover, three main requirements such as recommendation serendipity, recommending newly deployed services, and recommendation accuracy were examined in this work for developing an efficient recommendation system. Then, a three-way aspect model was implemented to identify the similarities of the web pages based on the semantic contents of web services. Reference [15] suggested three different recommendation approaches such as collaborative filtering approach, content based approach, and hybrid approach for developing an efficient service recommendation system. The major components involved in this system were as follows: Functional and non-function evaluation. Diversity web service ranking. Diversity evaluation.
To consider the query recommendation system for the Data recommendation database, [16] proposed an ontology based context recommendation system and mobile data recommendation applications. In this, the context types like Profile, Social interactions, learning activities, and device specifications were considered from the database in the learning object ontology. The OWL rules are used for the filtering of context to recommend the query input. In [17], the paper presented a survey of different methodologies for the recommendation system based on the ontology of Data recommendation. In that, it states that the hybridization of algorithms and other recommendation techniques achieved a better similarity identification model based on the knowledge based recommendation system. Later in [18], the author proposed a course recommendation system based on the query classification approach. This estimates the relevant data for the query input by using the classification of query data from the database. The ontology estimates the N-List of relevant features from the database that are matched with the query input and displayed the recommended course information. Similarly in [19], the paperwork presented a review of ontology based Data recommendation process. The Recommender systems specified in the analysis are using ontology, artificial intelligence, among other techniques to provide personalized recommendations. This helps to prepare the learning libraries and feature retrieval model to enhance the recommendation system in the Data recommendation process. In [20], the author proposed a novel learning path recommendation model. This was based on the multidimensional knowledge graph framework for the Data recommendation system. This multidimensional knowledge graph method was used to separate the overall database and organized it into several classes. This will enhance the learning capacity and reduce the time complexity of the classification model. Similarly, [21] paperwork proposed a novel recommendation system for the Data recommendation process using the moodle Data recommendation platform. This identifies the similarity of course information from the database and retrieves the relevant data. MoodleRec performs the sorting of supported standard compliant Learning Object Repositories and suggests a ranked list of Learning Objects that are similar to the query input that is operated in the two different levels of classification. In [22] for the telemedicine diagnosis 3D imaging helps the doctors to make clear judgments, 3D medical watermarking algorithm based on wavelet transform is proposed in this work. In [23] to avoid medical audio data leakage in the field of telemedicine the two-stage reversible robust audio watermarking algorithm was proposed.
From the survey, it is investigated that the existing approaches have both advantages and disadvantages, but it mainly lacks the following drawbacks: It failed to recognize the Prediction variations. The existing systems require more training sets to retrieve query data. This also increased the time complexity of the memory based recommendation systems. It offered a list of ranked services with no transparency.
To solve these problems, this paper aims to develop a new query recommendation system. In this, the query searching and relevant data identification were processed by similarity indexing and the Logarithmic Directionality Texture Pattern (LDTP) based cluster estimation. The Distributional Recurrent Neural Network (DRNN) based classification algorithm enhanced the accuracy performance of the data retrieval system than by using the traditional recommendation model. A detailed description of the proposed work is explained in Section 3.

Proposed Work
In this section, a detailed description of the proposed methodology is presented with its flow illustration. This paper intends to perform query recommendations efficiently. In this system, the progressive data like student education course data is given as the input, which is preprocessed by performing the stop words removal and stemming.
After getting the filtered/pre-processed data, the matrix is generated for selecting the Cluster Pattern (CP) based on the Normalized Logarithmic Directionality Texture Pattern (LDTP) method. Based on the CP, the clusters are formed by implementing the Metaheuristic Pattern Searching (MPS) system and the documents in each cluster are then processed for the training of the classification algorithm by using Distributional Recurrent Neural Network (DRNN) as shown in Fig. 1. After that, the similarities such as Kolmogorov and Transformation distance are computed to identify the similar items. Consequently, the n numbers of attributes of the documents are stored in the cloud server. When the server receives the request from the web page, retrieve the relevant data by searching for the best match for query data and recommend the result by using the DRNN classification model. Based on the highest similarity value, the services are recommended to the requested web page. Finally, the matched data that are related to the query input is listed as the recommended result for the Data recommendation application.
The stages that are involved in this system are as follows: Preprocessing Logarithmic Directionality Texture Pattern (LDTP)based pattern generation Clustering using Metaheuristic Pattern Searching (MPS) Similarity Estimation Service Ranking and recommendation

Preprocessing
At first, the dataset is given as the input for preprocessing, where the stop words removal and stemming are performed to obtain the filtered data. The main intention of dataset preprocessing is to optimize the data size by selecting attributes that are related to the recommendation process. Here, the unwanted characters or letters are filtered to reallocate the special characters that are needed to process in the dynamic analysis model. The special characters are considered as the Unicode value to represent the letter size which can reduce the data memory size. This makes the prediction quality with better retrieval accuracy. The filtering of irrelevant data can be identified by estimating the uniqueness of the attribute value whether that value can be segmented by its related components.

Logarithmic Directionality Texture Pattern (LDTP) Based Pattern Generation
After preprocessing, the matrix is generated for the preprocessed data by using the LDTP. It is also known as the universal distance measure that finds the distance between each object. Moreover, it simultaneously uncovers all similarities for selecting the CP. In this stage, the set of documents and their domain list is given as the input. Here, each domain contains a set of N files, which are stored in a repository R N and its size is denoted as M. To construct the matrix, the varying number of keywords in each document are extracted i.e., K i and K j . Then, the bytes of K i and K j is also extracted and stored in separate variables wd x , wd y and wd xy . Consequently, the binary values are calculated for the extracted bytes of data, based on this the values of M xy , N xy and N yx are computed as shown below: Then, the distance dist xy is computed by the ratio of 1 logð Derv xy Þ and dif v . Consequently, the T dist is updated with the sum of T dist and dist xy .
Then, the value of O dist is updated with the values of T dist and finally, the M NID (i, j)is estimated based on the updated O dist /M. Step 1: Let, N be the set of Domains, in which each domain contains a sample set of 50 files; Step 2: Let, R N be of the repository of files, which holds the set of N documents; Step 3: Let, M be the Total Size of R N ; Step 4: To construct the M NID of size (M, M) Step 5: End for i End for j

Clustering Using Metaheuristic Pattern Searching (MPS)
After selecting the CP using LDTP, the number of clusters is formed by implementing the MPS technique. When compared to traditional clustering such as k-means and fuzzy c-means, it is an efficient clustering technique widely used in the field of computer science. Because it provides a high quality of clusters by iteratively exchanging the messages between all pairs of data. The major reasons for using this algorithm are reduced clustering error, determinism, increased efficiency, and simple computation. Also, it does not require satisfying the triangle inequality, because it supports the similarities. Moreover, the major characteristics of this technique are availability and responsibility. This algorithm computes a set of exemplars for representing the dataset, where the pair-wise similarity is estimated between each pair of data. Here, the sum of distances or similarities for all the data points with respect to their equivalent exemplars is maximal. In this algorithm, the availability matrix S i and the responsibility matrix S j are constructed based on the distance matrix M NID obtained from the previous stage. Then, these matrices are updated by checking the rows and columns in M NID is greater than the value of A ij .
Consequently, the exponential matrix is constructed by checking the value of the sum of A ij and R ij is greater than 0.
Then, the average for the Max (Expm i ) and R iðIdx x Þ is computed and updated with the avg list .
where i and j are the size of the matrix (S i ; S j ), Idx ls Max (Expm i ) where i is the size of Matrix and S i -is the Index list with maximum elements of each column in the matrix. After that, the distance index list is estimated by finding the difference between the avg 2 and avg 2 x . Then, the maximum index of the dis ls is calculated, based on this value, the CP is selected for each cluster.
where x is the size of the matrix S j and C id = Max (Index (dis ls )).

Input: Distance Matrix [LDTP matrix (M NID )]
Output: clusters, A ij , R ij Step 1: Construct Availability Matrix and responsibility matrix (Continued)

Algorithm 2 (continued)
Let S i and S j be the size of the matrix ðM NID ) and Set K = 2; For i = 1 to S i For j = 1 to S j A ij is computed by using Eq. (5) End for j; End for I; Step 2: Construct and update the responsibility matrix and Availability matrix; For X k = 1 to k End for S j End for S i End for X k Step 3: Compute Exponential Matrix by using Eq. (7); Update avg list ← avg; For y = 1 to S j Compute avg R by using Eq. (9); Compute the distance list by using Eq. (10); Update C id → C head End for y;

Similarity Computation
In this stage, the similarity is computed between the documents by computing the Kolmogorov and transformation distance based similarity measures. The Kolmogorov is a widely used similarity mechanism that encodes a finite set of objects into strings denoted as {0, 1}. For instance, it estimates the similarity between two representations (i.e., A and B), where it takes A as input and B as output. Then, the quantity of this similarity is denoted as the K(B|A), which is semi-computable. Then, the transformation distance based similarity measure is a kind of asymmetric technique and it does not have admissible distance. In this technique, the web page query is given as the input and the estimated similarity is results as the output. Here, the size of cluster head documents and the keywords in each cluster head document is computed then the size of the document is converted into bytes. Also, the binary folds are computed for the bytes of data, from that the minimum and maximum folds are estimated by using the following equation: p1 ¼ ðBf xy À Min ðB xy ÞÞ Max ðB xy Þ p2 ¼ ððSize ðS1Þ À Size ðS2ÞÞ size ðS12Þ Based on these values, the transmission distance similarity is estimated by the product of p1 * p2. Then, the KC is computed by generating the mask value for the binary folds of the data.
At last, the similarity values of both TD and KC are summed for estimating the total similarity. It is used to identify the most similar items related to the web page query.

Algorithm 3: Similarity Computation
Input: Web page Query Output: Estimated similarity Step 1: Let U Q be the Web page Query Step 2: Let K U be the Keywords in the Web page Query Step 3: The Server ID K U with the cluster Keys.
Step 4: Let c Head be the cluster head documents and kc H be the keywords in the cluster head documents For M = 1 to Size of (K U ) For N = 1 to Size of (kc H ) S1 = K U ðM Þ ; S2 ¼ kc H ðN Þ and S12 = s1 * s2 B1, B2 and B12 bytes for of S1, S2 and S12 Bf x ; Bf y and Bf xy be the binary folds of B1, B2, and B12; (Continued) Compute Max (B xy ) and Min (B xy ) by using Eq. (11); Compute p1 and p2 by using Eqs. (12) and (13)

Query Recommendation
Finally, the ranking is provided for the items based on their similarity value, where the high similar items are recommended to the requested web pages. Here, the Distributional Recurrent Neural Network (DRNN)is used to rank the items based on their similarity. In this technique, the web page query and selected CP are given as the input and the matched document for the query is provided as the output. At first, the server retrieves the keywords in the web page query K U , then the keywords in the number of documents K Dn are extracted. After that, the similarity between the web page query and the keywords in the document is computed, based on this the list of similarity score Sim Ku is estimated for the number of documents in the matched cluster. Finally, the matched document E rank is listed and displayed as the recommended information about the query data for the classification process. Step 1: Let U Q be the Web page Query Step 2: Let K U be the Keywords in the Web page Query Step 3: The Server index K U with the cluster Keys.
Step 4: Let N be the set of documents in the cluster; Step 5: For I = 1 to N (No of documents) K Dn = Keywords in the Document Sim Ku ← Similarity (K Dn , K U ) and update End for N Step 6: Sim Ku is the list of similarity scores for the documents in the matched cluster.
The matched document is denoted as N (E rank ) Step 7: Decrypt the matched document based on the above mentioned non-abelian algorithm;

Result Analysis
In this section, the experimental results of existing and proposed techniques are evaluated by using various performance measures. The overall implementation of the proposed work was processed in the tool python (Version 3.7). It includes precision, recall, f-measure, and other classification rates. Moreover, the proposed service recommendation mechanism is compared with the existing similarity and classification technique for proving the effectiveness of the proposed system. Precision, recall, and f-measure are the most used measures for evaluating the performance of the query recommendation methods. In which, precision is defined as the function of relevancy, which estimates the ratio of the relevant and retrieved services. Also, it is the positive predictive value that provides the results relevant to an accurate service recommendation. The recall also provides the most relevant results during the service recommendation. Then, the f-measure integrates the value of both precision and recall, which reduces the impact of outliers. The precision, recall, and f-measure values are calculated as follows: Figs. 2 to 4 shows the precision, recall, and f-measure values of the proposed service recommendation system with respect to varying α values. Fig. 3 shows recall values of the proposed work. Here, α represents the coefficient that states the more specific information of the functionalities, which ranges from 0.6 to 0.9. In this evaluation, various fields such as business, entertainment, politic, sports, and technology are considered. Also, the comparison between the existing [24]. Tab. 1 shows the number of classes for different categories and the number of training documents and testing are listed for the dataset followed in the existing paper [25].   5 shows the comparison result of the proposed query recommendation with existing systems. This represents the performance result of the proposed recommendation system for the dataset of Data recommendation-Web KB. From the analysis, it is observed that the proposed service recommendation mechanism provides better results by providing high ranking services to the web pages. Also, in Fig. 6, the comparison was prepared for the accuracy and error rate of the proposed ontology based Data recommendation process.

Conclusion and Future Enhancement
This paper aims to design a new query recommendation system for providing efficient services to web pages. In this, the Data recommendation based query recommendation system was focused to analyze and predict relevant data from the database. For this reason, enhanced clustering, distance based similarity, and retrieving mechanisms are utilized. Here, the stop words removal and stemming are performed to preprocess the dataset. Then, the LDTP measure is used to select the CP by extracting the keywords from the repository of files. Also, the MPS mechanism is used to form the cluster with a set of documents based on the CP. The documents are then processed for the training model in the data learning system by   using the DRNN classifier. In this environment, the server identifies the query request by using the same DRNN classification model. Moreover, the KC and TD based similarity measures are used to find the most similar items for recommendation. Also, the rank is provided for the highly similar items based on this priority by using the combination of LDTP and MPS similarity estimation technique. In performance evaluation, different measures are used to analyze the results of the existing and proposed techniques. From the evaluation, it is observed that the proposed query recommendation system provides better results by efficiently ranking the services.
Funding Statement: The authors received no specific funding for this study.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.