The effectiveness of the logic mining approach is strongly correlated to the quality of the induced logical representation that represent the behaviour of the data. Specifically, the optimum induced logical representation indicates the capability of the logic mining approach in generalizing the real datasets of different variants and dimensions. The main issues with the logic extracted by the standard logic mining techniques are lack of interpretability and the weakness in terms of the structural and arrangement of the 2 Satisfiability logic causing lower accuracy. To address the issues, the logical permutation serves as an alternative mechanism that can enhance the probability of the 2 Satisfiability logical rule becoming true by utilizing the definitive finite arrangement of attributes. This work aims to examine and analyze the significant effect of logical permutation on the performance of data extraction ability of the logic mining approach incorporated with the recurrent discrete Hopfield Neural Network. Based on the theory, the effect of permutation and associate memories in recurrent Hopfield Neural Network will potentially improve the accuracy of the existing logic mining approach. To validate the impact of the logical permutation on the retrieval phase of the logic mining model, the proposed work is experimentally tested on a different class of the benchmark real datasets ranging from the multivariate and time-series datasets. The experimental results show the significant improvement in the proposed logical permutation-based logic mining according to the domains such as compatibility, accuracy, and competitiveness as opposed to the plethora of standard 2 Satisfiability Reverse Analysis methods.

Artificial Neural Network (ANN) is a subset of Artificial Intelligence that was inspired by artificial neurons. The primary aim of the ANN is to create black box model that can offer alternative explanation among the data. Using this explanation, one can use the output produced from ANN to solve various optimization problem. The main problem with conventional ANN is the lack of symbolic reasoning to govern the modelling of neurons. Reference [

The most popular application of the logical rule in DHNN is logic mining. Reference [

In this paper, the modified 2SATRA integrated with permutation operator will enhance the capability of selecting the most optimal induced logic by considering other combination of variable in 2SAT logic. The proposed modified 2SATRA will extract the optimal logical rule for various real-life datasets. Therefore, Thus, the correct synaptic weight during learning phase will determine the capability of the logic mining model and the accuracy of the induced logic generated during testing phase. This work focused on the impact of the logical permutation mechanism in Hopfield Neural Network (HNN) towards the performance of 2SATRA in the tasks data mining and extraction. The contribution of this paper is as follows:

To formulate 2 Satisfiability that incorporates permutation operators which consider various combination of variable in a clause.

To implement permutation 2 Satisfiability in Discrete Hopfield Neural Network by minimizing the cost function during learning phase that leads to optimal final neuron state.

To embed the proposed hybrid Discrete Hopfield Neural Network into logic mining where more diversified induced logic has been proposed.

To evaluate the performance of the proposed permutation logic mining in doing real life datasets with other state of the art logic mining.

The organization of this paper is as follows. Section 2 encloses a bit of brief introduction of 2 Satisfiability logical representation including the conventional formulations and examples. Section 3 focuses on the formulations of logical permutation on 2 Satisfiability Based Reverse Analysis methods. Thus, Section 4 explains the experimental setup including benchmark dataset, performance metrics, baseline method and experimental design. Then, the results and discussions are covered briefly in Section 5. Definitively, the concluding remarks are included in the final section of this paper.

Satisfiability (SAT) is a class of problem of finding the feasible interpretation that satisfies a particular Boolean Formula based on the logical rule. Based on the literature in [

Given a set of specified

A set of logical literals comprising either the positive variable or the negation of variable in terms of

Given a set of

Based on the feature in (a) until (c), the precise definition of

Specifically, the fundamental classification of DHNN with i-th activation is shown as follows

Logic mining is a paradigm that used logical rule to simplify the information of the data set. Based on the inspiration of a study by [

Given a set of data,

Synaptic weight | ||||
---|---|---|---|---|

0.25 | −0.25 | 0.25 | −0.25 | |

0.25 | 0.25 | 0.25 | −0.25 | |

−0.25 | 0.25 | −0.25 | −0.25 |

For instance, if the given dataset reads

In this context, the

Based on

In this paper, the impact of logical permutation in attaining the optimum induced logic is examined. Thus, the first 10 publicly available datasets from repository (B1-B10) are acquired from the open source UCI repository databases via

Code | Dataset | Attributes | Instances | Missing Value | Type of dataset | Outcome |
---|---|---|---|---|---|---|

B1 | Chronic kidney disease | 26 | 400 | Yes | Multivariate | Chronic kidney disease |

B2 | Heart attack analysis | 14 | 303 | No | Multivariate | Chances of getting heart attack |

B3 | Hepatitis C virus | 14 | 615 | Yes | Multivariate | Category of diagnosis |

B4 | Obesity | 17 | 2111 | No | Multivariate | Obesity level |

B5 | Stroke | 12 | 5110 | No | Multivariate | Stroke prediction |

B6 | German credit data | 20 | 1000 | No | Multivariate | Status |

B7 | Zoo | 17 | 101 | No | Multivariate | Class |

B8 | Wine | 13 | 178 | No | Multivariate | Class |

B9 | Energy efficiency–Y1 | 8 | 768 | No | Multivariate | Heating load |

B10 | Computer hardware | 9 | 209 | No | Multivariate | Estimated relative performance |

B11 | Water level | 13 | 56 | Yes | Time series | Type of river |

As the primary impetus of this work is to evaluate the quality of the induced logical representation generated by P2SATRA, we restrict the baseline methods comparison to the standard method only with the capability in attaining the induced logic from the real datasets.

Parameter | Parameter value |
---|---|

Combination of neurons | 100 |

Attribute selection | Random |

Number of learning |
100 |

Logical rule | |

Tolerance value |
0.001 |

Number of neuron string | 100 |

Selection_rate | 0.1 |

Neuron combination | 100 |

Parameter | Parameter value |
---|---|

Combination of neurons | 100 |

Attribute selection | Random |

Number of learning |
100 |

Logical rule | |

Number of neuron string | 100 |

Selection_rate | 0.1 |

Parameter | Parameter value |
---|---|

Combination of neurons | 100 |

Attribute selection | Random |

Number of learning |
100 |

Logical rule | |

Number of neuron string | 100 |

Selection_rate | 0.1 |

Maximum permutation | 100 |

Parameter | Parameter value |
---|---|

Combination of neurons | 100 |

Number of learning |
100 |

Logical rule | |

Number of neuron string | 100 |

Selection_rate | 0.1 |

Various performance evaluations such as the sensitivity, precision analysis, F-Score and Matthews Correlation Coefficient (MCC) are employed to analyze and assess the overall capability and the significant effect of logical permutation in P2SATRA. The performance of the P2SATRA is calculated based on the confusion matrix. Specifically,

Therefore, precision is employed to gauge the algorithm’s or model’s predictive capability. The computation and formulation for Precision

Accuracy

In addition, Matthews Correlation Coefficient

All simulations will be implemented and executed by employing the Dev C++ Version 5.11 software due to the versatility of the programming language and the user-friendly interface of the compiler. Hence, the simulations will be implemented in C++ language by using computer with Intel Core i7 2.5 GHz processor, 8GB RAM and Windows 8.1. Following that, the threshold CPU time for each execution was set 24 h and any possible outputs that go beyond the threshold time were omitted entirely from the analysis. The overall experiments were executed by using the similar device to prevent possible bad sector in the memory during the simulations.

This study created the 2SATRA integrated with HNN-2SAT to simulate and analyze the effect of logical permutation, forming P2SATRA. The composition of attributes will be randomly permuted as opposed to the previous 2SATRA models [

The results of

For

For result of

As well to

According to the average rank for all the data sets in terms of

The further analysis via Friedman test rank has been performed for all 11 datasets with

Dataset | P2SATRA | 2SATRA | E2SATRA | RA |
---|---|---|---|---|

B1 | 0.569 | 0.171 | 0.575 | |

B2 | 0.182 | 0.000 | 0.479 | |

B3 | 0.419 | 0.360 | 0.407 | |

B4 | 0.500 | 0.667 | 0.566 | |

B5 | 0.400 | 0.000 | 0.486 | |

B6 | 0.673 | 0.804 | 0.393 | |

B7 | 0.630 | 0.750 | 0.889 | |

B8 | 0.389 | 0.634 | 0.653 | |

B9 | 0.000 | 1.000 | 0.839 | |

B10 | 0.536 | 0.728 | 0.655 | |

B11 | 0.000 | - | 0.000 | |

Avg | 0.955 | 0.391 | 0.511 | 0.540 |

Std | 0.037 | 0.235 | 0.354 | 0.240 |

Min | 0.901 | 0.000 | 0.000 | 0.000 |

Max | 1.000 | 0.673 | 1.000 | 0.889 |

Avg Rank | 3.227 | 2.864 | 2.864 |

Dataset | P2SATRA | 2SATRA | E2SATRA | RA |
---|---|---|---|---|

B1 | 0.600 | |||

B2 | - | - | - | - |

B3 | 0.585 | 0.659 | 0.549 | |

B4 | 0.500 | 0.500 | 0.566 | |

B5 | - | - | - | - |

B6 | 0.696 | 0.693 | 0.388 | |

B7 | 0.600 | 0.600 | 0.960 | |

B8 | 0.793 | 0.542 | 0.417 | |

B9 | 0.000 | |||

B10 | 0.500 | 0.897 | 0.948 | |

B11 | - | - | - | - |

Avg | 0.911 | 0.544 | 0.699 | 0.691 |

Std | 0.112 | 0.251 | 0.171 | 0.250 |

Min | 0.700 | 0.000 | 0.500 | 0.388 |

Max | 1.000 | 0.875 | 1.000 | 1.000 |

Avg Rank | 3.1878 | 2.688 | 2.750 |

Dataset | P2SATRA | 2SATRA | E2SATRA | RA |
---|---|---|---|---|

B1 | 0.085 | 0.097 | 0.097 | |

B2 | ||||

B3 | 0.306 | 0.248 | 0.292 | |

B4 | ||||

B5 | ||||

B6 | 0.957 | 0.957 | 0.962 | |

B7 | 0.926 | 0.923 | ||

B8 | 0.339 | 0.765 | 0.476 | |

B9 | 0.000 | 0.762 | ||

B10 | 0.746 | 0.929 | 0.679 | |

B11 | - | 0.000 | 0.000 | 0.000 |

Avg | 0.774 | 0.403 | 0.545 | 0.472 |

Std | 0.410 | 0.436 | 0.465 | 0.411 |

Min | 0.000 | 0.000 | 0.000 | 0.000 |

Max | 1.000 | 1.000 | 1.000 | 1.000 |

Avg Rank | 2.682 | 2.500 | 2.909 |

Dataset | P2SATRA | 2SATRA | E2SATRA | RA |
---|---|---|---|---|

B1 | 0.148 | 0.171 | 0.171 | |

B2 | ||||

B3 | 0.402 | 0.360 | 0.381 | |

B4 | 0.667 | 0.667 | 0.723 | |

B5 | ||||

B6 | 0.803 | 0.803 | 0.553 | |

B7 | 0.750 | 0.750 | 0.941 | |

B8 | 0.488 | 0.634 | 0.444 | |

B9 | 0.000 | 1.000 | 0.866 | |

B10 | 0.598 | 0.728 | 0.791 | |

B11 | - | - | ||

Avg | 0.748 | 0.351 | 0.511 | 0.443 |

Std | 0.398 | 0.329 | 0.354 | 0.361 |

Min | 0.000 | 0.000 | 0.000 | 0.000 |

Max | 1.000 | 0.804 | 1.000 | 0.941 |

Avg Rank | 3.000 | 2.818 | 2.636 |

Dataset | P2SATRA | 2SATRA | E2SATRA | RA |
---|---|---|---|---|

B1 | 0.081 | 0.130 | 0.130 | |

B2 | - | - | - | - |

B3 | −0.078 | −0.507 | −0.113 | |

B4 | - | - | - | - |

B5 | - | - | - | - |

B6 | −0.089 | −0.027 | ||

B7 | - | 0.040 | ||

B8 | 0.028 | 0.509 | 0.195 | |

B9 | 0.107 | 0.713 | ||

B10 | −1.000 | 0.728 | 0.052 | |

B11 | - | - | - | - |

Avg | 0.727 | −0.150 | 0.295 | 0.158 |

Std | 0.382 | 0.422 | 0.556 | 0.293 |

Min | −0.040 | −1.000 | −0.507 | −0.113 |

Max | 1.000 | 0.107 | 1.000 | 0.713 |

Avg rank | 3.143 | 2.429 | 2.714 |

Dataset | Details of each attribute | Induced logic |
---|---|---|

B1 | A = Sugar level | P = (D V F) ^ (A V B) ^ (C V E) |

B = Red blood cells | ||

C = Serum creatinine | ||

D = White blood cell count | ||

E = Red blood cell count | ||

F = Hypertension | ||

P = Chronic kidney disease | ||

B2 | A = Sex | P = (D V _F) ^ (A V E) ^ (_C V B) |

B = Resting blood pressure | ||

C = Fasting blood sugar | ||

D = Exercise induced angina | ||

E = Old peak | ||

F = Number of major vessels | ||

P = Chances of heart attack | ||

B3 | A = Alkaline phosphatase | P = (F V E) ^ (C V _A) ^ (B V D) |

B = Alamine aminotransferase | ||

C = Aspartate aminotransferase | ||

D = Bilirubin | ||

E = Creatinine | ||

F = Gamma-glutamyl transpeptisade | ||

P = Category of diagnosis | ||

B4 | A = Weight | P = (E V C) ^ (F V _D) ^ (A V B) |

B = Smoking | ||

C = Daily water intake | ||

D = Daily consumed calories | ||

E = Freq. of physical activity | ||

F = Technology usage | ||

P = Obesity level | ||

B5 | A = Hypertension | P = (F V E) ^ (_B V D) ^ (C V A) |

B = Heart disease | ||

C = Ever married | ||

D = Type of work | ||

E = Average glucose level | ||

F = Body mass index | ||

P = Stroke prediction | ||

B6 | A = Column 6 | P = (E V C) ^ (D V F) ^ (B V A) |

B = Column 10 | ||

C = Column 12 | ||

D = Column 14 | ||

E = Column 16 | ||

F = Column 19 | ||

P = Column 20 | ||

B7 | A = Hair | P = (A V C) ^ (D V E) ^ (B V F) |

B = Milk | ||

C = Toothed | ||

D = Backboned | ||

E = Venomous | ||

F = Tail | ||

P = Class | ||

B8 | A = Alcalinity of Ash | P = (E V C) ^ (F V D) ^ (B V A) |

B = Total phenols | ||

C = Flavanoids | ||

D = Hue | ||

E = OD280/OD315 | ||

F = Proline | ||

P = Class | ||

B9 | A = x1 | P = (C V _E) ^ (F V B) ^ (D V A) |

B = x2 | ||

C = x3 | ||

D = x4 | ||

E = x5 | ||

F = x7 | ||

P = y1 | ||

B10 | A = mmin | P = (F V E) ^ (D V A) ^ (B V C) |

B = mmax | ||

C = cach | ||

D = chmin | ||

E = chmax | ||

F = prp | ||

P = erp | ||

B11 | A = Jan | P = (_E V A) ^ (C V B) ^ (D V F) |

B = Mar | ||

C = May | ||

D = Jul | ||

E = Sep | ||

F = Nov | ||

P = Kuantan |

In this work, a new alternative approach of attaining the optimal induced logic entrenched in any of multivariate or time-series datasets by introducing the logical permutations in 2SATRA has been successfully developed. The enhancements can be seen clearly in the substantial accuracy improvement of the proposed model as opposed to the existing approach, indicating the success in the generalization of the datasets. In this study, we have exploited the multi-connection between the attribute arrangements in generating the

The authors would like to thank all AIRDG members and those who gave generously of their time, ideas and hospitality in the preparation of this manuscript.