Alcoholism is an unhealthy lifestyle associated with alcohol dependence. Not only does drinking for a long time leads to poor mental health and loss of self-control, but alcohol seeps into the bloodstream and shortens the lifespan of the body’s internal organs. Alcoholics often think of alcohol as an everyday drink and see it as a way to reduce stress in their lives because they cannot see the damage in their bodies and they believe it does not affect their physical health. As their drinking increases, they become dependent on alcohol and it affects their daily lives. Therefore, it is important to recognize the dangers of alcohol abuse and to stop drinking as soon as possible. To assist physicians in the diagnosis of patients with alcoholism, we provide a novel alcohol detection system by extracting image features of wavelet energy entropy from magnetic resonance imaging (MRI) combined with a linear regression classifier. Compared with the latest method, the 10-fold cross-validation experiment showed excellent results, including sensitivity

Alcoholism is one of the mental health problems. It is composed of two types: Alcohol dependence and alcohol abuse. Alcoholism could cause severe mental illness. Females are generally more vulnerable than males to the harmful effects of alcohol, because of their lower ability to metabolize alcohol, smaller body mass, and a higher proportion of body adipose tissue [

According to the World Health Organization, about three million people die each year from excessive drinking. Alcohol accounts for 5.1% of the global burden of disease. In the 2016 statistics, the global

The main causes of alcohol abuse are dull daily habits and stress. Avoiding excessive alcohol exposure in the early stages of alcoholism is the best way to protect people from the effects of alcoholism. Therefore, it is necessary to replace the daily drinking entertainment with healthier activities. Finding hobbies appropriately, replacing alcohol as a boring pastime with something meaningful, and reducing unnecessary drinking in daily life are all helpful to improve drinking habits. Controlling alcohol consumption in the early stage can effectively curb alcohol dependence, and the symptoms of alcohol abuse in the early stage are easy to miss the best time for self-adjustment once they are ignored.

Alcoholism influences all parts of the body, and it affects the brain. The size of gray matter (GM) and white matter (WM) of alcoholism patients are reportedly less than age-matched controls [

Recently, many ADSs were reported. Hou [

The contribution of this study is to put forward a new alcohol detection method based on computer vision and image processing [

Our method first obtains the image features of the magnetic resonance imaging (MRI) data in a higher-dimensional space through wavelet energy entropy, then the hyperplane of linear classification is used to separate the positive and negative samples, and establishes the model of the relationship between variables by linear regression. Finally, the 10-fold cross-validation is used to calculate the mean and standard deviation to verify the performance of the constructed model.

We use a new image feature extraction method [

Discrete wavelet transform has an excellent detection effect, which can achieve fast computation. Furthermore, it has the advantages of automatically adjustable time-frequency resolution for time-frequency domain analysis. The dimensionality of the discrete Wavelet transform coefficient [

In the process of image extraction, we assume that the decomposition level of the wavelet coefficient of signal

Then the Shannon entropy of the energy coefficient

The Linear regression classifier (LRC) [

The principle of LRC is to divide samples into positive and negative by a hyperplane that does not care about the dimension of space, and the most classical basic expression is

The category decision of this model is completed by summing up the product of the characteristics of each dimension and their respective weights. The simplest dichotomy we use is to make the data feature belong to [0, 1]. Therefore, a function is required to attribute the original data eigenvalues to a value set and map them to [0, 1].

We assume that the extracted image eigenvalues are located in the linear subspace in a specific category and that a data set _{j}_{j}_{j}

It is not necessary to know the statistical information about the estimator and the measurement. The least square estimation method is used for parameter estimation because of its simplicity. Through least square estimation, the reconstruction coefficient _{j}

Then we reconstructed image _{j}

The similarity of alcoholism images to the j-th class was estimated by the reconstruction error _{j}

It is possible to assume an unknown picture of alcoholism

The LRC can intuitively express the relationship between independent variables and dependent variables according to the coefficients and give the interpretation of each variables. At the same time, LRC has the obvious advantages of fast modeling speed and running speed in case of large data volume. However, it is unavoidable that the LRC cannot fit the nonlinear data well. Therefore, it is necessary to first determine whether the variables are in linear relationship, to avoid the use of overly complex nonlinear classifiers on the one hand, and improve the efficiency of modeling on the other hand.

379 slices were obtained in which there are 188 alcoholic brain images and 191 non-alcoholic brain images. The division is shown in

Alcoholic | Nonalcoholic | Total | |
---|---|---|---|

Total | 188 | 191 | 379 |

Compared with cross-validation [

The method first divides the data set into 10 folding subsets of the same size randomly as shown in

The single cross-validation results are less reliable because of possible variation. In contrast, the characteristic of regular repetition of 10-fold cross-validation makes the method reliable in verifying model performance. Although in some cases it is not substantially different from 5-fold or 20-fold cross-validation. However, there are still researches and theoretical evidence to prove that 10-fold cross-validation is an appropriate choice to prevent neural network from overfitting to obtain the best error estimate.

The process we realized was divided into three parts. First, the wavelet transform coefficient of the image was extracted, then the wavelet energy was calculated and the entropy value was obtained by combining Shannon entropy, and finally the feature vector obtained was classified as the input of the linear regression classifier. The statistical results of this model are obtained by using

During the experiment, we compared six indicators: Sensitivity (sen), specificity (spc), precision (pre), accuracy (acc), F1 score (F1) and Matthews correlation coefficient (MCC). To evaluate the classifier, the output of the classifier is compared with the ideal reference classification. Suppose the question was whether someone had alcohol poisoning. Some of them suffered from alcoholism, and the results of the classification tests correctly indicated that they were positive. We use

Sensitivity represents the ratio of the correct markers in our program to all actual patients with alcoholism, and the formula is expressed as follows:

The accuracy value can usually clearly show the performance of our model. The formula for accuracy during the experiment is as follows:

However, the accuracy of the model is likely to be affected by the imbalance of positive and negative samples. Therefore, the single accuracy value cannot well reflect the overall situation of the model.

Specificity was the correct marker in the experiment for all real people who did not have the disease, and its expression formula is:

Precision is also one of the evaluation indexes for the predicted results. It mainly represents the percentage of the results predicted by the model as positive samples that are actually positive samples. The formula for the precision is as follows:

The high accuracy rate means that it can well reflect the stronger ability of the model to distinguish negative samples.

F1-score is the harmonic mean of accuracy and recall rates.

It mainly aims at the shortcomings of missing data caused by high accuracy rate caused by high threshold and the reduction of prediction accuracy caused by high recall rate.

MCC is a relatively balanced indicator that describes the correlation coefficient between the predicted results and the actual results. The expression formula of MCC is as follows:

One of its advantages is that it can also be used in the case of sample imbalance. It considers True Positives, False Positives, False Negatives and True Negatives as well. The value range of MCC is [ −1, 1]. A value of 1 means that the prediction is completely consistent with the actual result; a value of 0 means that the predicted result is lower than the random prediction result; and a value of −1 means that the predicted result is completely inconsistent with the actual result.

Wavelet transform is a local transform in time and frequency domain. It can extract information from signals effectively by means of multi-scale refinement analysis of functions or signals by means of scaling and shifting. Fourier transform is a tool of mutual transformation from time domain to frequency domain, and it is one of the most widely used and effective analytical methods in signal processing. After the appearance of wavelet analysis, it inherits and develops the features of Fourier transform localization, which is suitable for signal time-frequency analysis and processing, solves the defect of Fourier transform, and becomes a popular method in many application fields.

The Fourier transform is a global transformation, and the values of each point in the function affect the result of the transformation. In contrast to the Fourier transform, wavelets use local bases, and the coefficients are affected only by the points on the support of a particular base. This results in wavelets containing not only frequency information but also time information. The Fourier transform gives information in the one-dimensional frequency domain, which is a mixture of important features of frequency. However, the time information cannot be read directly from the frequency domain. The result of the wavelet transform is two-dimensional information, the horizontal axis is the time axis, and the vertical axis is the frequency. Therefore, a wavelet is very useful for analyzing instantaneous time-varying signals, and some features of the image can be fully highlighted through transformation.

The wavelet decomposition of magnetic resonance imaging is shown in the

A Fourier coefficient usually represents a signal component that runs through the whole-time domain. The advantages of wavelet transform allow for more accurate local description and separation of signal features. For special signals, wavelet coefficients are suitable for most instantaneous signals, and it is easy to interpret image information. The advantage of wavelet transform is that its transform has not only the frequency-domain resolution of Fourier transform but also the time-domain or special resolution. However, because the one-dimensional signal is expressed by the two-dimensional coefficient, it is easy to have great redundancy.

Sen | Spc | Prc | Acc | F1 | MCC | |
---|---|---|---|---|---|---|

1 | 86.70 | 89.53 | 89.07 | 88.13 | 87.87 | 76.27 |

2 | 89.89 | 91.62 | 91.35 | 90.77 | 90.62 | 81.54 |

3 | 84.04 | 91.10 | 90.29 | 87.60 | 87.05 | 75.36 |

4 | 87.23 | 88.48 | 88.17 | 87.86 | 87.70 | 75.73 |

5 | 89.36 | 89.01 | 88.89 | 89.18 | 89.12 | 78.36 |

6 | 88.83 | 90.05 | 89.78 | 89.45 | 89.30 | 78.89 |

7 | 87.77 | 87.96 | 87.77 | 87.86 | 87.77 | 75.72 |

8 | 89.36 | 90.05 | 89.84 | 89.71 | 89.60 | 79.42 |

9 | 87.23 | 86.91 | 86.77 | 87.07 | 87.00 | 74.14 |

10 | 91.49 | 87.43 | 87.76 | 89.45 | 89.58 | 78.97 |

Mean_SD |

Compared with the existing methods support vector machine [

In the following experiments, the results of the experiments at the decomposition Level 2 are shown in

Sen | Spc | Prc | Acc | F1 | MCC | |
---|---|---|---|---|---|---|

1 | 93.09 | 93.19 | 93.09 | 93.14 | 93.09 | 86.28 |

2 | 85.11 | 93.72 | 93.02 | 89.45 | 88.89 | 79.16 |

3 | 92.55 | 90.58 | 90.63 | 91.56 | 91.58 | 83.13 |

4 | 92.55 | 94.76 | 94.57 | 93.67 | 93.55 | 87.35 |

5 | 93.09 | 93.19 | 93.09 | 93.14 | 93.09 | 86.28 |

6 | 87.23 | 93.19 | 92.66 | 90.24 | 89.86 | 80.60 |

7 | 90.96 | 91.62 | 91.44 | 91.29 | 91.20 | 82.59 |

8 | 89.36 | 93.72 | 93.33 | 91.56 | 91.30 | 83.18 |

9 | 88.83 | 93.19 | 92.78 | 91.03 | 90.76 | 82.12 |

10 | 92.55 | 92.15 | 92.06 | 92.35 | 92.31 | 84.70 |

Mean_SD |

In the experiment with the decomposition Level 2, it can be seen that the improvement of the decomposition level is conducive to the improvement of the model test performance. In the decomposition Level 3 experiment in

Sen | Spc | Prc | Acc | F1 | MCC | |
---|---|---|---|---|---|---|

1 | 92.02 | 93.72 | 93.51 | 92.88 | 92.76 | 85.76 |

2 | 89.89 | 95.81 | 95.48 | 92.88 | 92.60 | 85.89 |

3 | 91.49 | 92.67 | 92.47 | 92.08 | 91.98 | 84.17 |

4 | 92.55 | 94.76 | 94.57 | 93.67 | 93.55 | 87.35 |

5 | 89.89 | 94.76 | 94.41 | 92.35 | 92.10 | 84.79 |

6 | 94.68 | 92.67 | 92.71 | 93.67 | 93.68 | 87.36 |

7 | 90.43 | 94.24 | 93.92 | 92.35 | 92.14 | 84.75 |

8 | 92.55 | 93.72 | 93.55 | 93.14 | 93.05 | 86.28 |

9 | 90.96 | 93.19 | 92.93 | 92.08 | 91.94 | 84.18 |

10 | 90.96 | 91.10 | 90.96 | 91.03 | 90.96 | 82.06 |

Mean_SD |

In the experiment with the decomposition Level 4 show in

Sen | Spc | Prc | Acc | F1 | MCC | |
---|---|---|---|---|---|---|

1 | 89.89 | 90.05 | 89.89 | 89.97 | 89.89 | 79.95 |

2 | 89.36 | 92.15 | 91.80 | 90.77 | 90.57 | 81.55 |

3 | 93.09 | 88.48 | 88.83 | 90.77 | 90.91 | 81.63 |

4 | 93.09 | 90.58 | 90.67 | 91.82 | 91.86 | 83.67 |

5 | 90.96 | 89.53 | 89.53 | 90.24 | 90.24 | 80.49 |

6 | 88.83 | 90.05 | 89.78 | 89.45 | 89.30 | 78.89 |

7 | 89.89 | 85.34 | 85.79 | 87.60 | 87.79 | 75.29 |

8 | 88.83 | 90.58 | 90.27 | 89.71 | 89.54 | 79.43 |

9 | 91.49 | 91.62 | 91.49 | 91.56 | 91.49 | 83.11 |

10 | 90.43 | 90.05 | 89.95 | 90.24 | 90.19 | 80.48 |

Mean_SD |

As can be seen from the

In this part, our method is compared with the most advanced methods in the field of image detection by combining wavelet energy entropy and linear regression classifier.

The first method uses Hu moment invariants to extract image features, and combines predator-prey adaptive inertia chaotic particle swarm optimization algorithm to train the single hidden layer neural network classifier [

The data for the comparison results of our methods and these methods are listed in

Method | Sensitivity | Specificity | Precision | Accuracy | F1 | MCC |
---|---|---|---|---|---|---|

HMI-IPSO [ |
90.67 | 91.33 | 91.28 | 91.00 | 90.97 | 82.00 |

SVM [ |
84.56 | 85.70 | 84.78 | 85.15 | 84.67 | 70.27 |

PZM [ |
86.23 | 87.02 | 86.23 | 86.64 | 86.23 | 73.25 |

WFFT [ |
86.23 | 87.27 | 86.46 | 86.77 | 86.34 | 73.51 |

CSO [ |
91.92 | 91.88 | 84.24 | |||

WEE+LRC | ||||||

(ours) |

It can be observed that the WEE+LRC method proposed by us achieved the best performance in all six evaluation values of the six methods. In the comparison of the specificity values of the six methods shown in the table, the SVM method obtained a score of 85.70%. PZM received a score of 87.02%. WFFT received a score of 87.27%. The HMI-IPSO method has been improved with a value of 91.33% Compared with the HMI-IPSO method, the CSO method increased to

In the precision comparison, SVM, PZM, WFFT, HMI-IPSO, CSO obtained 84.78%, 86.23%, 86.46%, 91.92%, 91.28%, respectively. Our WEE+LRC method was maintained at the highest score of

From the bar chart, our method maintained a high level of specificity, precision, accuracy, F1 score and Matthews correlation coefficient. Compared with the previous method, we have about 8% improvement in numerical values and a better score compared with the latest method, indicating that our algorithm has a higher performance.

We compared our feature extraction method with the popular wavelet energy and wavelet entropy. In order to ensure the fair comparison of the same conditions, we carried out three levels of the wavelet transform, and the comparison results are shown in

Feature extraction | Sensitivity | Specificity | Precision | Accuracy | F1 | MCC |
---|---|---|---|---|---|---|

Wavelet energy | ||||||

Wavelet entropy | ||||||

WEE (ours) |

In the comparison of specificity and precision, although the wavelet energy fractions of

In the comparison of Matthews coefficient, it can be seen that there is a leap difference between our method and the other two methods. The highest value of our method is

In general, the score of the feature extraction method proposed by us is significantly better than that of wavelet energy and wavelet entropy in six evaluations, indicating that we have a feature extraction method with higher quality and higher precision to improve the overall performance of the model.

In this study, we compared our LRC classifier with different classifiers: Decision tree (DT), support vector machine (SVM), and naive Bayesian classifier (NBC). The results are shown in

Classifier | Sensitivity | Specificity | Precision | Accuracy | F1 | MCC |
---|---|---|---|---|---|---|

DT | ||||||

SVM | ||||||

NBC | ||||||

LRC (ours) |

As can be seen from the data comparison of the table, the average value of our assessment is always the highest. In the comparison of six indicators, our method is stable above the SVM score by about 1%, and the DT method is lower than the SVM score by about 1% to 4% in all values. The approach of NBC had the lowest results in the experiment, at least 3% less than our approach. In particular, NBC’s approach was 12% lower than ours in the MCC comparison. Although the values of SVM are at least 1% and 4% higher than those of DT and NBC, respectively. The highest score of our method represents its best experimental performance among all method comparisons.

We propose a novel alcohol detection system, which is based on a new image extraction method wavelet energy entropy and uses linear regression classifier for diagnosis. Wavelet energy entropy extracted from magnetic resonance imaging (MRI) replaces image features with signals, which can simplify complex image information and thus improve detection efficiency. The linear regression classifier can visually distinguish and express the relationship between variables according to the image information provided. The experimental results of cross-validation show that the efficient modeling speed and running speed of linear regression classifier also have obvious advantages in comparison with the latest detection methods. For doctors, high quality and efficient auxiliary diagnostic tools are needed. Our proposed alcohol detection system can meet the diagnostic needs of doctors and help relieve their diagnostic pressure.

In future studies, our method can continuously improve performance, improve detection accuracy and reduce detection complexity through optimization experiments. The new study will not be limited to alcohol diagnosis, it could also be used for other types of classification tasks.