Internet of Things (IoT) and related applications have successfully contributed towards enhancing the value of life in this planet. The advanced wireless sensor networks and its revolutionary computational capabilities have enabled various IoT applications become the next frontier, touching almost all domains of life. With this enormous progress, energy optimization has also become a primary concern with the need to attend to green technologies. The present study focuses on the predictions pertinent to the sustainability of battery life in IoT frameworks in the marine environment. The data used is a publicly available dataset collected from the Chicago district beach water. Firstly, the missing values in the data are replaced with the attribute mean. Later, one-hot encoding technique is applied for achieving data homogeneity followed by the standard scalar technique to normalize the data. Then, rough set theory is used for feature extraction, and the resultant data is fed into a Deep Neural Network (DNN) model for the optimized prediction results. The proposed model is then compared with the state of the art machine learning models and the results justify its superiority on the basis of performance metrics such as Mean Squared Error, Mean Absolute Error, Root Mean Squared Error, and Test Variance Score.

Internet-of-Things (IoT) and related applications have gained immense momentum since the past decade with its implementations touching almost all spheres of life in our planet. IoT is often termed as Internet of Everything (IoE) which creates a global network of machines and devices integrated with one another establishing seamless communication systems across wide spectrum of domains [

In healthcare, IoT applications have enabled reliable tracking of patients and transforming of data for providing essential and accelerated medical service [

The present study focuses on prediction of battery life sustainability for sensors which are installed in the beaches for collecting marine data. These sensors and batteries are primary components for IoT applications deployed to collect and monitor data from the marine environment namely Beach name, Measurement Timestamp, Water Temperature, Turbidity, Transducer Depth, Wave Height, Wave period and the battery life. Various applications for monitoring marine environment have been developed integrating IoT architectures, sensing, and control and communication technologies. These implementations are predominant in the areas of ocean sensing and monitoring, wave and current monitoring, water quality assessment, fish farming and coral reef monitoring. Although significant research has been conducted on IoT applications catering to the needs of marine environment, most of the studies have been conducted either at the under-water or surface level. Also solar energy harvesting has been a major point of discussion among various studies along with other forms of energy – wind, wave and ocean currents. The world at this point is progressing towards green technologies where energy optimization plays a vital role [

The data used in the study is a publicly available dataset collected from the Chicago park district beach water. The missing values in the raw data are replaced with attribute means and then one hot encoding technique is applied achieve data homogeneity transforming all categorical values to numerical ones. This data is further normalized using Standard scalar technique and rough set approach is used for feature extraction. The extracted data thus contain best features having most significant impact on the target class – battery life. The extracted data is fed into a DNN ensuring use of activation function for optimized prediction results on battery life sustainability of the devices in the IoT framework. The prediction of battery life would help in tuning the connected sensor and network devices for the achievement of sustainable energy. The results are compared with the start of the art technologies to establish reliability of the prediction results.

The contributions of the proposed model are:

Using the DNN’s a prediction model is proposed to forecast the battery life of the water sensors deployed along Chicago District Beach.

The proposed architecture uses the concepts of rough sets to minimize dimensionality and select significant features to increase prediction accuracy and decrease complexity.

The results are compared with common state-of-the-art techniques such as Linear Regression and XGBoost, which highlight the enhanced performance of the proposed model

A review of the current rough sets model and the state-of-the-art techniques is performed in depth.

The experimentation justify the fact that the proposed prediction model outperforms other popular ML techniques thereby validating the positive impact of integrating rough set technique with the ML model.

In the recent decade numerous artificial intelligence methods have been used to boost the learning level of the systems in an extremely granular manner. In this section, discussion on the various contributions of deep learning algorithms and rough set concepts are discussed through an extensive survey of literature.

In [

In [

In [

In [

In [

In [

The management of on-site maintenance visits is an important requirement for the successful operation of IoT networks. In particular, on-site visits to replace or recharge depleted batteries may be required in battery powered devices. For this purpose, practitioners often use prediction techniques to estimate the battery life of the deployed devices. Battery life is one of the biggest challenges for IoT at present [

In [

In [

Author | Contribution | Advantages | Disadvantages |
---|---|---|---|

Lopez et al. [ |
A classifier for monitoring Network traffic is designed hybridizing RNN with CNN. | Robust Method and F1 detection scores is excellent. | High-level header based data is used extracting from the data packets for training the model. |

Fan et al. [ |
PSO algorithm and Rough PSO algorithm is proposed considering the rough set Concept | Rough PSO also is used for classification. Faster convergence speed also better precision | Trapped in locally optimum solution and Time complexity |

Mahdavinejad et al. [ |
ML Techniques are studied. Applications of IoT, IoT data characteristics and relevant technical implementations are explored. | The methods surveys have been identified to be suitable with ease of use in various types of problems | Present issues in the field of smart data analytics have not been included. |

Manogaran et al. [ |
An optimized Bayesian neutral network is proposed consisting of densely connected layer is proposed. The model helps to detect temperature imbalance in the field of healthcare. | Problems relevant to multiple access based physical monitoring system is resolved. | Methods to scale the devices is excluded from the study. |

Chakraborty et al. [ |
Make use of Neighborhood Rough Set (NRS) in exploiting the uncertainty of tracking the object in a video sequence. | Without any prior knowledge and variations in size and speed also, propose the rack of multiple objects. | In the areas of signal processing, the main concern of NRS filter and intuitionistic entropy are unsupervised prediction and handling ambiguity. |

Hassan et al. [ |
A novel approach is proposed based on concepts of rough sets to develop ML and soft computing using DL architecture. The method caters to solve real-life problems. | Approach focuses on integration of local properties pertinent to individual decision table. The method attempts enables acceptable global decision making. | RS assumes the presence of single decision-making table, whereas real-world problems involve numerous decisions from varied decision-making tables. |

Otero et al. [ |
ACO algorithm is proposed incorporating widely used techniques from standard decision tree and ACO algorithms. | Comparison of 22 publicly available data sets including three decision tree algorithms - CACDT, CART, and C4.5 | Results are statistically proof but with single test has done. |

In this section the algorithms and methods used in this study are explicitly discussed.

The conventional mathematical methods like crisp sets failed to resolve unclear and vague problems which acted as the prime motivation among researchers to work in a direction to solve such problems where approximations would play a major role. The concept of Rough sets was first proposed in [

Fuzzy and rough sets have been considered as complimentary simplifications of the classical set theory concepts. The recourse to two definable subsets (ψ) is loosely defined by subsets of a universal structure (U).

As per the theory of Rough sets, there exist four classes namely -

A is roughly X - definable, iff X(A)≠ ψ ∧ X¯ (A) ≠U

A is roughly X - definable, iff X(A) = ψ ∧ X¯ (A) ≠U

A is roughly X - definable, iff X(A) ≠ ψ ∧ X¯ (A) = U

A is roughly X - definable, iff X(A) = ψ ∧ X¯ (A) = U

Accuracy of any rough set (A) can be measured by the possible values of target set (A) to the immediate set.

where A - represents the carnality of set A and α X(A) lies in between 0 and 1.

It is the most significant element in determining the variables from the rough sets. Considering two sets (X, Y), its equivalence classes is considered as [A]X, [A]Y where Y_{i} is the equivalence class from the attribute set Y. The dependency for the other set X, η can be calculated as

From

DNN is one of the standard methods for generating classification models where learning can be of any forms supervised, unsupervised or semi-supervised [

The input to neuron in the first hidden layer is given in

where C_{1} _{(x,y)} and z_{1} _{(x)} are the weight and bias respectively. The output of the neuron x in the first hidden layer is given by h_{1(x)} = _{1(1)}) where

XGBoost is one of the most effective approaches in ML which implements the concept of decision trees in regular successions without any gaps [

Linear Regression is one of the best supervised algorithms with declared output having perpetual and constant slope. It predicts values from a range of enclosed data values. Simple regression and multi-variable regression are the forms of Linear Regression. Simple regression and multi-variable regression are usually referred by

where x,y,z represents attributes, m, a_{1}, a_{2} and a_{3} are variables for the regression process. Statistical methods can be used to measure and reduce the size of the error variable to improve predictive power of the model.

One Hot Encoding is a process of exemplifying categorical values into binary numbers. It is a known fact that machine learning algorithms fail to work on categorical data and hence have to be converted to numbers where one hot encoding technique plays its significant role. As a natural reaction it is possible to opine on the use of integer coding directly but it has its limitations when used on relationships of the natural ordinal types. In one hot encoding technique, the categorical values in the data are directly assessed to integer values and each integer value is changed to a binary value [

The architecture of the proposed model is depicted in the

a. Preprocessing plays a crucial role in machine learning algorithms.

b. The dimensionality reduction technique is applied to the used data using rough Set theory.

c. The reduced dataset is then fed into a deep neural network model for predicting the sustainability of battery life.

d. The DNN model is evaluated against the traditional state of the art models to validate the performance of the predictions.

Column Name | Data Description | Unit of the Gathered Sensor Data | Range of Data |
---|---|---|---|

Beach Name | Name of the beach where water sensor is deployed | Plain Text String | 6 locations |

Measurement Timestamp | Measurement is done on hourly basis in a single day throughout the season | Date and Time | 30/8/2013 8.00 AM & 11/9/2019 11.00 AM |

Water Temperature | Temperature of water at the time of measurement | Degree Celsius | 9.1 to 31.5 |

Turbidity | Turbidity of the water waves | Nephelometric Turbidity Units (NTU) | 0.01 to 1683.48 |

Transducer Depth | Depth | Meters | −0.082 to 2.214 |

Wave Height | Height of the water waves | Meters | 0.013 to 1.467 |

Wave Period | Period at which the waves occur | Seconds | 1 to 10 |

Battery Life | Voltage of the battery remaining for the purpose of deciding the time for replacement | Numeric | 4.8 to 13.3 |

Name of the Location of the water sensor | Location of the Sensor (in Latitude and Longitude) |
---|---|

63rd Street Beach | (41.784561°, −87.571453°) |

Calumet Beach | (41.714739°, −87.527356°) |

Montrose Beach | (41.969094°, −87.638003°) |

Ohio Street Beach | (41.894328°, −87.613083°) |

Osterman Beach | (41.987675°, −87.651008°) |

Rainbow Beach | (41.760147°, −87.550081°) |

where

The MAE calculates similarity of the projections to potential results, while the RMSE is the standard deviation of the sample from the discrepancies between the expected

z is every z[i] value that is observed minus the sum of the average of the ztest values that are observed. The variance is,

Real time raw data for water are collected from different places. Since the data is raw, there are few instances with some features missing, as the battery drain the sensor reducing battery sustainability. There is a possibility that some time-stamp instances would also not be usable until the battery is replaced. But filling in of missing values is extremely important before initiating the process of forecasting using the ML model. In the present study, the missing values in the dataset are first filled with the the respective attribute mean which makes the dataset ready for further analysis.

It is known from the data summary that out of ten characteristics, two are time tag driven, which do not contribute to the battery life prediction of the sensor node. Therefore, both features are omitted for further analysis. The remaining eight features of the

The next attempt was directed to extract features having higher impact on battery life prediction. Also the ones with negative impact were eliminated using rough sets. The number of features thus got decreased to 182 as part of dimensionality reduction. These features have the best impact on the target class prediction.

Layer | Type | Shape | Parameters |
---|---|---|---|

Dense\_17 | Dense | 128 | 22123 |

Dense\_18 | Dense | 256 | 31451 |

Dense\_19 | Dense | 256 | 63652 |

Dense\_20 | Dense | 256 | 63652 |

Dense\_21 | Dense | 256 | 52244 |

Dense\_22 | Dense | 128 | 31475 |

Dense\_23 | Dense | 64 | 7856 |

Dense\_24 | Dense | 1 | 61 |

Total Parameters | 272514 | ||

Trainable Parameters | 272514 |

Layer | Type | Shape | Parameters |
---|---|---|---|

Dense\_14 | Dense | 128 | 25624 |

Dense\_15 | Dense | 256 | 32123 |

Dense\_16 | Dense | 256 | 64634 |

Dense\_17 | Dense | 256 | 64634 |

Dense\_18 | Dense | 256 | 64634 |

Dense\_19 | Dense | 128 | 31684 |

Dense\_20 | Dense | 64 | 7956 |

Dense\_21 | Dense | 1 | 54 |

Total Parameters | 291343 | ||

Trainable Parameters | 291343 |

The proposed model is evaluated on the basis of various standard performance metrics: Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, and Test Variance [

Model Used | MAE | MSE | RMSE | TVS |
---|---|---|---|---|

DNN + Rough sets (Proposed) | 5.16 | 35.68 | 5.85 | 0.11 |

DNN | 5.25 | 36.81 | 5.98 | 0.17 |

Linear Regression + Rough sets | 6.45 | 80.47 | 8.74 | −0.23 |

Linear Regression | 979.62 | 16920158.899 | 40786.5 | −256623.7 |

XGBoost + Rough sets | 5.34 | 45.58 | 6.12 | 0.26 |

XGBoost | 5.47 | 44.12 | 6.31 | 0.29 |

The

There is no doubt that there are significant studies relevant to application of IoT in marine environment. But there exist lag in research which focus on energy optimization aspects of IoT applications especially towards enhancement of sustainability of battery life. In the present study a novel approach has been adopted based on converging rough set with DNN to predict battery life of the IoT network with optimum accuracy. The normal pre-processing steps used in this approach were further refined incorporating the rough set approach for extracting significant features which has contributed immensely towards more accurate predictions. The results of the model were compared with the state of the art techniques to establish its superiority. As part of future work, the same model can be deployed on several IoT applications rendering home surveillance, healthcare and defence services. Also the scalability and robustness of the model can be validated by testing the same on magnanimous IoT application like traffic predictions, air pollution, waste management etc.in smart city setups. The predictions of battery life would also guide in the design and development of energy efficient products involving IoT and sustainable energy technologies.