<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "http://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.1">
<front>
<journal-meta>
<journal-id journal-id-type="pmc">CMC</journal-id>
<journal-id journal-id-type="nlm-ta">CMC</journal-id>
<journal-id journal-id-type="publisher-id">CMC</journal-id>
<journal-title-group>
<journal-title>Computers, Materials &#x0026; Continua</journal-title>
</journal-title-group>
<issn pub-type="epub">1546-2226</issn>
<issn pub-type="ppub">1546-2218</issn>
<publisher>
<publisher-name>Tech Science Press</publisher-name>
<publisher-loc>USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">32739</article-id>
<article-id pub-id-type="doi">10.32604/cmc.2023.032739</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Feature Fusion-Based Deep Learning Network to Recognize Table Tennis Actions</article-title>
<alt-title alt-title-type="left-running-head">Feature Fusion-Based Deep Learning Network to Recognize Table Tennis Actions</alt-title>
<alt-title alt-title-type="right-running-head">Feature Fusion-Based Deep Learning Network to Recognize Table Tennis Actions</alt-title>
</title-group>
<contrib-group content-type="authors">
<contrib id="author-1" contrib-type="author" corresp="yes">
<name name-style="western"><surname>Yen</surname><given-names>Chih-Ta</given-names></name><xref ref-type="aff" rid="aff-1">1</xref><email>chihtayen@gmail.com</email></contrib>
<contrib id="author-2" contrib-type="author">
<name name-style="western"><surname>Chen</surname><given-names>Tz-Yun</given-names></name><xref ref-type="aff" rid="aff-2">2</xref></contrib>
<contrib id="author-3" contrib-type="author">
<name name-style="western"><surname>Chen</surname><given-names>Un-Hung</given-names></name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<contrib id="author-4" contrib-type="author">
<name name-style="western"><surname>Wang</surname><given-names>Guo-Chang</given-names></name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<contrib id="author-5" contrib-type="author">
<name name-style="western"><surname>Chen</surname><given-names>Zong-Xian</given-names></name><xref ref-type="aff" rid="aff-3">3</xref></contrib>
<aff id="aff-1"><label>1</label><institution>Department of Electrical Engineering, National Taiwan Ocean University</institution>, <addr-line>Keelung City, 202301</addr-line>, <country>Taiwan</country></aff>
<aff id="aff-2"><label>2</label><institution>Office of Physical Education, National Formosa University</institution>, Yunlin County, <addr-line>632</addr-line>, <country>Taiwan</country></aff>
<aff id="aff-3"><label>3</label><institution>Department of Electrical Engineering, National Formosa University</institution>, Yunlin County <addr-line>632</addr-line>, <country>Taiwan</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>&#x002A;</label>Corresponding Author: Chih-Ta Yen. Email: <email>chihtayen@gmail.com</email></corresp>
</author-notes>
<pub-date pub-type="epub" date-type="pub" iso-8601-date="2022-08-16"><day>16</day>
<month>08</month>
<year>2022</year></pub-date>
<volume>74</volume>
<issue>1</issue>
<fpage>83</fpage>
<lpage>99</lpage>
<history>
<date date-type="received"><day>28</day><month>5</month><year>2022</year></date>
<date date-type="accepted"><day>29</day><month>6</month><year>2022</year></date>
</history>
<permissions>
<copyright-statement>&#x00A9; 2023 Yen et al.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Yen et al.</copyright-holder>
<license xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>This work is licensed under a <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ext-link>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
</license>
</permissions>
<self-uri content-type="pdf" xlink:href="TSP_CMC_32739.pdf"></self-uri>
<abstract>
<p>A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study. The wearable device consisted of a six-axis sensor, Raspberry Pi 3, and a power bank. Multiple kernel sizes were used in convolutional neural network (CNN) to evaluate their performance for extracting features. Moreover, a multiscale CNN with two kernel sizes was used to perform feature fusion at different scales in a concatenated manner. The CNN achieved recognition of the four table tennis strokes. Experimental data were obtained from 20 research participants who wore sensors on the back of their hands while performing the four table tennis strokes in a laboratory environment. The data were collected to verify the performance of the proposed models for wearable devices. Finally, the sensor and multi-scale CNN designed in this study achieved accuracy and F1 scores of 99.58&#x0025; and 99.16&#x0025;, respectively, for the four strokes. The accuracy for five-fold cross validation was 99.87&#x0025;. This result also shows that the multi-scale convolutional neural network has better robustness after five-fold cross validation.</p>
</abstract>
<kwd-group kwd-group-type="author">
<kwd>Wearable devices</kwd>
<kwd>deep learning</kwd>
<kwd>six-axis sensor</kwd>
<kwd>feature fusion</kwd>
<kwd>multi-scale convolutional neural networks</kwd>
<kwd>action recognition</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1"><label>1</label><title>Introduction</title>
<p>Motion capture systems can effectively detect human movements. Motion capture systems based on inertial sensors have been used in various studies because their small size, low cost, and adaptability to different environments enable them to be integrated with wearable devices for the monitoring and recording of body movement [<xref ref-type="bibr" rid="ref-1">1</xref>]. Athletes have used wearable devices to monitor their training and movements to improve their fitness and performance [<xref ref-type="bibr" rid="ref-2">2</xref>&#x2013;<xref ref-type="bibr" rid="ref-4">4</xref>]. Scientific training methods and equipment are highly valued in athletics. Conventional training equipment and methods that lack theoretical basis have been gradually replaced or improved by new methods through continual testing and by accumulating training experience. Thus, wearable devices are widely used in studies of various human activities [<xref ref-type="bibr" rid="ref-5">5</xref>&#x2013;<xref ref-type="bibr" rid="ref-7">7</xref>].</p>
<p>In sports competitions, particularly individual sports, motion analysis is advantageous for athletes because the results can provide them with useful feedback on performance; this feedback can be used to improve performance with corrective exercises. Experiments on applications of motion analysis in sports are becoming more common. For example, Malawski studied motion in fencing by placing two nine-axis sensors on the elbow and chest of athletes to collect data. They then used a linear support vector machine to analyze fencing accuracy and fencing footwork [<xref ref-type="bibr" rid="ref-8">8</xref>]. In rowing, Worsey&#x00A0;et&#x00A0;al.&#x00A0;reviewed various related studies and noted that most researchers placed sensors on rowing equipment to monitor rowing performance [<xref ref-type="bibr" rid="ref-9">9</xref>]. In swimming, Guignard&#x00A0;et&#x00A0;al.&#x00A0;placed two nine-axis sensors on the upper and lower arms of swimmers to analyze swimming performance [<xref ref-type="bibr" rid="ref-10">10</xref>]. In running, Struber&#x00A0;et&#x00A0;al.&#x00A0;used an S-Move system consisting of five nine-axis sensors to achieve three-dimensional automatic and dynamic gait analysis [<xref ref-type="bibr" rid="ref-11">11</xref>]. In table tennis, Tabrizi&#x00A0;et&#x00A0;al.&#x00A0;developed a forehand stroke evaluation system with a single nine-axis sensor to provide players, particularly new players, actual practice with three forehand strokes [<xref ref-type="bibr" rid="ref-12">12</xref>].</p>
<p>Although wearable devices are widely used in data collection for various human actions, the positioning of the devices on the body must be related to the action being studied; otherwise, the accuracy of action recognition will be reduced. Placing sensors in optimal positions is key for action recognition [<xref ref-type="bibr" rid="ref-13">13</xref>,<xref ref-type="bibr" rid="ref-14">14</xref>]. Therefore, some studies on wearable devices have used multiple sensors to improve accuracy. However, placing multiple sensors on different parts of the body can be difficult, inhibiting, or uncomfortable for research participants [<xref ref-type="bibr" rid="ref-15">15</xref>]. The number of sensors must also be considered for motion capture in some sports. Studies have analyzed and predicted physical activities through appropriate experiments. Therefore, to obtain optimal table tennis data for subsequent activity analysis and estimation, this study referred to studies of racket sports, such as badminton and table tennis, in [<xref ref-type="bibr" rid="ref-16">16</xref>,<xref ref-type="bibr" rid="ref-17">17</xref>]. In these studies, different accelerometer, gyroscope, or magnetometer placements on the body may affect action recognition accuracy. According to the results of [<xref ref-type="bibr" rid="ref-16">16</xref>] and [<xref ref-type="bibr" rid="ref-17">17</xref>], a single sensor had the highest accuracy when worn on the wrist. Therefore, a single sensor was placed on the wrists of the participants for data collection in this study.</p>
<p>Artificial intelligence techniques have also matured in recent years. Some studies have combined wearable devices with artificial intelligence to achieve more accurate action recognition. Lawal&#x00A0;et&#x00A0;al.&#x00A0;used accelerometers and gyroscopes to measure motion for seven different parts of the human body and applied a dual-channel convolutional neural network (CNN) for data classification. The CNN was able to obtain an F1 score of 90&#x0025; on a public database. The results revealed that using complementary data from both accelerometers and gyroscopes for different body parts enabled the system to effectively classify actions, and data from sensors on the neck and the waist had a greater effect on action recognition accuracy [<xref ref-type="bibr" rid="ref-18">18</xref>]. Gholamiangonabadi&#x00A0;et&#x00A0;al.&#x00A0;proposed a leave-one-subject-out cross-validation network architecture. The network comprised six feedforward neural networks and CNNs, and 10-fold cross validation was used for human action recognition. The accuracy reached 99.85&#x0025; [<xref ref-type="bibr" rid="ref-19">19</xref>]. Tufek&#x00A0;et&#x00A0;al.&#x00A0;used ZigBee modules to automatically collect human action data. Due to the imbalanced data in the data set, they used data augmentation to improve the performance. The last three layers of the long short-term memory (LSTM) network they used had the highest accuracy of 93&#x0025; [<xref ref-type="bibr" rid="ref-20">20</xref>]. B&#x00FC;the&#x00A0;et&#x00A0;al.&#x00A0;placed a single sensor on a tennis racket and had two sensors attached to player&#x2019;s shoes to capture data related to arm movements and footwork during a shot. A longest common subsequence algorithm was used, and five shots and three footwork patterns were recognized, and accuracy of shot and footwork recognition was 94&#x0025; and 95&#x0025;, respectively [<xref ref-type="bibr" rid="ref-21">21</xref>]. Brzostowski&#x00A0;et&#x00A0;al.&#x00A0;used a Pebble smartwatch to collect acceleration data and used mel-frequency cepstral coefficients for feature extraction. Subsequently, principal component analysis was used to reduce the dimensionality of the data, and finally k-nearest neighbors and logistic regression models were employed to perform 10-fold cross validation on tennis shots. The accuracy of the k-nearest neighbors and logistic regression models were 82.22&#x0025; and 87.99&#x0025;, respectively, whereas the accuracy of the leave-one-out cross validation was 82.16&#x0025; and 87.16&#x0025;, respectively [<xref ref-type="bibr" rid="ref-22">22</xref>]. Pei&#x00A0;et&#x00A0;al.&#x00A0;used a six-axis sensor for data collection and trained a model with a 50&#x0025; overlapping register window and gravity, and the model achieved 98&#x0025; accuracy for shot detection. It achieved 96&#x0025; accuracy for three types of shot recognition and 80&#x0025; accuracy for two types of spin recognition [<xref ref-type="bibr" rid="ref-23">23</xref>]. Pardo&#x00A0;et&#x00A0;al.&#x00A0;placed six-axis sensors on participant wrists and waists for data collection and used a CNN to classify four shots in tennis and seven non-tennis activities with a mean accuracy of 99&#x0025; [<xref ref-type="bibr" rid="ref-24">24</xref>]. Yen&#x00A0;et&#x00A0;al.&#x00A0;used deep learning with feature fusion method into wearable sensor devices for human activity recognition and the accuracies in tenfold cross-validation were 99.56&#x0025; and 97.46&#x0025;, respectively [<xref ref-type="bibr" rid="ref-25">25</xref>].</p>
<p>Different deep learning networks are suitable for application in different fields. To determine the suitable deep learning network for action recognition, certain studies have used multiple networks for testing. Sansan&#x00A0;et&#x00A0;al.&#x00A0;analyzed deep learning networks suitable for activity recognition and compared their performance in terms of accuracy, speed, and memory. Deep learning networks such as CNN, LSTM, bidirectional LSTM, gated recurrent unit, and deep belief networks were compared, and 10 groups of public databases were analyzed. Each dataset included acceleration and angular velocity data for different body parts. The analysis results for various networks revealed that CNN was effective for capturing activity signals and identifying correlations between sensors. In most cases, it had excellent performance with faster response speed and lower memory consumption than other networks. Sun&#x00A0;et&#x00A0;al.&#x00A0;used multi-feature learning model with enhanced local attention and lightweight feature optimized CNN with joint learning strategy for vehicle re-identification [<xref ref-type="bibr" rid="ref-26">26</xref>,<xref ref-type="bibr" rid="ref-27">27</xref>]. In summary, wearable sensors have been widely used to classify actions, and sensor positioning has a substantial effect on the accuracy of action recognition. To accurately recognize the four table tennis strokes (i.e., forehand, backhand, forehand cut, and backhand cut), the sensors were placed on the back of the hands of the participants. Several tests revealed that hand movements had a greater correlation with the data collected for strokes. Thus, acceleration and angular velocity data of the hand measured by an accelerometer and gyroscope, respectively, can be used in an artificial neural network for action recognition.</p>
<p>The rest of this study is organized as follows. Section 2 describes the hardware architecture of the wearable device, data recording conditions, sensor calibration methods, and distribution of the recorded database data in the training, validation, and test sets. Section 3 explains the motion signal acquisition, data measurement method, input format, and network architecture. Section 4 introduces the calculation methods for evaluation metrics and accuracy used in the experiment. Section 5 presents the experimental results for different convolutional networks and discusses the evaluation results. Finally, Section 6 is the conclusion.</p>
</sec>
<sec id="s2"><label>2</label><title>Experimental Setup</title>
<sec id="s2_1"><label>2.1</label><title>Hardware Architecture</title>
<p>The wearable device used in this study comprised a Raspberry Pi 3, a six-axis sensor (MPU-6050), and a power bank with a capacity of 10050 mAh. Because the power bank must provide power to the Raspberry Pi 3 for a long time during data measurement, its large volume and weight may affect participant performance. Therefore, the power bank was tied to the waist, the Raspberry Pi 3 was fixed on the arm, and the six-axis sensor was installed on the back of the hand to collect the acceleration and angular velocity values of different strokes as input data for the deep learning network proposed in this study. Because there are two types of billiard rackets, and the grips of the two types are different. In order to avoid different grips affecting the calibration of the sensor, we placed the sensor on the handle to avoid inconsistencies in the sensing data. The placement of the wearable device is presented in <xref ref-type="fig" rid="fig-1">Fig. 1</xref>.</p>
<fig id="fig-1"><label>Figure 1</label><caption><title>Placement of the wearable device</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-1.png"/></fig>
<p>In this experiment, the Raspberry Pi 3 was used as a microcontroller for data acquisition, and an inter-integrated circuit communication protocol was used to obtain movement signals measured by the six-axis sensor. Angular velocity on the hand can be measured in <inline-formula id="ieqn-1"><mml:math id="mml-ieqn-1"><mml:mo>&#x00B1;</mml:mo></mml:math></inline-formula>250 &#x00B0;/s, <inline-formula id="ieqn-2"><mml:math id="mml-ieqn-2"><mml:mo>&#x00B1;</mml:mo></mml:math></inline-formula>500 &#x00B0;/s, <inline-formula id="ieqn-3"><mml:math id="mml-ieqn-3"><mml:mo>&#x00B1;</mml:mo></mml:math></inline-formula>1000 &#x00B0;/s, and <inline-formula id="ieqn-4"><mml:math id="mml-ieqn-4"><mml:mo>&#x00B1;</mml:mo></mml:math></inline-formula>2000 &#x00B0;/s. Acceleration measurement range and recall were <inline-formula id="ieqn-5"><mml:math id="mml-ieqn-5"><mml:mo>&#x00B1;</mml:mo></mml:math></inline-formula>16&#x2005;g and 2048 LSB/g, respectively, and the gyroscope was set to <inline-formula id="ieqn-6"><mml:math id="mml-ieqn-6"><mml:mo>&#x00B1;</mml:mo></mml:math></inline-formula>2000 &#x00B0;/s and 16.4 LSB/(&#x00B0;/s). The output signals of the accelerometer and gyroscope were sampled at a frequency of 10&#x2005;Hz, and the power bank was DC 5&#x2005;V at 2.1 A and provided stable power for the wearable device.</p>
</sec>
<sec id="s2_2"><label>2.2</label><title>Database of Recorded Experimental Data</title>
<p>To enhance the stability of our 6-axis sensor. Before wearing the device, the six-axis sensor was first calibrated by collecting 1000 data and taking the mean as to determine error during calibration. The <italic>x</italic>-, <italic>y</italic>-and <italic>z</italic>-axes of the gyroscope were calibrated simultaneously. Because the <italic>x</italic>-, <italic>y</italic>-and <italic>z</italic>-axes of the accelerometer were calibrated using gravity, their calibration was performed separately. The direction and position of the six-axis sensor were strictly controlled. The <italic>x</italic>-axis was toward the fingertip, the <italic>y</italic>-axis was toward the left side of the back of the hand, and the <italic>z</italic>-axis was toward the palm, as displayed in <xref ref-type="fig" rid="fig-2">Fig. 2</xref>.</p>
<p>During the experiment, the wearable device was used to collect data in accordance with this specification. During data collection, the sensor must be worn in a fixed position and orientation. A loose sensor may affect overall data recording during movement, resulting in reduced accuracy. Therefore, the hand was wrapped with a wrist guard, and the six-axis sensor was placed on top of it. The wrist guard was used to fix the wearable device on the back of the hand without shaking, and effectively reduced the error in the collected data.</p>
<fig id="fig-2"><label>Figure 2</label><caption><title>Six-axis sensor on the back of the hand</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-2.png"/></fig>
<p>In this study, 20 participants were recruited to use their right hands to perform four basic table tennis strokes: forehand, backhand, forehand cut, and backhand cut. In order to avoid the participants data being too monotonous, each experimenter performed 4 different billiard actions 600 times. The Raspberry Pi 3 was used to collect data from the six-axis sensor at a rate of 10&#x2005;Hz, and the data were stored in a text file. A total of 2400 values were recorded in the database as a 30&#x2009;&#x00D7;&#x2009;60 matrix with 60 eigenvalues. The collected data were divided into training (80&#x0025;), validation (10&#x0025;), and test sets (10&#x0025;). The number of each type of stroke recorded in each data set is listed in <xref ref-type="table" rid="table-1">Tab. 1</xref>.</p>
<table-wrap id="table-1"><label>Table 1</label><caption><title>Number of each type of stroke in each data set</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left"/>
<th align="left">Training set</th>
<th align="left">Test set</th>
<th align="left">Validation set</th>
<th align="left">Total</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Forehand</td>
<td align="left">480</td>
<td align="left">60</td>
<td align="left">60</td>
<td align="left">600</td>
</tr>
<tr>
<td align="left">Backhand</td>
<td align="left">480</td>
<td align="left">60</td>
<td align="left">60</td>
<td align="left">600</td>
</tr>
<tr>
<td align="left">Forehand cut</td>
<td align="left">480</td>
<td align="left">60</td>
<td align="left">60</td>
<td align="left">600</td>
</tr>
<tr>
<td align="left">Backhand cut</td>
<td align="left">480</td>
<td align="left">60</td>
<td align="left">60</td>
<td align="left">600</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s3"><label>3</label><title>Algorithm for Table Tennis Stroke Recognition</title>
<p>Stroke recognition was performed with data collection and the subsequent use of a deep learning algorithm to recognize different strokes. The details of the proposed action recognition algorithm are described in the following section.</p>
<sec id="s3_1"><label>3.1</label><title>Motion Signal Acquisition</title>
<p>A wearable device was used to measure the experimental data for acceleration and angular velocity of the participant&#x2019;s hand while performing a stroke. The participants performed four table tennis strokes: forehand, backhand, forehand cut, and backhand cut. <xref ref-type="fig" rid="fig-3">Fig. 3</xref> presents plots of acceleration and angular velocity values <italic>vs.</italic> time during the stroke.</p>
<fig id="fig-3"><label>Figure 3</label><caption><title>Line graphs for acceleration (left) and angular velocity (right) during a single swing: (a) fore hand; (b) back hand; (c) fore hand cut; (d) back hand cut</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-3a.png"/>
<graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-3b.png"/></fig>
</sec>
<sec id="s3_2"><label>3.2</label><title>Data Measurement Methods and Formats</title>
<p>During data collection, the data of the participants were collected at one stroke per second, and the sensor collected six values for acceleration (<italic>x</italic>, <italic>y</italic>, and <italic>z</italic>-axes) and angular velocity (<italic>x</italic>, <italic>y</italic>, and <italic>z</italic>-axes) at a sampling frequency of 10&#x2005;Hz; thus, a <inline-formula id="ieqn-7"><mml:math id="mml-ieqn-7"><mml:mn>1</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>60</mml:mn></mml:math></inline-formula> matrix was stored each second. The matrix was recorded in the following order: <italic>x</italic>-axis acceleration, <italic>y</italic>-axis acceleration, <italic>z</italic>-axis acceleration, <italic>x</italic>-axis angular velocity, <italic>y</italic>-axis angular velocity, and <italic>z</italic>-axis angular velocity. Data were collected for 30&#x2005;s; thus, a <inline-formula id="ieqn-8"><mml:math id="mml-ieqn-8"><mml:mn>30</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>60</mml:mn></mml:math></inline-formula> matrix was stored for each measurement.</p>
</sec>
<sec id="s3_3"><label>3.3</label><title>Network Architecture</title>
<p>A CNN was used in this study because it can extract input signal features through its convolutional layer, eliminating the need for manual feature extraction in conventional machine learning. Furthermore, the CNN can learn and classify through its fully connected layers. In this study, multiscale CNN was used for feature fusion, and multiscale correlation features of different lengths were obtained through kernels of different sizes [<xref ref-type="bibr" rid="ref-28">28</xref>] to achieve feature extraction at different scales.</p>
<p>The six-axis sensor signals collected by the wearable device were first stored in a fixed format before normalization. Values were normalized as in <xref ref-type="disp-formula" rid="eqn-1">Eq. (1)</xref>. The minimum value min(<italic>x</italic><sub>i</sub>) in the number sequence was subtracted from the initial value <italic>x</italic><sub>i</sub>, and the difference was divided by the difference between the maximum value max(<italic>x</italic><sub>i</sub>) of the sequence and the minimum value min(<italic>x</italic><sub>i</sub>) of the sequence. The result <inline-formula id="ieqn-9"><mml:math id="mml-ieqn-9"><mml:msub><mml:mrow><mml:mover><mml:mi>x</mml:mi><mml:mo>&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> was scaled in the range of 0 to 1.
<disp-formula id="eqn-1"><label>(1)</label><mml:math id="mml-eqn-1" display="block"><mml:msub><mml:mrow><mml:mover><mml:mi>x</mml:mi><mml:mo>&#x007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2212;</mml:mo><mml:mo movablelimits="true" form="prefix">min</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo movablelimits="true" form="prefix">max</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2212;</mml:mo><mml:mo movablelimits="true" form="prefix">min</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula>where, <italic>x</italic><sub>i</sub> is the input data.</p>
<p>To enable the CNN to predict the correct action with data from a single stroke, the <inline-formula id="ieqn-10"><mml:math id="mml-ieqn-10"><mml:mn>1</mml:mn><mml:mo>&#x00D7;</mml:mo><mml:mn>60</mml:mn></mml:math></inline-formula> data of one stroke were normalized and were used as input to the proposed network architecture. Among these, convolution blocks with three different kernel sizes, A, B, and C, were used for primary feature extraction. Features of different scales were input to the fully connected layers for learning and classification through subsequent concatenation, as presented in <xref ref-type="fig" rid="fig-4">Fig. 4</xref>.</p>
<fig id="fig-4"><label>Figure 4</label><caption><title>CNN architecture combining three different feature extraction methods</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-4.png"/></fig>
<p>Convolutional blocks A and B were concatenated as the input for the next layer of network. The subsequent network comprised two sets of fully connected layers, and the fully connected layers were used for feature classification. The maximum pooling layer, batch normalization, and random dropout between each convolutional layer were used in the convolutional blocks. The maximum pooling layer was used to retain only the most influential features, whereas batch normalization was used to normalize the data to values between 0 and 1. The maximum pooling layer and batch normalization were used to reduce computations during network training and to increase the speed of the network; random dropout was used to prevent overfitting during network training. The activation function used a rectified linear unit to enable the network to calculate nonlinear problems. The rectified linear unit converts the output of some neurons to 0 to reduce overfitting. In the final computation, the Softmax function was used to calculate the probability of each of the four strokes, and the stroke with the highest probability was selected as the classification result. The overall network architecture is presented in <xref ref-type="fig" rid="fig-4">Fig. 4</xref>. Groups of 60 values collected with the accelerometer and gyroscope were used as input data for training the CNN; each group of values was the data collected for one stroke. The data was processed with the convolutional blocks of A and B for feature extraction, and then passed through two fully connected layers with 256 and 128 neurons, respectively, to obtain one of the final four possible outputs of forehand, backhand, forehand cut, and backhand cut. The convolutional blocks A and B are presented in <xref ref-type="fig" rid="fig-5">Fig. 5</xref>.</p>
<fig id="fig-5"><label>Figure 5</label><caption><title>Network architecture of the convolutional blocks of different scales: (a) A; (b) B</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-5.png"/></fig>
<p>The three convolutional layers comprised 32, 64 and 128 neurons, respectively, and the stride was set to 1. The kernel size was set to 3 in block A and 5 in block B; otherwise, the blocks were identical. Masking of the max pooling layer was performed with a 1&#x2009;&#x00D7;&#x2009;3 matrix, the stride was set to 2, and the random dropout was set to 0.25. The Adam optimizer was used in this study with a learning rate of 0.001; this learning rate decreased by 10&#x0025; after every 10 iterations. The loss function used categorical cross-entropy to calculate errors between predicted and actual values to adjust the training weights of the model, and the number of iterations of the CNN was set to 200.</p>
</sec>
</sec>
<sec id="s4"><label>4</label><title>Calculation Methods of Evaluation Metrics and Accuracy</title>
<p>Solutions to classification problems in deep learning or statistics can be evaluated using a confusion matrix. The rows of the confusion matrix are the actual classes, the columns are the predicted classes, and the results are presented as the number of predictions for each actual&#x2013;predicted class pairing. As presented in <xref ref-type="table" rid="table-2">Tab. 2</xref>, true positive (TP) indicates that the actual value is positive and the predicted value is also positive; false negative (FN) means that the actual value is positive and the predicted value is negative; false positive (FP) indicates that the actual value is negative and the predicted value is positive; and true negative (TN) means that the actual value is negative and the predicted value is also negative.</p>
<table-wrap id="table-2"><label>Table 2</label><caption><title>Two-class confusion matrix</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="center" colspan="2">Confusion matrix</th>
<th align="center" colspan="2">Actual value</th>
</tr>
<tr>
<th/>
<th/>
<th align="left">Positive</th>
<th align="left">Negative</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="2">Predicted value</td>
<td align="left">Positive</td>
<td align="left">True positive (TP)</td>
<td align="left">False positive (FP)</td>
</tr>
<tr>
<td align="left">Negative</td>
<td align="left">False negative (FN)</td>
<td align="left">True negative (TN)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Network prediction aims to have high accuracy. Values in the confusion matrix are number counts and may be difficult to use to evaluate the quality of a model directly for large amounts of data. Therefore, four metrics of accuracy, precision, recall, and specificity were calculated using TP, FN, FP, and TN.</p>
<p>Accuracy: The proportion of the correct prediction results to all predictions. In this study, the accuracy is the proportion of strokes that were correctly predicted, and it can be calculated with <xref ref-type="disp-formula" rid="eqn-2">Eq. (2)</xref>.
<disp-formula id="eqn-2"><label>(2)</label><mml:math id="mml-eqn-2" display="block"><mml:mrow><mml:mtext mathvariant="italic">Accuracy</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>Precision: For a single-class item, TP indicates the proportion of positive predictions. For example, if forehand is considered positive and the other actions are considered negative, precision can be calculated by dividing the number of correctly classified forehands by the total number of predicted forehands (both correctly and incorrectly classified). High precision indicates a high probability of correctly predicting an outcome. Precision can be calculated as in <xref ref-type="disp-formula" rid="eqn-3">Eq. (3)</xref>.
<disp-formula id="eqn-3"><label>(3)</label><mml:math id="mml-eqn-3" display="block"><mml:mrow><mml:mtext mathvariant="italic">Precision</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>Recall: As for precision, if forehand is positive and other actions are negative, recall is the number of correctly classified forehands divided by the sum of the number of correctly classified forehands and the number of incorrectly classified non-forehand strokes. That is, recall is the number of correctly recognized swings of all recognized swings. A high recall indicates a high probability of returning most relevant results, and recall can be calculated with <xref ref-type="disp-formula" rid="eqn-4">Eq. (4)</xref>.
<disp-formula id="eqn-4"><label>(4)</label><mml:math id="mml-eqn-4" display="block"><mml:mrow><mml:mtext mathvariant="italic">sensitivity</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mtext mathvariant="italic">Recall</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>P</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>Specificity: For a single-class item, TN is the proportion of the actual negative classifications. Again assuming that forehand is positive and other actions are negative, specificity is the number of correctly classified actions divided by the number of correctly classified other actions plus the number of incorrectly classified forehands. That is, specificity is the number of all other actions that were correctly recognized. High specificity indicates a high probability of correctly recognizing all other actions. Specificity can be calculated with <xref ref-type="disp-formula" rid="eqn-5">Eq. (5)</xref>.
<disp-formula id="eqn-5"><label>(5)</label><mml:math id="mml-eqn-5" display="block"><mml:mrow><mml:mtext mathvariant="italic">Specificity</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:mi>F</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>These four metrics can be used to convert confusion matrix values onto an interval of 0 to 1, and another indicator, the F1 score, can be generated.</p>
<p>F1 score is the harmonic mean of precision and recall. The score measures improvements in precision and recall while minimizing their difference. Its value falls between 0 and 1; 1 represents the optimal output of the model, and 0 represents random output. The F1 score is calculated with <xref ref-type="disp-formula" rid="eqn-6">Eq. (6)</xref>.
<disp-formula id="eqn-6"><label>(6)</label><mml:math id="mml-eqn-6" display="block"><mml:mi>F</mml:mi><mml:mn>1</mml:mn><mml:mspace width="thinmathspace" /><mml:mrow><mml:mtext mathvariant="italic">Score</mml:mtext></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2217;</mml:mo><mml:mrow><mml:mtext mathvariant="italic">Precision</mml:mtext></mml:mrow><mml:mo>&#x2217;</mml:mo><mml:mrow><mml:mtext mathvariant="italic">Recall</mml:mtext></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mtext mathvariant="italic">Precision</mml:mtext></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mtext mathvariant="italic">Recall</mml:mtext></mml:mrow></mml:mrow></mml:mfrac></mml:math></disp-formula></p>
<p>Binary classification was performed four times on each of the four classes to calculate the accuracy, precision, recall, specificity and F1 score for each of the four table tennis strokes. The mean values were used as metrics to evaluate the model.</p>
<p>The CNN was also evaluated by k-fold cross validation. The k-fold cross validation method is used to reduce errors in the actual values due to specific combinations of training and testing data. In k-fold cross validation, raw data are classified into <italic>k</italic> groups, and a nonrepeated group was selected as the test set for each run. The remaining <italic>k</italic>&#x2212;1 unselected groups were used as the training set. Training was repeated <italic>k</italic> times to obtain <italic>k</italic> accuracy values. In this study, <italic>k</italic> was set as 5, as in <xref ref-type="fig" rid="fig-6">Fig. 6</xref>. The accuracies of the five trials were averaged as the overall model accuracy.</p>
<fig id="fig-6"><label>Figure 6</label><caption><title>Five-fold cross validation</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-6.png"/></fig>
</sec>
<sec id="s5"><label>5</label><title>Experimental Results and Discussion</title>
<sec id="s5_1"><label>5.1</label><title>Evaluation Metrics of the Models</title>
<p>Each participant performed four types of table tennis strokes in the experimental environment. The CNN models were trained with the collected data. Five models were trained, four of which were conventional CNNs, and one was a multiscale CNN. The evaluation results after model training are presented in <xref ref-type="table" rid="table-3">Tab. 3</xref>, with Kernel Size_1, Kernel Size_3, Kernel Size_5, and Kernel Size_7 all being conventional CNNs with kernel sizes of 1, 3, 5, and 7, respectively. The final F1 score results revealed that the convolution layer could effectively identify features that improved the training and predictions of the model for kernel sizes of 3 and 5. To further improve the model, a multiscale CNN model combining the features of Kernel Size_3 and Kernel Size_5 was used. The results in <xref ref-type="table" rid="table-3">Tab. 3</xref> revealed that the multiscale CNN model had an accuracy, precision, recall, specificity, and F1 score of 99.58&#x0025;, 99.16&#x0025;, 99.19&#x0025;, 99.72&#x0025;, and 99.16&#x0025;, respectively; the multiscale model outperformed the conventional CNN models in all metrics.</p>
<table-wrap id="table-3"><label>Table 3</label><caption><title>Various evaluation metrics of self-recorded data</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left">Model</th>
<th align="left">Accuracy</th>
<th align="left">Precision</th>
<th align="left">Recall</th>
<th align="left">Specificity</th>
<th align="left">F1-Score</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Kernel size_1</td>
<td align="left">98.12&#x0025;</td>
<td align="left">86.25&#x0025;</td>
<td align="left">96.28&#x0025;</td>
<td align="left">98.75&#x0025;</td>
<td align="left">96.24&#x0025;</td>
</tr>
<tr>
<td align="left">Kernel size_3</td>
<td align="left">99.16&#x0025;</td>
<td align="left">98.33&#x0025;</td>
<td align="left">98.39&#x0025;</td>
<td align="left">99.45&#x0025;</td>
<td align="left">98.32&#x0025;</td>
</tr>
<tr>
<td align="left">Kernel size_5</td>
<td align="left">99.16&#x0025;</td>
<td align="left">98.33&#x0025;</td>
<td align="left">98.35&#x0025;</td>
<td align="left">99.44&#x0025;</td>
<td align="left">98.33&#x0025;</td>
</tr>
<tr>
<td align="left">Kernel size_7</td>
<td align="left">98.54&#x0025;</td>
<td align="left">97.08&#x0025;</td>
<td align="left">97.08&#x0025;</td>
<td align="left">99.02&#x0025;</td>
<td align="left">97.08&#x0025;</td>
</tr>
<tr>
<td align="left">Multi-scale</td>
<td align="left">99.58&#x0025;</td>
<td align="left">99.16&#x0025;</td>
<td align="left">99.19&#x0025;</td>
<td align="left">99.72&#x0025;</td>
<td align="left">99.16&#x0025;</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s5_2"><label>5.2</label><title>Confusion Matrix of the Models</title>
<p><xref ref-type="fig" rid="fig-7">Fig. 7</xref> presents the confusion matrices for all models. The results revealed that the Kernel Size_3 and Kernel Size_5 models could accurately recognize the forehand and backhand. The multiscale CNN model also recognized these actions but had improved recognition accuracy for forehand cut and backhand cut. Thus, the feature fusion technique can effectively improve predictions of table tennis strokes.</p>
<fig id="fig-7"><label>Figure 7</label><caption><title>Confusion matrix for different CNN models: (a) Kernel size_1 (b) Kernel size_3 (c) Kernel size_5 (d) Kernel size_7 (e) Multi-scale</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-7a.png"/><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-7b.png"/></fig>
</sec>
<sec id="s5_3"><label>5.3</label><title>Accuracy and Loss Function of the Models</title>
<p><xref ref-type="fig" rid="fig-8">Figs. 8</xref> to <xref ref-type="fig" rid="fig-12">12</xref> present the accuracy and loss function curves for all trained models. These curves reveal the convergence of model and judge whether the model was stable during training. Blue and orange lines in the graphs represent the training and validation results, respectively. <xref ref-type="fig" rid="fig-8">Figs. 8</xref> to <xref ref-type="fig" rid="fig-11">11</xref> present the accuracy and loss function curves of the conventional CNN models. The figures reveal that the models could not achieve stable convergence even after 200 iterations and had slight oscillations. <xref ref-type="fig" rid="fig-12">Fig. 12</xref> presents the multiscale CNN model; the model began to converge at 100 iterations and fully converged after 160 iterations. The full 200 iterations were not executed because an early stopping mechanism was used to avoid overtraining and overfitting.</p>
<fig id="fig-8"><label>Figure 8</label><caption><title>Kernel size_1 model: (a) Accuracy curve; (b) Loss function curve</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-8.png"/></fig>
<fig id="fig-9"><label>Figure 9</label><caption><title>Kernel size_3 model: (a) Accuracy curve; (b) Loss function curve</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-9.png"/></fig>
<fig id="fig-10"><label>Figure 10</label><caption><title>Kernel size_5 model: (a) Accuracy curve; (b) Loss function curve</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-10.png"/></fig>
<fig id="fig-11"><label>Figure 11</label><caption><title>Kernel size_7 model: (a) Accuracy curve; (b) Loss function curve</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-11.png"/></fig>
<fig id="fig-12"><label>Figure 12</label><caption><title>Multiscale model: (a) Accuracy curve; (b) Loss function curve</title></caption><graphic mimetype="image" mime-subtype="png" xlink:href="CMC_32739-fig-12.png"/></fig>
</sec>
<sec id="s5_4"><label>5.4</label><title>Cross Validation Results of the Models</title>
<p>Network training is a probabilistic model, and the combination of a set of data sets cannot prove that the results are representative of the network. To prevent the prediction results by being affected by fixed training and testing data, five-fold cross validation was adopted to evaluate the results. The five-fold cross validation results in <xref ref-type="table" rid="table-4">Tab. 4</xref> were consistent with the previous results. The multiscale CNN model had an accuracy of 99.87&#x0025; and an error of &#x00B1;&#x2009;0.17&#x0025; in action recognition; thus, it could more accurately recognize table tennis strokes than conventional CNN models could. This result also shows that the multi-scale convolutional neural network has better robustness after five-fold cross validation.</p>
<table-wrap id="table-4"><label>Table 4</label><caption><title>CNN model comparison</title></caption>
<table frame="hsides">
<colgroup>
<col align="left"/>
<col align="left"/>
</colgroup>
<thead>
<tr>
<th align="left"/>
<th align="left">5 fold cross validation</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Kernel size_1</td>
<td align="left">98.6&#x0025; (&#x00B1;1.6&#x0025;)</td>
</tr>
<tr>
<td align="left">Kernel size_3</td>
<td align="left">99.62&#x0025; (&#x00B1;0.31&#x0025;)</td>
</tr>
<tr>
<td align="left">Kernel size_5</td>
<td align="left">99.7&#x0025; (&#x00B1;0.25&#x0025;)</td>
</tr>
<tr>
<td align="left">Kernel size_7</td>
<td align="left">99.58&#x0025; (&#x00B1;0.63&#x0025;)</td>
</tr>
<tr>
<td align="left">Multi-Scale</td>
<td align="left">99.87&#x0025; (&#x00B1;0.17&#x0025;)</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s6"><label>6</label><title>Conclusions</title>
<p>Professional sports teams or amateur sports enthusiasts have recognized the potential of technology and data analysis; they have thus begun to use wearable devices to collect data related to physical activities. These wearables generate large amounts of data for research and analysis to determine whether an athlete has correct posture and stroke during sports and to identify more effective training methods to improve athlete performance.</p>
<p>This study proposed a system for classifying table tennis strokes. In the system, a wearable device containing a six-axis sensor was tied on the back of the hand to collect data during racket movements. Features of the six-axis sensor were determined using a CNN with two different kernel sizes, and actions were classified with the CNN models. Besides, the early stopping mechanism was used to avoid overtraining and overfitting conditions. The accuracy after five-fold cross validation reached 99.87&#x0025;, demonstrating that the new CNN (multiscale CNN) used in this study was more effective than conventional CNNs in action recognition.</p>
</sec>
</body>
<back>
<ack>
<p>We thanks for supporting of the Ministry of Science and Technology MOST (Grant No. MOST 108&#x2013;2221-E-150&#x2013;022-MY3, MOST 110&#x2013;2634-F-019&#x2013;002) and the National Taiwan Ocean University. And we also thank for editor kind coordination. Moreover, we are grateful the reviewers for constructive suggestions.</p>
</ack>
<fn-group>
<fn fn-type="other"><p><bold>Funding Statement:</bold> We thanks for supporting of the Ministry of Science and Technology MOST (Grant No. MOST 108&#x2013;2221-E-150&#x2013;022-MY3, MOST 110&#x2013;2634-F-019&#x2013;002) and the National Taiwan Ocean University.</p></fn>
<fn fn-type="conflict"><p><bold>Conflicts of Interest:</bold> The authors declare that they have no conflicts of interest to report regarding the present study.</p></fn>
</fn-group>
<ref-list content-type="authoryear">
<title>References</title>
<ref id="ref-1"><label>[1]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>Z.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Zhao</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Qiu</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Li</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Using wearable sensors to capture posture of the human lumbar spine in competitive swimming</article-title>,&#x201D; <source>IEEE Transactions on Human-Machine Systems</source>, vol. <volume>49</volume>, no. <issue>2</issue>, pp. <fpage>194</fpage>&#x2013;<lpage>205</lpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-2"><label>[2]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>D. R.</given-names> <surname>Seshadri</surname></string-name>, <string-name><given-names>R. T.</given-names> <surname>Li</surname></string-name>, <string-name><given-names>J. E.</given-names> <surname>Voos</surname></string-name>, <string-name><given-names>J. R.</given-names> <surname>Rowbottom</surname></string-name>, <string-name><given-names>C. M.</given-names> <surname>Alfes</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Wearable sensors for monitoring the physiological and biochemical profile of the athlete</article-title>,&#x201D; <source>Npj Digital Medicine</source>, vol. <volume>2</volume>, no. <issue>72</issue>, pp. <fpage>PMC6646404</fpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-3"><label>[3]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Muniz-Pardos</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Sutehall</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Gellaerts</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Falbriard</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Mariani</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Integration of wearable sensors into the evaluation of running economy and foot mechanics in elite runners</article-title>,&#x201D; <source>Current Sports Medicine Reports</source>, vol. <volume>17</volume>, no. <issue>12</issue>, pp. <fpage>480</fpage>&#x2013;<lpage>488</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-4"><label>[4]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>V. D. K.</given-names> <surname>Eline</surname></string-name> and <string-name><given-names>M. R.</given-names> <surname>Marco</surname></string-name></person-group>, &#x201C;<article-title>Accuracy of human motion capture systems for sport applications; state-of-the-art review</article-title>,&#x201D; <source>European Journal of Sport Science</source>, vol. <volume>18</volume>, no. <issue>6</issue>, pp. <fpage>806</fpage>&#x2013;<lpage>819</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-5"><label>[5]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>O. D.</given-names> <surname>Lara</surname></string-name> and <string-name><given-names>M. A.</given-names> <surname>Labrador</surname></string-name></person-group>, &#x201C;<article-title>A survey on human activity recognition using wearable sensors</article-title>,&#x201D; <source>IEEE Communications Surveys &#x0026; Tutorials</source>, vol. <volume>15</volume>, no. <issue>3</issue>, pp. <fpage>1192</fpage>&#x2013;<lpage>1209</lpage>, <year>2013</year>.</mixed-citation></ref>
<ref id="ref-6"><label>[6]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>I. K.</given-names> <surname>Ihianle</surname></string-name>, <string-name><given-names>A. O.</given-names> <surname>Nwajana</surname></string-name>, <string-name><given-names>S. H.</given-names> <surname>Ebenuwa</surname></string-name>, <string-name><given-names>R. I.</given-names> <surname>Otuka</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Owa</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>A deep learning approach for human activities recognition from multimodal sensing devices</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>179028</fpage>&#x2013;<lpage>179038</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-7"><label>[7]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Ferrari</surname></string-name>, <string-name><given-names>D.</given-names> <surname>Micucci</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Mobilio</surname></string-name> and <string-name><given-names>P.</given-names> <surname>Napoletano</surname></string-name></person-group>, &#x201C;<article-title>On the personalization of classification models for human activity recognition</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>32066</fpage>&#x2013;<lpage>32079</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-8"><label>[8]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>F.</given-names> <surname>Malawski</surname></string-name></person-group>, &#x201C;<article-title>Depth versus inertial sensors in real-time sports analysis: A case study on fencing</article-title>,&#x201D; <source>IEEE Sensors Journal</source>, vol. <volume>21</volume>, no. <issue>4</issue>, pp. <fpage>5133</fpage>&#x2013;<lpage>5142</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-9"><label>[9]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>M. T.</given-names> <surname>Worsey</surname></string-name>, <string-name><given-names>H. G.</given-names> <surname>Espinosa</surname></string-name>, <string-name><given-names>J. B.</given-names> <surname>Shepherd</surname></string-name> and <string-name><given-names>D. V.</given-names> <surname>Thiel</surname></string-name></person-group>, &#x201C;<article-title>A systematic review of performance analysis in rowing using inertial sensors</article-title>,&#x201D; <source>Electronics</source>, vol. <volume>8</volume>, no. <issue>11</issue>, pp. <fpage>1304</fpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-10"><label>[10]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>B.</given-names> <surname>Guignard</surname></string-name>, <string-name><given-names>O.</given-names> <surname>Ayad</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Baillet</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Mell</surname></string-name>, <string-name><given-names>E. D.</given-names> <surname>Simba&#x00F1;a</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Boulanger</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Validity, reliability and accuracy of inertial measurement units (IMUs) to measure angles: Application in swimming</article-title>,&#x201D; <source>Sports Biomech, Advance Online Publication</source>, pp. 1&#x2013;33, <year>2021</year>. <uri xlink:href="https://doi.org/10.1080/14763141.2021.1945136">https://doi.org/10.1080/14763141.2021.1945136</uri>.</mixed-citation></ref>
<ref id="ref-11"><label>[11]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Struber</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Ledouit</surname></string-name>, <string-name><given-names>O.</given-names> <surname>Daniel</surname></string-name>, <string-name><given-names>P.-A.</given-names> <surname>Barraud</surname></string-name> and <string-name><given-names>V.</given-names> <surname>Nougier</surname></string-name></person-group>, &#x201C;<article-title>Reliability of human running analysis with low-cost inertial and magnetic sensor arrays</article-title>,&#x201D; <source>IEEE Sensors Journal</source>, vol. <volume>21</volume>, no. <issue>13</issue>, pp. <fpage>15299</fpage>&#x2013;<lpage>15307</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-12"><label>[12]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S. S.</given-names> <surname>Tabrizi</surname></string-name>, <string-name><given-names>S.</given-names> <surname>Pashazadeh</surname></string-name> and <string-name><given-names>V.</given-names> <surname>Javani</surname></string-name></person-group>, &#x201C;<article-title>A deep learning approach for table tennis forehand stroke evaluation system using an IMU sensor</article-title>,&#x201D; <source>Computational Intelligence and Neuroscience</source>, vol. <volume>9</volume>, pp. <fpage>1</fpage>&#x2013;<lpage>15</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-13"><label>[13]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>Atallah</surname></string-name>, <string-name><given-names>B.</given-names> <surname>Lo</surname></string-name>, <string-name><given-names>R.</given-names> <surname>King</surname></string-name> and <string-name><given-names>G.</given-names> <surname>Yang</surname></string-name></person-group>, &#x201C;<article-title>Sensor positioning for activity recognition using wearable accelerometers</article-title>,&#x201D; <source>IEEE Transactions on Biomedical Circuits and Systems</source>, vol. <volume>5</volume>, no. <issue>4</issue>, pp. <fpage>320</fpage>&#x2013;<lpage>329</lpage>, <year>2011</year>.</mixed-citation></ref>
<ref id="ref-14"><label>[14]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>T.</given-names> <surname>Sztyler</surname></string-name> and <string-name><given-names>H.</given-names> <surname>Stuckenschmidt</surname></string-name></person-group>, &#x201C;<article-title>On-body localization of wearable devices: An investigation of position-aware activity recognition</article-title>,&#x201D; in <conf-name>Proc. PerCom</conf-name>, <conf-loc>Sydney, NSW, Australia</conf-loc>, pp. <fpage>1</fpage>&#x2013;<lpage>9</lpage>, <year>2016</year>. <uri xlink:href="https://doi.org/10.1109/PERCOM.2016.7456521">https://doi.org/10.1109/PERCOM.2016.7456521</uri>.</mixed-citation></ref>
<ref id="ref-15"><label>[15]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>A.</given-names> <surname>Gupta</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Gupta</surname></string-name>, <string-name><given-names>K.</given-names> <surname>Gupta</surname></string-name> and <string-name><given-names>K.</given-names> <surname>Gupta</surname></string-name></person-group>, &#x201C;<article-title>A survey on human activity recognition and classification</article-title>,&#x201D; in <conf-name>Proc. ICCSP 2020</conf-name>, <conf-loc>Chennai, India</conf-loc>, pp. <fpage>0915</fpage>&#x2013;<lpage>0919</lpage>, <year>2020</year>. <uri xlink:href="https://doi.org/10.1109/ICCSP48568.2020.9182416">https://doi.org/10.1109/ICCSP48568.2020.9182416</uri>.</mixed-citation></ref>
<ref id="ref-16"><label>[16]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>C. Z.</given-names> <surname>Shan</surname></string-name>, <string-name><given-names>E. S. L.</given-names> <surname>Ming</surname></string-name>, <string-name><given-names>H. A.</given-names> <surname>Rahman</surname></string-name> and <string-name><given-names>Y. C.</given-names> <surname>Fai</surname></string-name></person-group>, &#x201C;<article-title>Investigation of upper limb movement during badminton smash</article-title>,&#x201D; in <conf-name>Proc. ASCC</conf-name>, <conf-loc>Kota Kinabalu</conf-loc>, pp. <fpage>1</fpage>&#x2013;<lpage>6</lpage>, <year>2015</year>. <uri xlink:href="https://doi.org/10.1109/ASCC.2015.7244605">https://doi.org/10.1109/ASCC.2015.7244605</uri>.</mixed-citation></ref>
<ref id="ref-17"><label>[17]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>S.</given-names> <surname>Winiarski</surname></string-name>, <string-name><given-names>M. L.</given-names> <surname>Ivan</surname></string-name> and <string-name><given-names>B.</given-names> <surname>Ziemowit</surname></string-name></person-group>, &#x201C;<article-title>The role of the non-playing hand during topspin forehand in table tennis</article-title>,&#x201D; <source>Symmetry</source>, vol. <volume>13</volume>, no. <issue>11</issue>, pp. <fpage>2054</fpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-18"><label>[18]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>I. A.</given-names> <surname>Lawal</surname></string-name> and <string-name><given-names>S.</given-names> <surname>Bano</surname></string-name></person-group>, &#x201C;<article-title>Deep human activity recognition with localisation of wearable sensors</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>155060</fpage>&#x2013;<lpage>155070</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-19"><label>[19]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>D.</given-names> <surname>Gholamiangonabadi</surname></string-name>, <string-name><given-names>N.</given-names> <surname>Kiselov</surname></string-name> and <string-name><given-names>K.</given-names> <surname>Grolinger</surname></string-name></person-group>, &#x201C;<article-title>Deep neural networks for human activity recognition with wearable sensors: Leave-one-subject-out cross-validation for model selection</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>133982</fpage>&#x2013;<lpage>133994</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-20"><label>[20]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>N.</given-names> <surname>Tufek</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Yalcin</surname></string-name>, <string-name><given-names>M.</given-names> <surname>Altintas</surname></string-name>, <string-name><given-names>F.</given-names> <surname>Kalaoglu</surname></string-name>, <string-name><given-names>Y.</given-names> <surname>Li</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>Human action recognition using deep learning methods on limited sensory data</article-title>,&#x201D; <source>IEEE Sensors Journal</source>, vol. <volume>20</volume>, no. <issue>6</issue>, pp. <fpage>3101</fpage>&#x2013;<lpage>3112</lpage>, <year>2020</year>.</mixed-citation></ref>
<ref id="ref-21"><label>[21]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>L.</given-names> <surname>B&#x00FC;the</surname></string-name>, <string-name><given-names>U.</given-names> <surname>Blanke</surname></string-name>, <string-name><given-names>H.</given-names> <surname>Capkevics</surname></string-name> and <string-name><given-names>G.</given-names> <surname>Tr&#x00F6;ster</surname></string-name></person-group>, &#x201C;<article-title>A wearable sensing system for timing analysis in tennis</article-title>,&#x201D; in <conf-name>Proc. BSN</conf-name>, <conf-loc>San Francisco, CA, USA</conf-loc>, pp. <fpage>43</fpage>&#x2013;<lpage>48</lpage>, <year>2016</year>. <uri xlink:href="https;//doi.org/10.1109/BSN.2016.7516230">https;//doi.org/10.1109/BSN.2016.7516230</uri>.</mixed-citation></ref>
<ref id="ref-22"><label>[22]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>K.</given-names> <surname>Brzostowski</surname></string-name> and <string-name><given-names>P.</given-names> <surname>Szwach</surname></string-name></person-group>, &#x201C;<article-title>Data fusion in ubiquitous sports training: Methodology and application</article-title>,&#x201D; <source>Wireless Communications and Mobile Computing</source>, vol. <volume>2018</volume>, pp. <fpage>1</fpage>&#x2013;<lpage>14</lpage>, <year>2018</year>.</mixed-citation></ref>
<ref id="ref-23"><label>[23]</label><mixed-citation publication-type="conf-proc"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Pei</surname></string-name>, <string-name><given-names>J.</given-names> <surname>Wang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Xu</surname></string-name>, <string-name><given-names>Z.</given-names> <surname>Wu</surname></string-name> and <string-name><given-names>X.</given-names> <surname>Du</surname></string-name></person-group>, &#x201C;<article-title>An embedded 6-axis sensor based recognition for tennis stroke</article-title>,&#x201D; in <conf-name>Proc. ICCE</conf-name>, <conf-loc>Las Vegas, NV</conf-loc>, pp. <fpage>55</fpage>&#x2013;<lpage>58</lpage>, <year>2017</year>. <uri xlink:href="https://doi.org/10.1109/ICCE.2017.7889228">https://doi.org/10.1109/ICCE.2017.7889228</uri>.</mixed-citation></ref>
<ref id="ref-24"><label>[24]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>L. B.</given-names> <surname>Pardo</surname></string-name>, <string-name><given-names>D. B.</given-names> <surname>Perez</surname></string-name> and <string-name><given-names>C. O.</given-names> <surname>Uru&#x00F1;uela</surname></string-name></person-group>, &#x201C;<article-title>Detection of tennis activities with wearable sensors</article-title>,&#x201D; <source>Sensors</source>, vol. <volume>19</volume>, no. <issue>22</issue>, pp. <fpage>5004</fpage>, <year>2019</year>.</mixed-citation></ref>
<ref id="ref-25"><label>[25]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.-T.</given-names> <surname>Yen</surname></string-name>, <string-name><given-names>J.-X.</given-names> <surname>Liao</surname></string-name> and <string-name><given-names>Y.-K.</given-names> <surname>Huang</surname></string-name></person-group>, &#x201C;<article-title>Feature fusion of a deep-learning algorithm into wearable sensor devices for human activity recognition</article-title>,&#x201D; <source>Sensors</source>, vol. <volume>21</volume>, no. <issue>24</issue>, pp. <fpage>8294</fpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-26"><label>[26]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Sun</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Chen</surname></string-name>, <string-name><given-names>X. R.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>G. Z.</given-names> <surname>Dai</surname></string-name>, <string-name><given-names>P. S.</given-names> <surname>Chang</surname></string-name> <etal>et al.,</etal></person-group> &#x201C;<article-title>A Multi-feature learning model with enhanced local attention for vehicle re-identification</article-title>,&#x201D; <source>Computers, Materials &#x0026; Continua</source>, vol. <volume>69</volume>, no. <issue>3</issue>, pp. <fpage>3549</fpage>&#x2013;<lpage>3560</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-27"><label>[27]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>W.</given-names> <surname>Sun</surname></string-name>, <string-name><given-names>G. C.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>X. R.</given-names> <surname>Zhang</surname></string-name>, <string-name><given-names>X.</given-names> <surname>Zhang</surname></string-name> and <string-name><given-names>N. N.</given-names> <surname>Ge</surname></string-name></person-group>, &#x201C;<article-title>Fine-grained vehicle type classification using lightweight convolutional neural network with feature optimization and joint learning strategy</article-title>,&#x201D; <source>Multimedia Tools and Applications</source>, vol. <volume>80</volume>, no. <issue>20</issue>, pp. <fpage>30803</fpage>&#x2013;<lpage>30816</lpage>, <year>2021</year>.</mixed-citation></ref>
<ref id="ref-28"><label>[28]</label><mixed-citation publication-type="journal"><person-group person-group-type="author"><string-name><given-names>C.-T.</given-names> <surname>Yen</surname></string-name>, <string-name><given-names>J.-X.</given-names> <surname>Liao</surname></string-name> and <string-name><given-names>Y.-K.</given-names> <surname>Huang</surname></string-name></person-group>, &#x201C;<article-title>Human daily activity recognition performed using wearable inertial sensors combined with deep learning algorithms</article-title>,&#x201D; <source>IEEE Access</source>, vol. <volume>8</volume>, pp. <fpage>174105</fpage>&#x2013;<lpage>174114</lpage>, <year>2020</year>.</mixed-citation></ref>
</ref-list>
</back>
</article>