Comparison of regression models for estimation of isometric wrist joint torques using surface electromyography

Background Several regression models have been proposed for estimation of isometric joint torque using surface electromyography (SEMG) signals. Common issues related to torque estimation models are degradation of model accuracy with passage of time, electrode displacement, and alteration of limb posture. This work compares the performance of the most commonly used regression models under these circumstances, in order to assist researchers with identifying the most appropriate model for a specific biomedical application. Methods Eleven healthy volunteers participated in this study. A custom-built rig, equipped with a torque sensor, was used to measure isometric torque as each volunteer flexed and extended his wrist. SEMG signals from eight forearm muscles, in addition to wrist joint torque data were gathered during the experiment. Additional data were gathered one hour and twenty-four hours following the completion of the first data gathering session, for the purpose of evaluating the effects of passage of time and electrode displacement on accuracy of models. Acquired SEMG signals were filtered, rectified, normalized and then fed to models for training. Results It was shown that mean adjusted coefficient of determination (Ra2) values decrease between 20%-35% for different models after one hour while altering arm posture decreased mean Ra2 values between 64% to 74% for different models. Conclusions Model estimation accuracy drops significantly with passage of time, electrode displacement, and alteration of limb posture. Therefore model retraining is crucial for preserving estimation accuracy. Data resampling can significantly reduce model training time without losing estimation accuracy. Among the models compared, ordinary least squares linear regression model (OLS) was shown to have high isometric torque estimation accuracy combined with very short training times.


Background
SEMG is a well-established technique to non-invasively record the electrical activity produced by muscles. Signals recorded at the surface of the skin are picked up from all the active motor units in the vicinity of the electrode [1]. Due to the convenience of signal acquisition from the surface of the skin, SEMG signals have been used for controlling prosthetics and assistive devices [2][3][4][5][6][7], speech recognition systems [8], and also as a diagnostic tool for neuromuscular diseases [9].
However, analysis of SEMG signals is complicated due to nonlinear behaviour of muscles [10], as well as several other factors. First, cross talk between the adjacent muscles complicates recording signals from a muscle in isolation [11]. Second, signal behaviour is very sensitive to the position of electrodes [12]. Moreover, even with a fixed electrode position, altering limb positions have been shown to have substantial impact on SEMG signals [13]. Other issues, such as inherent noise in signal acquisition equipment, ambient noise, skin temperature, and motion artefact can potentially deteriorate signal quality [14,15].
The aforementioned issues necessitate utilization of signal processing and statistical modeling for estimation of muscle forces and joint torques based on SEMG signals. Classification [16] and regression techniques [17,18], as well as physiological models [19,20], have been considered by the research community extensively. Machine learning classification methods in aggregate have proven to be viable methods for classifying limb postures [21] and joint torque levels [22]. Park et al. [23] compared the performance of a Hill-based muscle model, linear regression and artificial neural networks for estimation of thumb-tip forces under four different configurations. In another study, performance of a Hillbased physiological muscle model was compared to a neural network for estimation of forearm flexion and extension joint torques [24]. Both groups showed that neural network predictions are superior to Hill-based predictions, but neural network predictions are task specific and require ample training before usage. Castellini et al. [22] and Yang et al. [25], in two distinct studies, estimated grasping forces using artificial neural networks (ANN), support vectors machines (SVM) and locally weighted projection regression (LWPR). Yang concluded that SVM outperforms ANN and LWPR.
This study was intended to compare performance of commonly utilized regression models for isometric torque estimation and identify their merits and shortcomings under circumstances where the accuracy of predictive models has been reported to be compromised. Wrist joint was chosen as its functionality is frequently impaired due to aging [26] or stroke [7], and robots (controlled by SEMG signals) are developed to train and assist affected patients [2,3]. Performance of five different models for estimation of isometric wrist flexion and extension torques are compared: a physiological based model (PBM), an ordinary least squares linear regression model (OLS), a regularized least squares linear regression model (RLS), and three machine learning techniques, namely SVM, ANN, and LWPR.

Physiological Based Model
Physiological based model (PBM) used in this study is a neuromusculoskeletal model used for estimation of joint torques from SEMG signals. Rectified and smoothed SEMG signals have been reported to result in poor estimations of muscle forces [27,28]. This is mainly due to (a) existence of a delay between SEMG and muscle tension onset (electromechanical delay) and (b) the fact that SEMG signals have a shorter duration than resulting forces. It has been shown that muscle twitch response can be modeled well by using a critically damped linear second order differential equation [29]. However since SEMG signals are generally acquired at discrete time intervals, it is appropriate to use a discretized form. Using backward differences, the differential equation takes the form of a discrete recursive filter [30]: where e j is the processed SEMG signal of muscle j at time t, d is the electromechanical delay, α is the gain coefficient, u j (t) is the post-processed SEMG signal at time t, and β 1 and β 2 the recursive coefficients for muscle j.
Electromechanical delay was allowed to vary between 10 and 100 ms as that is the range for skeletal muscles [31]. The recursive filter maps SEMG values e j (t) for muscle j into post-processed values u j (t). Stability of this equation is ensured by satisfying the following constraints [32]: Unstable filters will cause u j (t) values to oscillate or even go to infinity. To ensure stability of this filter and restrict the maximum neural activation values to one, another constraint is imposed: Neural activation values are conventionally restricted to values between zero and one, where zero implies no activation and one translates to full voluntary activation of the muscle.
Although isometric forces produced by certain muscles exhibit linear relationship with activation, nonlinear relationships are observed for other muscles. Nonlinear relationships are mostly witnessed for forces of up to 30% of the maximum isometric force [33]. These nonlinear relationships can be associated with exponential increases in firing rate of motor units as muscle forces increase [34]: where A is called the non-linear shape factor. A = -3 corresponds to highly exponential behaviour of the muscle and A = 0 corresponds to a linear one.
Once nonlinearities are explicitly taken into account, isometric forces generated by each muscle at neutral joint position at time t are computed using [35]: where F max,j is the maximum voluntary force produced by muscle j.
Isometric joint torque is computed by multiplying isometric force of each muscle by its moment arm: where MA j is moment arm at neutral wrist position for muscle j and τ j (t) is the torque generated by muscle j at time t. Moment arms for flexors and extensors were assigned positive and negative signs respectively to maintain consistency with measured values.
As not all forearm muscles were accessible by surface electrodes, each SEMG channel was assumed to represent intermediate and deep muscles in its proximity in addition to the surface muscle it was recording from. Torque values from each channel were then scaled using mean physiological cross-section area (PCSA) values tabulated by Jacobson et al. and Lieber et al. [36][37][38]. Joint torque estimation values have been shown not to be highly sensitive to muscle PCSA values and therefore these values were fixed and not a part of model calibration [39]. The isometric torque at the wrist joint was computed by adding individual scaled torque values: where M is the number of muscles used in the model, and ΣPCSA j is the summation of PCSA of the muscle represents by muscle j and PCSA of muscle j itself.
Models were calibrated to each volunteer by tuning model parameters. Yamaguchi [42] has summarized maximum isometric forces reported by different investigators. We used means as initial values and constrained F max to one standard deviation of the reported values.
Initial values for moment arms were fixed to the mean values in [43], and constrained to one standard deviation of the values reported in the same reference. Since these parameters are constrained within their physiologically acceptable values, calibrated models can potentially provide physiological insight [24]. Activation parameters A, C 1 , C 2 , and d were assumed to be constant for all muscles a model with too many parameters loses its predictive power due to overfitting [44]. Parameters x = {A, C 1 , C 2 , d, F max,1 , ..., F max,M , MA 1 , MA 2 , ..., MA M } were tuned by optimizing the following objective function while constraining parameters to values mentioned beforehand: Models were optimized by Genetic Algorithms (GA) using MATLAB Global Optimization Toolbox (details of GA implementation are available in [45]). GA has previously been used for tuning muscle models [20]. Default MATLAB GA parameters were used and models were tuned in less than 100 generations (MATLAB default value for the number of optimization iterations) [46].

Ordinary Least Squares Linear Regression Model
torques using processed SEMG signals [23]. Linear regression is presented as: (9) where N is the number of samples considered (observations), M is the number of muscles, τ m is a vector of measured torque values, SEMG is a matrix of processed SEMG signals, β is a vector of regression coefficients to be estimated, and ε is a vector of independent random variables.
Ordinary least squares (OLS) method is most widely used for estimation of regression coefficients [47]. Estimated vector of regression coefficients using least squares method (β) is computed using: Once the model is fitted, SEMG values can be used for estimation of torque values (τ e ) as shown:

Regularized Least Squares Linear Regression Model
The ℓ 1 -regularized least squares (RLS) method for estimation of regression coefficients is known to overcome some of the common issues associated with the ordinary least squares method [48]. Estimated vector of Figure 1 Steps and parameters involved in the PBM. regression coefficients using ℓ 1 -regularized least squares method (β) is computed through the following optimization: where l ≥ 0 is the regularization parameter which is usually set equal to 0.01 [49,50].
We used the Matlab implementation of the ℓ 1 -regularized least squares method [51].

Support Vector Machines
Support vectors machines (SVM) are machine learning methods used for classification and regression. Support vector regression (SVR) maps input data using a nonlinear mapping to a higher-dimensional feature space where linear regression can be applied. Unlike neural networks, SVR does not suffer from the local minima problem since model parameter estimation involves solving a convex optimization problem [52].
We used epsilon support vector regression (ε-SVR) available in the LibSVM tool for Matlab [53]. Details of ε-SVR problem formulation are available in [54]. ε-SVR has previously been utilized for estimation of grasp forces [22,25]. The Gaussian kernel was used as it enables nonlinear mapping of samples and has a low number of hyperparameters, which reduces complexity of model selection [55]. Eight-fold cross-validation to generalize error values and grid-search for finding the optimal values of hyperparameters C, γ and ε were carried out for each model.

Artificial Neural Networks
Artificial neural networks (ANN) have been used for SEMG classification and regression extensively [22,25,56,57]. Three layer neural networks have been shown to be adequate for modeling problems of any degree of complexity [58]. We used feed-forward back propagation network with one input layer, two hidden layers, and one output layer [59]. We also used BFGS quasi-Newton training that is much faster and more robust than simple gradient descent [60]. Network structure is depicted in Figure 2, where M is the number of processed SEMG channels used as inputs to the ANN and τ e is the estimated torque value.
ANN models were trained using Matlab Neural Network Toolbox. Hyperbolic tangent sigmoid activation functions were used to capture the nonlinearities of SEMG signals. For each model, the training phase was repeated ten times and the best model was picked out of those repetitions in order to overcome the local minima problem [52]. We also used early stopping and regularization in order to improve generalization and reduce the likelihood of overfitting [61].

Locally Weighted Projection Regression
Locally Weighted Projection Regression (LWPR) is a nonlinear regression method for high-dimensional spaces with redundant and irrelevant input dimensions [62]. LWPR employs nonparametric regression with locally linear models based on the assumption that high dimensional data sets have locally low dimensional distributions. However piecewise linear modeling utilized in this method is computationally expensive with high dimensional data.
We used Radial Basis Function (RBF) kernel and meta-learning and then performed an eight-fold cross validation to avoid overfitting. Finally we used grid search to find the initial values of the distance metric for receptive fields, as it is customary in the literature [22,25]. Models were trained using a Matlab version of LWPR [63].

Methods
A custom-built rig was designed to allow for measurement of isometric torques exerted about the wrist joint. Volunteers placed their palm on a plate and Velcro straps were used to secure their hand and forearm to the plate. The plate hinged about the axis of rotation shown in Figure 3.
A Transducer Techniques TRX-100 torque sensor, with an axis of rotation corresponding to that of the volunteer's wrist joint, was used to measure torques applied about the wrist axis of rotation. Volunteer's forearm was secured to the rig using two Velcro straps. This design allowed restriction of arm movements. Volunteer placed their elbow on the rig and assumed an upright position.  Once the MVC values for both flexion and extension were determined, the volunteer was asked to gradually flex her/his wrist to 50% of MVC. Once the 50% was reached the volunteer gradually decreased the exerted torque to zero. This procedure was repeated three times for flexion and then for extension. Finally the volunteer was asked to flex and extend her/his wrist to 25% of MVC three times. Figure 4 shows a sample of torque signals gathered. Positive values on the scale are for flexion and negative values are for extension. Following the completion of this protocol, volunteers were asked to supinate their forearm, and follow the same protocol as before. Figure 5 shows forearm in pronated and supinated positions.
Completion of protocols in both pronated and supinated forearm positions was called a session. Table 1 summarizes actions in protocols.
In order to capture the effects of passage of time on model accuracy, volunteers were asked to repeat the same session after one hour. This session was named session two. Electrodes were not detached in between the two sessions. After completion of session two, electrodes were removed from the volunteer's skin. The volunteer was asked to repeat another session in twenty four hours following session two while attaching new electrodes. This was intended to capture the effects of electrode displacement and further time passage.
Each volunteer was asked to supinate her/his forearm and exert isometric torques on the rig following the same protocol used before after completion of session 1. This was intended to study the effects of arm posture on model accuracy.

SEMG Acquisition
A commercial SEMG acquisition system (Noraxon Myosystem 1400L) was used to acquire signals from eight SEMG channels. Each channel was connected to a    Noraxon AgCl gel dual electrode that picked up signals from the muscles tabulated in Table 2.
It has been reported that the extrinsic muscles of the forearm have large torque generating contributions in isometric flexion and extension [64]. Therefore we considered three superficial secondary forearm muscles as well as the primary forearm muscles accessible via SEMG. The skin preparation procedure outlined in surface electromyography for the non-invasive assessment of muscles project (SENIAM) was followed to maximize SEMG signal quality [65]. Figure 6 shows the position of electrodes attached to a volunteer's forearm. SEMG signals were acquired at 1 kHz using a National Instruments (NI-USB-6289) data acquisition card. An application was developed using LabVIEW software that stored data on a computer and provided visual feedback for volunteers. Visual feedback consisted of a bar chart that visualized the magnitude of exerted torques, which aided volunteers to follow the protocol more accurately.

Signal Processing
Initially DC offset values of SEMG signals were removed. Signals were subsequently high-pass filtered using a zero-lag Butterworth fourth order filter (30 Hz cut-off frequency), in order to remove motion artefact. Signals were then low-pass filtered using a zero-lag Butterworth fourth order filter (6 Hz cut-off frequency), full-wave rectified and normalized to the maximum SEMG value for each channel. Figure 7 shows the signal processing scheme.
33,520 samples were acquired from each of the eight SEMG channels and the torque sensor for each volunteer. The data set was broken down into training and testing data. Figure 8 shows a sample of raw and processed SEMG signals.

Results and Discussion
Models were initially trained with the training data set. The performance of trained models was subsequently tested by comparing estimated torque values from the model and the actual torque values from the testing data set. Two accuracy metrics were used to compare the performance of different models: normalized root mean squared error (NRMSE) and adjusted coefficient of determination (R 2 a ) [64]. Root mean squared error (RMSE) is a measure of the difference between measured and estimated values. NRMSE is a dimensionless metric expressed as RMSE over the range of measured torques values for each volunteer: where τ e (i) is the estimated and τ m (i) is the measured torque value for sample i, n corresponds to the total number of samples tested, and τ m,flex and τ m,ext are the maximum flexion and extension torques exerted by each volunteer. The absolute value of τ m,ext is considered because of the negative sign assigned to extension torque values during signal acquisition. R 2 is a measure of the percentage of variation in the dependant variable (torque) collectively explained by the independent variables (SEMG signals):     (14) where τ m is the mean measured torque. However R 2 has a tendency to overestimate the regression as more independent variables are added to the model. For this reason, many researchers recommend adjusting R 2 for the number of independent variables: where R 2 a is the adjusted R 2 , n is the number of samples and k is the number of SEMG channels.
Models were trained using every 100 data resampled from the processed signals to save model training time. Data set was reduced to 335 samples with resampling. Training time t, was measured as the number of seconds it took for each model to be trained. All training and testing was performed on a computer with an Intel ® Core™2 Duo 2.5 GHz processor and 6 GB of RAM. Table 3 compares mean training times for models trained using the original and resampled data sets.
One-way Analysis of Variance (ANOVA) failed to reject the null hypothesis that NRMSE and R 2 a have different mean values for each model, meaning that the difference between means is not significant (with minimum P-value of 0.95). We used reduced data sets with data resampled every 100 samples for the rest of the study.

Number of Muscles
As merely one degree of freedom of the wrist was considered in this study, the possibility of training models using only two primary muscles was investigated initially. There are six combinations possible with one primary flexor and one primary extensor muscle: FCR-ECRL, FCR-ECRB, FCR-ECU, FCU-ECRL, FCU-ECRB, and FCU-ECU. Models were trained using 75% of the data for all six combinations and then tested on the remaining 25% and the model with the best performance was picked. Mean and standard deviation of NRMSE and R 2 a for models trained with two, five, and eight channels are presented in Table 4.
It is noteworthy that best performance was not consistently attributed to a single combination of muscles for the case of models trained with two channels. It is evident that models trained with five channels are superior to models trained with two. However models trained with eight channels do not have significant performance superiority. Figure 9 compares NRMSE and R 2 a for different number of training channels.
This result appears to be in contrast to the results obtained by Delp et al. [66] where extrinsic muscles of the hand are expected to contribute substantially to torque generation. However, due to the design of our testing rig, volunteers only generated torque by pushing   their palms against the torque-sensing plate and their fingers did not contribute to torque generation. Therefore the addition of SEMG signals of extrinsic muscles to the model did not result in a significant increase in accuracy.
It should be noted that using more data for training models increases accuracy for same session models. Table 5 compares NRMSE and R 2 a for two extreme cases where 25% and 90% of the data set is used for training models and the rest of the data set is used for testing using all SEMG channels.
Mean R 2 a values increased 19%, 21%, 18%, 14%, 32%, and 26% while mean NRMSE values decreased 47%, 48%, 50%, 54%, 60%, and 46% for PBM, OLS, RLS, SVM, ANN, and LWPR, respectively. Figure 10 visualizes NRMSE and R 2 a for the two cases. For PBM training with two and five channels, ΣPCSA term in equation 7 was modified. For the two channel case, equation 7 took the following form: τ e (t) = PCSA f1exors PCSA f1exor × τ flexor (t) where ΣPCSA flexors is the summation of PCSA of all flexor muscles, ΣPCSA extensors is the summation of PCSA of all extensor muscles, PCSA flexor is the PCSA of the flexor muscle used for training, PCSA extensor is the PCSA of the extensor muscle used for training, τ flexor (t) is the torque of the flexor muscle used for training at time t, and τ extensor (t) is the torque of the flexor muscle used for training at time t.
Similarly PBM training with the five primary wrist muscles was carried out with modified ΣPCSA terms.   Half of the summation of PCSA values for non-primary flexors was added to each of the two primary flexors while a third of the summation of PCSA values for nonprimary extensors was added to the ΣPCSA term of each of the three primary extensors. These modifications allowed tuned parameters to stay within their physiologically acceptable values, even though less SEMG channels were used for training models.

Cross Session
Passages of time as well as electrode displacement adversely affect accuracy of models trained with SEMG [22,25]. Models trained with session 1 were tested with data from session 2 (in 1 hour without detaching electrodes) and session 3 (in 24 hours and with new electrodes attached). Table 6 compares model performance for the two cases.
Results suggest that model reliability deteriorates with passage of time. Figure 11 compares mean and standard deviation of NRMSE and R 2 a of models trained with session 1 and tested with data from the same session, after 1 and 24 hours.
Mean R 2 a values after one hour decreased 34%, 28%, 25%, 34%, 35%, and 20% while mean NRMSE decreased 93%, 68%, 70%, 88%, 91%, and 79% for PBM, OLS, RLS, SVM, ANN, and LWPR, respectively. After twenty four hours mean NRMSE values decreased further. High standard deviations of NRMSE and R 2 a values suggest unreliability of model predictions with passage of time and electrode displacement. Therefore it is crucial for models trained using SEMG signals to be retrained frequently regardless of the model utilized.

Arm Posture
Arm posture changes SEMG signal characteristics [8]. A model trained with the forearm in pronated position was utilized to predict the measured values from the supinated position in the same session. Supinating the forearm resulted in the torque sensor readings for extension and flexion to be reversed. This was explicitly taken into account when processing signals. Prediction accuracy of the trained models reduced significantly with forearm supination as shown in Table 7.
ANOVA shows that the hypothesis that NRMSE and R 2 a of testing was the same is refuted with P < 0.01. Results from this experiment validate that trained models are very sensitive to arm posture. Forearm supination shifts SEMG signal space. Since models trained in the pronated position do not take this shift into consideration, accuracy decreases [22]. SEMG patterns change with different arm postures that models need to explicitly take into consideration [67,68]. Figure 12 shows the effects of forearm supination on prediction accuracy of models trained with forearm in pronated position. Mean NRMSE values increased 2.50, 2.10, 2.13, 2.04, 2.24, and 2.32 times for PBM, OLS, RLS, SVM, ANN, and LWPR.   Table 8 summarizes performance of models based on different criteria. One advantage of machine learning techniques is that these models can be trained with raw SEMG signals as they are capable of mapping the nonlinearities associated with raw SEMG signals. In contrast, PBM can only be trained with processed SEMG signals since inputs to the PBM represent neural activity of muscles (a value bounded between zero and one) [69].
Moreover, nonlinear behaviour of muscles [10] observed in raw SEMG signals precludes utilization of linear regression for mapping.

Conclusions
Eleven volunteers participated in this study. During the first session, 33,520 samples from eight SEMG channels and a torque sensor were acquired while volunteers followed a protocol consisting of isometric flexion and extension of the wrist. We then processed SEMG signals and resampled every 100 samples to save model training time. Subsequently we trained models using identical training data sets. When using 90% of data as training data set and the rest of the data as testing data, we attained R 2 a values of 0.96 ± 0.04, 0.97 ± 0.04, 0.97 ± 0.03, 0.97 ± 0.03, 0.96 ± 3, and 0.87 ± 0.07 for PBM, OLS, RLS, SVM, ANN, and LWPR respectively. All models performed in a very comparable fashion, except for LWPR that yielded lower R 2 a values and higher NRMSE values.
Models trained using the data set from session one were tested using two separate data sets gathered one hour and twenty four hours following session one. We showed that Mean R 2 a values after one hour decrease 34%, 28%, 25%, 34%, 35%, and 20% for PBM, OLS, RLS, SVM, ANN, and LWPR, respectively. Tests after twenty four hours showed even further performance deterioration. Therefore it was concluded that all models considered in this study are sensitive to passage of time and electrode displacement.
The effects of the number of SEMG channels used for training were explored. Models trained with SEMG channels from the five primary forearm muscles were shown to be of similar predictive ability compared to models trained with all eight SEMG channels. However, models trained with two SMEG channels resulted in predictions with lower R 2 a and higher NRMSE values. Finally models trained with forearm in a pronated position were tested with data gathered from forearm in the supinated position. Mean NRMSE values increased 2.50, 2.10, 2.13, 2.04, 2.24, and 2.32 times for PBM, OLS, RLS, SVM, ANN, and LWPR.   The substantial decrease in predictive ability of all models with passage of time, electrode displacement, and altering arm posture necessitates regular retraining of models in order to sustain estimation accuracy. We showed that resampling the data set substantially reduces the training time without sacrificing estimation accuracy of models. All models were trained in under 20 seconds while OLS was trained in under 10 ms. Low training times achieved in this work render regular retraining feasible.