Classification of rhythmic locomotor patterns in electromyographic signals using fuzzy sets

Background Locomotor control is accomplished by a complex integration of neural mechanisms including a central pattern generator, spinal reflexes and supraspinal control centres. Patterns of muscle activation during walking exhibit an underlying structure in which groups of muscles seem to activate in united bursts. Presented here is a statistical approach for analyzing Surface Electromyography (SEMG) data with the goal of classifying rhythmic "burst" patterns that are consistent with a central pattern generator model of locomotor control. Methods A fuzzy model of rhythmic locomotor patterns was optimized and evaluated using SEMG data from a convenience sample of four able-bodied individuals. As well, two subjects with pathological gait participated: one with Parkinson's Disease, and one with incomplete spinal cord injury. Subjects walked overground and on a treadmill while SEMG was recorded from major muscles of the lower extremities. The model was fit to half of the recorded data using non-linear optimization and validated against the other half of the data. The coefficient of determination, R2, was used to interpret the model's goodness of fit. Results Using four fuzzy burst patterns, the model was able to explain approximately 70-83% of the variance in muscle activation during treadmill gait and 74% during overground gait. When five burst functions were used, one function was found to be redundant. The model explained 81-83% of the variance in the Parkinsonian gait, and only 46-59% of the variance in spinal cord injured gait. Conclusions The analytical approach proposed in this article is a novel way to interpret multichannel SEMG signals by reducing the data into basic rhythmic patterns. This can help us better understand the role of rhythmic patterns in locomotor control.


Background
During gait, the Central Nervous System (CNS) activates the muscles of the lower extremities in rhythmic patterns that can be measured by surface electromyography (SEMG). These signals are not precisely periodic; they naturally vary from stride to stride due to responses to environmental stimuli and a number of complex mechanisms in the CNS that are not well understood. SEMG is often used in the study of the motor control of normal and pathological gait, because it contains important information about the timing and intensity of muscle commands that originate in the CNS [1]. There have been several attempts to statistically classify locomotor patterns from SEMG data, however the majority of these approaches are a posteriori and identify patterns without regard for physiological theory. Here, we propose a new a priori analytical method involving fuzzy systems that is designed to classify rhythmic locomotor patterns in SEMG waveforms that fit a rudimentary model of open-loop Central Pattern Generator (CPG) control.
Interpretation of SEMG during gait is particularly challenging due to the complexity of the myoelectric signals, which are stochastic in nature and represent an interference pattern from multiple motor units. Furthermore, SEMG data are usually multi-dimensional and involve significant measurement error (noise) that can only be partially discriminated from true signal using filtering techniques [2]. A number of statistical techniques have been proposed to deal with the high dimensionality and uncertainty that is inherent to SEMG data [3,4]. Jansen et al. [5] used a hierarchical clustering procedure to classify different muscle patterns observed in gait, from which they were able to draw inferences about different walking strategies. Intra-class correlation coefficients have been used to identify characteristics of different patient populations [6]. Factor analysis has been used to capture the underlying correlations between muscles, which has led to a deeper understanding of how locomotor patterns are organized [7]. These advanced analytical approaches can contribute to a better understanding of the underlying neural mechanisms that control muscle activity during gait. However, these approaches are a posteriori and lead to identification of patterns independent of physiological theory. The method proposed here is built upon the specific theory of a CPG that open-loop control of locomotion using simplified, pre-programmed muscle commands.
The idea that human locomotion is driven by oscillating neural circuits located in the spinal cord has been advanced for decades [8]. These circuits, known as the CPG, provide rhythmic "bursts" of muscle activation signals that form the basis of locomotor control [9][10][11]. By analyzing the basic pattern of SEMG signals as well as the variability that occurs over multiple strides, we can gain valuable insight into the function of the CPG and its role in human locomotor control.
One of the most important challenges in gait analysis is to determine if a set of recorded signals represents normal gait or if it contains particular signatures of pathological gait. It is often desirable to compare one set of SEMG waveforms to another in order to determine if a subject's gait exhibits abnormal behavior, if an intervention was successful, or if walking under different conditions involves different muscle activation patterns. Some researchers have developed mathematical indices that quantify certain features of dynamic EMG waveforms for the purpose of quantifying impairment [12,13] or to evaluate stride-to-stride variability [14].
Many neurological disorders are associated with increased variability of gait [1,5,9,15]. This is due to errors in locomotor control caused by dysfunction of specific areas in the CNS. It is conceivable that some CNS disorders may actually reduce the amount of variability, due to a decrease in anticipatory control (supraspinal), a decrease in environmental interaction (spinal reflexes) and a relative increase in self-generated oscillatory commands form the spinal CPG. For example, Miller et al. [14] observed reduced timing variability of the gastrocnemius muscle in Parkinsonian gait. This is an interesting finding that suggests there may be other characteristics of pathological gait that produce abnormally invariant muscle activation signals.
This article describes a combined fuzzy and statistical approach that first classifies basic muscle activation patterns during different phases of the gait cycle, and then evaluates the degree to which recorded muscle signals are consistent with a rudimentary CPG model of locomotor control. This approach is unique in that it enables an estimate of how much of the variability in muscle activity in gait is due to recurring basic patterns and how much is due to error and non-rhythmic sources of control (i.e., anticipatory adjustments, aberrant reflexes, measurement error, etc.).

Methods
Subjects SEMG recordings were collected from four able-bodied (AB) individuals with no neurological conditions, as well as one individual with Parkinson's Disease (PD) and one individual with incomplete Spinal Cord Injury (SCI). Descriptive data of the six subjects is provided in Table  1. PD subjects were classified according to the Hoehn & Yahr scale [16], and SCI subjects were classified according to the American Spinal Injury Association (ASIA) Impairment Scale [17]. PD is a neurological disorder in which the supraspinal centers are believed to generate erroneous signals for locomotion [18]. SCI was included as a case in which the pathways between supraspinal centers and spinal circuits are impaired. We expected to find abnormal features in the SEMG of both pathological subjects.

Instrumentation and protocol
Each subject was instrumented with an 8-channel SEMG system (Biometrics DataLOG, Biometrics Ltd, Ladysmith, VA, USA). Eight electrodes were carefully placed over the muscle belly of the following muscles bilaterally: vastus lateralis (VL), long head of biceps femoris (BF), tibialis anterior (TA) and gastrocnemius lateralis (LG). These particular muscles were selected as a representative set of the major actuators during gait [5]. The skin was cleaned and lightly abraded before the electrodes were attached with double-sided adhesive tape. SEMG signals were amplified, filtered (bandpass: , and recorded at 2000 Hz. A foot switch was placed in the right shoe directly under the heel to detect initial foot contact, which was used to mark the beginning and end of each gait cycle. Each subject performed two trials of overground walking (OG) for a distance of 10 m. Then each subject performed two trials of treadmill walking (TM) for a duration of 30 s. TM speed was set to the average walking speed of the subject's OG trials. The first trial of each set was used as training data for optimizing the model. The second trial was used to validate the model.
After recording, SEMG signals were rectified and filtered using a low-pass Butterworth filter with a cut-off frequency of 10 Hz, which is considered sufficient for noise removal without loss of signal [2]. All signals were then separated into individual gait cycles marked by right foot contact and time-normalized relative to the gait cycle using cubic spline interpolation of 100 evenly spaced points in time (0 to 99% of the gait cycle). All data processing was performed using Matlab software (The Mathworks, Inc., Natick, MA, USA).

Algorithm
The rectified and filtered SEMG signals were coded according to fuzzy sets [3,19]. A set of n Gaussian membership functions were used to represent specific bursts of muscle activity during the gait cycle. These are described by Equation 1. Gaussian functions represent a basic "burst" pattern and have been used previously to decompose SEMG data [20].
Where b i (t) is the ith burst function, τ i is the time of maximum value, and σ i is the standard deviation. The values of τ i and σ i were initially selected a priori to provide good coverage of the gait cycle. τ i were equally spaced throughout the gait cycle, and σ i were all equal to 10% of the gait cycle. Figure 1A illustrates the burst functions for n = 4, and the initial model parameters can be expressed as the following vectors. Each SEMG signal was treated as a weighted sum of the burst functions. Our model is described in Equation (2).  [16]. b American Spinal Injury Association (ASIA) Impairment Scale [17].
A) a priori model Where Y j (t) is the SEMG signal of the jth muscle and w ji is the weighting coefficient for the jth muscle and the ith burst function. n is the number of burst functions. The weighting coefficients were determined by fitting the model to the recorded SEMG data using a least-squares linear regression (Matlab function lscov). Each muscle was therefore represented by a single n-element vector of phase coefficients, resulting in a major reduction in the information density of each signal. Each SEMG signal could then be reconstructed using n coefficients, creating a basic underlying pattern of muscle activation during the gait cycle. These coefficients can be interpreted as the pre-programmed muscle activation patterns that are dispensed by the CPG at the different phases of the gait cycle.
The model was optimized by finding the values of τ i and s i that produced the best fit. A Nelder-Mead simplex direct search algorithm (Matlab function fminsearch) was used to find the burst function parameters that maximized the goodness of fit, R 2 , between the training data and the model output. We interpreted R 2 as the proportion of the variance in the SEMG signals that is explained by the model.

Results
Testing A 4-burst model was fit to the treadmill walking data and the overground walking data separately. Four bursts were initially chosen, because models of the CPG typically consist of four synergies corresponding to a flexor pattern and an extensor pattern on each side of the body [8]. As show in Figure 1, the burst function profile of these two models differed only slightly. Figure 2 shows the SEMG data from one of the validation trials  Following optimization of the model, a separate R 2 was calculated for each subject under each walking condition (OG and TM) using the validation data. Figure 3 summarizes the R 2 values under each walking condition. This represents to what extent the fuzzy model accounts for the variance of all SEMG signals of the validation walking trial. Initially, the model was designed with n = 4 burst functions. We tested for improved model performance by increasing the number of bursts from four to five. The best fit solution resulted in two functions with identical parameters values for τ and σ. In other words, the 5-burst model degenerated to a 4-burst model. The fifth burst was redundant and provided no improvement to the fit of the model.

Discussion
The approach presented in this article represents a form of fuzzy coding of muscle activation signals that can be used to determine an underlying temporal pattern of SEMG signals during gait. The basic structure of the predictor model consists of four overlapping Gaussian membership functions distributed across the gait cycle. This model is based on general theory of CPG control of locomotion. The Gaussian membership functions representing pre-programmed bursts from the CPG were optimized according to a set of training data and then tested against a set of validation data. Four burst functions were sufficient; when a fifth burst was added, the model degenerated into a four-burst model during optimization.
The model assumes that the CPG produces periodic signals that are exactly the same for every stride. From this we may conclude that all stride-to-stride variability is due to mechanisms other than the CPG, i.e., anticipatory adjustments from supraspinal centers, reflex responses to external perturbations, etc. In normal gait, the model was able to account for 70-84% of the variance in SEMG throughout the gait cycle. Similar results were found for the subject with Parkinson's Disease. The model was not able to account for the SEMG of the SCI subject very well, likely due to a lack of coordination and high stride-to-stride variability.
Our statistical approach differs significantly from other methods of interpreting SEMG data during gait. Many SEMG analyses focus on the ensemble average of all strides and do not take into account variability [3,21]. In our analysis, the stride-to-stride variability was essential in determining the goodness of fit of the fuzzy CPG model. Ivanenko et al. [7] used factor analysis to find common waveforms that were shared by multiple muscles. These waveforms are analogous to the Gaussian membership functions that we use in our model, however they are more complex in shape. They were able to account for roughly 80% of the variance in normal gait, which is similar to our results [22].
There are some special considerations when using the analytical method described in this article. First, R 2 is very sensitive to measurement error, so great care should be taken to ensure that electrodes are placed correctly and securely. The calculation of R 2 is based on an estimation of variance using sums of squares. Considering the n-channel SEMG data as a set of points in n-dimensional space the sums of squares are based on Euclidean distances, whereby each dependent variable has equal weight. This may not always be appropriate. For example, if recordings are taken from the soleus and both heads of gastrocnemius, the triceps surae will contribute three times as much to the sum of squares as other muscle groups that are recorded individually.

Conclusions
The analytical approach proposed in this article is a novel way to interpret multichannel SEMG signals by reducing the data into basic rhythmic patterns. This can help us better understand the role of rhythmic patterns in locomotor control, and provide insight about certain forms of pathological gait. Authors' contributions TAT conceived the basis for the study, designed the methodology and carried out the data processing and statistical analyses. JSW collected the bulk of the data and participated in the data processing. SF contributed to the design of the study, recruitment of subjects, and analysis of data. All authors have read and approved the final version of this article.