Reliability of the biceps brachii M-wave
© Calder et al. 2005
Received: 28 December 2004
Accepted: 06 December 2005
Published: 06 December 2005
The peak-to-peak (P-P) amplitude of the maximum M-wave and the area of the negative phase of the curve are important measures that serve as methodological controls in H-reflex studies, motor unit number estimation (MUNE) procedures, and normalization factors for voluntary electromyographic (EMG) activity. These methodologies assume, with little evidence, that M-wave variability is minimal. This study therefore examined the intraclass reliability of these measures for the biceps brachii.
Twenty-two healthy adults (4 males and 18 females) participated in 5 separate days of electrical stimulation of the musculocutaneous nerve supplying the biceps brachii muscle. A total of 10 stimulations were recorded on each of the 5 test sessions: a total of fifty trials were used for analysis. A two-factor repeated measures analysis of variance (ANOVA) evaluated the stability of the group means across test sessions. The consistency of scores within individuals was determined by calculating the intraclass correlation coefficient (ICC). The variance ratio (VR) was then used to assess the reproducibility of the shape of the maximum M-wave within individual subjects.
The P-P amplitude means ranged from 12.62 ± 4.33 mV to 13.45 ± 4.07 mV across test sessions. The group means were highly stable. ICC analysis also revealed that the scores were very consistent (ICC = 0.98). The group means for the area of the negative phase of the maximum M-wave were also stable (117 to 126 mV·ms). The ICC analysis also indicated a high degree of consistency (ICC = 0.96). The VR for the sample was 0.244 ± 0.169, which suggests that the biceps brachii maximum M-wave shape was in general very reproducible for each subject.
The results support the use of P-P amplitude of the maximum M-wave as a methodological control in H-reflex studies, and as a normalization factor for voluntary EMG. The area of the negative phase of the maximum M-wave is both stable and consistent, and the shape of the entire waveform is highly reproducible and may be used for MUNE procedures.
The massed action potential (M-wave) is known as the compound muscle action potential (CMAP); it is an important investigative tool in several different areas of neurophysiological research. Evoking the maximum M-wave (Mmax) by supramaximal stimulation is the electrical equivalent of the recruitment of all motor units within the motor neuron pool . The Mmax is a methodological control to ensure that the effective stimulus intensity to peripheral nerves is consistent across recording sessions . Using a stimulus intensity that produces M-wave responses corresponding to a consistent percentage of Mmax ensures that the same numbers of motor axons are recruited in each trial .
The area of the negative phase of the maximum M-wave is a critical part of motor unit number estimation for tracking the progression of neuromuscular disorders . The P-P amplitude of the maximum M-wave is used in Hoffman reflex (H-reflex) studies to accurately conclude that variations in the H-reflex arise from a neural origin, and are not caused by changes in the muscle, recording conditions, or problems with instrumentation . This is accomplished by calculating the ratio between the maximum P-P amplitude of the H-reflex and the M-wave (Hmax/Mmax), which is considered an index of excitability of the H-reflex arc [1, 3, 5]. Similarly, the M-wave is also used as a normalization factor to correct for day-to-day fluctuations in voluntary electromyographic (EMG) activity due to slight differences in electrode placement, muscle temperature, and other such considerations [6, 7].
The methodologies described above are based on the assumption that there is little variability in the M-wave. Given the importance of the M-wave as a clinical and investigative tool, there are surprisingly few studies that have documented the variability of this waveform. Studies thus far have only examined simple test-retest reliability [4, 8–10], and only one of these studies has included both the P-P amplitude and the area of the negative phase of the maximum M-wave . Two studies have performed a more comprehensive analysis of M-wave reliability using the intraclass correlation coefficient (ICC), but neither the P-P amplitude nor area of the negative phase of the maximum M-wave were specifically investigated [11, 12]. The purpose of this study was therefore to examine maximum M-wave reliability using both the P-P amplitude and area of the negative of the curve for responses obtained from the biceps brachii muscle over a series of five test sessions.
Materials and methods
Physical characteristics of the subjects (N = 22) including gender, age, height, weight and body mass index (BMI).
Mean ± SD
22.87 ± 3.61
1.69 ± 0.09
65.05 ± 14.65
22.50 ± 3.35
Measurement schedule and procedures
There were 5 days of testing with at least 24 hours between each session. All testing was done on the right arm while subjects lay prone on a gurney with their shoulder abducted to 90° at their side, palm facing up with the elbow slightly flexed. Prior to electrode placement, the skin on the right upper arm was lightly abraded with and cleaned with rubbing alcohol to reduce signal impedance at the skin surface. The motor point was then determined for electrode placement. The motor point is defined as the region of the muscle where the lowest possible stimulus will produce a minimal muscle twitch. The motor point of the biceps brachii (BB) was located approximately midway between the glenohumeral joint and the cubital crease.
The cathode portion of the stimulating probe was placed in the estimated motor point region. With the train rate on the stimulator set at 10 pps, and the stimulus duration set at 1 msec , the cathode was moved around the muscle belly to find the motor point. Prior to placing the recording (G2) and reference (Gl) electrodes, skin impedance was measured (Grass EZM Electrode Impedance Meter, Astro-Med Inc., Warwick, RI) and maintained below 10 kΩ. The G2 electrode was placed directly above the motor point of the BB muscle. The Gl electrode was placed on the biceps tendon. Both Gl and G2 were standard size (20 mm diameter) Ag/AgCl electrodes (Grass F-E9-40-5, Astro-Med Inc., Warwick, RI). A self-adhesive ground electrode was placed on the upper portion of the biceps muscle, between G2 and the point of stimulation on the musculocutaneous nerve. The EMG system (Grass, P511, Astro-Med Inc., W Warwick, RI) amplified the evoked potentials (1000×) before they were band-passed filtered (3–1000 Hz).
To ensure that electrical stimulation was accurately over the musculocutaneous nerve, and that the BB was the only muscle being activated, bipolar surface electrodes (DE-2.1, Delsys Inc., Boston, MA) were positioned over the biceps and triceps brachii. One electrode was placed on the lower third of the biceps belly, below the motor point towards the distal tendon. The other was placed between the distal tendon and the top of the belly of the triceps lateral head to monitor activity in the antagonist muscles. A self-adhesive ground electrode was also secured over the collarbone. These signals were amplified with a fixed gain of 10 at the skin surface. The EMG system (Bagnoli 4, Delsys Inc., Boston, MA) further amplified the signals (100×) before they were band-passed filtered (20–450 Hz). Electrode placement remained consistent by tracing all sites with indelible ink, and asking participants to preserve these markings for the duration of the study.
All signals were sent to a 16-bit A/D converter (BNC-2110, National Instruments), and sampled at 2048 Hz using a Computer-Based Oscillograph and Data Acquisition System (DASYLab, DASYTEC National Instruments, Amherst, NH). This recorded data was stored for further analysis on a Pentium III PC (Seanix Technology Inc., Blaine, WA).
Subjects were instructed to close their eyes and lay still throughout the session. The BB M-wave was obtained by stimulating the musculocutaneous nerve as depicted in Figure 1. Stimulations started below the response threshold, and increased in 4 V increments until Mmax was achieved. Stimulus intensity was then increased slightly beyond this point to confirm no further enlargement in the peak-to-peak amplitude, and subsequently returned to the lower intensity where the M response remained stable [14, 15]. Ten stimuli were delivered, separated by a 15-second rest after each pulse. The M-waves were recorded for each trial and saved for later analysis. This protocol was followed for each of the 5 days of testing.
Data reduction was conducted for all ten trials for each of the five test sessions. The P-P amplitude of the biceps brachii M-wave was calculated as the difference between the maximum and minimum of the signal. Area of the negative phase of the biceps brachii M-wave was calculated using trapezoidal integration:
where n is the number of data points, yi is the data value at time t, and Δt is the sampling interval. The start of the negative phase of the biceps brachii M-wave was defined as the first point to cross the zero baseline after the stimulus artefact (t1, y1). The end of the negative phase of the biceps brachii M-wave was the last point before the second baseline crossing (tn, yn). Since trapezoidal integration is sensitive to interval width (Δt), the entire waveform was interpolated to a sampling rate of 10 kHz prior to calculating area under the curve . All data reduction was completed using MATLAB software (The Mathworks Inc., Natick, MA).
All statistical procedures were performed in SYSTAT (SPSS Inc., Chicago, IL). A significance level of P <0.05 was adopted for this study.
Intraclass correlation analysis of variance
Reliability analysis with the intraclass correlation coefficient (ICC) requires two different analysis of variance (ANOVA) models [15–18]. One is to establish the "consistency" of the measures. This is a fully nested model wherein trials are nested within days, which are in turn nested within subjects. When subjects are able to reproduce their own score, the scores are tightly group around the subject's own mean. In this way, the scores of one subject are very different from the scores of another, and the between subjects means squares (MS) error is high. This is also reflected as a high true score variance (σ2 True), as outlined below. Measures that are highly reliable have a true score variance that accounts for the greatest percentage of the total variance. The second ANOVA model is used to examine the "stability" of the means across test sessions. This ANOVA model has two factors (days × subjects). The repeated measurements (trials) on each subject in each day constituted a "within-cells" replication of measures [15–18]. A measure must therefore exhibit both consistency and stability to be considered reliable. The ICC was calculated in the following way:
The mean square (MS) errors for subjects, days and trials were extracted from the fully nested ANOVA model to calculate the ICC in equation 2. In equations 3–5, a' is number of days, n' is number of trials, σ2 e2 is error variance due to days, σ2 e1 is error variance due to trials, and σ2 true is the true score variance. The total variance σ2 Total was then calculated as the sum of the variances (σ2 true + σ2 e1+ σ2 e2). The portion of variances attributable to day-to-day (σ2 e2/σ2 Total), trial-to-trial (σ2 e1/σ2 Total), and between subjects (σ2 True/σ2 Total) error were computed to identify the amount of variability at each level of measurement [11, 12].
The ICC is a ratio of variance due to differences between subjects (signal) to the total variability in the data (signal and noise). Thus, the ICC is a relative measure of the ability to differentiate between individuals . Since clinical measures and normalization techniques are relative to the individual, there is need for an additional measure that assesses the reliability of the biceps brachii M-wave within the individual. The variance ratio (VR) assesses the "reproducibility" of waveform shape for an individual subject [20, 21]. The more similar in shape the waveforms are, the variance ratio tends towards 0. The more dissimilar in shape the waveforms are, the variance ratio tends towards 1.
To calculate the VR, the biceps brachii M-wave for each subject had to be normalized in the time-domain to the same number of data points. Only the negative and positive phases of the biceps brachii M-wave were analyzed. The start of the negative phase of the biceps brachii M-wave was defined as the first point to cross the zero baseline after the stimulus artefact (t1, y1). The end of the positive phase biceps brachii M-wave was the last point before the third baseline crossing (tn, yn). The waveform between these two points was then interpolated up to 1000 data points (T = 1000). This was done for all 50 waveforms (N = 50) within a subject. The formula for calculating the VR was:
where yt,n was the data point for the amplitude of the biceps brachii M-wave at the time t. To calculate , the biceps brachii M-wave was averaged across the 50 trials, which was still a 1000 point waveform. The grand mean was then a single number that represents the mean of all data points across the 50 trials.
The means (M) and standard deviations (SD) and analysis of variance (ANOVA) F-ratios for peak-to-peak (PP) amplitude and area of the negative phase of the biceps brachii maximum M-wave the for the ten trials across five days for all subjects (N = 22).
P-P Amplitude (mV) (M ± SD)
Area (mV·ms) (M ± SD)
12.90 ± 4.64
125.8 ± 40.1
12.63 ± 4.25
122.6 ± 40.1
12.66 ± 4.22
122.7 ± 39.5
12.62 ± 4.33
117.1 ± 37.4
13.45 ± 4.07
126.0 ± 35.2
Minimum – Maximum
5.07 – 23.64 mV
33.7 – 208.9 mV·ms
Source of Variation
Day × Subjects
The between-subjects main effect was significant for the area of the negative phase of the biceps brachii M-wave (see Table 2). There was a significant main effect for days as the area measure was also 4 to 7% lower on Day 4 than on any other day. The day × subjects interaction term was significant, indicating that not all subjects exhibited same pattern of change in the area measure across test sessions. However, this slight lack of stability in group means was compensated for by a high degree of consistency within subjects. The between subjects variability (σ2 True) accounted for 80% of the total variance, which is necessary for a high ICC. The trial-to-trial variability (σ2 e1) was 5% while the day-to-day variability (σ2 e2) was three-fold greater (15%). The resulting ICC was 0.96.
The reliability the P-P amplitude and the area of the negative phase of the maximum M-wave was assessed for fifty trials distributed equally across five test sessions. The P-P amplitude of the maximum M-wave exhibited excellent reliability (ICC = 0.98) and so did the area of the negative phase of the maximum M-wave (ICC = 0.96). These high ICC values are consistent with previous work from our laboratory, on other muscle groups [14, 15]. In the following paragraphs we will discuss the theoretical implications and practical application of our results.
The P-P amplitude of the maximum M-wave ranged from 12.62 ± 4.33 mV to 13.45 ± 4.07 mV across test days. Taylor et al.  reported a mean of 13.4 ± 4.2 mV while Allman and Rice  observed a mean of 15.3 ± 5.6 mV. Thus, our results are well within the range of values found in the literature. The area of the negative phase of the maximum M-wave is used for MUNE, but its value in absolute units is seldom reported. To the best of our knowledge, no comparative data exist for the biceps brachii. In this respect, the current work contributes normative data to the existing literature. Rutkove  reported a mean of 44.2 mV·ms for the abductor pollicis (thenar muscle) while Boe et al.  observed a nearly identical value (44.5 mV·ms). The later research group  also found a mean of 29.2 mV·ms for the first dorsal interosseus/adductor pollicis muscle. The biceps brachii means observed here ranged from 117.1 to 126.0 mV·ms. Given the large P-P amplitude values and longer durations compared to smaller muscles, the area values are quite reasonable.
The ICC analysis of variance technique resulted in a reliability coefficient for the P-P amplitude of the maximum M-wave that was excellent. A high reliability is obtained when the between-subjects variance is substantially larger than the variance in scores within subjects, and the variance of scores due to error is minimized. Individual subjects in the present study exhibited highly consistent P-P amplitude scores, so that the variation in scores between subjects could be clearly observed. Thus, the high ICC value indicates that the P-P amplitude of the maximum M-wave was a reliable estimation of complete activation of the associated the motoneuron pool. These findings provide evidence that support the use of the P-P amplitude of the maximum M-wave as a methodological standard against which other muscular responses, such as the H-reflex or voluntary EMG can be assessed.
The area of the negative phase of the maximum M-wave also exhibited excellent reliability for the same reasons as P-P amplitude. The maximum M-wave represents complete activation of the muscle associated with the stimulated peripheral nerve . In the absence of a neuromuscular disorder, it should remain unchanged over time as the number of α-motoneurons remains constant . It is unreasonable to expect that any physiological measure would have perfect reliability, but an ICC of 0.96 does indicate that the area of the negative phase of the maximum M-wave can be a stable and consistent measure of the number of α-motoneurons. The results presented here therefore support its use in MUNE.
To date, reliability studies on MUNE have utilized quite limited statistical techniques. The Pearson correlation coefficient and the t-test were combined to evaluate MUNE values obtained on only two separate occasions [4, 9]. Both methods combined are limited in that they are still insensitive to the problem of consistency. One study did use the CV to evaluate the consistency of individual subjects, but the stability of the group means was not considered . The current study presents a comprehensive treatment of the reliability of the area of the negative phase of the maximum M-wave. This is important as MUNE is performed over multiple test sessions to monitor the progression of neuromuscular disorders.
Other investigators have reported that trial-to-trial variability accounted for the lowest percentage of the total variance [11, 12]. The peripheral nerve was recruited by a hand-held stimulator as would occur during clinical testing . The low trial-to-trial variability indicates that this was not an issue in the present investigation. As might be expected, multiple test sessions introduce additional sources of error. There could be slight differences in the position of the stimulating and recording electrodes, changes in electrode-skin input impedance and/or muscle temperature [11, 12, 27]. Limb position is also critical as it alters position of the nerve relative to the skin surface, making it harder or easier to evoke the potential . Changes in muscle geometry relative to the skin surface associated different limb positions can alter the shape of the evoked potential [12, 27]. The low day-to-day variance observed in this study suggests that careful methodological controls can minimize these potential sources of error.
The resulting ICC values are higher in the current work versus previous publications [11, 12]. The reason may be due to fundamentally different methodologies. In addition to well-controlled electrode placement and methodology, there was a strict adherence to a well-documented anatomic reference position and stimulation site for the peripheral nerve. Stimulation of the peripheral nerve and recording the response at the measured motor point are key to obtaining crisp, reliable M-waves. Previous ICC studies [11, 12] use a non-clinical protocol. The two papers [11, 12] use electrical stimulation of the motor point and recording the M-wave between the motor point and distal tendon. The recorded M-wave is more susceptible to distortions associated with temporal dispersion and a contracting muscle; it could not be used for MUNE.
Figure 2 was used to illustrate the relative nature of the ICC. Individual responses can have a certain degree of variability, but, if differences between subjects can be detected, the ICC will be high. Thus, the VR was included in this study to assess reproducibility of M-wave shape for individual subjects. There is no generally accepted delineation of excellent or even acceptable ranges of VR as exists with the ICC. Jacoboson et al.  reviewed the existing literature and set an upper limit of 0.40 as the criteria below which the same muscle group on the right and left legs would exhibit symmetrical profiles for linear envelop detected EMG during gait. In the current sample, two subjects had a VR much greater than 0.40 while a third was only slightly greater than 0.40. The remainder of the sample had VRs below 0.40. The results for the ICC and VR taken together support the earlier observation of Merletti et al. [11, 12] that shape features of M-waves are so reliable that visual identification of subjects is possible based on their M-waves.
The results support the use of P-P amplitude of the maximum M-wave as a methodological control in H-reflex studies, and as a normalization factor for voluntary EMG. The area of the negative phase of the maximum M-wave is both stable and consistent, and the shape of the entire waveform is highly reproducible and may be used for MUNE procedures. The intraclass correlation analysis of variance is necessary for establishing the reliability (stability and consistency) of EMG waveform measures, but not sufficient for investigating reproducibility of the EMG waveform shape. The variance ratio demonstrated that the shape of the biceps brachii maximum M-wave was very reproducible for all but a few subjects. Such information is important if the M-wave is to be used in tracking the progression of neuromuscular disorders.
This study was supported by NSERC of Canada
- Maffiuletti NA, Martin A, Babault N, Pensini M, Lucas B, Schieppati M: Electrical and mechanical H max -to-M max ratio in power- and endurance-trained athletes. J Appl Physiol 2001, 90:3–9.PubMed
- Aagaard P, Simonsen EB, Andersen JL, Magnusson P, Dyhre-Poulsen P: Neural adaptation to resistance training: changes in evoked V-wave and H-reflex responses. J Appl Physiol 2002, 92:2309–2318.PubMed
- Crone C, Johnsen LL, Hultborn H, Ørnes GB: Amplitude of the maximum motor response (Mmax) in human muscles typically decreases during the course of an experiment. Exp Brain Res 1999, 124:265–270.View ArticlePubMed
- Boe S, Stashuk DW, Doherty TJ: Motor unit number estimation by decomposition-enhanced spike-triggered averaging: control data, test-retest reliability, and contractile level effects. Muscle Nerve 2004, 29:693–699.View ArticlePubMed
- Scaglioni G, Ferri A, Minetti AE, Martin A, Van Hoeck J, Capodaglio P, Sartorio A, Narci MV: Plantar flexor activation capacity and H reflex in older adults: adaptations to strength training. J Appl Physiol 2002, 92:2292–2302.PubMed
- Carolan B, Cafarelli E: Adaptations in coactivation after isometric resistance training. J Appl Physiol 1992, 73:911–917.PubMed
- Rich C, Cafarelli E: Submaximal motor unit firing rates after 8 wk of isometric resistance training. Med Sci Sport Exerc 2000, 32:190–196.
- Jaberzadeh S, Scutter S, Warden-Flood A, Nazeran H: Between-days reliability of H- reflexes in human flexor carpi radialis. Arch Phys Med Rehabil 2004, 85:1168–1173.View ArticlePubMed
- Shefner JM, Jillapalli D, Bradshaw DY: Reducing intersubject variability in motor unit number estimation. Muscle Nerve 1999, 22:1457–1460.View ArticlePubMed
- Williams LRT, Sullivan SJ, Seaborne DE, Morelli M: Reliability of individual differences for H-reflex recordings. Electromyogr Clin Neurophysiol 1992, 32:42–49.
- Merletti R, Lo Conte LR, Sathyan D: Repeatability of electrically-evoked myoelectric signals from the human tibialis anterior muscle. J Electromyogr Kinesiol 1995, 5:67–80.View ArticlePubMed
- Merletti R, Fiorito A, Lo Conte LR, Cisari C: Repeatability of electrically-evoked EMG signals in the human vastus lateralis. Muscle Nerve 1998, 21:184–193.View ArticlePubMed
- Pierrot-Deseilligny E, Mazevet D: The monosynaptic reflex: a tool to investigate motor control in humans: interest and limit. Clin Neurophysiol 2000, 30:67–80.View Article
- Christie A, Lester S, LaPierre D, Gabriel DA: Reliability of a new measure of motoneuron excitability. Clin Neurophysiol 2004, 115:116–123.View ArticlePubMed
- Christie A, Inglis GI, Boucher JP, Gabriel DA: Reliability of the FCR H-reflex. J Clin Neurophysiol 2005, 22:204–209.PubMed
- Feldt LS, McKee ME: Estimation of the reliability of skill tests. Res Quart 1958, 29:279–293.
- Kroll W: Reliability of a selected measure of human strength. Res Quart 1962, 33:410–417.
- Lindquist EF: Design and analysis of experiments in psychology and education Boston: Houghton Mifflin Company 1956.
- Weir JP: Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Stren Cond Res 2005, 19:231–240.
- Hershler C, Milner M: An optimality criterion for processing electromyographic (EMG) signals relating to human locomotion. IEEE Trans Biomed Eng 1978, 5:413–420.View Article
- Jacobson WC, Gabel RH, Brand RA: Surface vs. fine-wire electrode ensemble-averaged signals during gait. J Electromyogr Kinesiol 1995, 5:37–44.View ArticlePubMed
- Taylor JL, Butler JE, Gandevia SC: Altered responses of human elbow flexors to peripheral-nerve and cortical stimulation during a sustained maximal voluntary contraction. Exp Brain Res 1999, 127:108–115.View ArticlePubMed
- Allman BL, Rice CL: Incomplete recovery of voluntary isometric fatigue is not affected by old age. Muscle Nerve 2001, 24:1156–1167.View ArticlePubMed
- Rutkove SB: Pseudofacilitation: a temperature-sensitive phenonmenon. Muscle Nerve 2000, 23:115–118.View ArticlePubMed
- Magladery JW, Porter WE, Park AM, Teasdall RD: Electrophysiological studies of nerve and reflex activity in normal man. Bull Johns Hopkins Hosp 1951, 88:499–519.PubMed
- Simmons Z, Epstein DK, Borg B, Mauger DT, Kothari MJ, Shefner JM: Reproducibility of motor unit number estimation. Muscle Nerve 2001, 24:467–473.View ArticlePubMed
- Preston DC, Shapiro BE: Electromyography and Neuromuscular Disorders: Clinical-Electrophysiologic Correlations Boston: Butterworth-Heinemann 1998.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.