Validity and reliability of an accelerometer-based assessgame to quantify upper limb selective voluntary motor control

Introduction Current clinical assessments measure selective voluntary motor control (SVMC) on an ordinal scale. We introduce a playful, interval-scaled method to assess SVMC in children with brain lesions and evaluate its validity and reliability. Methods Thirty-one neurologically intact children (median [1st-3rd quartile]: 11.6 years [8.5–13.9]) and 33 patients (12.2 years [8.8–14.9]) affected by upper motor neuron lesions with mild to moderate impairments participated. Using accelerometers, they played a movement tracking game (assessgame) with isolated joint movements (shoulder, elbow, lower arm [pro−/supination], wrist, and fingers), yielding an accuracy score. Involuntary movements were recorded simultaneously and resulted in an involuntary movement score. Both scores were normalized to the performance of 33 neurologically intact adults (32.5 years [27.9; 38.3]), which represented physiological movement patterns. We correlated the assessgame outcomes with the Manual Ability Classification System, Selective Control of the Upper Extremity Scale, and a therapist rating of involuntary movements. Furthermore, a robust ANCOVA was performed with age as covariate, comparing patients to their healthy peers at the age levels of 7.5, 9, 10.5, 12, and 15 years. Intraclass correlation coefficients and smallest real differences indicated relative and absolute reliability. Results Correlations (Kendall/Spearman) for the accuracy score were τ = 0.29 (p = 0.035; Manual Ability Classification System), ρ = − 0.37 (p = 0.035; Selective Control of the Upper Extremity Scale), and ρ = 0.64 (p < 0.001; therapist rating). Correlations for the involuntary movement metric were τ = 0.37 (p = 0.008), ρ = − 0.55 (p = 0.001), and ρ = 0.79 (p < 0.001), respectively. The robust ANCOVAs revealed that patients performed significantly poorer than their healthy peers in both outcomes and at all age levels except for the dominant/less affected arm, where the youngest age group did not differ significantly. Robust intraclass correlation coefficients and smallest real differences were 0.80 and 1.02 (46% of median patient score) for the accuracy and 0.92 and 2.55 (58%) for involuntary movements, respectively. Conclusion While this novel assessgame is valid, the reliability might need to be improved. Further studies are needed to determine whether the assessgame is sensitive enough to detect changes in SVMC after a surgical or therapeutic intervention.


Introduction
Patients with upper motor neuron lesions, for example such affected by cerebral palsy (CP), traumatic brain injury, or stroke, often exhibit multiple symptoms contributing to their disability. These symptoms can be classified as either being positive or negative motor signs. Positive signs are characterized by an increased frequency and amplitude of involuntary muscle activation, whilst patients with negative motor signs exhibit insufficient muscle activity or an impaired control [1]. While negative motor signs might contribute more to a child's disability [2,3], they are also more difficult to measure [1]. The importance of these negative motor signs, especially selective voluntary motor control (SVMC), as a predictor of gross motor function has been demonstrated in children [4,5].
SVMC has been defined as the "ability to isolate the activation of muscles in a selected pattern in response to demands of a voluntary posture or movement" [1]. SVMC develops during childhood and might decline later in life. For example, neurologically intact children (NIC) up to the age of 10 years can display mirror movements when performing tasks with the upper extremities [6]. Also in adults aged 50-80 years, mirror movements have been observed [7]. The manifestation of reduced SVMC in the form of observable involuntary movements, however, is task dependent [8]. Hence, it is essential to interpret results of patients with neurological disorders performing SVMC assessments in the context of the performance of not only young, neurologically intact adults (NIA) but also age-matched healthy peers.
With regard to the upper extremity movements of patients, reduced SMVC has been shown to impact activities of daily living in children with CP [9,10]. However, a clinical tool to solely assess reduced SVMC of the upper extremities has only recently been available [11]. The Selective Control of the Upper Extremity Scale (SCUES) [12] evaluates SVMC by letting patients perform isolated joint movements three times and assessing simultaneously occurring mirror movements (movements of the contralateral joint), movements of the trunk or any other joint apart from target joint. In addition, if the patient displays an active range of motion that is smaller than the passive one, SVMC is also rated as impaired. Each target joint is tested and rated on a 4-point, ordinal scale ranging from 0, no selective motor control, to 3, normal selective motor control.
Despite the SCUES being valuable to rate SVMC without much equipment, a drawback of the ordinal scale might be a lack of sensitivity to detect change occurring after interventions. Furthermore, rating the extent of involuntary movements and the active range of motion the target joint can be somewhat subjective.
Here we evaluate the validity and reliability of a novel, interval-scaled assessment game (assessgame) that uses accelerometers to quantify SVMC of the upper extremities objectively.

Participants
The aim was to recruit 30 NIA between the age of 18 to 50 years, by convenience sampling, as well as 30 NIC, aged 6 to 18 years by quota sampling. Because involuntary movements are reported more frequently for younger age categories [6,13,14], we aimed to recruit more participants in the age range of 6 to 10 years.
Patients with a diagnosed upper motor neuron lesion, aged between 6 and 18 years, with the ability to understand and follow simple instructions and sit upright for 1 h with backrest support, were included. Exclusion criteria were the treatment with Botulinum toxin or any surgical intervention of the upper extremities in the past 6 months. We recruited in-and outpatients of the Swiss Children's Rehab. Affoltern am Albis by convenience sampling.
All participants were characterized by using descriptors of age, gender, and handedness, defined by which hand is used to write/draw. We additionally recorded the diagnosis and more affected (or non-dominant) hand, determined by an occupational therapist.
All methods were in accordance with the necessary guidelines and approved by the ethical committee of the canton of Zurich, Switzerland (PB_2016_01843). Either the participant and/or the legal guardian gave written informed consent.

Assessgame
The assessgame was created in collaboration with Reha-Stim Medtech AG (previously YouRehab, Schlieren, Switzerland). Its technical features and in depth description can be found in a separate methods paper [15]. In short, the assessgame uses accelerometers and bend sensors (cyber-gloves for fingers) to capture target joint movements and (potentially) simultaneously occurring involuntary movements. The goal is to move the target joint in an isolated manner to steer the avatar on a starstudded, predefined path. The path challenges players within 90% of their active range of motion (Fig. 1a), calibrated before each movement. Accelerometer sensors were applied proximally (reference sensor) and distally of the joints (Fig. 1c) to ensure that compensatory movements had no influence on the avatars path.
The assessgame provides two outcome measures, one for the accuracy of the target joint movement and one combined value for the involuntary movements. Further information on the exact algorithm used to generate these outputs can also be found in the methods paper [15]. In summary (Fig. 2), for the target joint accuracy score, the avatar position was calculated relative to the calibrated active range of motion. The absolute difference between the avatar position and the predefined path was calculated and divided by the standard deviation of the NIA around the path. This latter step allows to interpret deviations from the predefined path in relation to the difficulty of the trajectory (as in more difficult sections of the path, the SD of the NIA will be larger. Finally, we calculated an average accuracy score of the target joint. The score for the involuntary movements was calculated similarly. From the accelerometer data, joint angles were calculated, which were used to calculate the change in joint angle per time unit (derivative). The derivatives of the NIA served as reference path for all participants. The absolute difference between the derivative of participant's path and the reference path was then divided by the SD of the NIA around the reference path. The same procedure, albeit without calculating joint angles, was used for the finger data generated by the bend sensors. The average of each joint was then averaged with the rest of the joints resulting in the involuntary movement score.

Set up and procedure
Participants performed the German version of the SCUES [16] before starting with the assessgame. Table and chair height were adjusted such that the participant's hip, knees, ankles, and elbows were in a 90-degree angle.
Playing with the upper extremities, the avatar is steered by abducting and adducting the shoulder (with a 90-degree flexed elbow), flexing and extending the elbow (against gravity, vertical upper arm), wrist and fingers (both supported on a firm cushioning with pronated lower arms), and by pronating/supinating the lower arm on a table. The sensors were attached with Velcro straps, the positions are displayed in Fig. 1c. When playing with the shoulder and elbow joint, the participant's arms were unsupported, for the other joints the table and cushioning were used, allowing for easier movement execution and minimizing the impact of the sensors on performance.
After receiving verbal instructions on the game (e.g., follow the star-studded path and only move the target joint), participants played three trial rounds, familiarizing with the different steering mechanisms, using the following joints: fingers, forearm, and either elbow or shoulder. Since pilot-testing had revealed that playing the game with pronation/supination of the lower arm was less intuitive, participants were asked to train that Steering the owl, the goal is to follow the star-studded target path as accurately as possible without any involuntary movements. c Placement of the accelerometer sensors proximal and distal of all target joints. Abbreviations: sec = seconds movement in the trial rounds. Playing with the fingers familiarized the participant with the cyber-gloves' bend sensors. Finally, the participants could decide if they either preferred playing with the elbow or the shoulder joint because the steering mechanisms are the same and analogous to the wrist. After three trial rounds, the participant played with all 10 target joints in a randomized order to account for learning effects. While the participant was playing, the therapist noted any involuntary movements occurring. The possible descriptors were defined as mirror movements (movements in the contralateral, homologous joint), movements of any other joints, and of the trunk. The therapist could note any combination of descriptors. Administering the assessment took around 25 to 35 min, including the breaks that were allowed to avoid fatigue.
To evaluate the reliability of the measurements, the assessgame was repeated 1 to 3 days later with inpatients and 7 days after the initial measurement with outpatients.

Measurement tools
The assessgame measures target joint accuracy and involuntary movements occurring. Both metrics are measured on an interval scale where a score of zero is the theoretically possible perfect score and increasing values indicate a greater deviation from the target paths.
The involuntary movements that the therapist noted (henceforth called therapist opinion) were converted into a sum score for every target joint that was played. The score simply summed all descriptors the therapist noted. Therefore, a score of zero meant that the participant showed no involuntary movements while performing the target joint movement and a score of 3 indicated that all penalized involuntary movements were displayed (mirror and trunk movements and also movements of any additional joints).
The Manual Ability Classification System (MACS) [17] was validated in children with CP and classifies how they handle objects in daily activities. Patients with level 1 Fig. 2 Visualization of the assessgame's data analysis steps. The assessgame splits selective voluntary motor control (SVMC) into target joint accuracy and involuntary movements. We visualized the algorithm analyzing the raw accelerometer data resulting in standardized error scores for both outcome metrics. For the target joint, the numbers between 0 and 100 reflect the percentage joint position relative to the calibrated active range of motion. The involuntary movements were analyzed by first calculating the actual joint angle and then the derivative to quantify changes in position. This was done so that patients who were unable to maintain the starting position were not penalized. Finally, the standardized error expresses how many adult standard deviations the player was away from either the target path or the adult mean (involuntary movements) on average. The data were filtered using a 6th order Butterworth zero-phase low-pass filter with a cutoff frequency of 1.5 Hz (normalized cutoff frequency of 0.045). Reprinted from 'First validation of a novel assessgame quantifying selective voluntary motor control in children with upper motor neuron lesions' [15] handle objects easily and successfully whereas level 5 indicates that patients do not handle objects at all [17]. Medical professionals in our rehabilitation center routinely assess the MACS level in children with CP. For this study, they also classified children with other diagnoses.
The SCUES [12] was also validated in children with CP and evaluates SVMC on a four-point ordinal scale, for each target joint separately. A score of zero means that there is no observable SVMC and a score of 3 that the participant performs the desired movement over the entire range of motion without displaying any involuntary movements. The SCUES tests the shoulder, elbow, lower arm, wrist, fingers analogous to the assessgame with one exception, elbow flexion and extension are measured while the arm is in a horizontal position, thus mitigating the effect gravity has on performance.
Both the MACS and SCUES have not been validated in patients with upper motor neuron lesions other than CP. Therefore, the MACS and SCUES values of patients with other diagnoses should be considered to approximate the handling of objects in daily life and SVMC, respectively.
Before statistically analyzing the data, missing data points due to sensor errors were imputed with the mean value resulting from multiple imputation by chained equation. For detailed information we refer to our methods paper [15]. Convergent validity was tested by correlating the accuracy and involuntary movement metrics of the assessgame with the MACS, SCUES and therapist opinion for each individual joint, for the average score for the less/ more affected side, and the combined average of all joints. Kendall's tau-b [23] was chosen as correlation coefficient because it is specifically designed to handle ties in the data, of which there had to be many by nature of the few levels the ordinal scales provide. Additionally to the tau-b, we calculated Spearman's rank correlation coefficient for the average scores per limb, because the sum scores were expected to show more dispersion, hence warranting both correlation types. For the total involuntary movement score, we expected a high, positive Spearman correlation (0.7 ≤ ρ < 0.9) with the therapist opinion, a moderate, negative correlation (− 0.5 ≥ ρ > − 0.7) with the SCUES, and a low, positive correlation (0.3 ≤ ρ < 0.5) with the MACS. Total target joint accuracy correlations were expected to be moderate and positive for the therapist opinion, low and negative for the SCUES and low and positive for the MACS.
Discriminative validity was tested by comparing the NIC to the patient group for both assessgame metrics. This was done for the average scores of the less/more affected sides and the combined average scores of all joints. A robust, bootstrapped ANCOVA, as described by Mair and Wilcox [24], was used, entering age as a covariate. This method of analysis allows for robust comparisons at distinct levels of the covariate, in our case at the age levels 7.5, 9, 10.5, 12, and 15 years. The number of bootstrap samples was set to 2′000 and the data were not trimmed. Bootstrapped 95% confidence intervals were adjusted for the multiple comparison points. The span parameter (defining model flexibility) was set as low as possible (greater flexibility) but such that the group sizes at the comparison levels were at least 12, as suggested by Mair and Wilcox [24].
Due to the heterogeneous patient population, we used a conservative approach to determine the smallest real difference as a measure of absolute reliability. We divided the difference between the 97.5 percentile and 2.5 percentile by 2 and bootstrapped this metric 2′000 times. The upper limit was then taken as a robust estimate of absolute reliability (resembling the Bland-Altman approach for normally distributed data).

Results
A total of 33 NIA and 31 NIC participated and served as reference groups. In total 8 patients of 41 that gave informed consent dropped out of the study, 4 of them due to the severity of their disability and 4 due to compliance issues. Of the remaining 33 patients, only 23 were available for a second (reliability) assessment. Their characteristics are listed in Table 1. Patients participating in the validity part were similar to the NIC with regard to median age and distribution, but gender proportions stand out as dissimilar. For all children with upper motor neuron lesions except one, the dominant arm was also the less affected one. Twenty patients (61%) had a diagnosis where one side was acknowledged as being more affected.

Convergent validity
In general, correlations of the assessgame metrics were stronger with therapist opinion than with the MACS and SCUES, especially for the involuntary movement metric, where they were high ( Table 2). The MACS and the SCUES showed similar correlation coefficients and were small for the assessgame accuracy and moderate for the involuntary movement score. Furthermore, correlations were higher for the more affected arm compared to the less affected arm, for the averaged scores of all joints compared to the individual joints, and for the involuntary movement metric compared to the target joint accuracy.

Discriminative validity
The results for the discriminative validity tests can be found in Table 3. After correcting for multiple testing within each metric and side, the groups differed significantly at all levels, except for the young age categories of the less affected/dominant side. Figure 3 displays the results across all levels of the covariate. Both patients and NIC improve with age. The slope of improvement, however, was less pronounced in patients. The involuntary movement score for the patients' less affected side even worsens slightly with age.

Reliability
Relative and absolute reliability are displayed in Table 2. The ICCs ranged from good to excellent for all averaged values, whereas the ICCs for the individual joints are mostly moderate. Sporadically some individual joints had poor or good relative reliability. Smallest real differences in relation to the median performances were smaller for the accuracy score than for the involuntary movement score and larger for the more affected arm than the less affected one. For the individual joints of the involuntary movement score, smallest real differences were mostly larger than the median of the patients' scores. For the mean scores, the smallest real differences ranged from approximately 42 to 122% of the median of patient scores.

Discussion
This study assessed the validity and reliability of a novel accelerometer-based assessgame. Convergent and discriminative validity results indicate that the assessgame is valid and can discriminate between patients with upper motor neuron lesions and NIC. Relative reliability was good to excellent for the averaged scores but only moderate for individual joints. Absolute reliability, however, expressed as smallest real difference, was somewhere in the range 42 to 122% of the median of patient scores indicating that the assessgame might not be as sensitive as predicted.

Convergent validity
Correlations of the assessgame with the therapist opinion were clearly the highest, as was hypothesized. The obvious reason is that as an involuntary movement occurs, the therapist notes it. Correlation coefficients might even have been higher if frequency and intensity of the involuntary movements had been graded too, as is done for other assessments, for example the Zurich Neuromotor Assessment [27]. The correlations with the SCUES were lower compared to the therapist opinion. A possible explanation for that may be attention modulation. During the SCUES, patients are asked to focus specifically on suppressing involuntary movements. Patients even could receive a second or third try, if the assessor believes a better result is achievable. When playing the assessgame, patients were also asked to perform the target movement and suppress involuntary movements but are then left to play the game. Focusing on playing the game can be considered an external focus of attention, which may lead to playing without thinking about controlling involuntary movement. It has been shown in NIA [28] as well as NIC [29] that guiding attention to the involuntary movements (internal focus) leads to improved inhibition. The same can be said for children with CP albeit suppression was possible to a lesser extent [30]. Conversely, studies have demonstrated that computer games, in simplicity comparable to our assessgame, can have a distracting effect [31,32], which might further divert attention from suppressing involuntary movements. The fact that correlations are higher for the involuntary movement metric compared to the accuracy metric could mirror the fact that both the SCUES and the therapist opinion are designed to capture involuntary movements. The correlations are, however, not very far apart, indicating that patients who showed more involuntary movements while playing also followed the target path less accurately.
Another observable pattern is that correlations within the involuntary movement metric are greater for the nondominant arm. It has been shown that children and adolescents affected by CP exhibit more mirror movements in their less affected hand, when performing movements with their more affected hand [30,33]. Even though patients with stroke did not show increased EMG activation on the contralateral side, they did exhibit increased ipsilateral muscle activation [34]. In line with those results, Results of the robust ANCOVAs comparing patients to their peers at predetermined covariate levels for both assessgame metrics with bootstrapped 95% confidence intervals. The span parameter was set to 1 for the patient group and 0.7 for their peers. Abbreviation: 95%-CI bootstrapped 95% confidence interval, corrected for multiple testing Sukal et al. [35] found increased ipsilateral joint coupling for patients with hemiplegic CP. Patients in our study also exhibited more involuntary movements when playing with their more affected side (Fig. 3) and thus there was a clearer separation between scores, which might lead to higher rank correlations with therapist opinion.

Discriminative validity
The fact that NIC improve in a more complex motor task whilst displaying less involuntary movements with older age has been demonstrated before [6,36] and is associated with the maturation of the corticospinal tract [37]. Expanding on that, Rosenbaum et al. [38] found that children with more severe CP show a less pronounced improvement and an earlier leveling off in gross motor function, which is in line with what we see with the assessgame our results. These facts explain why the assessgame performance difference between patients and their healthy peers grows with age. The finding that the groups were not always significantly different at a young age highlights the importance of a reference group. A worse performance in such motor tasks and showing more involuntary movements may be physiological at younger ages.

Reliability
Relative reliability of the averaged scores was in the range that was seen in other studies investigating SVMC [39,40]. Keeping in mind that we chose a conservative way of estimating the smallest real difference, the resulting values still seemed rather large. In part, this might be due to factors such as motivation, time of day when testing, which was not standardized, and fatigue from therapy sessions. It needs to be determined whether this assessgame is sensitive enough to detect changes in SVMC after a surgical or therapeutic intervention.

Methodological considerations
An important consideration of SVMC measures in general is that certain movements are performed against gravity which could also make strength a relevant factor. By letting patients play the game with their active range of motion and not their passive one, the idea was to minimize the influence of strength on the assessgame outcomes. This, however, needs to be confirmed with further research. Assessgame outcomes for patients and healthy peers by age. Robust ANCOVAs using running interval smoothers (means without trimming) compared patients to their peers at predetermined levels (dashed lines) of the covariate age (see Table 3 for exact numbers). The span parameter was set to 1 for the patient group and 0.7 for their peers Furthermore, the pro−/supination movement could have been visualized differently. We decided to keep it consistent over all movements, letting the avatar be steered upwards and downwards. On the one hand, this ensured that participants did not have to be familiarized with multiple visualizations. On the other hand, this seemed unintuitive for most participants, which likely caused an additional cognitive demand when controlling the game with this movement.
Moreover, the heterogeneity of neurological conditions included in this study needs to be addressed. Even though the group is representative of the patient population in our rehabilitation center, the different conditions (e.g., congenital versus acquired brain lesions) might introduce more variability to the data. However, even a group as 'specific' as children with unilateral CP can show very different joint torque coupling patterns, depending on whether the brain injury occurred pre-or peri-or post-natal. The coupling patterns of children affected by post-natal injuries, for example, resemble those of adults with stroke [35]. This indicates that even children with the same diagnosis may vary strongly in the patterns of involuntary movements they exhibit. Statistically, we used a robust bootstrapping technique to capture this as best as possible.
Lastly, the size and quantity of sensors should be reduced for future, clinical applications. It is conceivable to use considerably smaller and only 7 sensors to measure the same metrics.

Conclusion
In conclusion, the presented assessgame is valid and shows good relative reliability. We expect that absolute reliability needs to be improved to sensitively measure changes stemming from interventions aiming to improve SVMC.