Smoothness metrics for reaching performance after stroke. Part 1: which one to choose?

Background Smoothness is commonly used for measuring movement quality of the upper paretic limb during reaching tasks after stroke. Many different smoothness metrics have been used in stroke research, but a ‘valid’ metric has not been identified. A systematic review and subsequent rigorous analysis of smoothness metrics used in stroke research, in terms of their mathematical definitions and response to simulated perturbations, is needed to conclude whether they are valid for measuring smoothness. Our objective was to provide a recommendation for metrics that reflect smoothness after stroke based on: (1) a systematic review of smoothness metrics for reaching used in stroke research, (2) the mathematical description of the metrics, and (3) the response of metrics to simulated changes associated with smoothness deficits in the reaching profile. Methods The systematic review was performed by screening electronic databases using combined keyword groups Stroke, Reaching and Smoothness. Subsequently, each metric identified was assessed with mathematical criteria regarding smoothness: (a) being dimensionless, (b) being reproducible, (c) being based on rate of change of position, and (d) not being a linear transform of other smoothness metrics. The resulting metrics were tested for their response to simulated changes in reaching using models of velocity profiles with varying reaching distances and durations, harmonic disturbances, noise, and sub-movements. Two reaching tasks were simulated; reach-to-point and reach-to-grasp. The metrics that responded as expected in all simulation analyses were considered to be valid. Results The systematic review identified 32 different smoothness metrics, 17 of which were excluded based on mathematical criteria, and 13 more as they did not respond as expected in all simulation analyses. Eventually, we found that, for reach-to-point and reach-to-grasp movements, only Spectral Arc Length (SPARC) was found to be a valid metric. Conclusions Based on this systematic review and simulation analyses, we recommend the use of SPARC as a valid smoothness metric in both reach-to-point and reach-to-grasp tasks of the upper limb after stroke. However, further research is needed to understand the time course of smoothness measured with SPARC for the upper limb early post stroke, preferably in longitudinal studies. Supplementary Information The online version contains supplementary material available at 10.1186/s12984-021-00949-6.


Introduction
Stroke is one of the main causes of adult disability [7][8][9]. Goal-directed upper limb movements after stroke are characterized by slowness, spatial and temporal discontinuity (i.e., lack of smoothness), and abnormal stereotypic patterns of muscle activation or movement synergies [10,11].
Currently, stroke literature offers several ways for objective measurement of upper limb movement, and standardization is lacking [12,13]. Measuring changes in smoothness during reaching, pointing or grasping using the upper paretic limb is suggested to reflect quality of movement (QoM) early after stroke [5,6]. Smoothness of movement is regarded as the result of 'learned, coordinative processes in sensorimotor control' , although the underlying neuronal and mechanical substrates that cause lack of smoothness in motor control are still poorly understood [14,15]. Smoothness is therefore interpreted as a reflection of the level of sensorimotor coordination and movement proficiency [16,17].
Balasubramanian and colleagues defined movement smoothness as continuity or non-intermittency of a movement, independent of its amplitude and duration [6]. Maximizing the smoothness of a movement is considered to be prioritized by the neuro-muscular system, as it reduces the control burden on the brain [18]. Nonetheless, the neurophysiological mechanisms of smoothness deficits after stroke are yet to be understood. Muscle activity patterns observed during reaching after stroke have been shown to be impaired [19]. Smoothness deficits could, for example, be caused by the inability to synchronize motor units or control agonists and antagonists in the right proportions [14,20], or may be due to changes in cortico-spinal tract excitability following stroke [21].
A prerequisite for investigating smoothness deficits after stroke is identifying a 'valid' smoothness metric. Unfortunately, there is currently no commonly accepted metric for quantifying movement smoothness, and many types have been used in the literature to investigate smoothness of reaching movements post stroke [13]. The use of many smoothness metrics in clinical research is limited by several methodological concerns. For instance, some metrics are not clearly described and therefore not reproducible. Other metrics depend on the duration or distance of reaching or are not dimensionless. In both cases, they could be confounded by the shape, i.e., the duration and amplitude, of the movement [16]. Some proposed smoothness metrics are based on position, and do not truly reflect smoothness per se [6,22] as they do not measure the rate of change of position. Furthermore, some metrics are linear transformations of other smoothness metrics, and are therefore proxies of existing metrics. Finally, some metrics lack robustness against measurement noise [6].
Several narrative reviews about smoothness have discussed the strengths and weaknesses of a limited set of available metrics [6,13,14,16]. The relations between these metrics and smoothness were assessed either by using simulation models, or by studying post-stroke correlations with clinical scales. However, these studies reviewed the literature narratively, rather than systematically. Therefore, a comprehensive overview of metrics used to measure smoothness after stroke is lacking. Furthermore, these metrics have not been validated in terms of whether they reflect smoothness [23]. As a result, proper recommendations for a valid smoothness metric are currently lacking in the literature.
Our goal was to identify the most valid metrics for quantifying smoothness of upper paretic limb movement after stroke during reaching tasks [24]. Reaching can be used to extend or point the hand/arms (reach-to-point) or touch or grasp something (reach-to-grasp). To this end, several subsidiary questions were formulated. Firstly, to identify available metrics, we addressed the question 'Which metrics have been used in the literature to assess movement smoothness in reaching by persons with stroke?' . Secondly, we filtered metrics sequentially, using a set of criteria derived from the literature to assess whether their mathematical definitions regarding smoothness were sound [6,14,16]. This was done to answer the question 'Which of the available metrics are mathematically defined, reproducible, not linear transforms of another metric, dimensionless, and defined using the rate of change in position?' . Thirdly, we assessed how each metric responds to smoothness deficits in the reaching task, to answer the question 'How does each smoothness metric respond to a simulated change in the velocity profile of a reaching task?'. In this study, metrics that satisfy the two latter questions can be said to be valid smoothness metrics that have been applied in stroke research.  [26]. The references of the included articles were scanned for additional suitable articles. The review has been registered in the PROS-PERO registry under CRD42020173211.

Metrics mathematically reflecting smoothness
Metrics should reflect the definition of movement smoothness, i.e., the continuity or non-intermittency of the movement profile, independent of its amplitude and duration [6]. Additionally, as smoothness reflects continuity, it should be based on rate of change of position or a higher derivative. Based on the requirements stated in the introduction above, the definition of a metric was not sound if: E1. the metric was not dimensionless, E2. the metric was not reproducible from the literature, E3. the metric was not based on velocity or a derivative of velocity, or E4. the metric was linearly related to another metric by (a) scaling or (b) addition of a constant.

Response of metrics to changes in velocity profile
The response of each metric to four different types of simulated perturbations, applied to two reaching velocity profiles, viz. reach-to-point and reach-to-grasp, were studied. A reach-to-point movement was simulated using a minimal jerk model [27]: where v mj is the minimal jerk velocity profile, d t is the total reaching distance, T is the total movement time and t is the time scale from 0 to T. Using this, a symmetrical velocity profile (v symm ) was created with a d t of 0.3 m, and a T of 1 s. While this velocity profile reflects a reach-topoint movement, it does not truly reflect reach-to-grasp movements [28], as the latter movements have to account for a higher accuracy when nearing the target position [28]. An initial analysis on healthy subjects showed that an asymmetrical velocity profile (v asymm ) was better suited for this purpose. This was modelled using a polynomial curve (Additional file 1.B). Both velocity profiles are shown in Additional file 1.C, and have been further investigated.
Of the four simulated perturbations, the first three are analytical evaluations of the smoothness metrics, and the last one is specifically based on theories regarding recovery of movement after stroke [14].
-Shape Simulation (SS) The movement duration and distance of the base velocity profiles were varied. The smoothness metric must not depend on either of these parameters.
The durations and distances of both velocity profiles were varied from 0.5 to 6.0 s in steps of 0.1 s, and from 0.2 to 0.7 m in steps of 0.01 m. A total of 2856 combinations were used to calculate the outcomes of the metrics. The ranges for movement duration and distance were chosen such that they were within the physiological range of human reaching.
-Harmonic Disturbances (HD) In this analysis, tremor or weak control of reaching movement was simulated using harmonic disturbances added to the base velocity profiles [29]. This included sinusoids with varying amplitude and frequency. The relation between frequency or amplitude and the metric should be monotonic. Smoothness is expected to decrease with increasing amplitude for a given frequency, and also with increasing frequency for a given amplitude.
Sinusoids of frequencies between 2 and 25 Hz in steps of 0.5 Hz, and amplitudes between 0 and 0.2 m/s in steps of 0.005 m/s were added to the base velocity profile. A total of 1927 unique combinations were explored. The ranges chosen were within the physiological ranges of movement [4,30].
-Measurement noise (MN) A more robust smoothness metric is less sensitive to measurement noise [6]. The noise was modelled as normally distributed white noise (mean = 0, standard deviation = 1) and added to the base velocity profiles.
The root mean square (RMS) of the noise was varied from 0 to 0.08 m/s in steps of 0.002 m/s. Twenty-five different realizations for each RMS were generated, and the metrics were estimated for each realization. The minimum, maximum, mean and standard deviation of the metrics were calculated and reported. In an additional analysis of noise we filtered the noise-added velocity profile using a zero phase 4th order low pass Butterworth filter with cut off of 20 Hz [6]. The mean of the metric outcome across the 25 realizations after filtering was determined.
-Sub-movement Simulation (SMS) A smoothness metric must reflect movement intermittency, and the change in the progressive blending of sub-movements [6,31]. The smoothness metric should therefore decrease monotonically with increasing number of sub-movements and increasing delays between each sub-movement.
This is an extension of previous work applied to a set of metrics [4,14]. The reaching profiles were modelled as a composition of two or more sub-movements, each defined as the base velocity profile with a duration of 1 s. The sub-movements were separated by a varying lag, denoted as Ks. Ks ranged from 0 s, were the submovements fully overlap, to 1.2 s, where there was 1.2 s between the starting points of the two sub-movements. The lag was increased in steps of 0.02 s. Note that when the lag was greater than 1 s, there were instances of zero velocity between subsequent sub-movements. The total duration of the movement increased with Ks. Simulations were performed for 2-4 sub-movements.

Analysis of the simulations
The responses of each metric to the four different simulated perturbations were individually assessed. For the Shape Simulation and Harmonic Disturbances, the percentage change (%Δ) of the metric from its value estimated using the respective base profile was identified as where metric i corresponds to metric values for each combination of parameters in the simulations, and metric 1 is the value for the first combination used. For Shape Simulation, metric 1 corresponded to the smoothness of a base profile with reaching distance 0.2 m and duration 0.5 s. We considered a change of more than 10% as meaningful, and the maximum %Δ was identified.
For Harmonic Disturbances, metric 1 corresponded to a base profile of reaching distance 0.3 m and duration 1 s. The %Δ was estimated for each combination of frequency and amplitude. Then, a Combinations Exceeded (CE) parameter was marked as the percentage of the combinations that exceeded 10%. A higher value of CE meant that there were more combinations of frequency and amplitude that caused a meaningful change in the value of the metric from its base velocity profile. For the Measurement Noise simulation, the ratio of signal-to-noise power (SNR) was estimated to quantify the robustness to noise. First, the power of the measurement noise was estimated. Then, the power of the signal was estimated as the power of the base velocity profile with added measurement noise. The lowest RMS of added noise was 0.002 m/s, which corresponds to SNRs of 45.0 dB for v symm and 45.4 dB for v asymm . Subsequently, the highest noise RMS added was 0.08 m/s, which corresponded to SNRs of 13.2 dB for v symm and 13.6 dB for v asymm . The SNR at which the mean value of the metric differed from the base velocity profile by at least 10% is reported. Metrics that reached a 10% threshold only at a high RMS of added measurement noise, and therefore a low SNR, were deemed to be more robust to noise. On the other hand, metrics that crossed the threshold at lower RMS values, and therefore a higher SNR, were highly sensitive to noise. An SNR threshold to distinguish between high and low robustness was determined using the distribution of the SNR values obtained at the 10% cut-off for each metric. Metrics with an SNR lower than the 25th percentile were considered to have high robustness to noise, and all others were deemed to have low robustness to noise.
Finally, in the Sub-movements Simulations, the change in the direction of the derivative of the metrics for increasing delays was assessed to study monotonicity. All computations were performed using MATLAB (2018b, The Mathworks, Natick, MA, USA).

Data availability
The MATLAB scripts used to generate the different simulations, the scripts for estimating the smoothness metrics, and the resulting metrics are provided with this manuscript (Additional file 4).

Systematic literature review
A total of 476 unique articles were identified, 102 of which were found to be eligible for inclusion using Rayyan [32]. A total of 32 different metrics (Additional file 1.D, E) were identified. Figure 1 shows the PRISMA flow chart (Additional file 3 reports the PRISMA checklist). Table 1 shows an overview of all metrics identified from the literature, and the ones that did not meet the four exclusion criteria (E1-E4). The metrics identified in the systematic review were classified into categories based on their mathematical definitions. Metrics defined in the time domain were classified as 'Trajectory metrics' , or 'Velocity metrics' , or ' Acceleration metrics' , or 'Jerk metrics' . Metrics defined in the frequency domain were classified as 'Frequency metrics' . Metrics that did not fit in any of these categories, or fitted in more than one category, were classified as 'Other metrics' . Trajectory-based smoothness metrics: The Index of Curvature (IC) [33] and the standard deviation of the position perpendicular to the movement direction (SD_XY) measured smoothness using only the discrete position information of the reaching movement. As these are not based on the rate of change of position as a function of time, they cannot be used to measure continuity and thereby smoothness of reaching (criterion E3). This holds for any proposed metric that belongs to this category.

Metrics mathematically reflecting smoothness
Velocity-based smoothness metrics: Of the seven velocity-based metrics, Movement Arrest Period Ratio (MAPR), Speed Metric (SM), Number of Sub-movements (NOS), Velocity Arc Length (VAL) and Correlation Metric (CM) were found to be mathematically sound for measuring smoothness and were used for further analysis.
MAPR is the proportion of time that the movement speed exceeds a given percentage of the peak speed [34]. SM, defined as the mean speed of the whole movement normalized by the peak speed, was found to decrease with the severity of the stroke [14]. Normalized Reaching Speed (NRS) is the ratio of the difference in peak and mean speed over the peak speed [35]. As NRS = 1 − SM, it is a linear transform of the SM metric, and is expected to behave congruently. Therefore, NRS was excluded from further analysis (criterion E4). The definition and mathematical description of the Tent Metric (TM) was incomplete in the study [14], and therefore could not be evaluated further (criterion E2). NOS counts the submovements that make up the norm of the velocity profile [36] and has been used to assess smoothness in persons with stroke [37]. VAL [4] is based on the arc length of the speed profile normalized by the peak speed. It assumes that a bell-shaped velocity profile has a shorter arc length than one with velocity fluctuations. CM determines the correlation between the velocity profile extracted from the minimal jerk model and the actual hand velocity profile during reaching [38]. Acceleration-based smoothness metrics: In this category, six metrics were identified, of which peaks (Peaks) and Inverse Number of Peaks and Valleys (IPV) were analysed further.
Peaks was the most frequently used metric (61 citations). The metric reflects the number of local maxima in the velocity profile for a given movement [39], which is inversely proportional to the smoothness of a movement. Peaks can also be defined as zero crossings in the acceleration domain when the derivative of the acceleration is negative. Peaks were additionally normalized either to the movement duration (NPt) [40] or to the movement distance (NPd) [41]. However, doing so causes the metric to be dependent on movement duration or movement distance. Therefore, these adapted definitions of Peaks (NPt and NPd) were excluded (criterion E1). Smoothness was also estimated using the Number of Valleys [42] or the Number of Valleys and Peaks [43]. Since these definitions are linear transforms of Peaks, they are assumed to show congruent behaviour to Peaks, and were excluded from further analysis (criterion E4). IPV, on the other hand, is not a linear transform of Peaks, and was included in further analysis [44]. Although a few studies employed additional criteria for peak detection [45,46], the choices for these criteria, and the difference with Peaks was not explicitly provided, and they were not considered for the present study. The Acceleration Metric (AM) is the ratio between the mean acceleration and the peak acceleration [35]. A point-to-point reaching movement should have zero velocity both at the beginning and end of the movement, which implies that the mean acceleration over this movement must be zero. However, this was not the case in the referenced studies, suggesting that some aspect of its definition is missing [35,47]. According to the textual description, the metric definition is not face-valid, and it was therefore excluded (criterion E2).
Jerk-based smoothness metrics: There were a total of 12 different jerk-based metrics, of which only two types of dimensionless squared jerk metrics, DSJt and DSJb, and their respective log transformations, LDSJt, and LDSJb, were further analysed.
Jerk, the third derivative of position, has often been used as a measure of smoothness in different ways; either as the integral of the squared jerk or the integral of the absolute jerk [3,14,16]. Furthermore, the results were scaled using different terms, which introduces a unit to the metric. As smoothness metrics have to be dimensionless (criterion E1), only the dimensionless jerk metrics were considered. Three types of dimensionless squared jerk metrics, DSJt [3], DSJb [4], and DSJm [2], were introduced to measure smoothness. The suffixed letter corresponds to the author's name. These jerk metrics differ in the normalizations used in their definitions. As DSJm is a linear transform of DSJt, it was excluded (criterion E4a). A natural logarithm transform of the DSJb metric was performed to improve its sensitivity (LDSJb) [4]. The same was applied to DSJt, thereby introducing LDSJt [5]. As LDSJb and LDSJt employ the peak velocity, and the average velocity respectively in their equations, they are not linear transformations of each other. Rotational Jerk (RJ) measures movement smoothness using the orientations of the wrist during the movement [48]. This form of smoothness quantifies the variability of hand orientation.
However, as we analysed changes to a tangential velocity profile, we have no models for the changes in orientation during the reaching movement. Therefore, this metric was not analysed further.
The SPM, SPAL, and SPARC were developed by the same authors [1,4,6], and are directly proportional to the increase in smoothness of the movement. The SPM measures smoothness as the sum of all peaks in the amplitude-normalized Fourier transform of the velocity profile [1]. The SPAL uses the negative arc length of the amplitude and the frequency-normalized Fourier transform of the velocity profile [4]. The frequency range used in SPAL was further limited in order to define SPARC [6]. Finally, SPMR expresses smoothness using the energy within a 0.2 Hz bin around the dominant frequency in the Fourier transform of the accelerations, normalized by the entire energy [49].
Other metrics: Kostić and Popović [50] defined a smoothness metric (Combined Smoothness Metric [CSM]) in the context of a drawing task in which a patient, while seated at a desk, draws a pre-defined square. The smoothness metric uses information from the movement velocity and jerk, and consists of four different terms. As the formula uses different dimensions incorrectly, the metric was excluded (criterion E1).

Response of metrics to changes in velocity profile
In the previous section, fifteen metrics were identified as mathematically sound, and therefore subjected to further analysis: NOS, SM, MAPR, VAL, Peaks, IPV, DSJt, LDSJt, DSJb, LDSJb, CM, SPMR, SPM, SPAL and SPARC . Table 2 describes the selected metrics' range of feasible mathematical values obtained for each type of perturbation. The parameters used to interpret the response of metrics to the simulations (%Δ, CE, and SNR) are also shown. Metrics SM, MAPR, IPV, CM, SPM, SPMR, SPAL and SPARC should decrease with decreasing smoothness of movement. However, the other metrics increase with decreasing smoothness. To enable comparison across metrics, we append a * to these latter metrics. This includes NOS*, VAL*, Peaks*, DSJt*, LDSJt*, DSJb*, and LDSJb*.
In this section, we discuss the results of the simulation analyses using v symm as the base velocity profile. As the changes in the values of the smoothness metrics for the v asymm were similar, their results have been placed in Additional file 1.F. The main difference between using the two base velocity profiles was the magnitude of the      resulting values, as shown in Table 2. Where other differences in the response to the simulation analyses were found, they are addressed in the following sections. Figure 2 shows the response of each metric to changes in movement duration and movement distance for the symmetric velocity profile. The percentage of change (%Δ) shows that NOS*, VAL*, SPAL, and SPMR were sensitive to changes in this simulation for both velocity profiles ( Table 2). The inconsistencies in the number of sub-movements as measured by the NOS* shows that this metric is not suitable as a smoothness metric. Metrics SM, MAPR, Peaks*, IPV, LDSJt*, LDSJb*, CM, and SPM were truly insensitive to changes in this simulation. Figure 3 shows the metric outcomes with added sines of varying frequencies and amplitudes. The algorithm used to estimate NOS* failed to converge to an optimal solution for higher frequencies (missing data in Fig. 3). All other metrics behave as expected to this simulation and show a lower smoothness outcome as the amplitude of the added sine increases. However, all metrics except SM, MAPR and CM showed lower smoothness outcomes at higher frequencies for the same amplitude. SPAL and SPARC were insensitive to sine disturbances with frequencies higher than 20 Hz, as their definitions include the use of a cut-off frequency. The CE values for NOS*, MAPR, VAL*, and CM are less than 50% (Table 2) suggesting that these metrics are relatively less sensitive to harmonic disturbances, and might not be useful to reflect presence of tremor or weak control of reaching movement.

Measurement noise (MN)
NOS* is only capable of analysing the smoothness at low noise powers up to an RMS of 0.008 m/s (Fig. 4). For higher noise powers, the algorithm that counts NOS* fails to converge to an optimal solution (indicated by N.A. in Table 2 in the SNR column). The other metrics show lower outcomes of smoothness as the RMS of the noise is increased (Fig. 4). MAPR, CM, and SPAL did not cross the 10% threshold for any noise power included in the simulation (unfilled entries '-' in Table 2). This indicates that these metrics are robust to the range of measurement noises added in this study. Peaks*, IPV, and all jerk-based smoothness metrics were very sensitive to measurement noise.

Sub-movements simulation (SMS)
The algorithm used to estimate NOS* calculated incorrect values at certain instances (Fig. 5). This was because the algorithm did not converge to an optimal solution within the provided boundary constraints with increasing number of sub-movements. We found that only the VAL* was truly monotonic to changes in lag between sub-movements (Additional file 1.G). SPMR surprisingly increased with increasing numbers of sub-movements which shows that the metric fails in this analysis. All other metrics showed a lower outcome for smoothness with increasing number of sub-movements and increasing delay between them. For Peaks* and IPV, a third peak was detected at 0.3 and 0.5 s (Fig. 5). Although non monotonic overall, the metrics Peaks*, IPV, SPM, SPAL, and SPARC showed jumps only at certain discrete intervals. The CM was seen to be monotonic only if the delay between sub-movements was larger than 0.2 s. Further, when considering increases in delays (Ks) of 0.06 s, the SPAL and SPARC metrics also showed a monotonic change for delays larger than 0.2 s. Furthermore, the monotonicity was influenced by the base velocity profile used for all metrics except VAL*, SPMR, and SPARC (Additional file 1.G). Table 3 summarizes the simulation analysis results and indicates whether the responses of each metric were as expected. For the measurement noise analysis, the robustness of each metric to added noise was studied. Descriptive statistics of the SNR values as shown in Table 2 were used to divide the metrics into two groups; high and low robustness to measurement noise. Note that a higher added RMS noise value corresponds to a lower SNR value, and hence to greater robustness to noise. We find that only SPARC responded as expected to the Shape Simulation, Harmonic Disturbances, and Measurement Noise simulations. For the Sub-movement Simulation, SPARC responded as expected by showing a monotonic change for increase in delays between sub-movements greater than 0.2 s (20% of sub-movement duration) only when the delay was increased in steps of 0.06 s (6% of sub-movement duration).

Discussion
The aim of this study was to identify valid smoothness metrics to investigate the QoM of the upper paretic limb during reaching tasks by persons with stroke. A smoothness metric used in stroke research was valid if it was mathematically sound, and responded to the simulation analyses as expected. The systematic literature review   [16]. Many metrics were sensitive to reaching distance and duration, or were not found to be useful to reflect presence of tremor or weak control of reaching movement, or were not robust to added measurement noise. We find that almost all metrics do not change monotonically to increasing delay between the sub-movements. Further, we observe in some cases ( Table 3) that the reaching task influences the behaviour of smoothness metric, which was a disadvantage to certain metrics. Our simulation analyses showed that Spectral Arc Length (SPARC ) responded favourably in all simulation analyses, for both base velocity profiles, and therefore is a valid metric to measure smoothness of reach-to-point or reach-to-grasp movements post stroke. The simulation analyses performed in this study builds on and agrees with the trends for the shape, noise, and sub-movement simulations shown in literature [4,6,14,16]. However, this study offers an exhaustive analysis of all available smoothness measures and also offers insight on influence of added sinusoids.

Clinical relevance
Smoothness is considered a result of learned coordinative processes, and increased motor control results in improved smoothness during reaching, pointing and grasping [5,6,14]. Identifying and using valid smoothness metrics is essential for proper clinical research, and results in accurate observations of the recovery of motor control while improving the identification of true treatment effects on QoM. The present study showed that only SPARC is a valid smoothness metrics in spite of the plethora available in the literature.
Neurological recovery occurs spontaneously after stroke and results in normalization of neurological measures such as EEG patterns, whereas behavioural restitution is rather restricted to regaining normal behaviour, not denying that neuronal restitution is taking place [51,52]. Clinical assessments which are most closely related to behavioural restitution and thereby neurological recovery, take into account the ability to perform movements outside the pathologic synergies [53]. Whether smoothness metrics reflect neurological recovery after stroke can be determined by investigating the longitudinal association between clinical outcomes that measure behavioural restitution and smoothness metrics [54]. Furthermore, studying the associations between the recovery of neurological pathways and changes in movement smoothness will reveal the influence of behavioural restitution and compensation on smoothness. Additionally, identifying neurological recovery along with changes in movement smoothness post stroke and eventually the underlying physiology that governs smoothness, will Table 3 Summary of the analysis results 'Yes' means that the metric responded to the perturbations as expected, whereas 'No' means otherwise. 1 There was no instance in the analysis where the metric value crossed the 10% threshold. 2 The metric showed monotonic change for lag values greater than 0.2 s. 3 The metric showed monotonic change when the derivative was estimated using steps of 0.06 s for the lag between sub-movements. 4 The metric was robust to all noise values added in the simulation. + Incomplete data. Metrics provide an indication whether smoothness can be used as a target or outcome measure in training and in designing rehabilitation robotics. In these cases, smoothness measured during reaching in healthy age-and gender-matched individuals can be used as reference values [54]. This study used simulations to offer a systematic analysis of changes to the reaching profiles. In case of harmonic disturbance analysis, the upper limit of the sinusoidal frequency range tested (25 Hz) was beyond known frequencies in stroke, and therefore covers all potential disturbances [55]. In case of noise simulation analysis, the robustness of metrics to added measurement noise was tested. However, if the noise is a result of weak human control, the resulting movement would be less smooth, as reflected by the smoothness metric. Therefore, efforts to distinguish between measurement noise and perturbations due to actual human motion control must be undertaken in order to distinguish abnormal, pathologically reduced movement smoothness from that seen in healthy, age-and gender-matched subjects.

Practical barriers
In order to measure smoothness, the measurement system should be capable of measuring velocity (or a higher derivative) of reaching. Measuring smoothness using motion tracking systems or high-end kinematic measurement sensors is relatively simple using the SPARC metric. However, practical requirements need to be considered when the metric is applied in either a clinical setting or an ambulatory or daily life setting. For ambulatory or daily life settings, metrics that can be estimated using wearable on-body sensors are preferred. Inertial and Magnetic Measurement Units (IMUs) are commonly used as wearable sensors for measuring the kinematics of movement. However, as an IMU measures accelerations, estimating velocity from it would require additional processing and is usually prone to drift [56]. In this study, we measured SPARC using linear velocities [6]. Alternatively, in a recent study, Melendez-Calderon and colleagues suggest that during reaching, SPARC can be measured using angular velocities obtained from IMUs [22]. However, techniques to correct drift due to strapdown integration [56] were not employed in their study, as the authors suggest that it warrants a systematic analysis of the errors introduced in the smoothness estimate [22]. Therefore, if the errors are accounted for, it should be possible to reliably measure SPARC using corrected linear velocities obtained from IMUs for a standardized pre-defined movement with a clear start and end posture. Given the advantages of using IMUs, their validity in measuring QoM after stroke requires further research [57].

Generalizability of current findings
Besides stroke, smoothness is highly relevant for studying the impact of neurological disease in other populations, such as those with Parkinson's and Huntington's disease [16]. For instance, smoothness has been used to study fluidity of movement in the upper limb, reflecting bradykinesia and rigidity in patients with Parkinson's disease [58]. Furthermore, the generalizability of smoothness should be investigated for the lower limb allowing to differentiate between affected and healthy gait, as well as to examine effects of medication on smoothness, and to identify fall risk [59]. In addition, the level of smoothness is highly relevant in sports as a measure of proficiency [60,61]. The present findings may serve as inspiration for related fields to determine how smoothness varies for the movement task they analyse.

Limitations and future directions
The first limitation of the current review was that it was restricted to smoothness metrics investigated in post-stroke reaching. Additional metrics for measuring movement smoothness could have been identified if our review was not limited to stroke studies. Generalization to other neurological diseases is therefore limited. The same is true for other movement tasks such as rhythmic drinking tasks [62] or self-paced, isolated elbow flexion movements [63]. Secondly, only English language articles were considered for our systematic review. Thirdly, we model different reaching tasks with different velocity profiles; reach-to-point or aiming movements with symmetrical velocity profiles based on minimum jerk models [27], and reach-to-grasp movement with an asymmetrical velocity profile based on a polynomial curve [28]. The minimum jerk profile was shown to be a good approximation for reaching in healthy individuals [14,[64][65][66][67][68][69]. The asymmetric profile was modelled by applying a polynomial fit to reach-tograsp movements in healthy individuals using a polynomial fit. This fit was found to be better than averaging the reaching profiles from the healthy individuals (Additional file 1.B). However, a true measure of smoothness should not be influenced by the movement profile.
Fourthly, the sub-movement analysis shows that a minimum detectable change in smoothness as measured by SPARC reflects a change in delay between submovements that were at least 6% of the sub-movement duration or longer. Furthermore, as the metric is nonmonotonic for delays less than 20% the duration of a sub-movement, it should be used with caution when studying differences in smoothness amongst fully recovered or healthy individuals. This needs to be considered when studying populations with good recovery. Finally, smoothness metrics such as RJ are based on rotational movements and had to be rejected as they could not be tested with the current simulations. As QoM is studied by comparing task performance with normative values, CM could have been a suitable metric [70]. It is defined using correlation with a minimal jerk profile and it might be interesting to consider a CM measure that takes account of correlation with a velocity profile that models the reaching task. However, in our analysis, we saw that the metric might not be useful in measuring tremor or weak control of reaching movement. Additionally, the need for prior knowledge of the intended reaching task is a big drawback of the metric.
Although our simulations mimicked features of reaching in persons with stroke, such as varying duration or distance, and sub-movement segmentation [11], they cannot truly replace actual reaching by subjects who have suffered a stroke. Moreover, longitudinal studies of patterns of smoothness metrics in patients early post stroke will show how sensitive the smoothness metric over time and how these values relate to values measured in healthy age-and gender-matched subjects. We performed this analysis in our companion paper [71], where SPARC was seen to be responsive to change over time in the early phase post stroke and longitudinally associated with clinical measures of motor impairment within subjects.

Conclusion
We recommend the use of SPARC as a valid metric to measure the smoothness of the upper limb reaching after stroke. Longitudinal studies are further required to understand the relationship between the time course of recovery and smoothness early post stroke.