From: Robot-aided assessment of lower extremity functions: a review

Property | Definition | Measure |
---|---|---|

Reliability | Consistency of the results obtained on repeated administrations of the same test by the same person (intra-rater or test-retest) or by different people (inter-rater). |
ICC: based on ANOVA statistics: between-subjects var/(between-subjects var + error), six different computational methods are possible; 0 ≤ ICC ≤ 1, unitless [212, 213]. Acceptance levels for ICC depends on the application. However, a general classification of reliability has been proposed [214]: 0.00 ≤ ICC ≤ 0.10 – virtually none; 0.11 ≤ ICC ≤ 0.40 – slight; 0.41 ≤ ICC ≤ 0.60 – fair; 0.61 ≤ ICC ≤ 0.80 – moderate; 0.81 ≤ ICC ≤ 1.0 – substantial. \( SEM=SD\sqrt{1-ICC} \) (SD of the scores from all subjects). SEM has the same unit of the measured variable [18]. Bland-Altman plots: mean of two measures vs their difference. LOA = ±1.96∙SD [17] Cohen’s Kappa k: percent agreement among raters corrected for chance agreement [215]. |

Validity |
Extent to which the instrument measures what it intends to measure. Concurrent validity: degree to which the measure correlates with a gold standard. Construct validity: ability of a test to measure the underlying concept of interest. | Correlation-based methods: Pearson (r) or Spearman (ρ) correlation coefficient, ICC [216]. For continuous measures of the same data type (e.g. two methods for measuring gait speed): Root Mean Square Error (RMSE) or Bland-Altman plots against gold standard. |

Responsiveness |
Ability to accurately detect changes. Internal responsiveness: ability of a measure to change over a particular specified time frame. External responsiveness: extent to which changes in a measure over a specified time frame relate to corresponding changes in a gold standard [217] Minimal Detectable Change (MDC): minimal amount of change that is not likely to be due to random variation in measurement [218]. Minimal clinically important difference (MCID): smallest amount of change in an outcome that might be considered important by the patient or clinician [22]. Floor and ceiling effects: the extent to which scores cluster at the bottom or top, respectively, of the scale range. |
Internal responsiveness: Cohen’s effect size: observed change in score divided by the SD of baseline score. Standardized response mean (SRM): observed change score divided by SD of change score in the group. External responsiveness: ROC curves: sensitivity vs specificity based on an external criterion [217] \( \mathrm{M}\mathrm{D}\mathrm{C} = \mathrm{S}\mathrm{E}\mathrm{M} \times 1.96 \times \sqrt{2} \) [18] MCID: anchor-based (compare a change score with external measure of clinically relevant change) or distribution-based methods (based on statistical characteristics of the sample) [218]. Floor and ceiling effects: percentage of the number of scores clustered at bottom/top. |