Predicting functional performance via classification of lower extremity strength in older adults with exergame-collected data

Objective The goal of this article is to present and to evaluate a sensor-based functional performance monitoring system. The system consists of an array of Wii Balance Boards (WBB) and an exergame that estimates whether the player can maintain physical independence, comparing the results with the 30 s Chair-Stand Test (30CST). Methods Sixteen participants recruited at a nursing home performed the 30CST and then played the exergame described here as often as desired during a period of 2 weeks. For each session, features related to walking and standing on the WBBs while playing the exergame were collected. Different classifier algorithms were used to predict the result of the 30CST on a binary basis as able or unable to maintain physical independence. Results By using a Logistic Model Tree, we achieved a maximum accuracy of 91% when estimating whether player’s 30CST scores were over or under a threshold of 12 points, our findings suggest that predicting age- and sex-adjusted cutoff scores is feasible. Conclusion An array of WBBs seems to be a viable solution to estimate lower extremity strength and thereby functional performance in a non-invasive and continuous manner. This study provides proof of concept supporting the use of exergames to identify and monitor elderly subjects at risk of losing physical independence.


Introduction
Falls are an important cause of mortality and early placement in nursing homes in older adults. The main causes of falls are accidental and environment-related (31%), or caused by gait imbalance (17%). Approximately 30 to 60% of older adults fall each year. Out of these falls, 10 to 20% result in injury, hospitalization, or death. Risk assessment and exercise are among the most relevant factors to prevent these falls [1]. The role of sensor-based solutions in regards to falling risk has traditionally been focused on detecting said falls. Both wearable and smartphone-based solutions for fall detection are readily available for this purpose [2]. Although this approach is useful, detecting elderly who are at risk of losing physical independence, and thus may fall in the near future, would provide an additional method to prevent falls before they occur.
Exergames are active video games that incorporate physical movements, aiming to combine physical exercise with the fun associated with gaming. The main advantage of using exergames is that they increase motivation and thus adherence to training [3]. These exergames can be designed to require players to perform physical movements similar to those of fall risk prevention exercises. At the same time, and in the background, data of clinical relevance can be collected from the sensors used to Open Access *Correspondence: Augusto.garcia@kom.tu-darmstadt.de 1 Multimedia Communications Lab, Technische Universitaet Darmstadt, Rundeturmstr. 10, 64283 Darmstadt, Germany Full list of author information is available at the end of the article control the exergame, [4]. Furthermore, it is also possible to adapt the exergame to the specific needs of the user in real-time and without external intervention, based on how players perform in the game [5]. This holds promise for using exergames as rehabilitation tools able to provide continuous physical improvement [6].
The potential of the Wii Balance Board (WBB) to estimate whether the player can maintain physical independence has already been identified [7]. However, the relationship between WBB data and the estimation of clinically meaningful physical independence metrics is unclear. In this sense, Mertes et al. discussed that WBB data contain information that allows discrimination between elderly who previously fell and others who did not [8]. Their study achieved an accuracy of 76.6% when classifying fallers and non-fallers among 12 participants. Early evidence also shows that the WBB could be used to train balance in the elderly [9], and that there are statistically significant differences in the way elderly at falling risk interact with the WBB as compared to individuals with no falling risk. These differences correlate with clinical fall risk tests, further supporting our hypothesis that a direct relation between WBB data and clinical metrics for physical independence can be established. Yamada et al. [10] found statistically significant differences and moderate correlations (r = 0.69) in a study with 45 participants.
A limitation of the WBB is that, due to its small surface, it can only be used to estimate balance while standing, but not in movement. In a previous article, we presented PDDanceCity, a city map exergame that provides dualtasked cognitive and physical rehabilitation [11]. The game is controlled with an array of six WBBs, which we call Extended Balance Board (EBB) [12]. Thanks to its extended surface, EBB data can be used to estimate the balance of the player both while standing and walking. We believe the data extracted from the EBB could be used to estimate the balance and gait skills of the player in the background, without the need to actively perform any specific test, or for any caregiver to be present.
To do so, this study aims to analyze the possibility of training a classifier to predict the clinical functional performance of a player based on EBB data collected in the background while playing PDDanceCity. This can be achieved by attempting to predict the score of a standardized test that can be used to assess the capability of maintaining physical independence. There are several such tests to measure lower extremity strength, for example, the 30-s Chair-Stand test (30CST) [13], which is part of the Fullerton Fitness Test Battery, and is fairly easy to administer. The Fullerton Fitness Test Battery is commonly employed in older adults in community settings. It can measure physical patterns of physical decline in advanced ages. Evidence suggests it could also be used as a screening test to estimate the balance impairment in older adults [14,15]. The 30CST classifies participants as subjects able or unable to maintain physical independence, depending on whether their test score is above or below an age-and sex-adjusted cutoff. We hypothesize this binary prediction could be achieved with a classifier algorithm using data extracted from the EBB.
The goal of this study is to determine the potential of classifying EBB-extracted data to perform a binary prediction, that is, whether the player is able or unable to maintain physical independence. This prediction could be used to detect when individuals are at increased risk of losing physical independence and could be more prone to fall in the near future. We aim to validate this estimation basing the result on a prediction of the 30CST score. Data is collected while users are playing PDDanceCity to provide a very simple background screening process determining whether the player may be unable to maintain physical independence.

Methods
PDDanceCity [11] is a labyrinth navigation exergame designed for dual-tasking rehabilitation. The goal of the game is to navigate the labyrinth, representing a city map, to reach a goal, where only two-dimensional movements are possible (up, down, right, and left). As an additional requirement, players are encouraged to reach the target with the least possible number of steps. Besides, they may be required to visit a given number of points of interest (for example a museum, monument, or café) which may, or may not, be directly on the shortest path ( Fig. 1). The game offers dual-tasking rehabilitation, training visuospatial function, memory, balance, and physical coordination.
PDDanceCity is controlled with a system consisting of an array of six WBBs, called EBB [12] (Fig. 2). A controller receives all data from the WBBs and forwards it via a USB connection to a PC. Information sent through the USB interface contains the board identifier (ID), based on To use EBB data to control PDDanceCity, the center of mass com(t) is calculated as follows. We define S as the 6 × 4 matrix of sensor values (six WBB boards and four sensors per board), and s i,j (t) as the value of sensor i, j of S in instant t . We define C as the matrix of (x, y) coordinate vectors c i,j assigned to each sensor (Fig. 3), based on its position. We also define w(t) as the last total weight value calculated by all boards, that is, the weight of the player. com(t) is calculated as the weight-normalized bidimensional projection of sensor values as: This results in a set of two minus-one to one values (com x , com y ) which can be used to determine intentionality. To achieve this, we define a directional intention based on two conditions: the main directional component must be equal to or greater than 0.5 in magnitude, and the other component must be equal to or lesser than 0.1 in magnitude. As an example, (0.1,0.9) represents an upwards step, and (−0.8,0.05) would represent a leftwards movement. Between each step, the player is always required to return to the center (both values lower than or equal to 0.1 in magnitude). Figure 4 represents two examples of this directional intention. We also define the instability factor if (t) as an approximation of the first-order differential of com(t) . This parameter is a measure of how a player shifts their weight on the EBB. A very fast weight shifting, causing a high value of if (t) , would be an indicator of potential lack of balance (or loss thereof ) among older adults who are not expected to move quickly. This is calculated as: where (t − 1) represents the value prior to the most recent one t . In this manner, when if (t) surpasses a certain threshold, a potential loss of balance might have occurred. For every level played, PDDanceCity stores a.xml file that includes the player's profile information, information about the level, steps taken, and all values of com(t) andif (t).
Finally, we extract a series of features based on com(t ) and if (t) . These features are mostly related to average values, standard deviations and maxima and minima of com(t ) under different circumstances, as well as the number of times that if (t) overcame different possible thresholds. In addition to these two elements, we also consider features related to the time intervals between steps, and the standard deviation of these intervals. A complete feature list is presented in Table 1. All features are calculated per playthrough, with no windowing. We used the Matlab software to calculate these features [16].
To evaluate our system, we recruited 16 participants (median age 73, 6 males) at a nursing home in Darmstadt, Germany. A computer was installed in a common room, connected to a television and the EBB (Fig. 2). Participants were invited to play PDDanceCity as often as they desired for a period of 2 weeks. During the first session, nominal data (age and sex) was collected, and the 30CST was administered. The resulting 30CST scores ranged between 0 and 17, with a median of 13. All sessions took

Fig. 2 PDDanceCity scenario setup
place under observation of one of the authors, to ensure that no falls occurred. Otherwise, the game sessions were unsupervised. We obtained the approval of the ethics committee of the Technical University of Darmstadt for this evaluation.
In total, these 16 participants played 87 levels of PDDanceCity during this period. The median number of levels per participant was 5. Each level of PDDanceCity takes approximately 2 to 3 min to complete, resulting in an approximated gameplay time of 10 to 15 min per participant. For each level, a single training instance was obtained. The data of 6 of these levels had to be discarded due to data failure, leaving 81 training instances for classification. Due to the reduced number of participants, and to minimize the risk of overfitting based on age and sex, we attempted to classify if the player's predicted 30CST score was above or below a cutoff score of 12 points, without using these nominal data (age and sex) as features. We refer to players classified above this cutoff as fit, and those under the cutoff as not fit. This score was chosen to even out both groups, as eight participants had a 30CST score of 11 or lower. We also explore the possibility of predicting the adjusted cutoffs, which we discuss at the end of "Results" section. All classification tasks were performed using Weka [17].

Results
The best classification results are presented in Table 2. This decision tree used the average time between steps exclusively, with a score of 6.17 or lower, indicating a participant able to maintain physical independence. A comparison of different classification algorithms is presented in Fig. 5. In all cases, we performed our classification using ten-fold cross-validation. Results of a feature selection analysis (information gain attribute evaluation) are included in Table 3. No features were excluded for classification.
As a second potential scenario of analysis, we also aimed to predict the age-and sex-adjusted 30CST cutoff scores. The resulting accuracy was very high (99%) but, as discussed in the previous section, we suspect that to be due to overfitting to age and sex because of our limited sample size, as the classifier did achieve 100% accuracy using exclusively age and sex as features. If we remove these two features in this scenario, we achieve a classification accuracy of 86% predicting the age-and sexadjusted 30CST outcome. For this reason, we believe that provided a large (and diverse) enough sample size of participants of a wide array of ages and different degrees of fitness, it should be possible to predict the age-and sex-adjusted 30CST binary result using the methods presented in this publication.
We complemented this classification with the analysis of the effect sizes of each feature between the fit and not fit groups, measured on the basis of Hedges' g, due to the low sample size and the disparity in standard deviations. We also evaluated statistical significance using a Welch t-test. These effect sizes are presented in Table 4. Features related to the instability factor and the mean and standard deviation of the time between steps, seem to contain the information most related to the 30CST. Following Cohen's rule of thumb (0.2 is a small, 0.5 a medium and 0.8 a large effect size), the effect sizes of these features are large, with the differences between the fit and not fit groups being in most cases very significant (p < 0.001) or at least significant (p < 0.05).

Discussion
Despite the limited number of participants and training instances, we obtained excellent classification results. Generally, decision trees seem to provide the best performance in the proposed classification task.
Although we decided on using the 30CST to minimize the risk of falls while conducting the test, such test is correlated to physical independence, but not the risk of falling. This is a limitation of this study since, in order to evaluate the feasibility of using EBB data to predict the risk of falling directly, an alternative assessment method, such as the Berg Balance Scale (BBS), should be used. A future study with a larger cohort should consider using the BBS instead of the 30CST to further support the hypothesis that EBB data can be used to accurately identify participants at an increased falling risk. In addition, further balance-related data from participants (Physiological Profile Assessment, functional balance, gait speed, or prior falls) should be collected as well.
Our design also presents some technical limitations. At the moment, the WBBs send the data via Bluetooth, which means they have to be manually connected for each play session. They also operate on batteries, and when these are low the data received is not reliable anymore, thus leading to data failure. Additionally, the EBB frame presents a risk depending on how the EBB is placed in its surroundings: if it is not set against a wall

Com Std Direction
Standard deviation of com, for each direction, as above. Eight features per playthrough Step Avg Average time between steps, excluding the first step, defining Step Time (i) as the time in seconds in which step i occurred, and n Steps as the total number of steps in the playthrough. One feature per playthrough n Steps i=2 Step Time (i)−Step Time (i−1) n Steps Step Std Standard deviation of time between steps, excluding the first step. One feature per playthrough n Steps i=2 Step Time (i)−Step Time (i−1)−Step Avg 2 n Steps −1

Conclusions
This study provides proof of concept supporting the use of exergames to identify elderly subjects at risk of losing physical independence. Despite the aforementioned limitations, our results suggest that the EBB, as an extension   of the WBB, can be used to screen the elderly population for individuals which are likely to lose physical independence in the near future, thus guiding therapeutic and rehabilitation adjustments. Nevertheless, a larger dataset is required to determine the feasibility of predicting if a participant will be above or below their age-and sexadjusted 30CST cutoff score. This could also open the possibility of predicting the result of similar tests, such as the Berg Balance Scale or the Ten-Meter Walk Test.
Once the technical limitations of the EBB are addressed, and considering that participants played without supervision, a home (or, more generally, unsupervised) scenario seems feasible. In the future, we aim to extend our evaluation including features related to game performance, conducting a similar evaluation concerning cognition. This could be done, for example, on the basis of the Mini-Mental State Examination.