Cross-validation of predictive models for functional recovery after post-stroke rehabilitation

Table 1 Description and ranges of optimisation of the parameters for each algorithm trained

Classifier	Parameters (description)	Values range
Logistic Regression	c (inverse of the regularisation strength)	0.001–1000
Logistic Regression	l1_ratio (to select the weight of L1 and L2 penalties)	0.1–0.9
kNN	n_neighbors (to select the number of neighbours)	10–50
	weight (to select a uniform or distance-based weight on the samples)	“uniform”, “distance”
	algorithm (to select the type of algorithm to compute the nearest neighbours)	“brute”, “ball-tree”, “kd_tree”
	leaf_size (parameter selectable only for tree-based algorithms that affect its speed and memory)	5–100
	p (power of the Minkowski metric for the distance calculation)	1–5
SVM	gamma (kernel coefficient)	10^–6–10⁶
	C (inverse of the regularisation strength)	10^–6–10⁶
	kernel (kernel type to be used in the algorithm)	“rbf”, “linear”
RF	n_estimators (number of trees in the forest)	5–25
	max_depth (maximum depth of the tree)	1–10
	max_features (to select the number of features to consider when looking for the best split)	2–10
	criterion (to select the function type to estimate the quality of the split)	“gini”, “entropy”
	min_samples_leaf (to select the minimum number of samples to have a leaf node)	3–10
	min_samples_split (to select the minimum number of samples to split and internal node)	5–20
	bootstrap (to activate or not the bootstrap approach when building the trees)	“true”, “false”

ISSN: 1743-0003