Paradigm design
To provide robust single-channel control, we implemented a synchronous binary approach to 2-D cursor control. Synchronous control uses a pre-defined time window for each user response so that the computer does not need to determine when a user response occurs, but only into which class each user response falls. Binary control refers to a situation under which each response must be classified into one of only two classes, as contrasted with control where a response can be classified into one of a greater number of classes or ignored altogether. Synchronous binary classification is the simplest possible classification using EEG, and we hypothesized that this simplicity would yield high cursor movement accuracy.
The binary approach works as follows. The cursor moves in discrete steps, and each step is in one of four directions (up, down, left, right) as selected by the user through his or her EEG signal. To select a direction, the user effectively answers "yes" or "no" two times in a row, performing continuous right-hand movement to answer "yes," or abstaining from such movement to answer "no." The user has a short time to give each answer, during which the resultant ERD causes a power change in the EEG signal. The computer program measures the EEG power from a single optimum channel and frequency band over the pre-defined time window of the subject's answer. If the power is above a certain threshold the software algorithm interprets the answer as a "no," and if the power is below the threshold the software algorithm interprets the answer as a "yes." The program determines the threshold value prior to the user's first game by presenting a series of "yes" or "no" prompts that the user obeys directly, and using the associated power measurements from the appropriate location/band to optimize classification accuracy. This threshold determination does not have to be repeated before each game.
Under the 2-D cursor control paradigm, a cursor moves among squares of a grid towards a target while avoiding a trap. Sequential screen shots of one cursor move are shown in Figure 1. The subject is presented with the game grid, and is allowed to blink, shift gaze, and strategize for the next move. After presentation, everything but the cursor and four adjacent squares are blacked out, and a prompt is presented in each of the possible movement directions. For all but one of the studies presented here, EEG signals were recorded with the subject making hand movements. One example is presented in which control was performed with only motor imagery. When movements are used, the subject initiates control by making continuous right hand movement. The prompts remain cyan for a short time to allow the subject to interpret the prompt in the desired movement direction, and then the prompts turn green. While the prompts are green, the subject executes the desired task. To select a direction showing a "yes" prompt, the subject continues the right hand movement. To select a direction showing a "no" prompt, the subject ceases the movement and remains motionless throughout the green prompt. In either case, the subject must fixate on the prompt, remain relaxed, and not blink to avoid artifacts while the prompt is green. Once the program determines the first response (first bit), it eliminates the two rejected directions, and repeats the prompting process. After the second response (second bit), the game grid again becomes visible, and the cursor moves to the new position. The entire process for one (two-bit) cursor move takes about 15 s. When the game is played without hand movements (as in one of our supplementary tests), the subject is asked instead to imagine a movement. When playing the game using motor imagery, the threshold-setting and control tasks are performed as normal.
Additional file 1: ExampleVideo is a short video clip of the 2-D cursor control game. This file is provided only to demonstrate the appearance of the game.
Additional file 1: Example Video. Demonstrates the general sequence of a typical cursor control game. No audio. (WMV 7 MB)
While on any given movement the cursor moves in only one direction, the control is two-dimensional rather than one-dimensional because the direction of each movement can be any one of four choices in two dimensions. This is analogous to the two-dimensional control achieved by the P-300 detection method of Piccione et al. [15], which also uses a series of single cursor movements, each in one of four directions. Whereas Piccione's method relies on sequential emphasis of four stimuli to obtain the two bits of information required for each cursor move, our method obtains the first and second bits sequentially through two user selections, which together uniquely identify both the dimension and direction of each cursor move. This two-dimensional control is distinct from one-dimensional control, wherein the computer restricts the dimension of cursor movement, and the user is free to control only the direction of the movement.
The cursor control game incorporates several additional features. These include automatic recordkeeping, game scoring to hold player interest, and an optional adaptive threshold feature (which was used only for Subject F, as discussed below). Furthermore, the program avoids superfluous prompts; if the cursor is at an edge of the grid and the first prompt can uniquely determine cursor movement direction, then only one prompt is provided.
Study procedures
A Neuroscan Synamp 1 amplifier (Neuroscan Inc., El Paso, TX, USA) amplified the EEG signal from 29 electrodes. The 29 electrodes sampled at 250 Hz from FP1, F3, F7, C3A, C1, C3, C5, T3, C3P, P3, T5, O1, FP2, F4, F8, C4A, C2, C4, C6, T4, C4P, P4, T6, O2, FZ, CZA, CZ, PZA, and PZ in an elastic cap (Electro-Cap International, Inc., Eaton, OH, USA). The recordings from a maximum of five of these 29 electrodes were used for each subject's cursor control, although all 29 electrodes were used once per subject for the initial channel/bin optimization step, which did not need to be repeated thereafter. A Hewlett-Packard workstation converted the amplified analog signal to a digital signal.
We determined the optimum single electrode location and frequency band for control for each subject from offline analysis of EEG recordings. First, each subject performed the threshold-setting task (although no threshold was set at this point) wherein single predetermined yes/no prompts were presented sequentially. This threshold-setting task consisted of 30 prompts, composed of 15 "yes" and 15 "no" prompts randomly interspersed. An offline feature analysis of the resultant EEG recordings was performed to identify the location and band for which power measurements provided the greatest yes/no class separability. Once the optimum location and band were identified, these were used for all subsequent testing with the subject. Thus, this optimization step, which required a relatively large number of electrodes (all 29 were analyzed), only needed to be performed once per subject, and then a reduced number of electrodes could be used (five electrodes if using Laplacian derivation, or one electrode if not).
Once the optimum location and band were identified, each subject repeated the threshold-setting task, and the power in the optimum location/band was again computed (now using the reduced number of electrodes). These measurements were used to set an optimum threshold. For these experiments, the threshold-setting task again consisted of 30 prompts, composed of 15 "yes" and 15 "no" prompts randomly interspersed. Completion of the entire threshold-setting task took less than 5 minutes. The threshold determined from this task was used for the subsequent 2-D cursor control task. Each subject repeated the threshold-setting task multiple times to practice his or her control strategy. However, each time the task was repeated, the program discarded all previously obtained data. Thus, the threshold set by the program was based solely on the 30 prompts from the subject's most recent performance of the threshold-setting task.
Finally, each subject performed the 2-D cursor control task. The program interpreted intended cursor movement direction online in real-time by comparing measured powers to the optimum threshold. The program also tagged the EEG recordings with the interpreted yes/no answers. An electromyography (EMG) channel recorded right hand movement during the cursor control task. The EMG signal was sampled at 250 Hz from a bipolar surface electrode located over each subject's right wrist extensor muscles. Visual inspection of the EMG recording was used to quantify the control accuracy through post-hoc offline analysis.
Computational method
For all prompts in the threshold-setting and cursor control tasks, the time over which the subject gave each yes or no answer had duration 2 s. Band power measurements were computed for the final 1.5 s of this time window only, to allow for subject response time. Power was determined using the Welch estimation method with FFT length (nonequispaced fast Fourier transform) of 64 and a Hamming window with 50% overlap [17]. The sampling rate of this study was 250 Hz, and the frequency resolution was about 4 Hz. For all measurements, the EEG signal was referenced using Laplacian derivation to reduce error. This means that the EEG signal was referenced from each electrode to the average of the potentials from the nearest four orthogonal electrodes. For example, the program referenced the C3 channel to the average of C1, C3A, C5, and C3P, each of which was about 3 cm from C3, and calculated band power on C3 for the referenced signal.
To determine the optimum spatial location and frequency band for discrimination, we conducted a feature analysis by calculating Bhattacharyya distances from power measurements. Frequency bands were 4 Hz wide, corresponding to the 4 Hz resolution of the power measurement. We measured power using the Welch method for each yes/no response, for each EEG channel. Then, for each channel/bin pair, we calculated a Bhattacharyya distance based on the power measurements for all of the responses from both the "yes" and "no" classes. Higher Bhattacharyya distances corresponded to better yes/no class separability, and identified the more effective channels and frequency bands for control. We calculated each Bhattacharyya distance according to (1), where M
i
and Σ
i
are the mean vector and covariance matrix of class i ( = 1,2), respectively [18]. As we measured the Bhattacharyya distance for each channel and frequency bin, M
i
is a scalar.
After we identified the optimum location and frequency band (only done once per subject), we used these in our threshold-setting program, which no longer needed all EEG channels. This program measured power in the optimum location/band while the subject performed the threshold-setting task. After the task was complete, a receiver operating characteristics (ROC) curve was generated by determining the true positive and false positive fractions that would result from various values of threshold. Here, "true positive fraction" refers to the fraction of intended "yes" answers that the program would interpret as "yes" answers given the particular threshold value (this is equivalent to sensitivity). "False positive fraction" is the fraction of intended "no" answers that the program would interpret as "yes" answers (this is equivalent to 1 – specificity). The threshold-setting program chose the optimal threshold as that which minimized the distance defined in (2).
Additional file 2: Overview summarizes the most important steps of the binary control computational method. The file shows examples of recorded EEG signals, and indicates how these signals can be classified based on their power spectral densities into "yes" and "no" classes. The file demonstrates the correspondence between higher Bhattacharyya distances and better class separability, and shows how choosing the optimum location/band can yield a high-quality ROC curve, from which a threshold can be set and subsequently used to achieve good control in the 2-D cursor control task.
To quantitatively assess the accuracy of the cursor control, we analyzed the recordings from the control task offline following each subject's session. We compared our program's yes/no interpretations with the recorded right wrist EMG trace to explicitly determine whether each classification and cursor move was correct.
For motor imagery, no EMG signal was available for comparison, so we assessed the accuracy of yes/no classification from one of the threshold-setting task recordings. We divided the prompts into a training set consisting of the first 7 "yes" and first 7 "no" responses, and a testing set consisting of the last 8 "yes" and last 8 "no" responses. We used the training set to calculate an optimum threshold, which we then applied to the testing set to classify its responses. Because we knew the correct classifications of the responses, we were able to quantify the classification accuracy. We also used the entire threshold-setting task to set an optimum threshold with which the subject played the cursor control game. We then asked the subject to qualitatively evaluate her control after playing the game.
Subjects and data acquisition
We tested the paradigm with four healthy subjects using hand movement. Subjects included three females and one male, with ages ranging from 24–55 years. Subject A was female, age 53 years. Subject B was female, age 55 years. Subject C was female, age 24 years. Subject D was male, age 32 years.
We also carried out several supplementary tests. Subject B performed our paradigm using motor imagery. This followed Subject B's session using real movement. Subject E, a primary lateral sclerosis (PLS) patient, performed our paradigm using hand movement. PLS is a motor neuron disease, the symptoms of which include slowly progressive spasticity of unknown cause without clinical signs of lower motor neuron loss. Pathological studies show degeneration of the corticospinal tracts. Subject E was female, age 58 years, with the disease for 11 years. She was identified as a PLS-A patient with loss of motor-evoked potentials by transcranial magnetic stimulation, and her right finger tapping rate was 3.6 taps/s, which was significantly lower than healthy controls of 5.8 taps/s [19]. Subject F performed our paradigm using hand movement, but with no Laplacian derivation referencing of the EEG channels. Subject F was male, age 23 years. We also performed a post-hoc offline analysis of data from Subject A with the Laplacian derivation removed.
None of the subjects had previous BCI experience. All subjects were right-handed according to the Edinburgh inventory [20]. All subjects gave written informed consent for the protocol, which was approved by the institutional review board.
We accomplished the real-time EEG data acquisition and processing using a Matlab-based self-developed hardware and software system. The self-developed Matlab scripts accessed the digital signal and performed the power spectral estimation. Finally, the scripts decoded the power spectral signal to drive the cursor movement.