Detecting alpha spindle events in EEG time series using adaptive autoregressive models
© Lawhern et al.; licensee BioMed Central Ltd. 2013
Received: 30 May 2013
Accepted: 13 September 2013
Published: 18 September 2013
Skip to main content
© Lawhern et al.; licensee BioMed Central Ltd. 2013
Received: 30 May 2013
Accepted: 13 September 2013
Published: 18 September 2013
Rhythmic oscillatory activity is widely observed during a variety of subject behaviors and is believed to play a central role in information processing and control. A classic example of rhythmic activity is alpha spindles, which consist of short (0.5-2 s) bursts of high frequency alpha activity. Recent research has shown that alpha spindles in the parietal/occipital area are statistically related to fatigue and drowsiness. These spindles constitute sharp changes in the underlying statistical properties of the signal. Our hypothesis is that change point detection models can be used to identify the onset and duration of spindles in EEG. In this work we develop an algorithm that accurately identifies sudden bursts of narrowband oscillatory activity in EEG using techniques derived from change point analysis. Our motivating example is detection of alpha spindles in the parietal/occipital areas of the brain. Our goal is to develop an algorithm that can be applied to any type of rhythmic oscillatory activity of interest for accurate online detection.
In this work we propose modeling the alpha band EEG time series using discounted autoregressive (DAR) modeling. The DAR model uses a discounting rate to weigh points measured further in the past less heavily than points more recently observed. This model is used together with predictive loss scoring to identify periods of EEG data that are statistically significant.
Our algorithm accurately captures changes in the statistical properties of the alpha frequency band. These statistical changes are highly correlated with alpha spindle occurrences and form a reliable measure for detecting alpha spindles in EEG. We achieve approximately 95% accuracy in detecting alpha spindles, with timing precision to within approximately 150 ms, for two datasets from an experiment of prolonged simulated driving, as well as in simulated EEG. Sensitivity and specificity values are above 0.9, and in many cases are above 0.95, for our analysis.
Modeling the alpha band EEG using discounted AR models provides an efficient method for detecting oscillatory alpha activity in EEG. The method is based on statistical principles and can generally be applied to detect rhythmic activity in any frequency band or brain region.
Alpha waves ([8, 13] Hz) were among the earliest described functional oscillatory components in the human EEG , and research has supported the general notion that alpha band power is inversely related to brain activation [2–4] and reflects deactivated or inhibited cortical processes . Several different alpha rhythms have been identified (e.g., mu, sigma, tau, occipital) across various brain regions. In particular, rhythmic alpha activity has been observed to increase in posterior brain regions (parietal-occipital) during attentional lapses [6, 7] and during states of drowsiness relative to states of alertness [8–11].
A widely-studied characteristic of the alpha frequency band is the alpha spindle, a large narrowband burst of alpha activity that usually occurs over short (0.5-2 s) duration [8, 11, 12]. Alpha spindle spectral microstructures have peak amplitudes generally in the higher alpha frequencies ([10, 13] Hz) and have characteristics that resemble the “waxing and waning” of the alpha rhythm [12, 13]. Alpha spindles also have been shown to occur primarily in the parietal/occipital regions of the brain and have been correlated with fatigue, drowsiness and reduced driving performance in experiments of prolonged driving [11, 12]. Several methods have been developed to characterize alpha power changes associated with various tasks. These methods include fast Fourier transforms (FFT) [7, 10, 11], wavelets , phase , matching pursuit [16, 17], ERD/ERS , autoregressive modeling [6, 18, 19], adaptive filtering , neural network analysis , fuzzy systems [22, 23], and nonlinear EEG analyses [7, 24]. Other characterizations of alpha activity are the alpha band power of the signal  and power ratios such as the (alpha + theta)/beta ratio.
Recently, Simon et al.  developed an algorithm specifically for detecting alpha spindles in EEG by calculating the full width at half maximum (FWHM) of the amplitude spectral density of the alpha frequency range. Features of alpha spindles, such as alpha spindle rate and alpha spindle duration, were extracted from the EEG and correlated with fatigue and drowsiness in both real  and simulated  driving tasks. However, Simon et al. only validated algorithm accuracy in capturing high frequency narrowband alpha on simulated data and not real EEG. Techniques based on matching pursuit (MP) have also been used by Schönwald et al. to identify alpha spindles in sleep by using a dictionary of Gabor, Fourier and Dirac delta functions . They report sensitivity and specificity values of approximately 0.812 in detecting alpha sleep spindles across all stages of sleep. By changing some of the parameters in their model, they were able to increase the sensitivity and specificity to about 0.9; however, this change increased computational time exponentially.
The goal of this work is to develop an efficient algorithm to reliably detect sudden increases of narrowband oscillatory EEG activity with good temporal resolution. Our motivating example for the development of this algorithm is detecting alpha spindles in EEG. We hypothesize that alpha spindles in EEG represent a changes in the underlying neural dynamics that can be characterized and identified by their statistical properties. To detect these changes, we propose a method that is based on change-point detection, a class of time series models designed to discover changes in the underlying statistical properties of time series data [26–30]. These models aim to detect statistical irregularities in data (called change points) with high temporal resolution. Change point detection models have been used previously in many applications such as detecting denial of service (DoS) attacks by monitoring packet activity in computer networks [27, 31] and monitoring stock prices .
Since EEG signals are highly dynamic, we develop a change-point detection algorithm based on discounted autoregressive (DAR) models to represent the EEG time series adaptively in time. DAR models weigh points observed further in the past less heavily than points more recently observed, so they can adapt to the non-stationary nature of EEG. Features that are extracted from a DAR model are time dependent and can be used for analysis of transient EEG. The DAR model can also be updated sequentially in time using an algorithm called the sequential discounted autoregressive (SDAR) algorithm[27, 28]. This allows for the analysis and monitoring of EEG signals in near real-time, making it a computationally efficient method for EEG analysis.
We apply our SDAR algorithm together with predictive loss scoring to identify time periods in the alpha frequency range of EEG where the time-dependent DAR model cannot adequately describe future data points (periods of high loss scores) and correlate these time periods with alpha spindles in EEG. The model parameters, such as the time-varying AR model coefficients and the model variance, can be used for further analysis of these time periods. We demonstrate the efficacy of this approach both for simulated data as well as for expert-labeled EEG data from simulated driving tasks.
Autoregressive models (AR) represent each data point as a linear combination of a certain number (the model order) of previous data points. Discounted AR models assume that data points observed further in the past contribute less information than points more recently observed. An algorithm for the sequential computation of the DAR model parameters was proposed in . We adapt this algorithm for online detection of rhythmic oscillatory activity in EEG, with applications to detecting alpha spindles.
where is the estimate of the AR model parameters A = (A 1,…,A p )T and . The variance σ 2 can be estimated in a similar fashion .
where r ε(0,1) is the discounting rate, and is the discounted maximum likelihood estimate of the DAR model parameters. This model implies that time points observed earlier in the sequence have less influence on the overall likelihood than points observed more recently by a factor of (1-r) t-i . This model has been used previously in the analysis of time series signals in information network security [27, 28].
Using the DAR model, Urabe et al.  have derived an algorithm for the sequential optimization of the DAR model parameters, called the sequential discounted autoregressive (SDAR) algorithm. The SDAR algorithm sequentially estimates the DAR model parameters given a new data point in the time series at time t. The DAR model parameters have a subscript t to emphasize that they are now time-dependent. We use this algorithm to obtain the coefficients of the DAR model, which are used to calculate the mean and variance of the Gaussian distribution that describes the data at each time point. We then calculate a predictive loss function that measures how well the data is modeled by past data points.
Repeat Steps 2–4 until the end of the time series at t = n.
In Step 1 we initialize the parameters at time t = p, where p is the model order. Here, I p denotes a p × p identity matrix. We initialize the parameters M t and by the Burg estimates of the AR parameters and the noise variance, and , respectively, from an available training dataset, as these values produce stable algorithm performance. Usually we use an initial portion of the time series as training and estimate these values using a normal AR model.
In Step 2 we update the DAR parameters A t given the new data point x t . Step 3 updates the mean and variance of the Gaussian distribution using the newly estimated DAR model parameters. Step 4 calculates a quadratic loss score by comparing the data point to the mean of the Gaussian distribution at the current time. Steps 2–4 are repeated until the end of the recording at t = n. We then apply a temporal smoothing function to the quadratic loss score time series to reduce the impact of isolated outliers . We smooth the loss score time series using the mean of five (5) data points.
We calculate the True Positive (TP), False Positive (FP), and False Negative (FN) rates by comparing the alpha spindle time regions identified by an expert with those marked by the SDAR algorithm as having a smoothed loss score that exceeds the specified threshold. We use the weighted F-measure to take into account the highly unbalanced nature of the data, as alpha spindles occurred less than 1% of the total time. β = 1 reduces to the standard F-measure, which weights precision and recall equally. β = 0.5 emphasizes precision more than recall, and β = 2 emphasizes recall more than precision. We use β = 2 for this analysis to identify the loss score threshold value that produces a minimal number of false negatives. We split our data into two continuous equal halves; one half was used for finding the optimal threshold parameters, while the second half was used for validation. Note that the number of alpha spindles in each half may not be the same.
We report the hit rate (HR) of the algorithm, which is the number of times the algorithm identified a region that was also identified by the expert, regardless of the timing precision in the detected regions. We also report the Spindle Temporal Error (STE) as the ratio of the total time in the False Negative state to the total number of alpha spindles. This gives a summary of the temporal localization of the detected spindles.
If more than one EEG channel is modeled, the SDAR algorithm uses a voting strategy that only selects a time region if a certain percentage of the overall number of channels identified the same time region. This strategy reduces the impact of isolated outliers that may exist only in one EEG channel. We use a voting threshold of 33% (1/3) for analyzing all the parietal/occipital EEG channels. More or less stringent strategies can be used by changing the voting percentage required for identifying significant time regions. Detected alpha spindle regions separated by less than 250 ms were merged together to form a single alpha spindle. Since the literature suggests that alpha spindle duration is generally between 500 ms and 2 s, isolated alpha spindle regions shorter than 250 ms were removed from the data . These post-processing techniques mimic post-processing procedures used by other authors [11, 12].
In both simulations, the DAR model order was set to 2, and the discounting rate was set to r = 0.01.
To analyze the performance of the algorithm, we changed the signal-to-noise ratio (SNR) by changing the amplitude factor of the alpha spindle dipoles. For example, an amplitude factor of 6 means the SNR is 2 (6/3). We simulated the alpha spindles at SNR ratios of 1, 1.3, 1.6, 2 and 3 (corresponding to amplitude factors of 3, 4, 5, 6 and 9, respectively). ROC and F-measure analyses were performed for each SNR value.
The simulated EEG data was sampled at 300 Hz. The data was subsequently down-sampled to 128 Hz and band-pass filtered at [6, 15] Hz using an order 8 Butterworth filter prior to analysis. An EEG electrode mosaic with 33 channels was used for simulating the data, with a channel orientation following the international 10–10 system of electrode placement. We applied the SDAR algorithm with model order 1 and discounting rate r = 0.01 to the simulation output of all parietal/occipital channels, setting the voting percentage to be 33% and the fuzzy window parameter to be 0 s for comparison purposes.
The EEG was recorded using a 64-channel Biosemi ActiveTwo system, and offline referenced to the average of the two mastoids. Four external channels were used to record eye movements by EOG, although EOG data was not analyzed in this study. The experiment was originally sampled at 2048 Hz and then subsequently down-sampled to 128 Hz. Figure 3B shows a sample of the resulting signal during a period of visually identifiable alpha activity.
An expert with more than 10 years of EEG processing experience visually identified and marked alpha spindle events in the EEG data. Recent literature has suggested that alpha spindling occurring in the parietal and occipital regions of the brain is related to fatigue in experiments of prolonged driving [11, 38]. Therefore, the expert used EEG data from the parietal and occipital channels for identifying alpha spindle events. The expert used three different data operations to assist in the visual identification of alpha spindles: the EEG data band-passed from [1, 50] Hz, the data band-passed from [1, 15] Hz, and the Independent Component Analysis (ICA) [39–41] decomposition of [1, 50] Hz band-passed EEG data. The expert used ICA to isolate and remove eye blink and movement components if the eye activity prevented an accurate time identification of the alpha spindle events. All filters were order 8 IIR Butterworth filters from ERPLAB . We used the expert’s labeled events as the ground truth to evaluate the performance of the SDAR algorithm. The expert manually marked alpha spindles in the EEG using MATLAB (The Mathworks Inc., Natick, Massachusetts) tools and functions obtained from the DETECT Toolbox .
The data was processed in EEGLAB  using a band-pass filter of [6, 15] Hz with an order 8 IIR filter from ERPLAB  before applying the SDAR algorithm. No ICA preprocessing was performed prior to this analysis. The datasets were analyzed in two passes. In the first pass, we analyzed the full data and report the detection performance and accuracy using all available information. In the second pass we partitioned the full data into two continuous equal halves for training and testing purposes. The detection performance and accuracy were reported for both the training and testing data. Results from the testing data were obtained using the optimal threshold value estimated from the training data. The receiver operating characteristic (ROC) curves and analysis of classification performance using the F-measure were performed only on the full and training data.
We then analyzed the two datasets obtained from the simulated driving experiment (see Methods). These datasets are referred to as Driving Data 1 and Driving Data 2, respectively. We use an order 1 SDAR model with discounting rate r = 0.01 and 33% (1/3) voting percentage for the duration of the paper. The results of using an order 2 SDAR model were similar, and are included in the Additional file 1 included with this manuscript. Our first analysis compares the expert labeled data to the SDAR algorithm’s labeled data using a fuzzy window parameter of 0 s. Detected spindles of less than 250 ms were removed prior to analysis, while spindles separated by less than 250 ms were merged to form one spindle. An online implementation of this algorithm would have a delay of at least 250 ms (to verify that the activity is a spindle), plus any additional computational, data acquisition, and processing overhead. The processing of each channel can be done in parallel, reducing the computational burden.
Classification performance of the algorithm versus the ground truth events for Driving Data 1 for three data conditions: the full data without partitioning, the training data and the testing data, respectively
Spindle Temporal Error
Note that a fuzzy window parameter of 0 s indicates that very minor differences in labeled regions will count negatively against the performance of the algorithm. It is unrealistic in practice to assume an exact temporal agreement between an expert and the algorithm, or even among two different experts. Incorporating an allowable timing error in the comparison can produce a more appropriate comparison.
Classification performance of the algorithm versus the ground truth events for Driving Data 1 and Driving Data 2 for three data conditions: the full data without partitioning, the training data and the testing data, respectively
Spindle Temporal Error
We compare the ASD and SDAR approaches for Driving Data 1. In order to compare SDAR with ASD, we applied the ASD algorithm as follows. For the ASD algorithm we processed the data according to the procedure used in  (128 Hz sampling rate and [0.5, 48] Hz band-pass filter, ICA to minimize muscle and eye artifacts). For comparison purposes with the SDAR algorithm, we also applied the ASD algorithm without ICA artifact removal. For the SDAR algorithm we processed the data only using a [6, 15] Hz band-pass filter. We chose only one channel, PO7, in our analysis as this produced the highest detection accuracy for single-channel detection with the ASD algorithm. We used a fuzzy window of 0.1 s.
Comparison between the SDAR algorithm fitting and the ASD algorithm for channel PO7 of Driving Data 1
ASD algorithm without ICA
ASD algorithm with ICA
Spindle Temporal Error
In this paper we propose an efficient method for detecting large narrowband increases in oscillatory EEG activity using change point detection methods based on discounted autoregressive models. This technique was applied to the alpha frequency range where the goal of the method was to detect alpha spindling activity and to estimate features of the alpha spindling such as the spindle rate and temporal localization. Our results show that this approach successfully identifies alpha spindles in EEG time series with good time resolution, allowing for the possibility of using characteristics such as alpha spindle frequency and duration as features for other types of modeling approaches, including state classification, fatigue monitoring, and performance prediction.
Early work using change point detection models for EEG data analysis was done by Brodsky et al. . These authors use nonparametric modeling techniques for the analysis of alpha activity with eyes closed versus eyes open. In contrast to their work, our work uses parametric modeling of the EEG using the DAR model to detect oscillatory activity. As the EEG is naturally time-dependent, methods based on autoregressive models are appropriate representations of the dynamics of EEG. Our use of the DAR model is especially attractive in that the model can adapt to non-stationary time series, a feature that is often present in EEG. Our approach is also computationally efficient, only requiring a few matrix operations at each iteration of the algorithm, making it an attractive analysis technique for large EEG datasets.
There are some differences in processing between our SDAR method and the ASD method proposed by Simon et al.  that deserve mention. First, our method band-passes the data to [6, 15] Hz, while the ASD method uses a [0.5, 48] Hz band-pass. The band-pass from [6, 15] Hz effectively removes muscle in the higher frequency bands, while limiting the impact of eye-related artifacts in the lower frequency bands. This reduces the need for extensive artifact removal post-processing with ICA  and auto-regressive techniques  as used in . Also, the effects of eye-related movements in EEG are minimal, since we are only analyzing parietal and occipital channels. Finally, our approach is not based on frequency characteristics of the signal other than the initial [6, 15] Hz band-pass. Because of this, we are not limited by the time-resolution of FFT-based methods in short time windows.
The SDAR approach sequentially calculates a quadratic loss score at each time point and uses this score function to identify irregular periods in the data. We obtain an effective time resolution equal to the window size of our temporal smoothing function. One possible disadvantage of our approach is that it requires a priori knowledge of the frequency range of interest before analysis. This may not be much of a concern for the analysis of EEG, since researchers often use well-understood, predetermined bands for analysis.
One important parameter of the SDAR model is the discounting rate, r. This parameter controls the level of contribution past data points have in the current model. If the discounting rate is large (> .05) the model adapts very quickly to the new dynamics of the data, making detecting unusual behavior difficult. It also is less robust to minor variations in the signals which are not statistically significant. Therefore we suggest using slower learning rates to prevent the model from learning too much of the alpha spindle behavior while still adapting to slowly changing behavior. We found that discounting rates between 0.01 and 0.001 perform well in isolating alpha spindles of data sampled at 128 Hz. These learning rates would need to be adjusted for different sampling frequencies: the rates should be decreased when the sampling rate increases and the rates should be increased when the sampling rate decreases.
Previous research in autoregressive modeling of EEG data has shown that large model orders are needed to estimate the underlying dynamics of the EEG signal. For example,  used AR models of order 10 to analyze the spectral contents in short EEG signals, while  used AR models of order 6 for distinguishing among several different mental tasks. In contrast, our work has shown that using low model orders are sufficient to identify statistically irregular EEG data segments. A possible explanation for this is the fact that we narrowly band-pass the EEG signal to the alpha frequency range prior to analysis. This band-pass reduces the frequency content of the signal. Because of this, only a few model orders may be needed to capture the oscillatory dynamics of this narrow band signal and discounting captures model variation. For EEG data filtered at a wider band, more orders are likely needed. Another possible explanation is that we are primarily focused on when EEG data is statistically irregular and not necessarily why it is statistically irregular. A low order representation of the EEG may be sufficient purely for identifying when sections of data are statistically irregular.
Several techniques have been proposed for detecting oscillatory activity in higher frequency bands. For example, a technique based on FFT analysis in the [80, 500] Hz frequency range was proposed in . Another study of higher frequency data  proposed a technique for oscillatory event detection based on amplitude and duration thresholding of a short-time line length function. This technique is similar to our approach in that the short-time line length function bears a large degree of similarity to the order 1 autoregressive structure used in this work. Another technique based on an adaptive Hilbert Transform has been proposed for oscillatory EEG analysis in neonates . An adaptive Hilbert Transform could be used to capture a time-dependent amplitude envelope which could be thresholded to find alpha spindles. In contrast to these previous works, our work is based on an adaptive statistical representation of the alpha-band EEG.
The SDAR algorithm is based on an adaptive statistical representation of the EEG time series and is not limited to alpha spindle detection in EEG. Results of our simulation study show that alpha spindles can be detected reasonably well if the spindle signal strength is at least 50% stronger than then the background noise signal (Figure 6). This suggests that this algorithm can be potentially used to detect oscillatory activity in other EEG frequency bands if the expected SNR of the activity is at least 1.5. For example, Craig et al.  analyzed EEG oscillatory activity in a simulated driving paradigm and showed that as the subject tired, oscillatory activity in both theta and alpha bands increased over the entire cortex, while activity in the delta band showed no significant changes. Fast wave activity also showed a significant increase primarily in frontal areas.
Other types of oscillatory activity could be modeled using our SDAR approach. For example, oscillatory activity in the theta range was analyzed by Cruikshank et al.  using wavelet-based techniques to analyze cortical activity underlying sensorimotor integration in humans. High frequency oscillatory activity is also a major area of interest in epilepsy research . As new experiments are conducted at increasing sampling rates, oscillatory activity in several different EEG bands can be analyzed to find spatiotemporal patterns across a range of experimental paradigms . Our technique based on SDAR modeling of the EEG can be applied to find spatiotemporal patterns of changes that may be indicative of mental state changes. This will be investigated in future research.
Our method requires an initial band-pass of the EEG data in a relevant frequency range of interest prior to analysis. For our analysis we band-passed the data at [6, 15] Hz for detecting alpha spindle oscillations in EEG. Since eye movement artifacts in EEG are generally in frequency ranges smaller than 6 Hz and artifacts from muscle movements are generally greater than 15 Hz, no additional artifact preprocessing is required. However, analysis of other brain regions may require additional preprocessing to remove artifacts prior to analysis. This is especially true for [3, 7] Hz theta oscillations, as eye movement artifacts will be more prevalent and pervasive when compared to the analysis of the alpha band. Analysis of the gamma frequency range will require removal of high frequency muscle activity as well as the removal of power-line noise (either at 50 or 60 Hz) prior to modeling by the SDAR algorithm.
Our primary goal of this work was the development of an algorithm for accurately identifying alpha spindles in EEG. We focused primarily on the parietal-occipital EEG channels as alpha spindles generally occur in these regions. However, this approach could be applied to analyze all the EEG channels simultaneously. In this way, the changes in the EEG could be correlated across brain regions, thus revealing additional features that can be useful for analysis. This is currently the topic of future research.
Recent advances in sensor technologies have enabled the non-invasive recording of neural activity in a variety of scenarios [38, 54] with some technologies aimed at improving performance in healthy individuals . Efficient methods for detecting and identifying changes in EEG oscillatory activity may have practical applications in BCI contexts. For example, our work in detecting alpha rhythm changes (alpha bursts/spindles) may facilitate early detection of fatigue onset before lapses or microsleeps occur [7, 9–12, 38] and may be useful for preventing potentially dangerous situations such as attention lapses or microsleeps during tasks that require sustained levels of vigilance. Parameters that may be used for analysis of rhythmic alpha activity include frequency range, duration, rate, topography and peak frequency [11, 12, 56]. Another area where our approach might be useful is for real-time online monitoring of sleep. Rechtschaffen and Kales  published guidelines for manually scoring stages of sleep, including criteria for scoring alpha spindles in the form of k-complexes observed during stages I-II of sleep. Automated methods for sleep stage scoring based on the R&K gold standard visual scoring of EEG recordings have been published; however, none of these methods have reached the level of acceptance to the extent of R&K . An automated system that is robust and validated is still lacking today.
In this work we showed that discounted autoregressive models can be used to model the alpha band EEG time series for detecting alpha spindle events in EEG. Our method is based on statistical principles and can generally be applied to detect rhythmic activity in any frequency band or brain region. As the algorithm is based on a time-adaptive statistical representation of the signal, it can account for slowly non-stationary behavior, making it an attractive model for EEG data analysis.
The authors wish to thank W. David Hairston and Kaleb McDowell, both with the Army Research Laboratory, for helpful discussions, and DCS Corporation for the support and development of the experimental design used in this study. This research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-10-2-0022. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.