Trialwise Manual Training Loop¶
In this example, we will use a convolutional neural network on the Physiobank EEG Motor Movement/Imagery Dataset to decode two classes:
- Executed and imagined opening and closing of both hands
- Executed and imagined opening and closing of both feet
Enable logging¶
[2]:
import logging
import importlib
importlib.reload(logging) # see https://stackoverflow.com/a/21475297/1469195
log = logging.getLogger()
log.setLevel('INFO')
import sys
logging.basicConfig(format='%(asctime)s %(levelname)s : %(message)s',
level=logging.INFO, stream=sys.stdout)
Load data¶
You can load and preprocess your EEG dataset in any way, Braindecode only expects a 3darray (trials, channels, timesteps) of input signals X
and a vector of labels y
later (see below). In this tutorial, we will use the MNE library to load an EEG motor imagery/motor execution dataset. For a tutorial from MNE using Common Spatial Patterns to decode this data, see
here. For another library useful for loading EEG data, take a look at Neo IO.
[3]:
import mne
from mne.io import concatenate_raws
# 5,6,7,10,13,14 are codes for executed and imagined hands/feet
subject_id = 22 # carefully cherry-picked to give nice results on such limited data :)
event_codes = [5,6,9,10,13,14]
# This will download the files if you don't have them yet,
# and then return the paths to the files.
physionet_paths = mne.datasets.eegbci.load_data(subject_id, event_codes)
# Load each of the files
parts = [mne.io.read_raw_edf(path, preload=True,stim_channel='auto', verbose='WARNING')
for path in physionet_paths]
# Concatenate them
raw = concatenate_raws(parts)
# Find the events in this dataset
events, _ = mne.events_from_annotations(raw)
# Use only EEG channels
eeg_channel_inds = mne.pick_types(raw.info, meg=False, eeg=True, stim=False, eog=False,
exclude='bads')
# Extract trials, only using EEG channels
epoched = mne.Epochs(raw, events, dict(hands=2, feet=3), tmin=1, tmax=4.1, proj=False, picks=eeg_channel_inds,
baseline=None, preload=True)
Convert data to Braindecode format¶
Braindecode has a minimalistic SignalAndTarget
class, with attributes X
for the signal and y
for the labels. X
should have these dimensions: trials x channels x timesteps. y
should have one label per trial.
[4]:
import numpy as np
# Convert data from volt to millivolt
# Pytorch expects float32 for input and int64 for labels.
X = (epoched.get_data() * 1e6).astype(np.float32)
y = (epoched.events[:,2] - 2).astype(np.int64) #2,3 -> 0,1
We use the first 40 trials for training and the next 30 trials for validation. The validation accuracies can be used to tune hyperparameters such as learning rate etc. The final 20 trials are split apart so we have a final hold-out evaluation set that is not part of any hyperparameter optimization. As mentioned before, this dataset is dangerously small to get any meaningful results and only used here for quick demonstration purposes.
[5]:
from braindecode.datautil.signal_target import SignalAndTarget
train_set = SignalAndTarget(X[:40], y=y[:40])
valid_set = SignalAndTarget(X[40:70], y=y[40:70])
Create the model¶
Braindecode comes with some predefined convolutional neural network architectures for raw time-domain EEG. Here, we use the shallow ConvNet model from Deep learning with convolutional neural networks for EEG decoding and visualization.
[6]:
from braindecode.models.shallow_fbcsp import ShallowFBCSPNet
from torch import nn
from braindecode.torch_ext.util import set_random_seeds
# Set if you want to use GPU
# You can also use torch.cuda.is_available() to determine if cuda is available on your machine.
cuda = False
set_random_seeds(seed=20170629, cuda=cuda)
n_classes = 2
in_chans = train_set.X.shape[1]
# final_conv_length = auto ensures we only get a single output in the time dimension
model = ShallowFBCSPNet(in_chans=in_chans, n_classes=n_classes,
input_time_length=train_set.X.shape[2],
final_conv_length='auto').create_network()
if cuda:
model.cuda()
We use AdamW to optimize the parameters of our network together with Cosine Annealing of the learning rate. We supply some default parameters that we have found to work well for motor decoding, however we strongly encourage you to perform your own hyperparameter optimization using cross validation on your training data.
[7]:
from braindecode.torch_ext.optimizers import AdamW
from braindecode.torch_ext.schedulers import ScheduledOptimizer, CosineAnnealing
from braindecode.datautil.iterators import get_balanced_batches
from numpy.random import RandomState
rng = RandomState((2018,8,7))
#optimizer = AdamW(model.parameters(), lr=1*0.01, weight_decay=0.5*0.001) # these are good values for the deep model
optimizer = AdamW(model.parameters(), lr=0.0625 * 0.01, weight_decay=0)
# Need to determine number of batch passes per epoch for cosine annealing
n_epochs = 30
n_updates_per_epoch = len(list(get_balanced_batches(len(train_set.X), rng, shuffle=True,
batch_size=30)))
scheduler = CosineAnnealing(n_epochs * n_updates_per_epoch)
# schedule_weight_decay must be True for AdamW
optimizer = ScheduledOptimizer(scheduler, optimizer, schedule_weight_decay=True)
Training loop¶
This is a conventional mini-batch stochastic gradient descent training loop:
- Get randomly shuffled batches of trials
- Compute outputs, loss and gradients on the batches of trials
- Update your model
- After iterating through all batches of your dataset, report some statistics like mean accuracy and mean loss.
[8]:
from braindecode.torch_ext.util import np_to_var, var_to_np
import torch.nn.functional as F
for i_epoch in range(n_epochs):
i_trials_in_batch = get_balanced_batches(len(train_set.X), rng, shuffle=True,
batch_size=30)
# Set model to training mode
model.train()
for i_trials in i_trials_in_batch:
# Have to add empty fourth dimension to X
batch_X = train_set.X[i_trials][:,:,:,None]
batch_y = train_set.y[i_trials]
net_in = np_to_var(batch_X)
if cuda:
net_in = net_in.cuda()
net_target = np_to_var(batch_y)
if cuda:
net_target = net_target.cuda()
# Remove gradients of last backward pass from all parameters
optimizer.zero_grad()
# Compute outputs of the network
outputs = model(net_in)
# Compute the loss
loss = F.nll_loss(outputs, net_target)
# Do the backpropagation
loss.backward()
# Update parameters with the optimizer
optimizer.step()
# Print some statistics each epoch
model.eval()
print("Epoch {:d}".format(i_epoch))
for setname, dataset in (('Train', train_set), ('Valid', valid_set)):
# Here, we will use the entire dataset at once, which is still possible
# for such smaller datasets. Otherwise we would have to use batches.
net_in = np_to_var(dataset.X[:,:,:,None])
if cuda:
net_in = net_in.cuda()
net_target = np_to_var(dataset.y)
if cuda:
net_target = net_target.cuda()
outputs = model(net_in)
loss = F.nll_loss(outputs, net_target)
print("{:6s} Loss: {:.5f}".format(
setname, float(var_to_np(loss))))
predicted_labels = np.argmax(var_to_np(outputs), axis=1)
accuracy = np.mean(dataset.y == predicted_labels)
print("{:6s} Accuracy: {:.1f}%".format(
setname, accuracy * 100))
Epoch 0
Train Loss: 0.73690
Train Accuracy: 62.5%
Valid Loss: 1.13041
Valid Accuracy: 53.3%
Epoch 1
Train Loss: 1.17932
Train Accuracy: 57.5%
Valid Loss: 1.35246
Valid Accuracy: 50.0%
Epoch 2
Train Loss: 0.81899
Train Accuracy: 65.0%
Valid Loss: 0.96027
Valid Accuracy: 56.7%
Epoch 3
Train Loss: 0.53547
Train Accuracy: 75.0%
Valid Loss: 0.77725
Valid Accuracy: 66.7%
Epoch 4
Train Loss: 0.30195
Train Accuracy: 85.0%
Valid Loss: 0.60807
Valid Accuracy: 73.3%
Epoch 5
Train Loss: 0.18695
Train Accuracy: 90.0%
Valid Loss: 0.54184
Valid Accuracy: 76.7%
Epoch 6
Train Loss: 0.13377
Train Accuracy: 95.0%
Valid Loss: 0.50131
Valid Accuracy: 80.0%
Epoch 7
Train Loss: 0.11521
Train Accuracy: 95.0%
Valid Loss: 0.47909
Valid Accuracy: 80.0%
Epoch 8
Train Loss: 0.09841
Train Accuracy: 97.5%
Valid Loss: 0.47807
Valid Accuracy: 80.0%
Epoch 9
Train Loss: 0.08487
Train Accuracy: 97.5%
Valid Loss: 0.47951
Valid Accuracy: 80.0%
Epoch 10
Train Loss: 0.07319
Train Accuracy: 97.5%
Valid Loss: 0.48485
Valid Accuracy: 80.0%
Epoch 11
Train Loss: 0.06363
Train Accuracy: 100.0%
Valid Loss: 0.49065
Valid Accuracy: 80.0%
Epoch 12
Train Loss: 0.05509
Train Accuracy: 100.0%
Valid Loss: 0.49882
Valid Accuracy: 76.7%
Epoch 13
Train Loss: 0.04861
Train Accuracy: 100.0%
Valid Loss: 0.50564
Valid Accuracy: 80.0%
Epoch 14
Train Loss: 0.04340
Train Accuracy: 100.0%
Valid Loss: 0.51243
Valid Accuracy: 80.0%
Epoch 15
Train Loss: 0.03985
Train Accuracy: 100.0%
Valid Loss: 0.51889
Valid Accuracy: 83.3%
Epoch 16
Train Loss: 0.03644
Train Accuracy: 100.0%
Valid Loss: 0.52753
Valid Accuracy: 83.3%
Epoch 17
Train Loss: 0.03371
Train Accuracy: 100.0%
Valid Loss: 0.53611
Valid Accuracy: 83.3%
Epoch 18
Train Loss: 0.03111
Train Accuracy: 100.0%
Valid Loss: 0.54486
Valid Accuracy: 83.3%
Epoch 19
Train Loss: 0.02891
Train Accuracy: 100.0%
Valid Loss: 0.55211
Valid Accuracy: 83.3%
Epoch 20
Train Loss: 0.02694
Train Accuracy: 100.0%
Valid Loss: 0.55715
Valid Accuracy: 83.3%
Epoch 21
Train Loss: 0.02522
Train Accuracy: 100.0%
Valid Loss: 0.56237
Valid Accuracy: 80.0%
Epoch 22
Train Loss: 0.02372
Train Accuracy: 100.0%
Valid Loss: 0.56693
Valid Accuracy: 80.0%
Epoch 23
Train Loss: 0.02254
Train Accuracy: 100.0%
Valid Loss: 0.57061
Valid Accuracy: 80.0%
Epoch 24
Train Loss: 0.02160
Train Accuracy: 100.0%
Valid Loss: 0.57319
Valid Accuracy: 80.0%
Epoch 25
Train Loss: 0.02083
Train Accuracy: 100.0%
Valid Loss: 0.57516
Valid Accuracy: 80.0%
Epoch 26
Train Loss: 0.02021
Train Accuracy: 100.0%
Valid Loss: 0.57674
Valid Accuracy: 80.0%
Epoch 27
Train Loss: 0.01972
Train Accuracy: 100.0%
Valid Loss: 0.57793
Valid Accuracy: 80.0%
Epoch 28
Train Loss: 0.01934
Train Accuracy: 100.0%
Valid Loss: 0.57883
Valid Accuracy: 80.0%
Epoch 29
Train Loss: 0.01903
Train Accuracy: 100.0%
Valid Loss: 0.57959
Valid Accuracy: 80.0%
Eventually, we arrive at 80.0% accuracy, so 24 from 30 trials are correctly predicted. In the Cropped Decoding Tutorial, we can learn do the same decoding using Cropped Decoding.
Evaluation¶
Once we have all our hyperparameters and architectural choices done, we can evaluate the accuracies to report in our publication by evaluating on the test set:
[9]:
test_set = SignalAndTarget(X[70:], y=y[70:])
model.eval()
# Here, we will use the entire dataset at once, which is still possible
# for such smaller datasets. Otherwise we would have to use batches.
net_in = np_to_var(test_set.X[:,:,:,None])
if cuda:
net_in = net_in.cuda()
net_target = np_to_var(test_set.y)
if cuda:
net_target = net_target.cuda()
outputs = model(net_in)
loss = F.nll_loss(outputs, net_target)
print("Test Loss: {:.5f}".format(float(var_to_np(loss))))
predicted_labels = np.argmax(var_to_np(outputs), axis=1)
accuracy = np.mean(test_set.y == predicted_labels)
print("Test Accuracy: {:.1f}%".format(accuracy * 100))
Test Loss: 0.31152
Test Accuracy: 80.0%
[ ]:
import mne
import numpy as np
from mne.io import concatenate_raws
from braindecode.datautil.signal_target import SignalAndTarget
# First 50 subjects as train
physionet_paths = [ mne.datasets.eegbci.load_data(sub_id,[4,8,12,]) for sub_id in range(1,51)]
physionet_paths = np.concatenate(physionet_paths)
parts = [mne.io.read_raw_edf(path, preload=True,stim_channel='auto')
for path in physionet_paths]
raw = concatenate_raws(parts)
picks = mne.pick_types(raw.info, meg=False, eeg=True, stim=False, eog=False,
exclude='bads')
events, _ = mne.events_from_annotations(raw)
# Read epochs (train will be done only between 1 and 2s)
# Testing will be done with a running classifier
epoched = mne.Epochs(raw, events, dict(hands=2, feet=3), tmin=1, tmax=4.1, proj=False, picks=picks,
baseline=None, preload=True)
# 51-55 as validation subjects
physionet_paths_valid = [mne.datasets.eegbci.load_data(sub_id,[4,8,12,]) for sub_id in range(51,56)]
physionet_paths_valid = np.concatenate(physionet_paths_valid)
parts_valid = [mne.io.read_raw_edf(path, preload=True,stim_channel='auto')
for path in physionet_paths_valid]
raw_valid = concatenate_raws(parts_valid)
picks_valid = mne.pick_types(raw_valid.info, meg=False, eeg=True, stim=False, eog=False,
exclude='bads')
events_valid = mne.find_events(raw_valid, shortest_event=0, stim_channel='STI 014')
# Read epochs (train will be done only between 1 and 2s)
# Testing will be done with a running classifier
epoched_valid = mne.Epochs(raw_valid, events_valid, dict(hands=2, feet=3), tmin=1, tmax=4.1, proj=False, picks=picks_valid,
baseline=None, preload=True)
train_X = (epoched.get_data() * 1e6).astype(np.float32)
train_y = (epoched.events[:,2] - 2).astype(np.int64) #2,3 -> 0,1
valid_X = (epoched_valid.get_data() * 1e6).astype(np.float32)
valid_y = (epoched_valid.events[:,2] - 2).astype(np.int64) #2,3 -> 0,1
train_set = SignalAndTarget(train_X, y=train_y)
valid_set = SignalAndTarget(valid_X, y=valid_y)
Dataset references¶
This dataset was created and contributed to PhysioNet by the developers of the BCI2000 instrumentation system, which they used in making these recordings. The system is described in:
Schalk, G., McFarland, D.J., Hinterberger, T., Birbaumer, N., Wolpaw, J.R. (2004) BCI2000: A General-Purpose Brain-Computer Interface (BCI) System. IEEE TBME 51(6):1034-1043.
PhysioBank is a large and growing archive of well-characterized digital recordings of physiologic signals and related data for use by the biomedical research community and further described in:
Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. (2000) PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 101(23):e215-e220.