GitHub Repository: adasegroup/NEUROML2022
Path: blob/main/seminar4/preprocessing.ipynb
¹⁰⁷ views

Kernel: Python 3

fMRI data preprocessing

In [ ]:

import nibabel
from nilearn import plotting
import matplotlib
import numpy as np
import warnings
warnings.filterwarnings('ignore')


anat = nibabel.load('/data/ds000114/derivatives/fmriprep/sub-01/anat/sub-01_t1w_preproc.nii.gz')
fmri = nibabel.load('/data/ds000114/sub-01/ses-test/func/sub-01_ses-test_task-fingerfootlips_bold.nii.gz')

Preprocessing

In this workflow we will conduct the following steps:

1. Coregistration of functional images to anatomical images (according to FSL's FEAT pipeline)

Co-registrationis the process of spatial alignment of 2 images. The target image is also called reference volume. The goodness of alignment is evaluated with a cost function.

We have to move the fmri series from fmri native space:

In [ ]:

plotting.view_img(nibabel.nifti1.Nifti1Image(fmri.get_data()[:,:,:,1], affine=fmri.affine), bg_img=anat, threshold=0.1e3, cut_coords=(0,0,0), title='Anat and fmri misalignment')

To native anatomical space:

2. Motion correction of functional images with FSL's MCFLIRT

The images are aligned with rigid transformation - rotations, translations, reflections. Then spatial interpolation is done, so as there was no movements.

3. Slice Timing correction

The brain slices are not acquired at the same time. Therefore, interpolation is done between the nearest timepoints Slice Order

Slice timing corretion in python

4. Smoothing of coregistered functional images with FWHM set to 5/10 mm

5. Artifact Detection in functional images (to detect outlier volumes)

So, let's start!

Imports

First, let's import all the modules we later will be needing.

In [ ]:

from nilearn import plotting
%matplotlib inline
from os.path import join as opj
import os
import json
from nipype.interfaces.fsl import (BET, ExtractROI, FAST, FLIRT, ImageMaths,
                                   MCFLIRT, SliceTimer, Threshold)
from nipype.interfaces.spm import Smooth
from nipype.interfaces.utility import IdentityInterface
from nipype.interfaces.io import SelectFiles, DataSink
from nipype.algorithms.rapidart import ArtifactDetect
from nipype import Workflow, Node

Experiment parameters

It's always a good idea to specify all parameters that might change between experiments at the beginning of your script. We will use one functional image for fingerfootlips task for ten subjects.

In [ ]:

experiment_dir = '/output'
#output_dir = '/home/neuro/nipype_tutorial/notebooks'
output_dir = 'datasink'
working_dir = 'workingdir'

# list of subject identifiers
subject_list = ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10']

#subject_list = ['01', '02']
# list of session identifiers
task_list = ['fingerfootlips']

# Smoothing widths to apply
fwhm = [5, 10]

# TR of functional images(time from the application of an excitation pulse to the application of the next pulse)

with open('/data/ds000114/task-fingerfootlips_bold.json', 'rt') as fp:
    task_info = json.load(fp)
TR = task_info['RepetitionTime']

# Isometric resample of functional images to voxel size (in mm)
iso_size = 4

Specify Nodes for the main workflow

Initiate all the different interfaces (represented as nodes) that you want to use in your workflow.

In [ ]:

# ExtractROI - skip dummy scans
#t_min - Minimum index for t-dimension
#t_size - Size of ROI in t-dimension
extract = Node(ExtractROI(t_min=4, t_size=-1, output_type='NIFTI'),
               name="extract")

#MCFLIRT - motion correction
#https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/MCFLIRT
#mean_vol- volumes are averaged to create a new template
#normcorr cost - https://www.fmrib.ox.ac.uk/datasets/techrep/tr02mj1/tr02mj1/node4.html
#https://onlinelibrary.wiley.com/reader/content/15ee5e5c909/10.1002/hbm.20235/format/pdf/OEBPS/pages/bg2.png
#sinc interpolation - https://math.stackexchange.com/questions/1372632/how-does-sinc-interpolation-work
mcflirt = Node(MCFLIRT(mean_vol=True,
                       save_plots=True,
                       output_type='NIFTI'),
               name="mcflirt")

#SliceTimer - correct for slice wise acquisition
#https://poc.vl-e.nl/distribution/manual/fsl-3.2/slicetimer/index.html
#more on https://matthew-brett.github.io/teaching/slice_timing.html
#interleaved = -odd
#top to bottom = --down
#normcorr loss
slicetimer = Node(SliceTimer(index_dir=False,
                             interleaved=True,
                             output_type='NIFTI',
                             time_repetition=TR),
                  name="slicetimer")

#Smooth - image smoothing
#spm_smooth for 3D Gaussian smoothing
smooth = Node(Smooth(), name="smooth")
smooth.iterables = ("fwhm", fwhm)

# Artifact Detection - determines outliers in functional images via intensity and motion paramters
#http://web.mit.edu/swg/art/art.pdf
#norm_threshold - Threshold to use to detect motion-related outliers when composite motion is being used
#zintensity_threshold - Intensity Z-threshold use to detection images that deviate from the mean
#spm_global like calculation to determine the brain mask
#parameter_source - Source of movement parameters
#use_differences - Use differences between successive motion (first element) and
#intensity parameter (second element) estimates in order to determine outliers.

art = Node(ArtifactDetect(norm_threshold=2,
                          zintensity_threshold=3,
                          mask_type='spm_global',
                          parameter_source='FSL',
                          use_differences=[True, False],
                          plot_type='svg'),
           name="art")

Coregistration Workflow

Initiate a workflow that coregistrates the functional images to the anatomical image (according to FSL's FEAT pipeline).

In [ ]:

# BET - Skullstrip anatomical Image
#https://www.fmrib.ox.ac.uk/datasets/techrep/tr00ss2/tr00ss2.pdf

bet_anat = Node(BET(frac=0.5,
                    robust=True,
                    output_type='NIFTI_GZ'),
                name="bet_anat")

# FAST - Image Segmentation
#https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FAST
#http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.200.3832&rep=rep1&type=pdf

segmentation = Node(FAST(output_type='NIFTI_GZ'),
                    name="segmentation", mem_gb=4)

# Select WM segmentation file from segmentation output
# Get better boundaries on white and gray matter
def get_wm(files):
    return files[-1]

# Threshold - Threshold WM probability image
threshold = Node(Threshold(thresh=0.5,
                           args='-bin',
                           output_type='NIFTI_GZ'),
                name="threshold")

# FLIRT - pre-alignment of functional images to anatomical images
#https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FLIRT
coreg_pre = Node(FLIRT(dof=6, output_type='NIFTI_GZ'),
                 name="coreg_pre")

# FLIRT - coregistration of functional images to anatomical images with BBR(uses the segmentation)
#https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FLIRT_BBR
coreg_bbr = Node(FLIRT(dof=6,
                       cost='bbr',
                       schedule=opj(os.getenv('FSLDIR'),
                                    'etc/flirtsch/bbr.sch'),
                       output_type='NIFTI_GZ'),
                 name="coreg_bbr")

# Apply coregistration warp to functional images
#apply_isoxfm-apply transformation supplied by in_matrix_file
applywarp = Node(FLIRT(interp='spline',
                       apply_isoxfm=iso_size,
                       output_type='NIFTI'),
                 name="applywarp")

# Apply coregistration wrap to mean file
applywarp_mean = Node(FLIRT(interp='spline',
                            apply_isoxfm=iso_size,
                            output_type='NIFTI_GZ'),
                 name="applywarp_mean")

# Create a coregistration workflow
coregwf = Workflow(name='coregwf')
coregwf.base_dir = opj(experiment_dir, working_dir)

# Connect all components of the coregistration workflow
coregwf.connect([(bet_anat, segmentation, [('out_file', 'in_files')]),
                 (segmentation, threshold, [(('partial_volume_files', get_wm),
                                             'in_file')]),
                 (bet_anat, coreg_pre, [('out_file', 'reference')]),
                 (threshold, coreg_bbr, [('out_file', 'wm_seg')]),
                 (coreg_pre, coreg_bbr, [('out_matrix_file', 'in_matrix_file')]),
                 (coreg_bbr, applywarp, [('out_matrix_file', 'in_matrix_file')]),
                 (bet_anat, applywarp, [('out_file', 'reference')]),
                 (coreg_bbr, applywarp_mean, [('out_matrix_file', 'in_matrix_file')]),
                 (bet_anat, applywarp_mean, [('out_file', 'reference')]),
                 ])

Specify input & output stream

Specify where the input data can be found & where and how to save the output data.

In [ ]:

# Infosource - a function free node to iterate over the list of subject names
infosource = Node(IdentityInterface(fields=['subject_id', 'task_name']),
                  name="infosource")
infosource.iterables = [('subject_id', subject_list),
                        ('task_name', task_list)]

# SelectFiles - to grab the data 
anat_file = opj('derivatives', 'fmriprep', 'sub-{subject_id}', 'anat', 'sub-{subject_id}_t1w_preproc.nii.gz')
func_file = opj('sub-{subject_id}', 'ses-test', 'func',
                'sub-{subject_id}_ses-test_task-{task_name}_bold.nii.gz')

templates = {'anat': anat_file,
             'func': func_file}
selectfiles = Node(SelectFiles(templates,
                               base_directory='/data/ds000114'),
                   name="selectfiles")

# Datasink - creates output folder for important outputs
datasink = Node(DataSink(base_directory=experiment_dir,
                         container=output_dir),
                name="datasink")

## Use the following DataSink output substitutions
substitutions = [('_subject_id_', 'sub-'),
                 ('_task_name_', '/task-'),
                 ('_fwhm_', 'fwhm-'),
                 ('_roi', ''),
                 ('_mcf', ''),
                 ('_st', ''),
                 ('_flirt', ''),
                 ('.nii_mean_reg', '_mean'),
                 ('.nii.par', '.par'),
                 ]
subjFolders = [('fwhm-%s/' % f, 'fwhm-%s_' % f) for f in fwhm]
substitutions.extend(subjFolders)
datasink.inputs.substitutions = substitutions

Specify Workflow

Create a workflow and connect the interface nodes and the I/O stream to each other.

In [ ]:

# Create a preprocessing workflow
preproc = Workflow(name='preproc')
preproc.base_dir = opj(experiment_dir, working_dir)

# Connect all components of the preprocessing workflow
preproc.connect([(infosource, selectfiles, [('subject_id', 'subject_id'),
                                            ('task_name', 'task_name')]),
                 (selectfiles, extract, [('func', 'in_file')]),
                 (extract, mcflirt, [('roi_file', 'in_file')]),
                 (mcflirt, slicetimer, [('out_file', 'in_file')]),

                 (selectfiles, coregwf, [('anat', 'bet_anat.in_file'),
                                         ('anat', 'coreg_bbr.reference')]),
                 (mcflirt, coregwf, [('mean_img', 'coreg_pre.in_file'),
                                     ('mean_img', 'coreg_bbr.in_file'),
                                     ('mean_img', 'applywarp_mean.in_file')]),
                 (slicetimer, coregwf, [('slice_time_corrected_file', 'applywarp.in_file')]),
                 
                 (coregwf, smooth, [('applywarp.out_file', 'in_files')]),

                 (mcflirt, datasink, [('par_file', 'preproc.@par')]),
                 (smooth, datasink, [('smoothed_files', 'preproc.@smooth')]),
                 (coregwf, datasink, [('applywarp_mean.out_file', 'preproc.@mean')]),

                 (coregwf, art, [('applywarp.out_file', 'realigned_files')]),
                 (mcflirt, art, [('par_file', 'realignment_parameters')]),

                 (coregwf, datasink, [('coreg_bbr.out_matrix_file', 'preproc.@mat_file'),
                                      ('bet_anat.out_file', 'preproc.@brain')]),
                 (art, datasink, [('outlier_files', 'preproc.@outlier_files'),
                                  ('plot_files', 'preproc.@plot_files')]),
                 ])

Visualize the workflow

It always helps to visualize your workflow.

In [ ]:

# Create preproc output graph
preproc.write_graph(graph2use='colored', format='png', simple_form=True)

# Visualize the graph
from IPython.display import Image
Image(filename=opj(preproc.base_dir, 'preproc', 'graph.png'))

In [ ]:

# Visualize the detailed graph
preproc.write_graph(graph2use='flat', format='png', simple_form=True)
Image(filename=opj(preproc.base_dir, 'preproc', 'graph_detailed.png'))

Run the Workflow

Now that everything is ready, we can run the preprocessing workflow. Change n_procs to the number of jobs/cores you want to use. Note that if you're using a Docker container and FLIRT fails to run without any good reason, you might need to change memory settings in the Docker preferences (6 GB should be enough for this workflow).

In [ ]:

!rm -r /output/workingdir

In [ ]:

mkdir -p /output/datasink

Run with 'Linear' Plugin, for one process per subject, if multiprocessing failed. Read more: https://nipype.readthedocs.io/en/1.1.0/users/plugins.html

In [ ]:

#args_dict = {'n_procs' : -1}
#preproc.run('Linear')

In [ ]:

preproc.run(plugin='MultiProc')

Inspect output

Let's check the structure of the output folder, to see if we have everything we wanted to save.

In [ ]:

!ls /output/datasink/preproc/sub-01

In [ ]:

!tree /output/datasink/preproc/sub-01/task-fingerfootlips

Visualize results

Let's check the effect of the different smoothing kernels.

In [ ]:

import nilearn
from nilearn import image, plotting
out_path = '/output/datasink/preproc/sub-01/task-fingerfootlips/'
fmri_preproc = nilearn.image.load_img(f'{out_path}/sub-01_ses-test_task-fingerfootlips_bold_mean.nii.gz')

In [ ]:

plotting.view_img(fmri_preproc, bg_img=anat, threshold=0.1e3, cut_coords=(0,0,0), title='Anat and fmri aligned, fwhm = 0 mm')

In [ ]:

fmri_preproc_5 = image.mean_img(nilearn.image.load_img(f'{out_path}/fwhm-5_ssub-01_ses-test_task-fingerfootlips_bold.nii'))
plotting.view_img(fmri_preproc_5, bg_img=anat, threshold=0.1e3, cut_coords=(0,0,0), title='Anat and fmri aligned, fwhm = 5 mm')

In [ ]:

fmri_preproc_10 = image.mean_img(nilearn.image.load_img(f'{out_path}/fwhm-10_ssub-01_ses-test_task-fingerfootlips_bold.nii'))
plotting.view_img(fmri_preproc_10, bg_img=anat, threshold=0.1e3, cut_coords=(0,0,0), title='Anat and fmri aligned, fwhm = 10 mm')

Now, let's investigate the motion parameters. How much did the subject move and turn in the scanner?

In [ ]:

import numpy as np
import matplotlib.pyplot as plt
par = np.loadtxt('/output/datasink/preproc/sub-01/task-fingerfootlips/sub-01_ses-test_task-fingerfootlips_bold.par')
fig, axes = plt.subplots(2, 1, figsize=(15, 5))
axes[0].set_ylabel('rotation (radians)')
axes[0].plot(par[0:, :3])
axes[1].plot(par[0:, 3:])
axes[1].set_xlabel('time (TR)')
axes[1].set_ylabel('translation (mm)');

There seems to be a rather drastic motion around volume 102. Let's check if the outliers detection algorithm was able to pick this up.

In [ ]:

!ls /data/ds000114/sub-01/ses-test/func/

In [ ]:

import numpy as np
outlier_ids = np.loadtxt('/output/datasink/preproc/sub-01/task-fingerfootlips/art.sub-01_ses-test_task-fingerfootlips_bold_outliers.txt')
print('Outliers were detected at volumes: %s' % outlier_ids)

from IPython.display import SVG
SVG(filename='/output/datasink/preproc/sub-01/task-fingerfootlips/plot.sub-01_ses-test_task-fingerfootlips_bold.svg')

Alternative for motion artifacts detection ICA-based Automatic Removal Of Motion Artifact

Dataset: A test-retest fMRI dataset for motor, language and spatial attention functions

Special thanks to Michael Notter for the wonderful nipype tutorial

Example of fmriprep run

In [ ]:

!pip install fmriprep sentry_sdk awscli

In [ ]:

!rm -r ./ds000114-download

In [ ]:

!aws s3 sync --no-sign-request s3://openneuro.org/ds000114 \
    ds000114-download/ \
    --exclude='*' \
    --include='task-fingerfootlips_bold.json' \
    --include='dataset_description.json' \
    --include='sub-01/*'

In [ ]:

input_dir='/data'
output_dir = 'pwd'

In [ ]:

!rm -r /output/fmriprep

In [ ]:

!ls './ds000114-download/sub-01/'

In [ ]:

!ls /data/ds000114/sub-01/ses-test/func/

Example of running fmriprep

will cause an error, first, we have to create BIDS validated dataset

In [ ]:

!fmriprep ./ds000114-download/ /output/  participant --fs-license-file ./license.txt --fs-no-reconall -w /output --participant_label 01