Run MRIQC on a BIDS dataset¶
In this tutorial, you will learn how to use the Nipoppy CLI to run the MRIQC pipeline (version 23.1.0) on the ds004101 BIDS dataset available from OpenNeuro.
Concretely, we will:
Initialize a Nipoppy dataset from a BIDS dataset
Install and set up the MRIQC processing pipeline
Run MRIQC on a single participant and session
Track the output files to check if processing was successful
Note
If you have not installed Nipoppy yet, instructions are available here.
Attention
This tutorial assumes that Apptainer (or Singularity) is installed on your system. If that is not the case, then you will not be able to run MRIQC with Nipoppy.
Step 0: Download the BIDS dataset¶
We will use the ds004101 dataset from OpenNeuro, which includes structural and functional MRI data for 9 subjects (2 sessions). The dataset can be downloaded by following instructions here. If you do not have DataLad or Node.js installed, you can use the shell script method:
Click on the “Download shell script” link at the bottom of this page. This should download a file called
ds004101-1.0.1.sh
to your computer.Move
ds004101-1.0.1.sh
to the directory you will use for this tutorial.Open a Terminal window and go to the directory where you put the script. Then run:
$ bash ds004101-1.0.1.sh
Depending on your internet connection, the above command may take a few minutes. Once it is done, you should have a new directory
ds004101-1.0.1
containing the BIDS dataset. Runtree ds004101-1.0.1
to see the dataset content.
Step 1: Initialize the Nipoppy dataset¶
Run the following command to create a Nipoppy dataset and populate it with the BIDS data:
$ nipoppy init --dataset nipoppy_study --bids-source ds004101-1.0.1
This command creates a folder named nipoppy_study
with subdirectories for raw data, processed (derivatives) data, pipeline configuration files, logs, etc. The tree
command can be used to show the directory structure.
$ tree -L 1 nipoppy_study
nipoppy_study/
├── bids
├── code
├── containers
├── derivatives
├── global_config.json
├── logs
├── manifest.tsv
├── pipelines
├── scratch
├── sourcedata
└── tabular
Nipoppy automatically generated a manifest file from the BIDS dataset. This file is considered the ground truth for which participants and sessions are available for processing.
nipoppy_study/manifest.tsv
¶participant_id visit_id session_id datatype
09114 1pre 1pre ['anat', 'func']
09114 2post 2post ['fmap', 'func']
09160 1pre 1pre ['anat', 'func']
09160 2post 2post ['fmap', 'func']
09260 1pre 1pre ['anat', 'fmap', 'func']
09260 2post 2post ['fmap', 'func']
09300 1pre 1pre ['anat', 'fmap', 'func']
09300 2post 2post ['fmap', 'func']
09380 1pre 1pre ['anat', 'fmap', 'func']
09380 2post 2post ['fmap', 'func']
09381 1pre 1pre ['anat', 'fmap', 'func']
09381 2post 2post ['fmap', 'func']
10134 1pre 1pre ['anat', 'fmap', 'func']
10134 2post 2post ['fmap', 'func']
10332 1pre 1pre ['anat', 'fmap', 'func']
10332 2post 2post ['fmap', 'func']
10570 1pre 1pre ['anat', 'fmap', 'func']
10570 2post 2post ['fmap', 'func']
The nipoppy status
prints out a summary of the dataset, including the number of participants who are in BIDS or have completed a pipeline.
$ nipoppy status --dataset nipoppy_study
For now, the dataset only has BIDS data:
nipoppy status
output¶ Participant counts by session at each Nipoppy checkpoint
╷ ╷ ╷ ╷
session_id │ in_manifest │ in_pre_reorg │ in_post_reorg │ in_bids
════════════╪═════════════╪══════════════╪═══════════════╪═════════
1pre │ 9 │ 0 │ 0 │ 9
2post │ 9 │ 0 │ 0 │ 9
╵ ╵ ╵ ╵
Step 2: Modify the global configuration file¶
The nipoppy init
command created the configuration file at nipoppy_study/global_config.json
. This file may need to be updated with information specific to your computing environment. Initially, it will look like this:
1{
2 "SUBSTITUTIONS": {
3 "[[NIPOPPY_DPATH_CONTAINERS]]": "[[NIPOPPY_DPATH_ROOT]]/containers",
4 "[[HPC_ACCOUNT_NAME]]": ""
5 },
6 "DICOM_DIR_PARTICIPANT_FIRST": true,
7 "CONTAINER_CONFIG": {
8 "COMMAND": "apptainer",
9 "ARGS": [
10 "--cleanenv"
11 ],
12 "ENV_VARS": {
13 "PYTHONUNBUFFERED": "1"
14 }
15 },
16 "HPC_PREAMBLE": [
17 "export PYTHONUNBUFFERED=1"
18 ],
19 "PIPELINE_VARIABLES": {
20 "BIDSIFICATION": {},
21 "PROCESSING": {},
22 "EXTRACTION": {}
23 },
24 "CUSTOM": {}
25}
By default, this file does not contain any pipeline-specific information, since the dataset does not have any pipelines installed yet. Still, there are fields that may need to be modified depending on your setup:
If you are on a system that still uses Singularity (which has been renamed to Apptainer), you will need to change
CONTAINER_CONFIG
->COMMAND
to"singularity"
instead of"apptainer"
If your group uses a shared directory for storing container image files, you can replace
"[[NIPOPPY_DPATH_ROOT]]/containers"
by the full path to that shared directory.Alternatively, you can create a symlink from
<NIPOPPY_PROJECT_ROOT>/containers
to that directory (then this line in the configuration can be deleted) (recommended).
Step 3: Install the MRIQC pipeline into the dataset¶
The following command can be used to check which pipelines can be run with the dataset:
$ nipoppy pipeline list --dataset nipoppy_study
The output says that there are no available pipelines to be run:
INFO No available bidsification pipelines
INFO No available processing pipelines
INFO No available extraction pipelines
That is because a newly initialized Nipoppy dataset does not contain any pipelines. Pipeline configuration files are available on the Zenodo data repository and can be searched for directly from your terminal using the command nipoppy pipeline search
. The configuration files for MRIQC can be downloaded by running the following:
$ nipoppy pipeline install --dataset nipoppy_study 15427844
When running nipoppy pipeline install
, you will be asked if you would like to download the MRIQC container. Type y
and press Enter
to do so. The download/building process may take ~10 minutes. The container image will be downloaded as mriqc_23.1.0.sif
inside the container store directory (i.e., nipoppy_study/containers
or the custom path you set in the global_config.json
file).
The pipeline installation process will add a TEMPLATEFLOW_HOME
pipeline variable to the nipoppy_study/global_config.json
file:
INFO Adding 1 variable(s) to the global config file:
INFO TEMPLATEFLOW_HOME Path to the directory where TemplateFlow will store templates (can be empty)
Open nipoppy_study/global_config.json
and set TEMPLATEFLOW_HOME
to a meaningful location.
"PIPELINE_VARIABLES": {
"BIDSIFICATION": {},
"PROCESSING": {
"mriqc": {
"23.1.0": {
"TEMPLATEFLOW_HOME": null
}
}
},
"EXTRACTION": {}
},
In general, we recommend using a shared directory within your research group for all Templateflow files, but you can also set it to something like "[[NIPOPPY_DPATH_ROOT]]/templateflow"
([[NIPOPPY_DPATH_ROOT]]
will be resolved to the full path of the nipoppy_study
directory at runtime). This directory will be created at runtime by Nipoppy if it does not exist – you do not need to create it yourself.
Step 4: Run MRIQC on a single participant and session¶
Use nipoppy process
to run MRIQC on a single participant and session. This could take around 15 minutes to complete.
$ nipoppy process \
--dataset nipoppy_study \
--pipeline mriqc \
--pipeline-version 23.1.0 \
--participant-id 09114 \
--session-id 1pre
Pipeline outputs are written to nipoppy_study/derivatives/mriqc/23.1.0/output
:
nipoppy_study/derivatives/mriqc/23.1.0/output/
├── dataset_description.json
├── logs
├── sub-09114
│ ├── figures
│ │ ├── sub-09114_ses-1pre_desc-background_T1w.svg
│ │ └── sub-09114_ses-1pre_desc-zoomed_T1w.svg
│ └── ses-1pre
│ └── anat
│ └── sub-09114_ses-1pre_T1w.json
└── sub-09114_ses-1pre_T1w.html
Log files can be found in nipoppy_study/logs/process/mriqc-23.1.0
.
Step 5: Track the pipeline processing status¶
Run nipoppy track-processing
to determine the MRIQC processing status for each subject and session:
$ nipoppy track-processing \
--dataset nipoppy_study \
--pipeline mriqc \
--pipeline-version 23.1.0
The command will create an processing status file at nipoppy_study/derivatives/processing_status.tsv
. This file should have a SUCCESS
status (last column) for participant 09114
session ses-1pre
, and FAIL
statuses in every other row, like this:
participant_id bids_participant_id session_id pipeline_name pipeline_version pipeline_step bids_session_id status
09114 sub-09114 1pre mriqc 23.1.0 default ses-1pre SUCCESS
09114 sub-09114 2post mriqc 23.1.0 default ses-2post FAIL
09160 sub-09160 1pre mriqc 23.1.0 default ses-1pre FAIL
09160 sub-09160 2post mriqc 23.1.0 default ses-2post FAIL
09260 sub-09260 1pre mriqc 23.1.0 default ses-1pre FAIL
09260 sub-09260 2post mriqc 23.1.0 default ses-2post FAIL
09300 sub-09300 1pre mriqc 23.1.0 default ses-1pre FAIL
09300 sub-09300 2post mriqc 23.1.0 default ses-2post FAIL
09380 sub-09380 1pre mriqc 23.1.0 default ses-1pre FAIL
09380 sub-09380 2post mriqc 23.1.0 default ses-2post FAIL
09381 sub-09381 1pre mriqc 23.1.0 default ses-1pre FAIL
09381 sub-09381 2post mriqc 23.1.0 default ses-2post FAIL
10134 sub-10134 1pre mriqc 23.1.0 default ses-1pre FAIL
10134 sub-10134 2post mriqc 23.1.0 default ses-2post FAIL
10332 sub-10332 1pre mriqc 23.1.0 default ses-1pre FAIL
10332 sub-10332 2post mriqc 23.1.0 default ses-2post FAIL
10570 sub-10570 1pre mriqc 23.1.0 default ses-1pre FAIL
10570 sub-10570 2post mriqc 23.1.0 default ses-2post FAIL
Running nipoppy status --dataset nipoppy_study
again will show a new column for the MRIQC pipeline showing that one participant has completed processing for the first session:
nipoppy status
output¶ Participant counts by session at each Nipoppy checkpoint
╷ ╷ ╷ ╷ ╷
│ │ │ │ │ mriqc
│ │ │ │ │ 23.1.0
session_id │ in_manifest │ in_pre_reorg │ in_post_reorg │ in_bids │ default
════════════╪═════════════╪══════════════╪═══════════════╪═════════╪═════════
1pre │ 9 │ 0 │ 0 │ 9 │ 1
2post │ 9 │ 0 │ 0 │ 9 │ 0
╵ ╵ ╵ ╵ ╵
The processing status file can also be uploaded to the Neurobagel digest dashboard, which will produce interactive visualizations of pipeline processing statuses.
Finally, this file can be used directly as input to the Neurobagel CLI when generating participant-level metadata about processing pipeline results.
Step 6 (optional): Run MRIQC on the rest of the dataset¶
Use nipoppy process
without the participant and session flags to process the rest of the dataset (in a loop). This will skip the participant-session that has previously been run successfully.
$ nipoppy process \
--dataset nipoppy_study \
--pipeline mriqc \
--pipeline-version 23.1.0
Then, run the tracking command again to update the processing status file:
$ nipoppy track-processing \
--dataset nipoppy_study \
--pipeline mriqc \
--pipeline-version 23.1.0
Note
For this dataset specifically, MRIQC will fail on all ses-2post
sessions because they do not have anatomical data.
And that’s it! You have successfully run MRIQC on a BIDS dataset using Nipoppy!
While other pipelines may have different pipeline specific configurations to be set up, the general workflow described in this tutorial can be adapted to any other pipeline run with Nipoppy.