Creating a manifest from data directories on disk for a study with different imaging datatypes¶
In this example, we have a longitudinal study with imaging visits, but not all participants have the same imaging datatypes for all visits.
We do not have a tabular file indicating which datatypes are available for which participants and visits. However, this information can be obtained by looking at the data directories on disk:
data/
├── ABC001/
│ ├── BL/
│ │ ├── T1w/
│ │ │ └── ...
│ │ └── diffusion/
│ │ └── ...
│ └── M12/
│ └── T1w/
│ └── ...
└── ABC002/
├── BL/
│ ├── T1w/
│ │ └── ...
│ └── diffusion/
│ └── ...
└── M12/
├── T1w/
│ └── ...
└── diffusion/
└── ...
Here is a script that creates a Nipoppy manifest for the directory structure above:
Attention
The script below was written for Python 3.11 with pandas
2.2.3.
It may not work with older/different versions.
1#!/usr/bin/env python
2"""Manifest-generation script for Example 3."""
3
4from pathlib import Path
5
6import pandas as pd
7
8if __name__ == "__main__":
9
10 # get the path to the data directory
11 # we assume that it is in the same directory as this script
12 path_data = Path(__file__).parent / "data"
13
14 data_for_manifest = []
15 for path_participant in sorted(path_data.iterdir()):
16 for path_participant_visit in sorted(path_participant.iterdir()):
17
18 # participant_id and visit_id are the names of the directories
19 participant_id = path_participant.name
20 visit_id = path_participant_visit.name
21
22 # use the visit_id as session_id
23 session_id = visit_id
24
25 # check which datatypes are present
26 datatype = []
27 if (path_participant_visit / "T1w").exists():
28 datatype.append("anat")
29 if (path_participant_visit / "diffusion").exists():
30 datatype.append("dwi")
31
32 # create the manifest entry
33 data_for_manifest.append(
34 {
35 "participant_id": participant_id,
36 "visit_id": visit_id,
37 "session_id": session_id,
38 "datatype": datatype,
39 }
40 )
41
42 df_manifest = pd.DataFrame(data_for_manifest)
43
44 # write the manifest in the same directory as this script
45 df_manifest.to_csv(
46 Path(__file__).parent / "example3-manifest.tsv", sep="\t", index=False
47 )
Running this script creates a manifest that looks like this:
participant_id |
visit_id |
session_id |
datatype |
---|---|---|---|
ABC001 |
BL |
BL |
[‘anat’, ‘dwi’] |
ABC001 |
M12 |
M12 |
[‘anat’] |
ABC002 |
BL |
BL |
[‘anat’, ‘dwi’] |
ABC002 |
M12 |
M12 |
[‘anat’, ‘dwi’] |