Preparing a BIDS dataset by hand and from scratch
⚠️ Note that this is purely for learning purposes and it is NOT recommended to BIDSify real datasets by hand . ⚠️
Table of content
Ingredients and tools
Get them fresh from your local market:
- MRI scanner 🧲
- EEG amplifier 🌩
- MEG squid 🦑
- …
🧠 some source
data to be converted into BIDS
We will work with the multi-modal face dataset from SPM .
Very often MRI source data will be in a DICOM format and will require to be converted.
Here the MRI data is in "3D" Nifti format .hdr/.img
and
we will need to change that to a "4D" Nifti .nii
format.
This dataset contains EEG, MEG and fMRI data on the same subject within the same paradigm.
We also extracted some of the information about the data from the SPM manual
and put it into the source/README.md
.
Similarly when you have DICOM data, it is usually a good idea to keep the PDF of MRI acquisition parameters with your source data.
🖋 a text editor
Several common options top choose from:
- Visual Studio code
- Sublime
- Atom
- Notepad does not really count.
♻ some format conversion tools
For the MRI data we will be using some of the SPM built-in functions to convert Nifti files into the proper format.
📥 [OPTIONAL] BIDS validator
- Install Node.js (at least version 12.12.0).
- Update
npm
to be at least version 7 (npm install --global npm@^7
) - From a terminal run
npm install -g bids-validator
- Run
bids-validator
to start validating datasets.
[OPTIONAL] Datalad to version control your data
You can follow the installation instruction in the Datalad handbook.
Recipe
1. Preheat the oven: creating folders
-
Create a
raw
folder to host your BIDS data and inside it create:- a
sourcedata
folder and put yoursource
data in it - a
code/conversion
folder and put thisREADME.md
in it - a subject folder:
sub-01
- with session folder:
ses-mri
- with an
anat
folder for the structural MRI data - with an
func
folder for the functional MRI data
- with an
- with session folder:
- a
By now you should have this.
├── code │ └── conversion ├── sourcedata │ ├── multimodal_fmri │ │ └── fMRI │ │ ├── Session1 │ │ └── Session2 │ └── multimodal_smri │ └── sMRI └── sub-01 └── ses-mri ├── anat └── func
2. Starters: converting the anatomical MRI file
- In Matlab launch SPM:
spm fmri
. -
In SPM:
- use the SPM 3D to 4D module:
Batch --> SPM --> Utils --> 3D to 4D File conversion
- select the
*.img
file to convert - keep track of what you did by saving the batch in
code/conversion
- run the batch
- use the SPM 3D to 4D module:
a. Cooking is not just about the taste, it is also about how things look: naming files
- Move the
.nii
file you have just created intosub-01/ses-mri/anat
. - Give this file a valid BIDS filename.
✅ Valid BIDS filenames
-
BIDS filenames are composed of:
extension
suffix
preceded by a_
entity-label
pairs separated by a_
-
So a BIDS filename can look like:
entity1-label1_entity2-label2_suffix.extension
-
entities
andlabels
can only contain letters and / or numbers. -
For a given suffix, some entities are
required
and some others are[optional]
. -
entity-label
pairs pairs have a specific order in which they must appear in filename.
In case you do not remember which suffix to use and which entities are required or optional, the BIDS specification has:
- filename templates at the beginning of the section for each imaging modality,
- a summary entity table.
b. Taste your dish while you prepare it: using the BIDS validator
Try it directly in your browser.
c. Season to taste: adding missing files
README
dataset_description.json
You can get content for those files from:
- from the BIDS specification (use the search bar)
- the BIDS starter templates
Suggestion:
Add the “table” output of the BIDS validator to your README to give a quick overview of the content of your dataset.
🚨 About JSON files
JSON files are text files to store
key-value
pairs.
If your editor cannot help you format them properly, you can always use the
online editor.
More information on how read and write JSON files is available on the BIDS stater kit.
JSON CONTENT EXAMPLE: { "key": "value", "key2": "value2", "key3": { "subkey1": "subvalue1" }, "array": [ 1, 2, 3 ], "boolean": true, "color": "gold", "null": null, "number": 123, "object": { "a": "b", "c": "d" }, "string": "Hello World" }
d. Icing on the cake: adding extra information
- Add
T1w.json
file. Use information fromsource/README.md
to create it. - Add a participants
participants.tsv
. You can use excel or google sheet to create them.
🚨 About TSV files
A Tab-Separate Values (TSV) file is a text file where tab characters (
\t
) separate fields that are in the file.
It is structured as a table, with each column representing a field of interest,
and each row representing a single data point.
More information on how read and write TSV files is available on the BIDS stater kit
TSV CONTENT EXAMPLE: participant_id\tage\tgender\n sub-01\t34\tM
By now you should have this.
├── code ├── sourcedata ├── sub-01 │ └── ses-mri │ ├── anat │ │ │ ├── sub-01_ses-mri_T1w.json │ │ │ └── sub-01_ses-mri_T1w.nii │ └── func ├── README ├── participants.tsv ├── participants.json └── dataset_description.json
e. BIDS is data jam: let’s preserve some
[OPTIONAL]
- Create a Datalad dataset
- make a commit when you have a valid dataset to use as a checkpoint.
datalad create --force -c text2git .
datalad save -m 'initial commit'
3. Main course: converting the functional MRI files
- Convert the 2 runs of made of 3D series of
*.img
into 2 single 4D*.nii
images by using the same SPM module used for the anatomical conversion. - Make sure to use enter the repetition time in the
interscan interval
. - Give the output files valid BIDS filenames. You will need to use
task
and therun
entities. - Use the BIDS validator and any eventual missing file (like
*_bold.json
file). - Create
events.tsv
: you can run the functionmultimodal/code/convert_func_event_mat.m
to help you convert the filesmultimodal/source/multimodal_fmri/fMRI/trials_ses*.mat
- Put the
events.tsv
files in the func folders and give them BIDS valid names. - Remove duplicate
json
files to make use of the “inheritance principle”
By now you should have this.
├── code ├── sourcedata ├── sub-01 │ └── ses-mri │ ├── anat │ │ │ └── sub-01_ses-mri_T1w.nii │ └── func │ ├── sub-01_ses-mri_task-FaceSymmetry_run-1_bold.nii │ ├── sub-01_ses-mri_task-FaceSymmetry_run-1_events.tsv │ ├── sub-01_ses-mri_task-FaceSymmetry_run-2_bold.nii │ └── sub-01_ses-mri_task-FaceSymmetry_run-2_events.tsv ├── README ├── participants.tsv ├── participants.json ├── T1w.json ├── task-FaceSymmetry_bold.json └── dataset_description.json
4. Dessert: defacing, quality control, upload your data to GIN...
Defacing
- with SPM:
Batch --> SPM --> Utils --> De-face images
MRIQC
# from within the `raw` folder
bids_dir=`pwd`
output_dir=${bids_dir}/../derivatives/mriqc
docker run -it --rm \
-v ${bids_dir}:/data:ro \
-v ${output_dir}:/out \
poldracklab/mriqc:0.16.1 /data /out \
--participant_label sub-01 \
--verbose-reports \
participant
uploading your data on GIN
- create an account on GIN
- upload your public SSH key to GIN
for SSH access
- you might need to create one first
- create an empty repository on GIN
datalad siblings add --name gin --url git@gin.g-node.org:/your_username/your_repository.git datalad push --to gin
- More information on the datalad handbook
Things to improve ?
Useful links
- BIDS specification
- BIDS starter kit
- BIDS validator
- BIDS examples
- Neurostars forum
- Other conversion tutorial
- Conversion tools
- Datalad handbook
- GIN