| .. | ||
| convert_bids.py | ||
| meta.json | ||
| README.md | ||
| requirements.txt | ||
What's this?
A script, convert_bids.py, that can be used to convert participants.tsv files to the flat-data schema.
Prerequisites
Clone the repo:
git clone https://hub.psychoinformatics.de/inm7/annotate.inm7.de-data.git
cd annotate.inm7.de-data/tools
Create a virtual environment and install requirements:
python -m venv ~/my_env
source ~/my_env/bin/activate
pip install -r requirements.txt
Inputs
The script needs to be pointed to the INM-7 superdataset.
You have to:
- clone it
- install all subdatasets:
datalad get . -r -n getallparticipants.tsvfiles for the datasets that you want to convert
The script also needs to be pointed to a meta.json input file. This file provides required input data for the conversion process to the flat-data schema. The object in the meta.json has a datasets key, which is another object, that should be populated with info about all the datasets that are to be converted. Example structure:
{
"datasets": {
"<dataset-shortname>": {
"path": "<path-to-bids-dataset-relative-to-super-root>",
"name": "<human-readable-name-of-dataset>",
"description": "<human-readable-name-of-dataset>",
"dimensions": {
"Age": {
"column": "<name-of-age-column-in-participants-table>",
"unit": "<year|month|day>"
},
"Sex": {
"column": "<name-of-sex-column-in-participants-table>",
"map": {
"F": "female",
"M": "male"
"<optional-different-F-level>": "female",
"<optional-different-M-level>": "male"
}
}
}
},
...
},
...
}
This meta.json file included in this repository only has an empty object as the value for the datasets key.
Usage
This is the script help:
>> python convert_bids.py -h
usage: convert_bids.py [-h] [--namespace NAMESPACE] [--output OUTPUT] [--post] [--url URL] [--summary] dataset_path metadata_path
positional arguments:
dataset_path Path to INM7 superdataset
metadata_path Path to meta.json helper file
options:
-h, --help show this help message and exit
--namespace NAMESPACE
Main namespace URL to be used for PIDs; defaults to 'https://inm7.de/ns/datamgt/'
--output OUTPUT Output file name, e.g. 'output.json'; file will be created in the `tools` directory next to this script; prints output to 'stdout'
by default
--post In addition to data conversion, also POST data to the backend; a base URL should be provided; the X_DUMPTHINGS_TOKEN token, if
required, should be saved to a '.env' file before running the script
--url URL Base URL for the backend
--summary Print a summary of the transformed metadata
Example:
python convert_bids.py --output flattened_bids.json --summary <path-to-superdataset> <path-to-meta-file>
To convert the data AND post the results to a dumpthings backend, the X_DUMPTHINGS_TOKEN should first be saved to an .env file in the same directory as the script.