- Python 100%
The idea is that all information is generated using `things/v2+` features only. This would make this tool compatible with any `things`-derived schema, and not specific to something like `research-information`. The adaptation to a specific, derived schema could then be implemented by a generic (not BIDS-specific) tool that looks for information in a record that can be expressed (more natively) using predefined structural slots of a concrete class in a derived schema. |
||
|---|---|---|
| bids_things | ||
| .changelog.md.j2 | ||
| .gitignore | ||
| .noannex | ||
| CHANGELOG.md | ||
| conftest.py | ||
| CONTRIBUTING.md | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
FLATBIDS
This is a tool for reporting metadata records on BIDS datasets (raw data and select derivatives).
At the moment, this is a development sketch for a dedicated audience. If you find this interesting, please get in touch.
Installation
Install directly from the development repository into a dedicated virtual environment:
uv tool install https://hub.psychoinformatics.de/orinoco/bids-things.git
This provides a bids-things command in the PATH.
uv tool uninstall bids-things will uninstall to tool again.
Usage
The general usage is:
bids-things <command> <path> <output-prefix> ...
where <command> selects one of the supported processing modes, <path> identifies the Datalad dataset worktree to process, and <output-prefix> determines the location and naming of the output files.
The generated output files are in JSON-lines format (one record per line), and there is one output file for each generated record type. All files share the common <output-prefix>.
Examples
Process a BIDS raw dataset
bids-things parse-raw sourcedata /tmp/metadata-raw
This generates the following files:
/tmp/metadata-raw-DataItem.jsonl/tmp/metadata-raw-Distribution.jsonl/tmp/metadata-raw-Dataset.jsonl/tmp/metadata-raw-Subject.jsonl
where DataItem, Distribution, Dataset, and Subject corresponding to concepts defined at https://concepts.datalad.org.
Process a MRIQC derivative dataset
bids-things \
parse-mriqc \
. \
mriqc \
071cc3df-cc31-11f0-be2c-dc97ba1c2528 \
inm7/projects/mriqc-openneuro-2025
produces
mriqc-DataItem.jsonlmriqc-Dataset.jsonlmriqc-Distribution.jsonl
where the Dataitem file will contain records of individual image quality metrics for any MRI image that was processed by MRIQC.
Information common to all MRIQC runs is factored out into a common namespace that is given by the DataLad dataset ID of a super-dataset comprising all MRIQC output datasets (here: 071cc3df-cc31-11f0-be2c-dc97ba1c2528).
The value inm7/projects/mriqc-openneuro-2025 is an identifier of the activity that describes the effort or project that led to or involved the MRIQC runs. This activity record is supposed to capture all information that is common to the respective MRIQC executions, such as software versions, execution environments, and responsible persons. This identifier is also placed into the common namespace. Its complete form is datalad:071cc3df-cc31-11f0-be2c-dc97ba1c2528/inm7/projects/mriqc-openneuro-2025.
The composition of this PID also exemplifies that of other PIDs in the generated output. datalad:071cc3df-cc31-11f0-be2c-dc97ba1c2528 is a DataLad dataset identifier (here the identifier of a super-dataset that comprises all MRIQC output datasets). inm7 is a(n ad-hoc) namespace label (here an organization label). projects/mriqc-openneuro-2025 is an internal identifier for the MRIQC execution activity. Using the DataLad dataset ID, a globally unique identifier, as an anchor ensures a globally unique identifier, even though only local coordination is required to achieve this.