61 lines
No EOL
3.7 KiB
Markdown
61 lines
No EOL
3.7 KiB
Markdown
# DataLink Tools
|
|
|
|
_WORK IN PROGRESS - CAN CHANGE AT ANY MOMENT_
|
|
|
|
Tooling to support interoperability where linked data and DataLad meet. These tools are co-developed in order to integrate several other tools/frameworks:
|
|
- https://github.com/psychoinformatics-de/datalad-concepts
|
|
- https://github.com/psychoinformatics-de/shacl-vue
|
|
- https://datalad.org/
|
|
- https://neurobagel.org/
|
|
- https://www.trr379.de/
|
|
|
|
|
|
## Using DataLink Tools
|
|
|
|
1. In a virtual environment: install the latest version of `linkml` with `pip`, either from PyPI or from source at https://github.com/linkml/linkml
|
|
2. Clone this tools repository: `git clone https://hub.datalad.org/datalink/tools.git datalink_tools`
|
|
3. Make sure the content of the `datalad-concepts` submodule is available locally
|
|
4. Patch your LinkML installation to allow correct integration with `datalad-concepts` and the tools in this repository:
|
|
|
|
```
|
|
cd datalink_tools/datalad-concepts
|
|
tools/patch_linkml
|
|
```
|
|
|
|
## Generating UI-annotated SHACL shapes from a LinkML schema
|
|
|
|
LinkML has the ability to export its YAML schemas and validated YAML data into various formats. One such format for schemas is the Shapes Constraint Language, SHACL, which is a standard for validating RDF graph data. Apart from validation, SHACL schemas containing enough information can be used by applications/script to autogenerate user interfaces. An example is [`shacl-vue`](https://github.com/psychoinformatics-de/shacl-vue), a VueJS-based web application that uses SHACL for auto-generating metadata forms, viewers, and editors. To support such applications, the SHACL exported by LinkML should include UI-annotations.
|
|
|
|
### Update a LinkML schema with UI annotations
|
|
|
|
*TODO*
|
|
|
|
### Export SHACL
|
|
|
|
*TODO*
|
|
|
|
```
|
|
python code/gen_shacl_ui.py datalad-concepts/src/thing/unreleased.yaml data/dlco-shacl-groups.yaml
|
|
```
|
|
|
|
## Generating a data dictionary for [Neurobagel](https://neurobagel.org/) from a LinkML schema
|
|
|
|
*TODO: A previous version of this README contained a thorough description of how to generate a data dictionary using a now outdated approach. A new approach is to be developed and documented...*
|
|
|
|
For background, read these [DataLad-Neurobagel integration notes](https://hub.datalad.org/datalink/org/issues/2).
|
|
|
|
A structured [data dictionary](https://neurobagel.org/dictionaries/) is required by Neurobagel
|
|
when adding TSV data to a Neurobagel node. Our goal is to embed the information required by a
|
|
Neurobagel data dictionary into any `datald-concepts`-derived schema that models data that could
|
|
end up being represented in a Neurobagel node. This would be useful because the schema would
|
|
simultaneously:
|
|
- model the data using `datald-concepts`-compatible classes, e.g. a Scientific Data Distribution schema
|
|
- contain UI annotations that enable generating a SHACL shapes graph with sufficient information to drive `shacl-vue` form generation
|
|
- contain slots with Neurobagel annotations that enable the programmatic generation of a Neurobagel data dictionary
|
|
|
|
|
|
### CI workflow to register file uploads from shacl-vue editors
|
|
|
|
The script ``code/register-upload.py`` is a resource to use in CI actions for automatic registration of uploaded annex keys into
|
|
target repositories. Shacl-vue's file upload feature (see https://shacl-vue.psychoinformatics.de/features-file-upload.html#file-upload)
|
|
essentially stores file content as unused, unreferenced keys in its configured upload repository. This tool identifies such unused keys, and registers them in the target repository by creating an annex pointer file. The name and structure of the pointer file location is derived from metadata records from the dumpthings instance the file was uploaded to. You can see it being used in [this CI action](https://hub.trr379.de/knowledge-pool/public-files/actions) |