Copy from hub.datalad/datalink/org: Set up a neurobagel node #14

Open
opened 2025-03-01 12:14:11 +00:00 by jsheunis · 0 comments
Owner

Source: https://hub.datalad.org/datalink/org/issues/4


Chronological notes on progress

  • Starting with setting up a neurobagel node now. First have to update my OS (I have delayed this for too long) in order to use an updated version of docker that is required for the neurobagel node setup
  • Getting a node up and running locally is easy enough given their instructions (with docker compose). Works as expected, using public node data that are immediately available to a local node.
  • I'm now looking at podman. Installation on a Mac proceeds without issues, as does podman machine init and podman machine start. If I want to work with the existing docker-compose.yml file, it looks like I'll need to install podman-compose. Can only find a reliable source for mac on brew at the moment.
  • Paraphrased comment from @mih: podman-compose and docker compose can both be replaced by podman's podlets
  • Continued local testing with docker. The local node works nicely out of the box. Then I tested adding my own data to it. I had to:
    • provide a data dictionary, this was generated from an annotated LinkML schema via the process documented here.
    • provide TSV data, which I just took from the Neurobagel example. Ideally this data would come from users annotating a dataset using shacl-vue. (A missing part of our future pipeline is generating this TSV directly from graph data captured by a shacl-vue form that was generated from the SHACL exported from the same schema containing the data dictionary annotations).
    • generate graph-ready data by running bagel-cli via docker with these files as inputs:
docker run --rm --volume=$PWD:$PWD -w $PWD neurobagel/bagelcli pheno \
    --pheno "data/example_synthetic.tsv" \
    --dictionary "data/data_dictionary.json" \
    --name "JSH synth data 1" \
    --output "data/jsh_synthdata1_pheno.jsonld"

Then the node has to be restarted and the data show up in a query. The table data downloaded from a query result are protected by default, and this can be changed as part of the config.

Source: https://hub.datalad.org/datalink/org/issues/4 --- ## Chronological notes on progress - Starting with setting up a neurobagel node now. First have to update my OS (I have delayed this for too long) in order to use an updated version of docker that is required for the neurobagel node setup - Getting a node up and running locally is easy enough given their instructions (with `docker compose`). Works as expected, using public node data that are immediately available to a local node. - I'm now looking at podman. Installation on a Mac proceeds without issues, as does `podman machine init` and `podman machine start`. If I want to work with the existing `docker-compose.yml` file, it looks like I'll need to install `podman-compose`. Can only find a reliable source for mac on `brew` at the moment. - Paraphrased comment from @mih: `podman-compose` and `docker compose` can both be replaced by podman's podlets - Continued local testing with docker. The local node works nicely out of the box. Then I tested adding my own data to it. I had to: - provide a data dictionary, this was generated from an annotated LinkML schema via the process documented [here](https://hub.datalad.org/datalink/tools#generating-a-data-dictionary-for-neurobagel-https-neurobagel-org-from-a-linkml-schema). - provide TSV data, which I just took from the [Neurobagel example](https://github.com/neurobagel/neurobagel_examples/blob/main/data-upload/example_synthetic.tsv). Ideally this data would come from users annotating a dataset using `shacl-vue`. (A missing part of our future pipeline is generating this TSV directly from graph data captured by a `shacl-vue` form that was generated from the SHACL exported from the same schema containing the data dictionary annotations). - generate graph-ready data by running `bagel-cli` via docker with these files as inputs: ``` docker run --rm --volume=$PWD:$PWD -w $PWD neurobagel/bagelcli pheno \ --pheno "data/example_synthetic.tsv" \ --dictionary "data/data_dictionary.json" \ --name "JSH synth data 1" \ --output "data/jsh_synthdata1_pheno.jsonld" ``` Then the node has to be restarted and the data show up in a query. The table data downloaded from a query result are protected by default, and this can be changed as part of the config.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
orinoco/tools#14
No description provided.