Improve performance by processing specific graph updates rather than recalculating all #348

Open
opened 2026-04-16 11:39:13 +00:00 by jsheunis · 2 comments
Owner

The main culprit being considered here is the getInstanceItems function that generates a list of records (for a specified class) to be displayed in shacl-vue. Code: https://hub.psychoinformatics.de/orinoco/shacl-vue/src/branch/main/src/composables/useRecords.js#L249-L341.

This function gets called several times during the app's normal operation, but most importantly:

  • on first load
  • when the graph changes (e.g. after more records were fetched because of a search term being entered, or because the user scrolled to the end of a specific records list)

This function recalculates all records every time, from "first principles", i.e. from the graph data. This is unnecessary (apart from the first time), because most records won't change, only individual ones will change or be added to the graph. So only processing changed or new records will allow some performance improvement, hopefully substantial.

Looking at the cycle from data coming into the graph until it ends up in the list, the starting point is typically the parseTTLandDedup function which gets called every time data is fetched from the backend (see more in the useData composable: https://hub.psychoinformatics.de/orinoco/shacl-vue/src/branch/main/src/composables/useData.js). In some components in the app individual quads are also added to or removed from the graph without running through the parseTTLandDedup function, specifically:

  • wizard functionality (added and removed)
  • saving a form

TheparseTTLandDedup function already returns an array of added quads, which are not currently used in all cases. This can be used directly to process added/changed quads, which is the perfect first step towards only processing updated and not recalculating the whole list. The main requirement is to use the quads with named node subjects to identify the new/updated nodes to be processed for the list, and additionally to traverse blank node subject quads to their (grand)parent quad with a named node subject to identify more new/updated nodes.

For the cases where parseTTLandDedup is not used, we already have access to individual quads (e.g. a single record being saved).

The next step is to develop an event emit step and a queue to process the individual new/updated records and to inject those into the list when and where appropriate, instead of recalculating the whole list.

The main culprit being considered here is the `getInstanceItems` function that generates a list of records (for a specified class) to be displayed in `shacl-vue`. Code: https://hub.psychoinformatics.de/orinoco/shacl-vue/src/branch/main/src/composables/useRecords.js#L249-L341. This function gets called several times during the app's normal operation, but most importantly: - on first load - when the graph changes (e.g. after more records were fetched because of a search term being entered, or because the user scrolled to the end of a specific records list) This function recalculates all records every time, from "first principles", i.e. from the graph data. This is unnecessary (apart from the first time), because most records won't change, only individual ones will change or be added to the graph. So only processing changed or new records will allow some performance improvement, hopefully substantial. Looking at the cycle from data coming into the graph until it ends up in the list, the starting point is typically the [`parseTTLandDedup` function](https://hub.psychoinformatics.de/orinoco/shacl-vue/src/commit/1c622be04658a9e08774eb8a2a002d2ce3cc5496/src/classes/ReactiveRdfDataset.js#L95) which gets called every time data is fetched from the backend (see more in the `useData` composable: https://hub.psychoinformatics.de/orinoco/shacl-vue/src/branch/main/src/composables/useData.js). In some components in the app individual quads are also added to or removed from the graph without running through the `parseTTLandDedup` function, specifically: - wizard functionality (added and removed) - saving a form The`parseTTLandDedup` function already returns an array of added quads, which are not currently used in all cases. This can be used directly to process added/changed quads, which is the perfect first step towards only processing updated and not recalculating the whole list. The main requirement is to use the quads with named node subjects to identify the new/updated nodes to be processed for the list, and additionally to traverse blank node subject quads to their (grand)parent quad with a named node subject to identify more new/updated nodes. For the cases where `parseTTLandDedup` is not used, we already have access to individual quads (e.g. a single record being saved). The next step is to develop an event emit step and a queue to process the individual new/updated records and to inject those into the list when and where appropriate, instead of recalculating the whole list.
Author
Owner

I have a local implementation of the above and it seems to work fine, but I have yet to test the performance in detailed compared to the previous approach. With the new approach, individual items are added/updated to a reactive object, as they are received, processed, and emitted by the graph. They are stored in the reactive object with record pids as keys, and the complete items as values. This results in not having to call the getInstanceItems function ever. (Note to self: investigate the same approach for InstancesSelectEditor....)

The challenge I have currently is the interaction of a long list of items (all fetched items of all classes, in the case of the shacl-vue kickstarter project) with the search text being updated: there is a long delay specifically when the search text is cleared. This delay is shorter for fewer selected classes, and the longest when all classes are shown.

There are two things that I can think of to investigate here:

  • use vue virtual scroller
  • use some index-based (fuzzy) search library

I want to stay away from introducing more dependencies though. So I'll try out virtual scroller first.

I have a local implementation of the above and it seems to work fine, but I have yet to test the performance in detailed compared to the previous approach. With the new approach, individual items are added/updated to a reactive object, as they are received, processed, and emitted by the graph. They are stored in the reactive object with record pids as keys, and the complete items as values. This results in not having to call the `getInstanceItems` function ever. (Note to self: investigate the same approach for `InstancesSelectEditor`....) The challenge I have currently is the interaction of a long list of items (all fetched items of all classes, in the case of the `shacl-vue` kickstarter project) with the search text being updated: there is a long delay specifically when the search text is cleared. This delay is shorter for fewer selected classes, and the longest when all classes are shown. There are two things that I can think of to investigate here: - use vue virtual scroller - use some index-based (fuzzy) search library I want to stay away from introducing more dependencies though. So I'll try out virtual scroller first.
Author
Owner

Ok, vue-virtual-scroller didn't seem to make any difference to the performance of the search text....

Ok, `vue-virtual-scroller` didn't seem to make any difference to the performance of the search text....
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
orinoco/shacl-vue#348
No description provided.