Improve performance by processing specific graph updates rather than recalculating all #348
Labels
No labels
bug
config
documentation
duplicate
enhancement
good first issue
help wanted
invalid
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
orinoco/shacl-vue#348
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The main culprit being considered here is the
getInstanceItemsfunction that generates a list of records (for a specified class) to be displayed inshacl-vue. Code: https://hub.psychoinformatics.de/orinoco/shacl-vue/src/branch/main/src/composables/useRecords.js#L249-L341.This function gets called several times during the app's normal operation, but most importantly:
This function recalculates all records every time, from "first principles", i.e. from the graph data. This is unnecessary (apart from the first time), because most records won't change, only individual ones will change or be added to the graph. So only processing changed or new records will allow some performance improvement, hopefully substantial.
Looking at the cycle from data coming into the graph until it ends up in the list, the starting point is typically the
parseTTLandDedupfunction which gets called every time data is fetched from the backend (see more in theuseDatacomposable: https://hub.psychoinformatics.de/orinoco/shacl-vue/src/branch/main/src/composables/useData.js). In some components in the app individual quads are also added to or removed from the graph without running through theparseTTLandDedupfunction, specifically:The
parseTTLandDedupfunction already returns an array of added quads, which are not currently used in all cases. This can be used directly to process added/changed quads, which is the perfect first step towards only processing updated and not recalculating the whole list. The main requirement is to use the quads with named node subjects to identify the new/updated nodes to be processed for the list, and additionally to traverse blank node subject quads to their (grand)parent quad with a named node subject to identify more new/updated nodes.For the cases where
parseTTLandDedupis not used, we already have access to individual quads (e.g. a single record being saved).The next step is to develop an event emit step and a queue to process the individual new/updated records and to inject those into the list when and where appropriate, instead of recalculating the whole list.
I have a local implementation of the above and it seems to work fine, but I have yet to test the performance in detailed compared to the previous approach. With the new approach, individual items are added/updated to a reactive object, as they are received, processed, and emitted by the graph. They are stored in the reactive object with record pids as keys, and the complete items as values. This results in not having to call the
getInstanceItemsfunction ever. (Note to self: investigate the same approach forInstancesSelectEditor....)The challenge I have currently is the interaction of a long list of items (all fetched items of all classes, in the case of the
shacl-vuekickstarter project) with the search text being updated: there is a long delay specifically when the search text is cleared. This delay is shorter for fewer selected classes, and the longest when all classes are shown.There are two things that I can think of to investigate here:
I want to stay away from introducing more dependencies though. So I'll try out virtual scroller first.
Ok,
vue-virtual-scrollerdidn't seem to make any difference to the performance of the search text....