Compare PID-values verbatim to support record addressing after prefixes have changed #177

Open
opened 2025-12-10 13:23:09 +00:00 by christian-monch · 0 comments
christian-monch commented 2025-12-10 13:23:09 +00:00 (Migrated from github.com)

If schema prefixes are changed or deleted, records that use the modified or removed prefixes cannot be addressed by their CURIEs anymore.

For example, a schema defines the prefix:

prefixes:
  abc: http://example.org/abc/

And a record with the PID abc:something is stored in a dump-thing-service. The record will be stored with the verbatim PID, i.e., with abc:something, but the index will contain the entry http://example.org/abc/something.

If the prefix is modified or removed from the schema, the service will not be able to resolve abc:something to the same value as before. Therefore it will not be able to locate the record in the index and assume that the record does not exist. As a result, CURIE-PID-based retrieval or deletion will no longer work. (One can still address the record with the fully resolved PID, i.e., http://example.org/abc/something, but that requires knowledge of the previous definition of the prefix abc.)

Since the record itself still stores the PID abc:something, it should be possible to address records with CURIE-PIDs, independent of the schema, if the addressing is based on the record content.

We should enable record content-based PID-based addressing (which is stable even when prefixes are changed or deleted). This could probably be enabled via a flag since it would require additional computation. But it could also be a fallback when the prefix is not longer known, or the resolved PID does not exist in the index.

If schema prefixes are changed or deleted, records that use the modified or removed prefixes cannot be addressed by their CURIEs anymore. For example, a schema defines the prefix: ```yaml prefixes: abc: http://example.org/abc/ ``` And a record with the PID `abc:something` is stored in a dump-thing-service. The record will be stored with the verbatim PID, i.e., with `abc:something`, but the index will contain the entry `http://example.org/abc/something`. If the prefix is modified or removed from the schema, the service will not be able to resolve `abc:something` to the same value as before. Therefore it will not be able to locate the record in the index and assume that the record does not exist. As a result, CURIE-PID-based retrieval or deletion will no longer work. (One can still address the record with the fully resolved PID, i.e., `http://example.org/abc/something`, but that requires knowledge of the previous definition of the prefix `abc`.) Since the record itself still stores the PID `abc:something`, it should be possible to address records with CURIE-PIDs, independent of the schema, if the addressing is based on the record content. We should enable record content-based PID-based addressing (which is stable even when prefixes are changed or deleted). This could probably be enabled via a flag since it would require additional computation. But it could also be a fallback when the prefix is not longer known, or the resolved PID does not exist in the index.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
orinoco/dump-things-server#177
No description provided.