Note on curation load aware schema design #73
1 changed files with 21 additions and 6 deletions
|
|
@ -11,11 +11,11 @@ uses their own data models. Each system allows for submission of additional
|
||||||
or edited records to a staging area where submissions can be subjected to
|
or edited records to a staging area where submissions can be subjected to
|
||||||
verification and curation, before they are accepted.
|
verification and curation, before they are accepted.
|
||||||
|
|
||||||
Metadata records from each system can be transformed to be compliant with a
|
Metadata records from each system can be losslessly transformed to be compliant
|
||||||
generic use case agnostic data model. This generic data model facilitates the
|
with a generic use case agnostic data model. This generic data model
|
||||||
integration of information across applications and workflows. Transformed
|
facilitates the integration of information across applications and workflows.
|
||||||
metadata records are, again, submitted for curation and integration into
|
Transformed metadata records are, again, submitted for curation and integration
|
||||||
a central knowledge base.
|
into a central knowledge base.
|
||||||
|
|
||||||
This central knowledge base can be queried to produce integrated reports.
|
This central knowledge base can be queried to produce integrated reports.
|
||||||
Knowledge base records can also be exported to the data models of individual
|
Knowledge base records can also be exported to the data models of individual
|
||||||
|
|
@ -66,11 +66,26 @@ consuming metadata, curation workflow differ substantially. The following sectio
|
||||||
collect some ideas and constraints to keep in mind when designing such workflows
|
collect some ideas and constraints to keep in mind when designing such workflows
|
||||||
in this context.
|
in this context.
|
||||||
|
|
||||||
|
### Design schemas to reduce churn
|
||||||
|
|
||||||
|
Data models should be designed to prefer linkage to broader, more slowly evolving,
|
||||||
|
less context constrained entities. For example, the relationship between a
|
||||||
|
container-type entity and its parts should be implemented by a `part_of`
|
||||||
|
relationship, rather than a list of `parts` in the container. This enables
|
||||||
|
the addition of a new part via the creation of a single, additional record
|
||||||
|
-- as opposed to having to create the new record, and then also having to update
|
||||||
|
the part-list.
|
||||||
|
|
||||||
|
This design choice does not limit the on-demand construction of part-lists
|
||||||
|
for "runtime" representations of knowledge for query-focused applications.
|
||||||
|
But it reduces to load on data curation workflows, by reducing the number of
|
||||||
|
events that require knowledge merge operations, in favor of plain additions.
|
||||||
|
|
||||||
### PIDs also require curation
|
### PIDs also require curation
|
||||||
|
|
||||||
Persistent identifiers (PID) play a key role in this metadata concept. Data
|
Persistent identifiers (PID) play a key role in this metadata concept. Data
|
||||||
models and vocabularies can change flexibly, but records still describe one and
|
models and vocabularies can change flexibly, but records still describe one and
|
||||||
the same `Thing` when the PID identical.
|
the same `Thing` when the PID is identical.
|
||||||
|
|
||||||
Persistent identifiers allow referencing entities in contexts where not all
|
Persistent identifiers allow referencing entities in contexts where not all
|
||||||
information about an entity is available. One can reference a `Person` without
|
information about an entity is available. One can reference a `Person` without
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue