2025-05-07 14:20:48 +00:00
1 changed files with 70 additions and 2 deletions
--- a/extra-docs/about.md
+++ b/extra-docs/about.md
@ -59,7 +59,75 @@ flowchart LR
    USER1 ~~~ USER2 ~~~ USER3
 ```
 ## Curation workflows
 Depending on the nature of the metadata and the respective audiences for producing
 consuming metadata, curation workflow differ substantially. The following sections
 collect some ideas and constraints to keep in mind when designing such workflows
 in this context.
 ### PIDs also require curation
 Persistent identifiers (PID) play a key role in this metadata concept. Data
 models and vocabularies can change flexibly, but records still describe one and
 the same `Thing` when the PID identical.
 Persistent identifiers allow referencing entities in contexts where not all
 information about an entity is available. One can reference a `Person` without
 having to reveal possibly sensitive information about that `Person` at the same
 time. For example, a public `Person` record about an academic may only contain
 a name and a work contact email (equivalent to the information available on
 a corresponding author in a journal publication). At the same time, an internal
 `Person` record would have additional information, like a private cell phone number.
 The public record can be generated from the richer, internal record by stripping
 information.
 #### PIDs may require mapping
 However, an identifier itself can also carry information. For example, an ORCID
 identifier typically can be used to reveal the name of a person. Hence when an
 ORCID is used as the PID for a metadata record, any place where the identifier
 is mentioned, also reveals the name of the person.  If the identifier used for
 an internal, protected record and a corresponding public record are the same,
 cross-referencing may be enabled unintentionally.
 In such cases, it can be necessary to maintain mapping tables for PIDs of the
 same entity in different contexts.
 Maintaining a separate PID mapping is also an instrument to aid (future)
 anonymization of records. When the mapping is destroyed (and other conditions
 are fulfilled too), a PID-based re-identification is potentially made impossible.
 #### PIDs may require curation
 When metadata records are submitted by non-experts these records already need to have
 PIDs in order to enable submission of multiple, interlinked records. It is advisable
 to use dedicated (actually only temporarily persistent) PIDs for this purpose.
 The reason is that a submitter cannot necessarily be trusted to use the PID of an
 existing record to make further statements. Instead, they may create a new record,
 with the same information as an existing one, and consequently use a new PID to link
 information to this entity. While a curation could keep both records, and declare them
 "same as" of each other, this needlessly inflates the number of records, increases
 the maintenance load, and complicates queries.
 Instead, curation could merge the two records found to be on the same entity,
 and retain only the already existing one, and therefore just one relevant PID.
 Subsequently, all PID references of the duplicate record in the submission
 could be replaced with this original PID.
 Using a dedicated PID space for pre-curation PIDs, such as
 `inm7:pending/<random-id>` can help the curation process by making them easier
 to detect. Moreover, using random, auto-generated PIDs for new, pre-curation
 records also eases the tasks for submitters. They do not have to learn and follow
 possible rules for PID generations, such as using particular PID systems for certain
 types of records (e.g., DOIs for publications, ORCID for researchers, ROR IDs for
 organizations, RRIDs for resources, etc). This task could be left to professional
 curators.
 ## Acknowledgements
-This work was funded by
+This work was funded by the MKW-NRW: Ministerium für Kultur und Wissenschaft
-
+des Landes Nordrhein-Westfalen under the Kooperationsplattformen 2022 program,
 grant number: KP22-106A.