Add a timestamp annotation and make annotation classes configurable #145
13 changed files with 430 additions and 52 deletions
12
CHANGELOG.md
12
CHANGELOG.md
|
|
@ -1,3 +1,15 @@
|
||||||
|
# 4.4.0 (2025-10-07)
|
||||||
|
|
||||||
|
## New features
|
||||||
|
|
||||||
|
- Record submission time in stored records.
|
||||||
|
|
||||||
|
- Add configuration option for submitter ID-class and the submitter time-class.
|
||||||
|
|
||||||
|
- Improve configuration validation. Configurations with unknown keys are now
|
||||||
|
rejected.
|
||||||
|
|
||||||
|
|
||||||
# 4.3.1 (2025-10-06)
|
# 4.3.1 (2025-10-06)
|
||||||
|
|
||||||
## Bugfixes
|
## Bugfixes
|
||||||
|
|
|
||||||
136
README.md
136
README.md
|
|
@ -439,6 +439,142 @@ A Forgejo authentication source can authenticate Forgejo-tokens that have at lea
|
||||||
- Repository (only if `repository` is set in the configuration): required to determine a team's access to the repository.
|
- Repository (only if `repository` is set in the configuration): required to determine a team's access to the repository.
|
||||||
|
|
||||||
|
|
||||||
|
#### Submission annotation tag
|
||||||
|
|
||||||
|
The service annotates submitted records with a submitter id and a timestamp.
|
||||||
|
Annotations consist of an annotation tag, defining the class of the annotation, and an annotation value.
|
||||||
|
By default the service will use the class `http://purl.obolibrary.org/obo/NCIT_C54269` for the submitter id and the class `http://semanticscience.org/resource/SIO_001083` for submission time.
|
||||||
|
(Both tags will be converted into CURIEs if the schema of the collection defines an appropriate prefix.)
|
||||||
|
|
||||||
|
The default annotation tag classes can be overridden in the configuration on a per collection basis.
|
||||||
|
To override the defaults tags, add a `submission_tags`-attribute to a collection definition.
|
||||||
|
The `submission_tags`-attribute should contain a mapping that maps either `submitter_id_tag`, or `submitter_time_tag` or both to an IRI or a CURIE.
|
||||||
|
If the schema defines a matching prefix, IRIs are automatically converted to CURIEs before storing the record.
|
||||||
|
The service validates that the prefix of a CURIE is defined in the schema of the collection.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
type: collections
|
||||||
|
version: 1
|
||||||
|
collections:
|
||||||
|
collection_1:
|
||||||
|
default_token: basic_access
|
||||||
|
curated: curated
|
||||||
|
incoming: contributions
|
||||||
|
submission_tags:
|
||||||
|
submitter_id_tag: schema:user_id
|
||||||
|
submission_time_tag: schema:time
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
The service currently supports the following backends for storing records:
|
||||||
|
- `record_dir`: this backend stores records as YAML-files in a directory structure that is defined [here](https://concepts.datalad.org/dump-things-storage-v0/). It reads the backend configuration from a "record collection configuration file" as described [here](https://concepts.datalad.org/dump-things-storage-v0/).
|
||||||
|
|
||||||
|
- `sqlite`: this backend stores records in a SQLite database. There is an individual database file, named `__sqlite-records.db`, for each curated area and incoming area.
|
||||||
|
|
||||||
|
- `record_dir+stl`: here `stl` stands for "schema-type-layer".
|
||||||
|
This backend stores records in the same format as `record_dir`, but adds special treatment for the `schema_type` attribute in records.
|
||||||
|
It removes `schema_type`-attributes from the top-level mapping of a record before storing it as YAML-file. When records are read from this backend, a `schema_type` attribute is added back into the record, using a schema to determine the correct class-URI.
|
||||||
|
In other words, all records stored with this backend will have no `schema_type`-attribute in the top-level, and all records read with this backend will have a `schema_type` attribute in the top-level.
|
||||||
|
|
||||||
|
- `sqlite+stl`: This backend stores records in the same format as `sqlite`, but adds the same special treatment for the `schema_type` attribute as `record_dir+stl`.
|
||||||
|
|
||||||
|
Backends can be defined per collection in the configuration file.
|
||||||
|
The backend will be used for the curated area and for the incoming areas of the collection.
|
||||||
|
If no backend is defined for a collection, the `record_dir+stl`-backend is used by default.
|
||||||
|
The `+stl`-backends can be useful if an endpoint returns records of multiple classes, because it allows clients to determine the class of each result record.
|
||||||
|
|
||||||
|
The service guarantees that backends of all types can co-exist independently in the same directory, i.e., there are no name collisions in files that are used for different backends (as long as no class name starts with `.` or `_`)).
|
||||||
|
|
||||||
|
The following configuration snippet shows how to define a backend for a collection:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
...
|
||||||
|
collections:
|
||||||
|
collection_with_default_record_dir+stl_backend:
|
||||||
|
# This is a collection with the default backend, i.e. `record_dir+stl` and
|
||||||
|
# the default authentication, i.e. config-based authentication.
|
||||||
|
default_token: anon_read
|
||||||
|
curated: collection_1/curated
|
||||||
|
|
||||||
|
collection_with_forgejo_authentication_source:
|
||||||
|
# This is a collection with the default backend, i.e. `record_dir+stl` and
|
||||||
|
# a forgejo-based authentication source. That means it will use a forgejo
|
||||||
|
# instance to determine the permissions of a token for this collection.
|
||||||
|
# The instance is also used to determine the user-id and the incoming label.
|
||||||
|
# In the case of forgejo, the user-id and the incoming label are the
|
||||||
|
# forgejo login associated with the token.
|
||||||
|
|
||||||
|
# We still need the name of a default token. If the token is defined in this
|
||||||
|
# config file, its properties will be determined by the
|
||||||
|
# config file. If the token is not defined in the config file, its
|
||||||
|
# properties will be determined by the authentication sources. In this
|
||||||
|
# example by the forgejo-instance at `https://forgejo.example.com`.
|
||||||
|
# If there is more than one authentication source, they will be tried
|
||||||
|
# in the order they are defined in the config file.
|
||||||
|
default_token: anon_read # We still need a default token
|
||||||
|
curated: collection_2/curated
|
||||||
|
|
||||||
|
# Token permissions, user-ids (for record annotations), and incoming
|
||||||
|
# label can be determined by multiple authentication sources.
|
||||||
|
# If no source is defined, `config` will be used, which reads token
|
||||||
|
# information from the config file.
|
||||||
|
# This example explicitly defines `config` and a second authentication
|
||||||
|
# source, a `forgejo` authentication source.
|
||||||
|
auth_sources:
|
||||||
|
- type: forgejo # requires `user`-read and `organization`-read permissions on token
|
||||||
|
# The API-URL of the forgejo instance that should be used
|
||||||
|
url: https://forgejo.example.com/api/v1
|
||||||
|
# An organization
|
||||||
|
organization: data_handling
|
||||||
|
# A team in the organization. The authorization of the team
|
||||||
|
# determines the permissions of the token
|
||||||
|
team: data_entry_personal
|
||||||
|
# `label_type` determines how an incoming label is created for
|
||||||
|
# a Forgejo token. If `label_type` is `team`, the incoming label
|
||||||
|
# will be `forgejo-team-<organization>-<team>`. If `label_type`
|
||||||
|
# is `user`, the incoming label will be
|
||||||
|
# `forgejo-user-<user-login>`
|
||||||
|
label_type: team
|
||||||
|
# An optional repository. The token will only be authorized
|
||||||
|
# if the team has access to the repository. Note: if `repo`
|
||||||
|
# is set, the token must have at least repository read
|
||||||
|
# permissions.
|
||||||
|
repo: reference-repository
|
||||||
|
|
||||||
|
# Fallback to the config file.
|
||||||
|
- type: config # check tokens from the configuration file
|
||||||
|
|
||||||
|
# Multiple authorization sources are allowed. They will be tried in the
|
||||||
|
# order defined in the config file. If an authorization source returns
|
||||||
|
# permissions for a token, those permissions will be used and no other
|
||||||
|
# authorization sources will be queried.
|
||||||
|
# The default authorization source is `config`, which reads the token
|
||||||
|
# permissions, user-id, and incoming
|
||||||
|
|
||||||
|
collection_with_explicit_record_dir+stl_backend:
|
||||||
|
default_token: anon_read
|
||||||
|
curated: collection_3/curated
|
||||||
|
backend:
|
||||||
|
# The record_dir-backend is identified by the
|
||||||
|
# type: "record_dir". No more attributes are
|
||||||
|
# defined for this backend.
|
||||||
|
type: record_dir+stl
|
||||||
|
|
||||||
|
collection_with_sqlite_backend:
|
||||||
|
default_token: anon_read
|
||||||
|
curated: collection_4/curated
|
||||||
|
backend:
|
||||||
|
# The sqlite-backend is identified by the
|
||||||
|
# type: "sqlite". It requires a schema attribute
|
||||||
|
# that holds the URL of the schema that should
|
||||||
|
# be used in this backend.
|
||||||
|
type: sqlite
|
||||||
|
schema: https://concepts.inm7.de/s/flat-data/unreleased.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
### Command line parameters:
|
### Command line parameters:
|
||||||
|
|
||||||
The service supports the following command line parameters:
|
The service supports the following command line parameters:
|
||||||
|
|
|
||||||
|
|
@ -1 +1 @@
|
||||||
__version__ = '4.3.1'
|
__version__ = '4.4.0'
|
||||||
|
|
|
||||||
|
|
@ -17,6 +17,8 @@ import yaml
|
||||||
from fastapi import HTTPException
|
from fastapi import HTTPException
|
||||||
from pydantic import (
|
from pydantic import (
|
||||||
BaseModel,
|
BaseModel,
|
||||||
|
ConfigDict,
|
||||||
|
Field,
|
||||||
ValidationError,
|
ValidationError,
|
||||||
)
|
)
|
||||||
from yaml.scanner import ScannerError
|
from yaml.scanner import ScannerError
|
||||||
|
|
@ -29,8 +31,12 @@ from dump_things_service.backends.sqlite import (
|
||||||
record_file_name as sqlite_record_file_name,
|
record_file_name as sqlite_record_file_name,
|
||||||
)
|
)
|
||||||
from dump_things_service.converter import get_conversion_objects
|
from dump_things_service.converter import get_conversion_objects
|
||||||
from dump_things_service.exceptions import ConfigError
|
from dump_things_service.exceptions import (
|
||||||
|
ConfigError,
|
||||||
|
CurieResolutionError,
|
||||||
|
)
|
||||||
from dump_things_service.model import get_model_for_schema
|
from dump_things_service.model import get_model_for_schema
|
||||||
|
from dump_things_service.resolve_curie import resolve_curie
|
||||||
from dump_things_service.store.model_store import ModelStore
|
from dump_things_service.store.model_store import ModelStore
|
||||||
from dump_things_service.token import (
|
from dump_things_service.token import (
|
||||||
TokenPermission,
|
TokenPermission,
|
||||||
|
|
@ -51,6 +57,10 @@ ignored_files = {'.', '..', config_file_name}
|
||||||
_global_config_instance = None
|
_global_config_instance = None
|
||||||
|
|
||||||
|
|
||||||
|
class StrictModel(BaseModel):
|
||||||
|
model_config = ConfigDict(extra='forbid')
|
||||||
|
|
||||||
|
|
||||||
class MappingMethod(enum.Enum):
|
class MappingMethod(enum.Enum):
|
||||||
digest_md5 = 'digest-md5'
|
digest_md5 = 'digest-md5'
|
||||||
digest_md5_p3 = 'digest-md5-p3'
|
digest_md5_p3 = 'digest-md5-p3'
|
||||||
|
|
@ -61,7 +71,7 @@ class MappingMethod(enum.Enum):
|
||||||
after_last_colon = 'after-last-colon'
|
after_last_colon = 'after-last-colon'
|
||||||
|
|
||||||
|
|
||||||
class CollectionDirConfig(BaseModel):
|
class CollectionDirConfig(StrictModel):
|
||||||
type: Literal['records']
|
type: Literal['records']
|
||||||
version: Literal[1]
|
version: Literal[1]
|
||||||
schema: str
|
schema: str
|
||||||
|
|
@ -82,26 +92,27 @@ class TokenModes(enum.Enum):
|
||||||
|
|
||||||
|
|
||||||
class TokenCollectionConfig(BaseModel):
|
class TokenCollectionConfig(BaseModel):
|
||||||
|
model_config = ConfigDict(extra='forbid')
|
||||||
mode: TokenModes
|
mode: TokenModes
|
||||||
incoming_label: str
|
incoming_label: str = Field(strict=True)
|
||||||
|
|
||||||
|
|
||||||
class TokenConfig(BaseModel):
|
class TokenConfig(StrictModel):
|
||||||
user_id: str
|
user_id: str
|
||||||
collections: dict[str, TokenCollectionConfig]
|
collections: dict[str, TokenCollectionConfig]
|
||||||
hashed: bool = False
|
hashed: bool = False
|
||||||
|
|
||||||
|
|
||||||
class BackendConfigRecordDir(BaseModel):
|
class BackendConfigRecordDir(StrictModel):
|
||||||
type: Literal['record_dir', 'record_dir+stl']
|
type: Literal['record_dir', 'record_dir+stl']
|
||||||
|
|
||||||
|
|
||||||
class BackendConfigSQLite(BaseModel):
|
class BackendConfigSQLite(StrictModel):
|
||||||
type: Literal['sqlite', 'sqlite+stl']
|
type: Literal['sqlite', 'sqlite+stl']
|
||||||
schema: str
|
schema: str
|
||||||
|
|
||||||
|
|
||||||
class ForgejoAuthConfig(BaseModel):
|
class ForgejoAuthConfig(StrictModel):
|
||||||
type: Literal['forgejo']
|
type: Literal['forgejo']
|
||||||
url: str
|
url: str
|
||||||
organization: str
|
organization: str
|
||||||
|
|
@ -110,19 +121,27 @@ class ForgejoAuthConfig(BaseModel):
|
||||||
repository: str | None = None
|
repository: str | None = None
|
||||||
|
|
||||||
|
|
||||||
class ConfigAuthConfig(BaseModel):
|
class ConfigAuthConfig(StrictModel):
|
||||||
type: Literal['config'] = 'config'
|
type: Literal['config'] = 'config'
|
||||||
|
|
||||||
|
|
||||||
class CollectionConfig(BaseModel):
|
class TagConfig(StrictModel):
|
||||||
|
submitter_id_tag: str = 'http://purl.obolibrary.org/obo/NCIT_C54269'
|
||||||
|
submission_time_tag: str = 'http://semanticscience.org/resource/SIO_001083'
|
||||||
|
|
||||||
|
|
||||||
|
class CollectionConfig(StrictModel):
|
||||||
default_token: str
|
default_token: str
|
||||||
curated: Path
|
curated: Path
|
||||||
incoming: Path | None = None
|
incoming: Path | None = None
|
||||||
backend: BackendConfigRecordDir | BackendConfigSQLite | None = None
|
backend: BackendConfigRecordDir | BackendConfigSQLite | None = None
|
||||||
auth_sources: list[ForgejoAuthConfig | ConfigAuthConfig] = [ConfigAuthConfig()]
|
auth_sources: list[ForgejoAuthConfig | ConfigAuthConfig] = [ConfigAuthConfig()]
|
||||||
|
submission_tags: TagConfig = TagConfig()
|
||||||
|
|
||||||
|
|
||||||
class GlobalConfig(BaseModel):
|
class GlobalConfig(StrictModel):
|
||||||
|
model_config = ConfigDict(strict=True)
|
||||||
|
|
||||||
type: Literal['collections']
|
type: Literal['collections']
|
||||||
version: Literal[1]
|
version: Literal[1]
|
||||||
collections: dict[str, CollectionConfig]
|
collections: dict[str, CollectionConfig]
|
||||||
|
|
@ -399,6 +418,10 @@ def process_config_object(
|
||||||
curated_store = ModelStore(
|
curated_store = ModelStore(
|
||||||
schema=schema,
|
schema=schema,
|
||||||
backend=curated_store_backend,
|
backend=curated_store_backend,
|
||||||
|
tags={
|
||||||
|
'id': collection_info.submission_tags.submitter_id_tag,
|
||||||
|
'time': collection_info.submission_tags.submission_time_tag,
|
||||||
|
}
|
||||||
)
|
)
|
||||||
|
|
||||||
instance_config.curated_stores[collection_name] = curated_store
|
instance_config.curated_stores[collection_name] = curated_store
|
||||||
|
|
@ -496,6 +519,14 @@ def process_config_object(
|
||||||
msg = 'plain tokens clash with hashed tokens'
|
msg = 'plain tokens clash with hashed tokens'
|
||||||
raise ConfigError(msg)
|
raise ConfigError(msg)
|
||||||
|
|
||||||
|
# Check tags
|
||||||
|
for collection_name, collection_info in config_object.collections.items():
|
||||||
|
module = instance_config.model_info[collection_name][0]
|
||||||
|
try:
|
||||||
|
resolve_curie(module, collection_info.submission_tags.submission_time_tag)
|
||||||
|
except CurieResolutionError as e:
|
||||||
|
raise ConfigError(str(e)) from e
|
||||||
|
|
||||||
return instance_config
|
return instance_config
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -17,21 +17,30 @@ url_regex = re.compile(url_pattern)
|
||||||
|
|
||||||
def resolve_curie(
|
def resolve_curie(
|
||||||
model: types.ModuleType,
|
model: types.ModuleType,
|
||||||
curie: str,
|
curie_or_iri: str,
|
||||||
) -> str:
|
) -> str:
|
||||||
if ':' not in curie:
|
if ':' not in curie_or_iri:
|
||||||
return curie
|
return curie_or_iri
|
||||||
|
|
||||||
if url_regex.match(curie):
|
if not is_curie(curie_or_iri):
|
||||||
return curie
|
return curie_or_iri
|
||||||
|
|
||||||
prefix, identifier = curie.split(':', 1)
|
prefix, identifier = curie_or_iri.split(':', 1)
|
||||||
prefix_value = model.linkml_meta.root.get('prefixes', {}).get(prefix)
|
prefix_value = model.linkml_meta.root.get('prefixes', {}).get(prefix)
|
||||||
if prefix_value is None:
|
if prefix_value is None:
|
||||||
msg = (
|
msg = (
|
||||||
f'cannot resolve CURIE "{curie}". No such prefix: "{prefix}" in '
|
f'cannot resolve CURIE "{curie_or_iri}". No such prefix: "{prefix}" in '
|
||||||
f'schema: {model.linkml_meta.root["id"]}'
|
f'schema: {model.linkml_meta.root["id"]}'
|
||||||
)
|
)
|
||||||
raise CurieResolutionError(msg)
|
raise CurieResolutionError(msg)
|
||||||
|
|
||||||
return prefix_value['prefix_reference'] + identifier
|
return prefix_value['prefix_reference'] + identifier
|
||||||
|
|
||||||
|
|
||||||
|
def is_curie(
|
||||||
|
curie_or_iri: str,
|
||||||
|
):
|
||||||
|
if ':' not in curie_or_iri:
|
||||||
|
return False
|
||||||
|
|
||||||
|
return url_regex.match(curie_or_iri) is None
|
||||||
|
|
|
||||||
|
|
@ -1,5 +1,6 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from datetime import datetime
|
||||||
from itertools import chain
|
from itertools import chain
|
||||||
from typing import TYPE_CHECKING
|
from typing import TYPE_CHECKING
|
||||||
|
|
||||||
|
|
@ -7,7 +8,7 @@ from dump_things_service.model import (
|
||||||
get_model_for_schema,
|
get_model_for_schema,
|
||||||
get_subclasses,
|
get_subclasses,
|
||||||
)
|
)
|
||||||
from dump_things_service.resolve_curie import resolve_curie
|
from dump_things_service.resolve_curie import resolve_curie, is_curie
|
||||||
from dump_things_service.utils import cleaned_json
|
from dump_things_service.utils import cleaned_json
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
if TYPE_CHECKING:
|
||||||
|
|
@ -31,10 +32,12 @@ class _ModelStore:
|
||||||
self,
|
self,
|
||||||
schema: str,
|
schema: str,
|
||||||
backend: StorageBackend,
|
backend: StorageBackend,
|
||||||
|
tags: dict[str, str]
|
||||||
):
|
):
|
||||||
self.schema = schema
|
self.schema = schema
|
||||||
self.model = get_model_for_schema(self.schema)[0]
|
self.model = get_model_for_schema(self.schema)[0]
|
||||||
self.backend = backend
|
self.backend = backend
|
||||||
|
self.tags = tags
|
||||||
|
|
||||||
def get_uri(self) -> str:
|
def get_uri(self) -> str:
|
||||||
return self.backend.get_uri()
|
return self.backend.get_uri()
|
||||||
|
|
@ -95,25 +98,30 @@ class _ModelStore:
|
||||||
submitter: str,
|
submitter: str,
|
||||||
) -> None:
|
) -> None:
|
||||||
"""Add submitter IRI to the record annotations, use CURIE if possible"""
|
"""Add submitter IRI to the record annotations, use CURIE if possible"""
|
||||||
submitter_iri = self.get_curie(
|
|
||||||
submitter_namespace,
|
|
||||||
submitter_class,
|
|
||||||
)
|
|
||||||
if 'annotations' not in json_object:
|
if 'annotations' not in json_object:
|
||||||
json_object['annotations'] = {}
|
json_object['annotations'] = {}
|
||||||
json_object['annotations'][submitter_iri] = submitter
|
submitter_curie_or_iri = self.get_curie(self.tags['id'])
|
||||||
|
time_curie_or_iri = self.get_curie(self.tags['time'])
|
||||||
|
json_object['annotations'][submitter_curie_or_iri] = submitter
|
||||||
|
json_object['annotations'][time_curie_or_iri] = datetime.now().isoformat()
|
||||||
|
|
||||||
def get_curie(
|
def get_curie(
|
||||||
self,
|
self,
|
||||||
name_space: str,
|
curie_or_iri: str,
|
||||||
class_name: str,
|
|
||||||
) -> str:
|
) -> str:
|
||||||
|
if is_curie(curie_or_iri):
|
||||||
|
return curie_or_iri
|
||||||
prefixes = self.model.linkml_meta.root.get('prefixes')
|
prefixes = self.model.linkml_meta.root.get('prefixes')
|
||||||
if prefixes:
|
if prefixes:
|
||||||
for prefix_info in prefixes.values():
|
for prefix_info in prefixes.values():
|
||||||
if prefix_info['prefix_reference'] == name_space:
|
reference = prefix_info['prefix_reference']
|
||||||
return f'{prefix_info["prefix_prefix"]}:{class_name}'
|
if curie_or_iri.startswith(reference):
|
||||||
return f'{name_space}{class_name}'
|
return curie_or_iri.replace(
|
||||||
|
reference,
|
||||||
|
prefix_info['prefix_prefix'] + ':',
|
||||||
|
1,
|
||||||
|
)
|
||||||
|
return curie_or_iri
|
||||||
|
|
||||||
def extract_inlined(
|
def extract_inlined(
|
||||||
self,
|
self,
|
||||||
|
|
@ -206,6 +214,7 @@ _existing_model_stores = {}
|
||||||
def ModelStore( # noqa: N802
|
def ModelStore( # noqa: N802
|
||||||
schema: str,
|
schema: str,
|
||||||
backend: StorageBackend,
|
backend: StorageBackend,
|
||||||
|
tags: dict[str, str],
|
||||||
) -> _ModelStore:
|
) -> _ModelStore:
|
||||||
"""
|
"""
|
||||||
Create a unique model store for the given schema and backend.
|
Create a unique model store for the given schema and backend.
|
||||||
|
|
@ -216,7 +225,7 @@ def ModelStore( # noqa: N802
|
||||||
"""
|
"""
|
||||||
existing_model_store, _ = _existing_model_stores.get(id(backend), (None, None))
|
existing_model_store, _ = _existing_model_stores.get(id(backend), (None, None))
|
||||||
if not existing_model_store:
|
if not existing_model_store:
|
||||||
existing_model_store = _ModelStore(schema, backend)
|
existing_model_store = _ModelStore(schema, backend, tags)
|
||||||
# We store a pointer to the backend in the value to ensure that the
|
# We store a pointer to the backend in the value to ensure that the
|
||||||
# backend object exists while we use its `id` as a key.
|
# backend object exists while we use its `id` as a key.
|
||||||
_existing_model_stores[id(backend)] = existing_model_store, backend
|
_existing_model_stores[id(backend)] = existing_model_store, backend
|
||||||
|
|
|
||||||
|
|
@ -37,58 +37,47 @@ collections:
|
||||||
incoming: {incoming}
|
incoming: {incoming}
|
||||||
backend:
|
backend:
|
||||||
type: record_dir+stl
|
type: record_dir+stl
|
||||||
schema: {schema_path}
|
|
||||||
idfx: digest_md5
|
|
||||||
auth_sources:
|
auth_sources:
|
||||||
- type: config
|
- type: config
|
||||||
|
submission_tags:
|
||||||
|
submitter_id_tag: oxo:NCIT_C54269
|
||||||
|
submission_time_tag: https://time
|
||||||
collection_2:
|
collection_2:
|
||||||
default_token: basic_access
|
default_token: basic_access
|
||||||
curated: {curated}/collection_2
|
curated: {curated}/collection_2
|
||||||
incoming: incoming_2
|
incoming: incoming_2
|
||||||
backend:
|
backend:
|
||||||
type: record_dir+stl
|
type: record_dir+stl
|
||||||
schema: {schema_path}
|
|
||||||
idfx: digest_md5
|
|
||||||
collection_3:
|
collection_3:
|
||||||
default_token: basic_access
|
default_token: basic_access
|
||||||
curated: {curated}/collection_3
|
curated: {curated}/collection_3
|
||||||
incoming: incoming_3
|
incoming: incoming_3
|
||||||
backend:
|
backend:
|
||||||
type: record_dir+stl
|
type: record_dir+stl
|
||||||
schema: {schema_path}
|
|
||||||
idfx: digest_md5
|
|
||||||
collection_4:
|
collection_4:
|
||||||
default_token: basic_access
|
default_token: basic_access
|
||||||
curated: {curated}/collection_4
|
curated: {curated}/collection_4
|
||||||
incoming: incoming_4
|
incoming: incoming_4
|
||||||
backend:
|
backend:
|
||||||
type: record_dir+stl
|
type: record_dir+stl
|
||||||
schema: {schema_path}
|
|
||||||
idfx: digest_md5
|
|
||||||
collection_5:
|
collection_5:
|
||||||
default_token: basic_access
|
default_token: basic_access
|
||||||
curated: {curated}/collection_5
|
curated: {curated}/collection_5
|
||||||
incoming: incoming_5
|
incoming: incoming_5
|
||||||
backend:
|
backend:
|
||||||
type: record_dir+stl
|
type: record_dir+stl
|
||||||
schema: {schema_path}
|
|
||||||
idfx: digest_md5
|
|
||||||
collection_6:
|
collection_6:
|
||||||
default_token: basic_access
|
default_token: basic_access
|
||||||
curated: {curated}/collection_6
|
curated: {curated}/collection_6
|
||||||
incoming: incoming_6
|
incoming: incoming_6
|
||||||
backend:
|
backend:
|
||||||
type: record_dir+stl
|
type: record_dir+stl
|
||||||
schema: {schema_path}
|
|
||||||
idfx: digest_md5
|
|
||||||
collection_7:
|
collection_7:
|
||||||
default_token: basic_access
|
default_token: basic_access
|
||||||
curated: {curated}/collection_7
|
curated: {curated}/collection_7
|
||||||
incoming: incoming_7
|
incoming: incoming_7
|
||||||
backend:
|
backend:
|
||||||
type: record_dir+stl
|
type: record_dir+stl
|
||||||
schema: {schema_path}
|
|
||||||
idfx: digest_md5
|
|
||||||
collection_8:
|
collection_8:
|
||||||
default_token: basic_access
|
default_token: basic_access
|
||||||
curated: {curated}/collection_8
|
curated: {curated}/collection_8
|
||||||
|
|
@ -102,8 +91,6 @@ collections:
|
||||||
incoming: {incoming}/collection_dlflatsocial-1
|
incoming: {incoming}/collection_dlflatsocial-1
|
||||||
backend:
|
backend:
|
||||||
type: record_dir+stl
|
type: record_dir+stl
|
||||||
schema: https://concepts.datalad.org/s/flat-social/unreleased.yaml
|
|
||||||
idfx: digest_md5
|
|
||||||
collection_dlflatsocial-2:
|
collection_dlflatsocial-2:
|
||||||
default_token: basic_access
|
default_token: basic_access
|
||||||
curated: {curated}/collection_dlflatsocial-2
|
curated: {curated}/collection_dlflatsocial-2
|
||||||
|
|
|
||||||
|
|
@ -8,7 +8,7 @@ from dump_things_service.config import (
|
||||||
ConfigError,
|
ConfigError,
|
||||||
GlobalConfig,
|
GlobalConfig,
|
||||||
process_config,
|
process_config,
|
||||||
process_config_object,
|
process_config_object, get_config,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -56,3 +56,133 @@ tokens:
|
||||||
global_dict = {}
|
global_dict = {}
|
||||||
with pytest.raises(ConfigError):
|
with pytest.raises(ConfigError):
|
||||||
process_config_object(tmp_path, config_object, [], global_dict)
|
process_config_object(tmp_path, config_object, [], global_dict)
|
||||||
|
|
||||||
|
|
||||||
|
def test_submission_tags_handling(dump_stores_simple):
|
||||||
|
config_object = GlobalConfig(
|
||||||
|
**yaml.load(
|
||||||
|
"""
|
||||||
|
type: collections
|
||||||
|
version: 1
|
||||||
|
collections:
|
||||||
|
collection_1:
|
||||||
|
default_token: basic_access
|
||||||
|
curated: curated/in_token_1
|
||||||
|
incoming: contributions
|
||||||
|
submission_tags:
|
||||||
|
submitter_id_tag: no_default_id_tag
|
||||||
|
submission_time_tag: no_default_time_tag
|
||||||
|
collection_2:
|
||||||
|
default_token: basic_access
|
||||||
|
curated: curated/collection_2
|
||||||
|
incoming: contributions
|
||||||
|
tokens:
|
||||||
|
basic_access:
|
||||||
|
user_id: anonymous
|
||||||
|
collections:
|
||||||
|
collection_1:
|
||||||
|
mode: WRITE_COLLECTION
|
||||||
|
incoming_label: incoming_anonymous
|
||||||
|
collection_2:
|
||||||
|
mode: WRITE_COLLECTION
|
||||||
|
incoming_label: incoming_anonymous
|
||||||
|
""",
|
||||||
|
Loader=yaml.SafeLoader,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
global_dict = {}
|
||||||
|
config = process_config_object(dump_stores_simple, config_object, [], global_dict)
|
||||||
|
# Check for specified tags in collection `collection_1`
|
||||||
|
assert config.collections['collection_1'].submission_tags.submission_time_tag == 'no_default_time_tag'
|
||||||
|
assert config.collections['collection_1'].submission_tags.submitter_id_tag == 'no_default_id_tag'
|
||||||
|
# Check for default tags in collection `collection_2`
|
||||||
|
assert config.collections['collection_2'].submission_tags.submission_time_tag == 'http://semanticscience.org/resource/SIO_001083'
|
||||||
|
assert config.collections['collection_2'].submission_tags.submitter_id_tag == 'http://purl.obolibrary.org/obo/NCIT_C54269'
|
||||||
|
|
||||||
|
|
||||||
|
def test_submission_tags_resolving(dump_stores_simple):
|
||||||
|
config_object = GlobalConfig(
|
||||||
|
**yaml.load(
|
||||||
|
"""
|
||||||
|
type: collections
|
||||||
|
version: 1
|
||||||
|
collections:
|
||||||
|
collection_1:
|
||||||
|
default_token: basic_access
|
||||||
|
curated: curated/in_token_1
|
||||||
|
incoming: contributions
|
||||||
|
submission_tags:
|
||||||
|
submitter_id_tag: abc:id
|
||||||
|
submission_time_tag: abc:time
|
||||||
|
tokens:
|
||||||
|
basic_access:
|
||||||
|
user_id: anonymous
|
||||||
|
collections:
|
||||||
|
collection_1:
|
||||||
|
mode: WRITE_COLLECTION
|
||||||
|
incoming_label: incoming_anonymous
|
||||||
|
""",
|
||||||
|
Loader=yaml.SafeLoader,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
global_dict = {}
|
||||||
|
process_config_object(dump_stores_simple, config_object, [], global_dict)
|
||||||
|
|
||||||
|
|
||||||
|
def test_submission_tags_resolving_error(dump_stores_simple):
|
||||||
|
config_object = GlobalConfig(
|
||||||
|
**yaml.load(
|
||||||
|
"""
|
||||||
|
type: collections
|
||||||
|
version: 1
|
||||||
|
collections:
|
||||||
|
collection_1:
|
||||||
|
default_token: basic_access
|
||||||
|
curated: curated/in_token_1
|
||||||
|
incoming: contributions
|
||||||
|
submission_tags:
|
||||||
|
submitter_id_tag: non-existing:id
|
||||||
|
collection_2:
|
||||||
|
default_token: basic_access
|
||||||
|
curated: curated/in_token_1
|
||||||
|
incoming: contributions
|
||||||
|
submission_tags:
|
||||||
|
submission_time_tag: non-existing:time
|
||||||
|
collection_3:
|
||||||
|
default_token: basic_access
|
||||||
|
curated: curated/in_token_1
|
||||||
|
incoming: contributions
|
||||||
|
submission_tags:
|
||||||
|
submitter_id_tag: http://something/non-existing
|
||||||
|
collection_4:
|
||||||
|
default_token: basic_access
|
||||||
|
curated: curated/in_token_1
|
||||||
|
incoming: contributions
|
||||||
|
submission_tags:
|
||||||
|
submission_time_tag: http://something/non-existing
|
||||||
|
tokens:
|
||||||
|
basic_access:
|
||||||
|
user_id: anonymous
|
||||||
|
collections:
|
||||||
|
collection_1:
|
||||||
|
mode: WRITE_COLLECTION
|
||||||
|
incoming_label: incoming_anonymous
|
||||||
|
collection_2:
|
||||||
|
mode: WRITE_COLLECTION
|
||||||
|
incoming_label: incoming_anonymous
|
||||||
|
collection_3:
|
||||||
|
mode: WRITE_COLLECTION
|
||||||
|
incoming_label: incoming_anonymous
|
||||||
|
collection_4:
|
||||||
|
mode: WRITE_COLLECTION
|
||||||
|
incoming_label: incoming_anonymous
|
||||||
|
""",
|
||||||
|
Loader=yaml.SafeLoader,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
global_dict = {}
|
||||||
|
with pytest.raises(ConfigError) as e:
|
||||||
|
process_config_object(dump_stores_simple, config_object, [], global_dict)
|
||||||
|
|
|
||||||
|
|
@ -165,6 +165,10 @@ def test_inline_extraction_locally():
|
||||||
store = ModelStore(
|
store = ModelStore(
|
||||||
schema=str(schema_path),
|
schema=str(schema_path),
|
||||||
backend=None,
|
backend=None,
|
||||||
|
tags = {
|
||||||
|
'id': 'abc:id',
|
||||||
|
'time': 'abc:time',
|
||||||
|
}
|
||||||
)
|
)
|
||||||
store.model = MockedModule()
|
store.model = MockedModule()
|
||||||
records = store.extract_inlined(inlined_object)
|
records = store.extract_inlined(inlined_object)
|
||||||
|
|
@ -193,6 +197,10 @@ def test_dont_extract_empty_things_locally():
|
||||||
store = ModelStore(
|
store = ModelStore(
|
||||||
schema=str(schema_path),
|
schema=str(schema_path),
|
||||||
backend=None,
|
backend=None,
|
||||||
|
tags={
|
||||||
|
'id': 'https://id',
|
||||||
|
'time': 'https://time',
|
||||||
|
}
|
||||||
)
|
)
|
||||||
store.model = MockedModule()
|
store.model = MockedModule()
|
||||||
records = store.extract_inlined(empty_inlined_object)
|
records = store.extract_inlined(empty_inlined_object)
|
||||||
|
|
|
||||||
|
|
@ -1,3 +1,4 @@
|
||||||
|
import freezegun
|
||||||
import pytest # noqa F401
|
import pytest # noqa F401
|
||||||
|
|
||||||
from .. import HTTP_200_OK
|
from .. import HTTP_200_OK
|
||||||
|
|
@ -19,14 +20,32 @@ xyz:HenryAdams a abc:Person ;
|
||||||
abc:schema_type "abc:Person" .
|
abc:schema_type "abc:Person" .
|
||||||
"""
|
"""
|
||||||
|
|
||||||
ttl_result_record = """@prefix abc: <http://example.org/person-schema/abc/> .
|
ttl_result_record_a = """@prefix abc: <http://example.org/person-schema/abc/> .
|
||||||
|
@prefix oxo: <http://purl.obolibrary.org/obo/> .
|
||||||
|
@prefix xyz: <http://example.org/person-schema/xyz/> .
|
||||||
|
|
||||||
|
xyz:HenryAdams a abc:Person ;
|
||||||
|
abc:annotations [ a abc:Annotation ;
|
||||||
|
abc:annotation_tag <https://time> ;
|
||||||
|
abc:annotation_value "1970-01-01T00:00:00" ],
|
||||||
|
[ a abc:Annotation ;
|
||||||
|
abc:annotation_tag oxo:NCIT_C54269 ;
|
||||||
|
abc:annotation_value "test_user_1" ] ;
|
||||||
|
abc:given_name "Henryöäß" ;
|
||||||
|
abc:schema_type "abc:Person" .
|
||||||
|
"""
|
||||||
|
|
||||||
|
ttl_result_record_b = """@prefix abc: <http://example.org/person-schema/abc/> .
|
||||||
@prefix oxo: <http://purl.obolibrary.org/obo/> .
|
@prefix oxo: <http://purl.obolibrary.org/obo/> .
|
||||||
@prefix xyz: <http://example.org/person-schema/xyz/> .
|
@prefix xyz: <http://example.org/person-schema/xyz/> .
|
||||||
|
|
||||||
xyz:HenryAdams a abc:Person ;
|
xyz:HenryAdams a abc:Person ;
|
||||||
abc:annotations [ a abc:Annotation ;
|
abc:annotations [ a abc:Annotation ;
|
||||||
abc:annotation_tag oxo:NCIT_C54269 ;
|
abc:annotation_tag oxo:NCIT_C54269 ;
|
||||||
abc:annotation_value "test_user_1" ] ;
|
abc:annotation_value "test_user_1" ],
|
||||||
|
[ a abc:Annotation ;
|
||||||
|
abc:annotation_tag <https://time> ;
|
||||||
|
abc:annotation_value "1970-01-01T00:00:00" ] ;
|
||||||
abc:given_name "Henryöäß" ;
|
abc:given_name "Henryöäß" ;
|
||||||
abc:schema_type "abc:Person" .
|
abc:schema_type "abc:Person" .
|
||||||
"""
|
"""
|
||||||
|
|
@ -75,6 +94,7 @@ def test_json_ttl_json(fastapi_client_simple):
|
||||||
assert json_object == json_record_out
|
assert json_object == json_record_out
|
||||||
|
|
||||||
|
|
||||||
|
@freezegun.freeze_time('1970-01-01')
|
||||||
def test_ttl_json_ttl(fastapi_client_simple):
|
def test_ttl_json_ttl(fastapi_client_simple):
|
||||||
test_client, _ = fastapi_client_simple
|
test_client, _ = fastapi_client_simple
|
||||||
|
|
||||||
|
|
@ -115,5 +135,8 @@ def test_ttl_json_ttl(fastapi_client_simple):
|
||||||
assert response.status_code == HTTP_200_OK
|
assert response.status_code == HTTP_200_OK
|
||||||
assert (
|
assert (
|
||||||
response.text.strip()
|
response.text.strip()
|
||||||
== ttl_result_record.replace('xyz:HenryAdams', new_json_pid).strip()
|
== ttl_result_record_a.replace('xyz:HenryAdams', new_json_pid).strip()
|
||||||
|
) or (
|
||||||
|
response.text.strip()
|
||||||
|
== ttl_result_record_b.replace('xyz:HenryAdams', new_json_pid).strip()
|
||||||
)
|
)
|
||||||
|
|
|
||||||
|
|
@ -1,5 +1,10 @@
|
||||||
|
import datetime
|
||||||
|
from datetime import datetime as datetime_object
|
||||||
|
|
||||||
import pytest # noqa F401
|
import pytest # noqa F401
|
||||||
|
|
||||||
|
import freezegun
|
||||||
|
|
||||||
from .. import HTTP_200_OK
|
from .. import HTTP_200_OK
|
||||||
from ..utils import cleaned_json
|
from ..utils import cleaned_json
|
||||||
|
|
||||||
|
|
@ -33,7 +38,22 @@ dlflatsocial:test_john_ttl a dlflatsocial:Person ;
|
||||||
dlsocialmx:given_name "Johnöüß" .
|
dlsocialmx:given_name "Johnöüß" .
|
||||||
"""
|
"""
|
||||||
|
|
||||||
ttl_output_record = """@prefix dlflatsocial: <https://concepts.datalad.org/s/flat-social/unreleased/> .
|
ttl_output_record_a = """@prefix dlflatsocial: <https://concepts.datalad.org/s/flat-social/unreleased/> .
|
||||||
|
@prefix dlsocialmx: <https://concepts.datalad.org/s/social-mixin/unreleased/> .
|
||||||
|
@prefix dlthings: <https://concepts.datalad.org/s/things/v1/> .
|
||||||
|
@prefix obo: <http://purl.obolibrary.org/obo/> .
|
||||||
|
|
||||||
|
dlflatsocial:test_john_ttl a dlflatsocial:Person ;
|
||||||
|
dlsocialmx:given_name "Johnöüß" ;
|
||||||
|
dlthings:annotations [ a dlthings:Annotation ;
|
||||||
|
dlthings:annotation_tag <http://semanticscience.org/resource/SIO_001083> ;
|
||||||
|
dlthings:annotation_value "1970-01-01T00:00:00" ],
|
||||||
|
[ a dlthings:Annotation ;
|
||||||
|
dlthings:annotation_tag obo:NCIT_C54269 ;
|
||||||
|
dlthings:annotation_value "test_user_1" ] .
|
||||||
|
"""
|
||||||
|
|
||||||
|
ttl_output_record_b = """@prefix dlflatsocial: <https://concepts.datalad.org/s/flat-social/unreleased/> .
|
||||||
@prefix dlsocialmx: <https://concepts.datalad.org/s/social-mixin/unreleased/> .
|
@prefix dlsocialmx: <https://concepts.datalad.org/s/social-mixin/unreleased/> .
|
||||||
@prefix dlthings: <https://concepts.datalad.org/s/things/v1/> .
|
@prefix dlthings: <https://concepts.datalad.org/s/things/v1/> .
|
||||||
@prefix obo: <http://purl.obolibrary.org/obo/> .
|
@prefix obo: <http://purl.obolibrary.org/obo/> .
|
||||||
|
|
@ -42,7 +62,10 @@ dlflatsocial:test_john_ttl a dlflatsocial:Person ;
|
||||||
dlsocialmx:given_name "Johnöüß" ;
|
dlsocialmx:given_name "Johnöüß" ;
|
||||||
dlthings:annotations [ a dlthings:Annotation ;
|
dlthings:annotations [ a dlthings:Annotation ;
|
||||||
dlthings:annotation_tag obo:NCIT_C54269 ;
|
dlthings:annotation_tag obo:NCIT_C54269 ;
|
||||||
dlthings:annotation_value "test_user_1" ] .
|
dlthings:annotation_value "test_user_1" ],
|
||||||
|
[ a dlthings:Annotation ;
|
||||||
|
dlthings:annotation_tag <http://semanticscience.org/resource/SIO_001083> ;
|
||||||
|
dlthings:annotation_value "1970-01-01T00:00:00" ] .
|
||||||
"""
|
"""
|
||||||
|
|
||||||
new_json_pid = 'dlflatsocial:another_john_ttl'
|
new_json_pid = 'dlflatsocial:another_john_ttl'
|
||||||
|
|
@ -90,6 +113,7 @@ def test_json_ttl_json_dlflatsocial(fastapi_client_simple):
|
||||||
assert json_object == json_record_out
|
assert json_object == json_record_out
|
||||||
|
|
||||||
|
|
||||||
|
@freezegun.freeze_time('1970-01-01')
|
||||||
def test_ttl_json_ttl_dlflatsocial(fastapi_client_simple):
|
def test_ttl_json_ttl_dlflatsocial(fastapi_client_simple):
|
||||||
test_client, _ = fastapi_client_simple
|
test_client, _ = fastapi_client_simple
|
||||||
|
|
||||||
|
|
@ -131,5 +155,8 @@ def test_ttl_json_ttl_dlflatsocial(fastapi_client_simple):
|
||||||
assert response.status_code == HTTP_200_OK
|
assert response.status_code == HTTP_200_OK
|
||||||
assert (
|
assert (
|
||||||
response.text.strip()
|
response.text.strip()
|
||||||
== ttl_output_record.replace('dlflatsocial:test_john_ttl', new_json_pid).strip()
|
== ttl_output_record_a.replace('dlflatsocial:test_john_ttl', new_json_pid).strip()
|
||||||
|
) or (
|
||||||
|
response.text.strip()
|
||||||
|
== ttl_output_record_b.replace('dlflatsocial:test_john_ttl', new_json_pid).strip()
|
||||||
)
|
)
|
||||||
|
|
|
||||||
|
|
@ -385,7 +385,12 @@ def create_token_store(
|
||||||
if extension == 'stl':
|
if extension == 'stl':
|
||||||
token_store = SchemaTypeLayer(backend=token_store, schema=schema_uri)
|
token_store = SchemaTypeLayer(backend=token_store, schema=schema_uri)
|
||||||
|
|
||||||
model_store = ModelStore(backend=token_store, schema=schema_uri)
|
submission_tags = instance_config.collections[collection_name].submission_tags
|
||||||
|
tags = {
|
||||||
|
'id': submission_tags.submitter_id_tag,
|
||||||
|
'time': submission_tags.submission_time_tag,
|
||||||
|
}
|
||||||
|
model_store = ModelStore(backend=token_store, schema=schema_uri, tags=tags)
|
||||||
instance_config.all_stores[store_dir] = (collection_name, model_store)
|
instance_config.all_stores[store_dir] = (collection_name, model_store)
|
||||||
|
|
||||||
return model_store
|
return model_store
|
||||||
|
|
|
||||||
|
|
@ -102,6 +102,7 @@ run = "python -m dump_things_service.main {args}"
|
||||||
default-args = ["dump_things_service"]
|
default-args = ["dump_things_service"]
|
||||||
extra-dependencies = [
|
extra-dependencies = [
|
||||||
"dump_things_service",
|
"dump_things_service",
|
||||||
|
"freezegun",
|
||||||
"httpx",
|
"httpx",
|
||||||
"pytest",
|
"pytest",
|
||||||
"pytest-cov",
|
"pytest-cov",
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue