Add a timestamp annotation and make annotation classes configurable #145

Merged
christian-monch merged 3 commits from timestamp into master 2025-10-07 13:41:33 +00:00
13 changed files with 430 additions and 52 deletions

View file

@ -1,3 +1,15 @@
# 4.4.0 (2025-10-07)
## New features
- Record submission time in stored records.
- Add configuration option for submitter ID-class and the submitter time-class.
- Improve configuration validation. Configurations with unknown keys are now
rejected.
# 4.3.1 (2025-10-06) # 4.3.1 (2025-10-06)
## Bugfixes ## Bugfixes

136
README.md
View file

@ -439,6 +439,142 @@ A Forgejo authentication source can authenticate Forgejo-tokens that have at lea
- Repository (only if `repository` is set in the configuration): required to determine a team's access to the repository. - Repository (only if `repository` is set in the configuration): required to determine a team's access to the repository.
#### Submission annotation tag
The service annotates submitted records with a submitter id and a timestamp.
Annotations consist of an annotation tag, defining the class of the annotation, and an annotation value.
By default the service will use the class `http://purl.obolibrary.org/obo/NCIT_C54269` for the submitter id and the class `http://semanticscience.org/resource/SIO_001083` for submission time.
(Both tags will be converted into CURIEs if the schema of the collection defines an appropriate prefix.)
The default annotation tag classes can be overridden in the configuration on a per collection basis.
To override the defaults tags, add a `submission_tags`-attribute to a collection definition.
The `submission_tags`-attribute should contain a mapping that maps either `submitter_id_tag`, or `submitter_time_tag` or both to an IRI or a CURIE.
If the schema defines a matching prefix, IRIs are automatically converted to CURIEs before storing the record.
The service validates that the prefix of a CURIE is defined in the schema of the collection.
```yaml
type: collections
version: 1
collections:
collection_1:
default_token: basic_access
curated: curated
incoming: contributions
submission_tags:
submitter_id_tag: schema:user_id
submission_time_tag: schema:time
...
```
The service currently supports the following backends for storing records:
- `record_dir`: this backend stores records as YAML-files in a directory structure that is defined [here](https://concepts.datalad.org/dump-things-storage-v0/). It reads the backend configuration from a "record collection configuration file" as described [here](https://concepts.datalad.org/dump-things-storage-v0/).
- `sqlite`: this backend stores records in a SQLite database. There is an individual database file, named `__sqlite-records.db`, for each curated area and incoming area.
- `record_dir+stl`: here `stl` stands for "schema-type-layer".
This backend stores records in the same format as `record_dir`, but adds special treatment for the `schema_type` attribute in records.
It removes `schema_type`-attributes from the top-level mapping of a record before storing it as YAML-file. When records are read from this backend, a `schema_type` attribute is added back into the record, using a schema to determine the correct class-URI.
In other words, all records stored with this backend will have no `schema_type`-attribute in the top-level, and all records read with this backend will have a `schema_type` attribute in the top-level.
- `sqlite+stl`: This backend stores records in the same format as `sqlite`, but adds the same special treatment for the `schema_type` attribute as `record_dir+stl`.
Backends can be defined per collection in the configuration file.
The backend will be used for the curated area and for the incoming areas of the collection.
If no backend is defined for a collection, the `record_dir+stl`-backend is used by default.
The `+stl`-backends can be useful if an endpoint returns records of multiple classes, because it allows clients to determine the class of each result record.
The service guarantees that backends of all types can co-exist independently in the same directory, i.e., there are no name collisions in files that are used for different backends (as long as no class name starts with `.` or `_`)).
The following configuration snippet shows how to define a backend for a collection:
```yaml
...
collections:
collection_with_default_record_dir+stl_backend:
# This is a collection with the default backend, i.e. `record_dir+stl` and
# the default authentication, i.e. config-based authentication.
default_token: anon_read
curated: collection_1/curated
collection_with_forgejo_authentication_source:
# This is a collection with the default backend, i.e. `record_dir+stl` and
# a forgejo-based authentication source. That means it will use a forgejo
# instance to determine the permissions of a token for this collection.
# The instance is also used to determine the user-id and the incoming label.
# In the case of forgejo, the user-id and the incoming label are the
# forgejo login associated with the token.
# We still need the name of a default token. If the token is defined in this
# config file, its properties will be determined by the
# config file. If the token is not defined in the config file, its
# properties will be determined by the authentication sources. In this
# example by the forgejo-instance at `https://forgejo.example.com`.
# If there is more than one authentication source, they will be tried
# in the order they are defined in the config file.
default_token: anon_read # We still need a default token
curated: collection_2/curated
# Token permissions, user-ids (for record annotations), and incoming
# label can be determined by multiple authentication sources.
# If no source is defined, `config` will be used, which reads token
# information from the config file.
# This example explicitly defines `config` and a second authentication
# source, a `forgejo` authentication source.
auth_sources:
- type: forgejo # requires `user`-read and `organization`-read permissions on token
# The API-URL of the forgejo instance that should be used
url: https://forgejo.example.com/api/v1
# An organization
organization: data_handling
# A team in the organization. The authorization of the team
# determines the permissions of the token
team: data_entry_personal
# `label_type` determines how an incoming label is created for
# a Forgejo token. If `label_type` is `team`, the incoming label
# will be `forgejo-team-<organization>-<team>`. If `label_type`
# is `user`, the incoming label will be
# `forgejo-user-<user-login>`
label_type: team
# An optional repository. The token will only be authorized
# if the team has access to the repository. Note: if `repo`
# is set, the token must have at least repository read
# permissions.
repo: reference-repository
# Fallback to the config file.
- type: config # check tokens from the configuration file
# Multiple authorization sources are allowed. They will be tried in the
# order defined in the config file. If an authorization source returns
# permissions for a token, those permissions will be used and no other
# authorization sources will be queried.
# The default authorization source is `config`, which reads the token
# permissions, user-id, and incoming
collection_with_explicit_record_dir+stl_backend:
default_token: anon_read
curated: collection_3/curated
backend:
# The record_dir-backend is identified by the
# type: "record_dir". No more attributes are
# defined for this backend.
type: record_dir+stl
collection_with_sqlite_backend:
default_token: anon_read
curated: collection_4/curated
backend:
# The sqlite-backend is identified by the
# type: "sqlite". It requires a schema attribute
# that holds the URL of the schema that should
# be used in this backend.
type: sqlite
schema: https://concepts.inm7.de/s/flat-data/unreleased.yaml
```
### Command line parameters: ### Command line parameters:
The service supports the following command line parameters: The service supports the following command line parameters:

View file

@ -1 +1 @@
__version__ = '4.3.1' __version__ = '4.4.0'

View file

@ -17,6 +17,8 @@ import yaml
from fastapi import HTTPException from fastapi import HTTPException
from pydantic import ( from pydantic import (
BaseModel, BaseModel,
ConfigDict,
Field,
ValidationError, ValidationError,
) )
from yaml.scanner import ScannerError from yaml.scanner import ScannerError
@ -29,8 +31,12 @@ from dump_things_service.backends.sqlite import (
record_file_name as sqlite_record_file_name, record_file_name as sqlite_record_file_name,
) )
from dump_things_service.converter import get_conversion_objects from dump_things_service.converter import get_conversion_objects
from dump_things_service.exceptions import ConfigError from dump_things_service.exceptions import (
ConfigError,
CurieResolutionError,
)
from dump_things_service.model import get_model_for_schema from dump_things_service.model import get_model_for_schema
from dump_things_service.resolve_curie import resolve_curie
from dump_things_service.store.model_store import ModelStore from dump_things_service.store.model_store import ModelStore
from dump_things_service.token import ( from dump_things_service.token import (
TokenPermission, TokenPermission,
@ -51,6 +57,10 @@ ignored_files = {'.', '..', config_file_name}
_global_config_instance = None _global_config_instance = None
class StrictModel(BaseModel):
model_config = ConfigDict(extra='forbid')
class MappingMethod(enum.Enum): class MappingMethod(enum.Enum):
digest_md5 = 'digest-md5' digest_md5 = 'digest-md5'
digest_md5_p3 = 'digest-md5-p3' digest_md5_p3 = 'digest-md5-p3'
@ -61,7 +71,7 @@ class MappingMethod(enum.Enum):
after_last_colon = 'after-last-colon' after_last_colon = 'after-last-colon'
class CollectionDirConfig(BaseModel): class CollectionDirConfig(StrictModel):
type: Literal['records'] type: Literal['records']
version: Literal[1] version: Literal[1]
schema: str schema: str
@ -82,26 +92,27 @@ class TokenModes(enum.Enum):
class TokenCollectionConfig(BaseModel): class TokenCollectionConfig(BaseModel):
model_config = ConfigDict(extra='forbid')
mode: TokenModes mode: TokenModes
incoming_label: str incoming_label: str = Field(strict=True)
class TokenConfig(BaseModel): class TokenConfig(StrictModel):
user_id: str user_id: str
collections: dict[str, TokenCollectionConfig] collections: dict[str, TokenCollectionConfig]
hashed: bool = False hashed: bool = False
class BackendConfigRecordDir(BaseModel): class BackendConfigRecordDir(StrictModel):
type: Literal['record_dir', 'record_dir+stl'] type: Literal['record_dir', 'record_dir+stl']
class BackendConfigSQLite(BaseModel): class BackendConfigSQLite(StrictModel):
type: Literal['sqlite', 'sqlite+stl'] type: Literal['sqlite', 'sqlite+stl']
schema: str schema: str
class ForgejoAuthConfig(BaseModel): class ForgejoAuthConfig(StrictModel):
type: Literal['forgejo'] type: Literal['forgejo']
url: str url: str
organization: str organization: str
@ -110,19 +121,27 @@ class ForgejoAuthConfig(BaseModel):
repository: str | None = None repository: str | None = None
class ConfigAuthConfig(BaseModel): class ConfigAuthConfig(StrictModel):
type: Literal['config'] = 'config' type: Literal['config'] = 'config'
class CollectionConfig(BaseModel): class TagConfig(StrictModel):
submitter_id_tag: str = 'http://purl.obolibrary.org/obo/NCIT_C54269'
submission_time_tag: str = 'http://semanticscience.org/resource/SIO_001083'
class CollectionConfig(StrictModel):
default_token: str default_token: str
curated: Path curated: Path
incoming: Path | None = None incoming: Path | None = None
backend: BackendConfigRecordDir | BackendConfigSQLite | None = None backend: BackendConfigRecordDir | BackendConfigSQLite | None = None
auth_sources: list[ForgejoAuthConfig | ConfigAuthConfig] = [ConfigAuthConfig()] auth_sources: list[ForgejoAuthConfig | ConfigAuthConfig] = [ConfigAuthConfig()]
submission_tags: TagConfig = TagConfig()
class GlobalConfig(BaseModel): class GlobalConfig(StrictModel):
model_config = ConfigDict(strict=True)
type: Literal['collections'] type: Literal['collections']
version: Literal[1] version: Literal[1]
collections: dict[str, CollectionConfig] collections: dict[str, CollectionConfig]
@ -399,6 +418,10 @@ def process_config_object(
curated_store = ModelStore( curated_store = ModelStore(
schema=schema, schema=schema,
backend=curated_store_backend, backend=curated_store_backend,
tags={
'id': collection_info.submission_tags.submitter_id_tag,
'time': collection_info.submission_tags.submission_time_tag,
}
) )
instance_config.curated_stores[collection_name] = curated_store instance_config.curated_stores[collection_name] = curated_store
@ -496,6 +519,14 @@ def process_config_object(
msg = 'plain tokens clash with hashed tokens' msg = 'plain tokens clash with hashed tokens'
raise ConfigError(msg) raise ConfigError(msg)
# Check tags
for collection_name, collection_info in config_object.collections.items():
module = instance_config.model_info[collection_name][0]
try:
resolve_curie(module, collection_info.submission_tags.submission_time_tag)
except CurieResolutionError as e:
raise ConfigError(str(e)) from e
return instance_config return instance_config

View file

@ -17,21 +17,30 @@ url_regex = re.compile(url_pattern)
def resolve_curie( def resolve_curie(
model: types.ModuleType, model: types.ModuleType,
curie: str, curie_or_iri: str,
) -> str: ) -> str:
if ':' not in curie: if ':' not in curie_or_iri:
return curie return curie_or_iri
if url_regex.match(curie): if not is_curie(curie_or_iri):
return curie return curie_or_iri
prefix, identifier = curie.split(':', 1) prefix, identifier = curie_or_iri.split(':', 1)
prefix_value = model.linkml_meta.root.get('prefixes', {}).get(prefix) prefix_value = model.linkml_meta.root.get('prefixes', {}).get(prefix)
if prefix_value is None: if prefix_value is None:
msg = ( msg = (
f'cannot resolve CURIE "{curie}". No such prefix: "{prefix}" in ' f'cannot resolve CURIE "{curie_or_iri}". No such prefix: "{prefix}" in '
f'schema: {model.linkml_meta.root["id"]}' f'schema: {model.linkml_meta.root["id"]}'
) )
raise CurieResolutionError(msg) raise CurieResolutionError(msg)
return prefix_value['prefix_reference'] + identifier return prefix_value['prefix_reference'] + identifier
def is_curie(
curie_or_iri: str,
):
if ':' not in curie_or_iri:
return False
return url_regex.match(curie_or_iri) is None

View file

@ -1,5 +1,6 @@
from __future__ import annotations from __future__ import annotations
from datetime import datetime
from itertools import chain from itertools import chain
from typing import TYPE_CHECKING from typing import TYPE_CHECKING
@ -7,7 +8,7 @@ from dump_things_service.model import (
get_model_for_schema, get_model_for_schema,
get_subclasses, get_subclasses,
) )
from dump_things_service.resolve_curie import resolve_curie from dump_things_service.resolve_curie import resolve_curie, is_curie
from dump_things_service.utils import cleaned_json from dump_things_service.utils import cleaned_json
if TYPE_CHECKING: if TYPE_CHECKING:
@ -31,10 +32,12 @@ class _ModelStore:
self, self,
schema: str, schema: str,
backend: StorageBackend, backend: StorageBackend,
tags: dict[str, str]
): ):
self.schema = schema self.schema = schema
self.model = get_model_for_schema(self.schema)[0] self.model = get_model_for_schema(self.schema)[0]
self.backend = backend self.backend = backend
self.tags = tags
def get_uri(self) -> str: def get_uri(self) -> str:
return self.backend.get_uri() return self.backend.get_uri()
@ -95,25 +98,30 @@ class _ModelStore:
submitter: str, submitter: str,
) -> None: ) -> None:
"""Add submitter IRI to the record annotations, use CURIE if possible""" """Add submitter IRI to the record annotations, use CURIE if possible"""
submitter_iri = self.get_curie(
submitter_namespace,
submitter_class,
)
if 'annotations' not in json_object: if 'annotations' not in json_object:
json_object['annotations'] = {} json_object['annotations'] = {}
json_object['annotations'][submitter_iri] = submitter submitter_curie_or_iri = self.get_curie(self.tags['id'])
time_curie_or_iri = self.get_curie(self.tags['time'])
json_object['annotations'][submitter_curie_or_iri] = submitter
json_object['annotations'][time_curie_or_iri] = datetime.now().isoformat()
def get_curie( def get_curie(
self, self,
name_space: str, curie_or_iri: str,
class_name: str,
) -> str: ) -> str:
if is_curie(curie_or_iri):
return curie_or_iri
prefixes = self.model.linkml_meta.root.get('prefixes') prefixes = self.model.linkml_meta.root.get('prefixes')
if prefixes: if prefixes:
for prefix_info in prefixes.values(): for prefix_info in prefixes.values():
if prefix_info['prefix_reference'] == name_space: reference = prefix_info['prefix_reference']
return f'{prefix_info["prefix_prefix"]}:{class_name}' if curie_or_iri.startswith(reference):
return f'{name_space}{class_name}' return curie_or_iri.replace(
reference,
prefix_info['prefix_prefix'] + ':',
1,
)
return curie_or_iri
def extract_inlined( def extract_inlined(
self, self,
@ -206,6 +214,7 @@ _existing_model_stores = {}
def ModelStore( # noqa: N802 def ModelStore( # noqa: N802
schema: str, schema: str,
backend: StorageBackend, backend: StorageBackend,
tags: dict[str, str],
) -> _ModelStore: ) -> _ModelStore:
""" """
Create a unique model store for the given schema and backend. Create a unique model store for the given schema and backend.
@ -216,7 +225,7 @@ def ModelStore( # noqa: N802
""" """
existing_model_store, _ = _existing_model_stores.get(id(backend), (None, None)) existing_model_store, _ = _existing_model_stores.get(id(backend), (None, None))
if not existing_model_store: if not existing_model_store:
existing_model_store = _ModelStore(schema, backend) existing_model_store = _ModelStore(schema, backend, tags)
# We store a pointer to the backend in the value to ensure that the # We store a pointer to the backend in the value to ensure that the
# backend object exists while we use its `id` as a key. # backend object exists while we use its `id` as a key.
_existing_model_stores[id(backend)] = existing_model_store, backend _existing_model_stores[id(backend)] = existing_model_store, backend

View file

@ -37,58 +37,47 @@ collections:
incoming: {incoming} incoming: {incoming}
backend: backend:
type: record_dir+stl type: record_dir+stl
schema: {schema_path}
idfx: digest_md5
auth_sources: auth_sources:
- type: config - type: config
submission_tags:
submitter_id_tag: oxo:NCIT_C54269
submission_time_tag: https://time
collection_2: collection_2:
default_token: basic_access default_token: basic_access
curated: {curated}/collection_2 curated: {curated}/collection_2
incoming: incoming_2 incoming: incoming_2
backend: backend:
type: record_dir+stl type: record_dir+stl
schema: {schema_path}
idfx: digest_md5
collection_3: collection_3:
default_token: basic_access default_token: basic_access
curated: {curated}/collection_3 curated: {curated}/collection_3
incoming: incoming_3 incoming: incoming_3
backend: backend:
type: record_dir+stl type: record_dir+stl
schema: {schema_path}
idfx: digest_md5
collection_4: collection_4:
default_token: basic_access default_token: basic_access
curated: {curated}/collection_4 curated: {curated}/collection_4
incoming: incoming_4 incoming: incoming_4
backend: backend:
type: record_dir+stl type: record_dir+stl
schema: {schema_path}
idfx: digest_md5
collection_5: collection_5:
default_token: basic_access default_token: basic_access
curated: {curated}/collection_5 curated: {curated}/collection_5
incoming: incoming_5 incoming: incoming_5
backend: backend:
type: record_dir+stl type: record_dir+stl
schema: {schema_path}
idfx: digest_md5
collection_6: collection_6:
default_token: basic_access default_token: basic_access
curated: {curated}/collection_6 curated: {curated}/collection_6
incoming: incoming_6 incoming: incoming_6
backend: backend:
type: record_dir+stl type: record_dir+stl
schema: {schema_path}
idfx: digest_md5
collection_7: collection_7:
default_token: basic_access default_token: basic_access
curated: {curated}/collection_7 curated: {curated}/collection_7
incoming: incoming_7 incoming: incoming_7
backend: backend:
type: record_dir+stl type: record_dir+stl
schema: {schema_path}
idfx: digest_md5
collection_8: collection_8:
default_token: basic_access default_token: basic_access
curated: {curated}/collection_8 curated: {curated}/collection_8
@ -102,8 +91,6 @@ collections:
incoming: {incoming}/collection_dlflatsocial-1 incoming: {incoming}/collection_dlflatsocial-1
backend: backend:
type: record_dir+stl type: record_dir+stl
schema: https://concepts.datalad.org/s/flat-social/unreleased.yaml
idfx: digest_md5
collection_dlflatsocial-2: collection_dlflatsocial-2:
default_token: basic_access default_token: basic_access
curated: {curated}/collection_dlflatsocial-2 curated: {curated}/collection_dlflatsocial-2

View file

@ -8,7 +8,7 @@ from dump_things_service.config import (
ConfigError, ConfigError,
GlobalConfig, GlobalConfig,
process_config, process_config,
process_config_object, process_config_object, get_config,
) )
@ -56,3 +56,133 @@ tokens:
global_dict = {} global_dict = {}
with pytest.raises(ConfigError): with pytest.raises(ConfigError):
process_config_object(tmp_path, config_object, [], global_dict) process_config_object(tmp_path, config_object, [], global_dict)
def test_submission_tags_handling(dump_stores_simple):
config_object = GlobalConfig(
**yaml.load(
"""
type: collections
version: 1
collections:
collection_1:
default_token: basic_access
curated: curated/in_token_1
incoming: contributions
submission_tags:
submitter_id_tag: no_default_id_tag
submission_time_tag: no_default_time_tag
collection_2:
default_token: basic_access
curated: curated/collection_2
incoming: contributions
tokens:
basic_access:
user_id: anonymous
collections:
collection_1:
mode: WRITE_COLLECTION
incoming_label: incoming_anonymous
collection_2:
mode: WRITE_COLLECTION
incoming_label: incoming_anonymous
""",
Loader=yaml.SafeLoader,
)
)
global_dict = {}
config = process_config_object(dump_stores_simple, config_object, [], global_dict)
# Check for specified tags in collection `collection_1`
assert config.collections['collection_1'].submission_tags.submission_time_tag == 'no_default_time_tag'
assert config.collections['collection_1'].submission_tags.submitter_id_tag == 'no_default_id_tag'
# Check for default tags in collection `collection_2`
assert config.collections['collection_2'].submission_tags.submission_time_tag == 'http://semanticscience.org/resource/SIO_001083'
assert config.collections['collection_2'].submission_tags.submitter_id_tag == 'http://purl.obolibrary.org/obo/NCIT_C54269'
def test_submission_tags_resolving(dump_stores_simple):
config_object = GlobalConfig(
**yaml.load(
"""
type: collections
version: 1
collections:
collection_1:
default_token: basic_access
curated: curated/in_token_1
incoming: contributions
submission_tags:
submitter_id_tag: abc:id
submission_time_tag: abc:time
tokens:
basic_access:
user_id: anonymous
collections:
collection_1:
mode: WRITE_COLLECTION
incoming_label: incoming_anonymous
""",
Loader=yaml.SafeLoader,
)
)
global_dict = {}
process_config_object(dump_stores_simple, config_object, [], global_dict)
def test_submission_tags_resolving_error(dump_stores_simple):
config_object = GlobalConfig(
**yaml.load(
"""
type: collections
version: 1
collections:
collection_1:
default_token: basic_access
curated: curated/in_token_1
incoming: contributions
submission_tags:
submitter_id_tag: non-existing:id
collection_2:
default_token: basic_access
curated: curated/in_token_1
incoming: contributions
submission_tags:
submission_time_tag: non-existing:time
collection_3:
default_token: basic_access
curated: curated/in_token_1
incoming: contributions
submission_tags:
submitter_id_tag: http://something/non-existing
collection_4:
default_token: basic_access
curated: curated/in_token_1
incoming: contributions
submission_tags:
submission_time_tag: http://something/non-existing
tokens:
basic_access:
user_id: anonymous
collections:
collection_1:
mode: WRITE_COLLECTION
incoming_label: incoming_anonymous
collection_2:
mode: WRITE_COLLECTION
incoming_label: incoming_anonymous
collection_3:
mode: WRITE_COLLECTION
incoming_label: incoming_anonymous
collection_4:
mode: WRITE_COLLECTION
incoming_label: incoming_anonymous
""",
Loader=yaml.SafeLoader,
)
)
global_dict = {}
with pytest.raises(ConfigError) as e:
process_config_object(dump_stores_simple, config_object, [], global_dict)

View file

@ -165,6 +165,10 @@ def test_inline_extraction_locally():
store = ModelStore( store = ModelStore(
schema=str(schema_path), schema=str(schema_path),
backend=None, backend=None,
tags = {
'id': 'abc:id',
'time': 'abc:time',
}
) )
store.model = MockedModule() store.model = MockedModule()
records = store.extract_inlined(inlined_object) records = store.extract_inlined(inlined_object)
@ -193,6 +197,10 @@ def test_dont_extract_empty_things_locally():
store = ModelStore( store = ModelStore(
schema=str(schema_path), schema=str(schema_path),
backend=None, backend=None,
tags={
'id': 'https://id',
'time': 'https://time',
}
) )
store.model = MockedModule() store.model = MockedModule()
records = store.extract_inlined(empty_inlined_object) records = store.extract_inlined(empty_inlined_object)

View file

@ -1,3 +1,4 @@
import freezegun
import pytest # noqa F401 import pytest # noqa F401
from .. import HTTP_200_OK from .. import HTTP_200_OK
@ -19,14 +20,32 @@ xyz:HenryAdams a abc:Person ;
abc:schema_type "abc:Person" . abc:schema_type "abc:Person" .
""" """
ttl_result_record = """@prefix abc: <http://example.org/person-schema/abc/> . ttl_result_record_a = """@prefix abc: <http://example.org/person-schema/abc/> .
@prefix oxo: <http://purl.obolibrary.org/obo/> .
@prefix xyz: <http://example.org/person-schema/xyz/> .
xyz:HenryAdams a abc:Person ;
abc:annotations [ a abc:Annotation ;
abc:annotation_tag <https://time> ;
abc:annotation_value "1970-01-01T00:00:00" ],
[ a abc:Annotation ;
abc:annotation_tag oxo:NCIT_C54269 ;
abc:annotation_value "test_user_1" ] ;
abc:given_name "Henryöäß" ;
abc:schema_type "abc:Person" .
"""
ttl_result_record_b = """@prefix abc: <http://example.org/person-schema/abc/> .
@prefix oxo: <http://purl.obolibrary.org/obo/> . @prefix oxo: <http://purl.obolibrary.org/obo/> .
@prefix xyz: <http://example.org/person-schema/xyz/> . @prefix xyz: <http://example.org/person-schema/xyz/> .
xyz:HenryAdams a abc:Person ; xyz:HenryAdams a abc:Person ;
abc:annotations [ a abc:Annotation ; abc:annotations [ a abc:Annotation ;
abc:annotation_tag oxo:NCIT_C54269 ; abc:annotation_tag oxo:NCIT_C54269 ;
abc:annotation_value "test_user_1" ] ; abc:annotation_value "test_user_1" ],
[ a abc:Annotation ;
abc:annotation_tag <https://time> ;
abc:annotation_value "1970-01-01T00:00:00" ] ;
abc:given_name "Henryöäß" ; abc:given_name "Henryöäß" ;
abc:schema_type "abc:Person" . abc:schema_type "abc:Person" .
""" """
@ -75,6 +94,7 @@ def test_json_ttl_json(fastapi_client_simple):
assert json_object == json_record_out assert json_object == json_record_out
@freezegun.freeze_time('1970-01-01')
def test_ttl_json_ttl(fastapi_client_simple): def test_ttl_json_ttl(fastapi_client_simple):
test_client, _ = fastapi_client_simple test_client, _ = fastapi_client_simple
@ -115,5 +135,8 @@ def test_ttl_json_ttl(fastapi_client_simple):
assert response.status_code == HTTP_200_OK assert response.status_code == HTTP_200_OK
assert ( assert (
response.text.strip() response.text.strip()
== ttl_result_record.replace('xyz:HenryAdams', new_json_pid).strip() == ttl_result_record_a.replace('xyz:HenryAdams', new_json_pid).strip()
) or (
response.text.strip()
== ttl_result_record_b.replace('xyz:HenryAdams', new_json_pid).strip()
) )

View file

@ -1,5 +1,10 @@
import datetime
from datetime import datetime as datetime_object
import pytest # noqa F401 import pytest # noqa F401
import freezegun
from .. import HTTP_200_OK from .. import HTTP_200_OK
from ..utils import cleaned_json from ..utils import cleaned_json
@ -33,7 +38,22 @@ dlflatsocial:test_john_ttl a dlflatsocial:Person ;
dlsocialmx:given_name "Johnöüß" . dlsocialmx:given_name "Johnöüß" .
""" """
ttl_output_record = """@prefix dlflatsocial: <https://concepts.datalad.org/s/flat-social/unreleased/> . ttl_output_record_a = """@prefix dlflatsocial: <https://concepts.datalad.org/s/flat-social/unreleased/> .
@prefix dlsocialmx: <https://concepts.datalad.org/s/social-mixin/unreleased/> .
@prefix dlthings: <https://concepts.datalad.org/s/things/v1/> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
dlflatsocial:test_john_ttl a dlflatsocial:Person ;
dlsocialmx:given_name "Johnöüß" ;
dlthings:annotations [ a dlthings:Annotation ;
dlthings:annotation_tag <http://semanticscience.org/resource/SIO_001083> ;
dlthings:annotation_value "1970-01-01T00:00:00" ],
[ a dlthings:Annotation ;
dlthings:annotation_tag obo:NCIT_C54269 ;
dlthings:annotation_value "test_user_1" ] .
"""
ttl_output_record_b = """@prefix dlflatsocial: <https://concepts.datalad.org/s/flat-social/unreleased/> .
@prefix dlsocialmx: <https://concepts.datalad.org/s/social-mixin/unreleased/> . @prefix dlsocialmx: <https://concepts.datalad.org/s/social-mixin/unreleased/> .
@prefix dlthings: <https://concepts.datalad.org/s/things/v1/> . @prefix dlthings: <https://concepts.datalad.org/s/things/v1/> .
@prefix obo: <http://purl.obolibrary.org/obo/> . @prefix obo: <http://purl.obolibrary.org/obo/> .
@ -42,7 +62,10 @@ dlflatsocial:test_john_ttl a dlflatsocial:Person ;
dlsocialmx:given_name "Johnöüß" ; dlsocialmx:given_name "Johnöüß" ;
dlthings:annotations [ a dlthings:Annotation ; dlthings:annotations [ a dlthings:Annotation ;
dlthings:annotation_tag obo:NCIT_C54269 ; dlthings:annotation_tag obo:NCIT_C54269 ;
dlthings:annotation_value "test_user_1" ] . dlthings:annotation_value "test_user_1" ],
[ a dlthings:Annotation ;
dlthings:annotation_tag <http://semanticscience.org/resource/SIO_001083> ;
dlthings:annotation_value "1970-01-01T00:00:00" ] .
""" """
new_json_pid = 'dlflatsocial:another_john_ttl' new_json_pid = 'dlflatsocial:another_john_ttl'
@ -90,6 +113,7 @@ def test_json_ttl_json_dlflatsocial(fastapi_client_simple):
assert json_object == json_record_out assert json_object == json_record_out
@freezegun.freeze_time('1970-01-01')
def test_ttl_json_ttl_dlflatsocial(fastapi_client_simple): def test_ttl_json_ttl_dlflatsocial(fastapi_client_simple):
test_client, _ = fastapi_client_simple test_client, _ = fastapi_client_simple
@ -131,5 +155,8 @@ def test_ttl_json_ttl_dlflatsocial(fastapi_client_simple):
assert response.status_code == HTTP_200_OK assert response.status_code == HTTP_200_OK
assert ( assert (
response.text.strip() response.text.strip()
== ttl_output_record.replace('dlflatsocial:test_john_ttl', new_json_pid).strip() == ttl_output_record_a.replace('dlflatsocial:test_john_ttl', new_json_pid).strip()
) or (
response.text.strip()
== ttl_output_record_b.replace('dlflatsocial:test_john_ttl', new_json_pid).strip()
) )

View file

@ -385,7 +385,12 @@ def create_token_store(
if extension == 'stl': if extension == 'stl':
token_store = SchemaTypeLayer(backend=token_store, schema=schema_uri) token_store = SchemaTypeLayer(backend=token_store, schema=schema_uri)
model_store = ModelStore(backend=token_store, schema=schema_uri) submission_tags = instance_config.collections[collection_name].submission_tags
tags = {
'id': submission_tags.submitter_id_tag,
'time': submission_tags.submission_time_tag,
}
model_store = ModelStore(backend=token_store, schema=schema_uri, tags=tags)
instance_config.all_stores[store_dir] = (collection_name, model_store) instance_config.all_stores[store_dir] = (collection_name, model_store)
return model_store return model_store

View file

@ -102,6 +102,7 @@ run = "python -m dump_things_service.main {args}"
default-args = ["dump_things_service"] default-args = ["dump_things_service"]
extra-dependencies = [ extra-dependencies = [
"dump_things_service", "dump_things_service",
"freezegun",
"httpx", "httpx",
"pytest", "pytest",
"pytest-cov", "pytest-cov",