dump-things-server/CHANGELOG.md
2026-04-15 08:33:44 +02:00

760 lines
20 KiB
Markdown

# 5.6.1 (2026-03-20)
## Bugfixes
- Ensure that pending gitaudit log entries are always persisted after
a defined *timeout*. In version 5.6.0 pending changes could have been
cached until the server is shutdown, and only be written then.
This change ensures that `dump-things-gitaudit-report` will report
all changes after *timeout* seconds.
# 5.6.0 (2026-03-20)
## New features
- Support for audit backends was added to `dump-things-service`. Currently there
is one audit backend type: `gitaudit`. The audit backend stores provenance information
about records, i.e. who changed what at which time. If defined for a collection,
every curated-write will generate an entry in the audit log. The curated write
endpoint has the new parameter `author_id`, which allows to specify an author ID
that will be written to the audit log for the respective modification. If the
author ID is not provided, the ID of the curator will be used as author ID.
- The new tool `dump-things-gitaudit-report` reports audit information for
individual PIDs, i.e., timestamps, user-id, associated diffs, and the
resulting record.
## Improvements
- Ensure uniform annotation structures when records are stored.
# 5.5.0 (2026-02-19)
## New features
- Forgejo authentication sources now generate incoming labels that are unique
to the user and the Forgejo instance. This keeps incoming areas of users on
different Forgejo instances separate, even if the user names are identical.
# 5.4.0 (2026-02-02)
## New features
- The `/server`-endpoint result now contains the name of classes that are
supported by the collections, i.e., classes for which storage- and
validation-endpoints exist.
- Add `/maintenance`-endpoint to temporarily lock collections for non-curator
access.
# 5.3.6 (2026-01-13)
## Changes
- Redirect `/` to `/docs`
# 5.3.5 (2025-12-18)
## Bugfixes
- Curator- and Incoming-endpoints now return only the requested record and
no additional surrounding structures.
# 5.3.4 (2025-12-17)
## Bugfixes
- add a patch to fix the order of type declarations that are generated by
linkml.generators.pythongen
# 5.3.3 (2025-12-16)
## Bugfixes
- fix type designator handling in `RDFLibLoader.from_rdf_graph`. This fixes an
issue where subclasses of ranges, e.g. `dlidentifiers:Identifier`, were not
properly handled.
# 5.3.2 (2025-12-15)
## Bugfixes
- add a patch for faulty `ifabsent`-code generation in LinkML's Pydantic-code generator.
# 5.3.1 (2025-12-11)
## Bugfixes
- paginated curated- and incoming-reads failed if the respective result
set was larger or equal to 1000 records.
# 5.3.0 (2025-12-02)
## New features
- The `/server`-endpoint result now contains the name of collections and the
schemas that are used by the collections.
# 5.2.1 (2025-11-26)
## Changes
- Improve caching in Forgejo authentication sources. All calls to a
Forgejo-authentication source are now cached with expiration duration
between one and five minutes. Authentication-related changes in Forgejo
instances might therefore become visible after up to five minutes.
# 5.2.0 (2025-10-29)
## New features
- Add the key `use_classes` to the collection configuration mapping. If that
key is present, only the classes listed under this key will receive store-
and validate-endpoints. This solves [issue 100](
https://github.com/christian-monch/dump-things-server/issues/100)
# 5.1.1 (2025-10-29)
## Bugfixes
- Catch CURIE resolution problems in delete-operations and report them in
HTTP-4xx responses. This solves [issue 168](
https://github.com/christian-monch/dump-things-server/issues/168)
# 5.1.0 (2025-10-29)
## New features
- Add the key `ignore_classes` to the collection configuration mapping. All
classes listed under this key will be ignored when store- and
validate-endpoints are created. This solves [issue 100](
https://github.com/christian-monch/dump-things-server/issues/100)
# 5.0.3 (2025-10-28)
## Bugfixes
- Improve auto-generated API docs. This solves [issue 165](
https://github.com/christian-monch/dump-things-server/issues/165)
# 5.0.2 (2025-10-27)
## Bugfixes
- Patch LinkML to use meaningful module names for compiled Pydantic code.
# 5.0.1 (2025-10-20)
## Bugfixes
- Dynamically increase the recursion limit when compiling code that is
generated by `PydanticGenerator`. (Until the root-cause for the usage of
excessive recursion is found, this workaround will be in place.)
# 5.0.0 (2025-10-17)
## Breaking changes
- Report an HTTP 404 error if the client tries to delete a record that does not
exist. This changes the behavior of the following endpoints if the record
identified by the provided PID is not found:
- `DELETE /<collection>/record`
- `DELETE /<collection>/curated/record`
- `DELETE /{collection}/incoming/{label}/record`
# 4.7.0 (2025-10-16)
## New features
- all records that are submitted are validated against the schema before
storing them.
# 4.6.2 (2025-10-15)
## Bugfixes
- fix a bug that prevented the service from detecting externally authorized
incoming zones on disk.
# 4.6.1 (2025-10-15)
## Enhancements
- ensure that a `config` authentication source is defined for collections that
use tokens that are defined in the configuration file.
# 4.6.0 (2025-10-14)
## New features
- support authentication of curators via Forgejo authentication sources.
If the unit `repo.actions` of the respective team is set to `write`, the
token will authorize curator access to the collection, that means read and
write access to all incoming areas and to curated records (see the "curator"
and "incoming" endpoints in the API documentation).
# 4.5.3 (2025-10-14)
## Bugfixes
- Increase recursion limit on python version 3.11. This prevents a
RecursionLimit-error when pydantic modules are generated.
# 4.5.2 (2025-10-11)
## Bugfixes
- Remove workaround for faulty schema definitions because the schema definitions
were fixed in PR https://github.com/psychoinformatics-de/datalad-concepts/pull/393
# 4.5.1 (2025-10-10)
## Bugfixes
- Add workaround for faulty schema definitions to schema merging tool.
# 4.5.0 (2025-10-08)
## New features
- Add the command `dump-things-create-merged-schema`.
This command creates a new schema that contains all schemas that the original schema imported.
The new schema is fully self-contained and does not reference any other schemas.
This is useful to freeze a version of a schema and make it independent of changes in imported schemas.
# 4.4.0 (2025-10-07)
## New features
- Record submission time in stored records.
- Add configuration option for submitter ID-class and the submitter time-class.
- Improve configuration validation. Configurations with unknown keys are now
rejected.
# 4.3.1 (2025-10-06)
## Bugfixes
- report CURIE resolution bugs in HTTP 400 message instead of creating an
internal server error.
# 4.3.0 (2025-10-01)
## New features
- Add validation endpoints: `/<collection>/validate/record/<class>`.
# 4.2.0 (2025-09-30)
## New features
- Use the name `__sqlite-records.db` for SQLite-backends.
Convert databases that use the old name to the new name.
This is not considered a breaking change, since the SQL DB-name is not
considered part of the public API/interface.
- Align CURIE resolution strategy with linkml's strategy: CURIE-resolution code
now interprets all strings that start with a scheme, colon, and two slashes
as URI.
# 4.1.1 (2025-09-30)
## Bugfixes
- Allow two collections to share a store, if the collections use the same
schema.
# 4.1.0 (2025-09-30)
## New features
- Add the command `dump-things-pid-check` which verifies that all pids in a
store which are in CURIE form can be resolved with the given schema.
# 4.0.0 (2025-09-29)
## Breaking changes
- Report an error if a `pid` in CURIE format, i.e., with a prefix, cannot be
resolved because the prefix is unknown. Depending on the schema, this might
break storing of records.
- Remove the error mode. This is not used and no longer needed.
- Remove export functionality. Export can be implemented via the curator API.
- Use DELETE to remove records
([issue #138](https://github.com/christian-monch/dump-things-server/issues/138)).
# 3.6.1 (2025-09-28)
## Bugfixes
- Fix a missing definition error that lead to server crashes.
# 3.6.0 (2025-09-28)
## New features
- Support incoming labels from external authentication sources. This
will, for example, give access to forgejo-authenticated incoming
areas in the `/<collection>/incoming/...`-endpoints.
# 3.5.0 (2025-09-27)
## New features
- Add the endpoints:
- `GET /<collection>/records/`
- `GET /<collection>/records/p/`
which return all records in a store for which the token authorizes.
- Add the endpoint:
- `GET /<collection>/delete?pid=<pid>`
which deletes the record with pid `<pid>`
- Add `CURATOR`-token mode. This mode allows:
- read/write access to the curated part of a collection
- read/write access to individual incoming areas of a collection
if the token has access to the respective collection.
The access is low-level, i.e., there is no annotation and no inlined
extraction.
- Add endpoints for curated area access.
See https://github.com/christian-monch/dump-things-server/issues/118 for a
description:
- `POST /<collection>/curated/record/<class>`
- `GET /<collection>/curated/records/<class>`
- `GET /<collection>/curated/records/p/<class>`
- `GET /<collection>/curated/records/`
- `GET /<collection>/curated/records/p/`
- `GET /<collection>/curated/record?pid=<pid>`
- `GET /<collection>/curated/delete?pid=<pid>`
- Add endpoints to access incoming areas of a collection. See
https://github.com/christian-monch/dump-things-server/issues/118 for a
description:
- `POST /<collection>/incoming/<label>/record/<class>`
- `GET /<collection>/incoming/`
- `GET /<collection>/incoming/<label>/records/<class>`
- `GET /<collection>/incoming/<label>/records/p/<class>`
- `GET /<collection>/incoming/<label>/records/`
- `GET /<collection>/incoming/<label>/records/p/`
- `GET /<collection>/incoming/<label>/record?pid=<pid>`
- `GET /<collection>/incoming/<label>/delete?pid=<pid>`
# 3.4.0 (2025-09-25)
## New features
- Return detailed information on authentication failures
- Add tags to OpenAPI definition to improve api documentation
# 3.3.3 (2025-09-24)
## Bugfixes
- Fix an error in token processing that caused server crashes.
# 3.3.2 (2025-09-24)
## Bugfixes
- Read user ID from authentication information and not from the config file.
This fixes an error when posting with a non-config authentication source.
# 3.3.1 (2025-09-22)
## Bugfixes
- Do not overwrite existing type validators in rdflib. This fixes an error
when converting from JSON to TTL.
# 3.3.0 (2025-09-19)
## New features
- Add the endpoint `/server` which returns version information about the
running server.
## Removed features
- Remove the response header `X-Dumpthings-Service-Version`
# 3.2.1 (2025-09-19)
## Bugfixes
- Fix hashed token handling. This releases removes a bug that lead to server
crashes.
# 3.2.0 (2025-09-18)
## New features
- Support hashed tokens in configuration file. The configuration file content
can not be used to exfiltrate a hashed token.
# 3.1.0 (2025-09-17)
## New features
- Support multiple authentication sources for tokens. Currently two authentication
sources are provided:
- `config`: reads token information from the config file
- `forgejo`: authenticate and authorize users via a Forgejo token
# 3.0.1 (2025-09-11)
## Bugfixes
- Unpin pydantic-version. This gets rid of a `RecursionError` that appeared on certain schemas.
# 3.0.0 (2025-09-10)
## Breaking Changes
- Remove the `token-permissions`-endpoint.
## Bugfixes
- Fix failing tests
# 2.4.1 (2025-09-06)
## Bugfixes
- Improve performance of class-instance fetching with SQL-backends. (from 2.3.4)
- Catch pydantic `ValidationError` exceptions and convert them to HTTP 400 errors.
This prevents internal server errors in case of invalid input data. (from 2.3.3)
# 2.4.0 (2025-09-02)
## New features
- Search pattern in constrained searches are only matched against values in
records, and not against keys anymore.
# 2.3.2 (2025-09-01)
## Bugfixes
- This version improves the bugfix that was introduced in version 2.3.1.
A token provided for a collection that is not defined in the token
configuration object will now be ignored.
# 2.3.1 (2025-09-01)
## Bugfixes
- Fix a bug that caused internal server errors if a token was provided for a
collection that was not defined in the token configuration object. Now, this
situation will lead to a 401 Unauthorized error.
(There is no ignoring of the token and no fallback to a default token. To
use a default token, the request should be performed without a token.)
# 2.3.0 (2025-09-01)
## New features
- Support explicit type declarations in TTL input for all types that are
defined in the underlying schema.
# 2.2.0 (2025-08-29)
## New features
- Add a `x-dumpthings-service-version`-header to all API responses. This header
contains the version of the dump-things-service that generated the response.
## Bugfixes
- Fix a bug in matching-parameter handling that prevented class-selection.
# 2.1.0 (2025-08-28)
## New features
- Add a `matching`-parameter to the `/<collection>/records/<class>`- and
`/<collection>/records/p/<class>`-endpoint.
This parameter allows to filter records by a simple text matching.
# 2.0.1 (2025-07-24)
## Bugfixes
- Ensure that format conversion errors are caught and reported correctly in
the API.
# 2.0.0 (2025-07-15)
## Breaking changes
- The `--export-to` command line parameter was removed from
`dump-things-service`. It is replaced with the functionally identical command
line parameter `--export-json`.
- The `--sort-by` command line parameter was removed from `dump-things-service`.
The reason is that sorting key modification requires rebuilding of the
`record_dir` indices. That would defeat the purpose of the persistent
`record_dir` index, which is fast startup. All results are now sorted by the
`pid` field by default.
- The backend that was previously called `record_dir` is now called
`record_dir+stl`.
`record_dir+stl` is now the default backend. It is functionally identical to
the previous `record_dir`-backend. Changes in configuration files are only
required if the `record_dir` backend was explicitly defined in the
configuration file.
## New features
- Factor out a Schema Type Layer (STL) from the `record_dir` backend. The STL
can be used with every backend. It removes top-level `schema_type`-entries
from records before they are stored. It also adds the correct top-level
`schema_type`-entry to records that are read from a store. This functionality
was previously built into the `record_dir` backend. Now it can be combined
with `record_dir` and `sqlite` backends.
- Add the backend extension `stl` which specifies that a backend should be used
with the STL. For example `record_dir+stl` defines a backend that uses
`record_dir` with the STL on top.
- The command `dump-things-copy-store` was added. It copies a data store from
one backend to another. This is useful for migrating to a different backend,
for example, from the `record_dir` backend to the `sqlite` backend.
- The command `dump-things-rebuild-index` was added. It allows rebuilding the
index of a `record_dir` data store, after the store was modified externally.
# 1.1.0 (2025-07-10)
## New features
- Add a `--export-tree` command line parameter to the `dump-things-service` command.
This parameter allows to export a data store to tree structure as described
[here](https://concepts.datalad.org/dump-things/).
## Bugfixes
- Fix a typo in the "tips and tricks" section of the README.md file.
# 1.0.0 (2025-07-09)
## Breaking changes
- The result of `/<collection>/records/<class>`-endpoint for the output format
TTL has changed its structure. It is now a JSON array where the individual
entries are strings. Each entry is a TTL document that describes a single record.
Earlier versions would return a single TTL document that contained all records.
Due to the high computational cost of combining multiple TTL documents into a
single document, this change was made. In addition, this change unified the
code used in the paginated and the non-paginated endpoints.
## New features
- support for multiple backends. The default backend is `record_dir`, which is the
backend used in the previous versions. It is fully compatible with existing stores.
This version adds a new backend, `sqlite`, which uses a SQLite database. More
SQL backends will be added in the future. SQL backends should be able to support
far bigger record numbers than the `record_dir` backend (hundreds of thousands)
without performance degradation.
- an export method has been added. With the command line parameter `--export`,
the service exports all records of a data store and the schema information of
collections to a JSON file or to stdout.
# 0.5.0 (2025-06-27)
## New features
- support sorting of result record lists. By default, result records are sorted by
the field `pid`. The parameter `--sort-by` allows to define alternative fields
for sorting. Multiple fields can be specified by repeating the `--sort-by` parameter.
# 0.4.0 (2025-06-25)
## New features
- limit the number of records that are returned via the `/<collection>/records/<class>`-endpoint.
The maximum number of JSON-records is 1200, the maximum number of TTL-records is 60 (due to the high cost of combining TTL-records).
An HTTP 413 error is returned if the number of records is exceeded.
This limits backward compatibility, as the previous behavior was to return all records.
## Bugfixes
- fix errors in README.md
# 0.3.0 (2025-06-25)
## New features
- pagination support for class instance retrieval. To keep backward-compatibility a new endpoint is added, i.e., `/<collection>/records/p/<class>`.
## Cleanup
- the datalad-concepts submodule was removed
- any calls to patch via post-install script were removed
# 0.2.7 (2025-06-24)
## Bugfixes
- monkeypatching was not triggered earlier, this is fixed now.
## Cleanup
- the datalad-concepts submodule was removed
- any calls to patch via post-install script were removed
# 0.2.6 (2025-06-22)
## Bugfixes
- ensure that the version number if correct
# 0.2.5 (2025-06-22)
## New feature
- dump-things-service does now patch its environment as required. There is no more
need to provide a patched linkml-environment.
- the distribution package is now smaller, it does not contain the test directory anymore.
## Bugfixes
- ensure that non-existing collections are properly reported in unauthenticated requests
# 0.2.4 (2025-06-07)
## Bugfixes
- add missing entry point for `dump-things-service`-command.
# 0.2.3 (2025-06-06)
## Bugfixes
- bump the version to 0.2.3
# 0.2.2 (2025-06-06)
## Bugfixes
- describe the pypi-installation and start of the service in the README
# 0.2.1 (2025-06-05)
## New feature
- dump-things-service is now available via pypi.
- add `dump-things-service` command. This command can be used after installation to start the service
# 0.2.0 (2025-06-04)
## New features
- set `schema_type`-attribute in all JSON records that are returned when storing or retrieving records
- add mapping functions with 2-stage directory hierarchies: `digest_md5_p3_p3` and `digest_sha1_p3_p3`.
- move all dependency definitions into `pyproject.toml`, remove `requirements.txt` and `requirements-devel.txt`
# 0.1.1 (2025-06-03)
## Bugfixes
- fix a logging call in dynamically created code
# 0.1.0 (2025-06-03)
## New features
- improve logging, add a --log-level command line parameter
- report full path in records with colliding PIDs-exception
- don't exit if PID collision is detected at startup, but log an error
- add a changelog
- improve resilience of record directory stores against fault YAML content and non-yaml files
- omit creation of unused record directory stores
# 0.0.1
## Features
- add in-memory PID-index for record directory stores
- add --error-mode command line parameter which allows the service to report a critical error on all endpoints
- add token capability endpoint (`/<collection>/token_permissions`)
- update configuration to allow details directory specification for tokens and incoming directories