Improve README.md #189
1 changed files with 45 additions and 184 deletions
227
README.md
227
README.md
|
|
@ -13,7 +13,7 @@ The service supports schemas that are based on Datalad's *Thing* schema, i.e. on
|
||||||
It assumes that the classes of stored records are subclasses of `Thing`, and inherit the properties `pid` and `schema_type` from the `Thing`-baseclass.
|
It assumes that the classes of stored records are subclasses of `Thing`, and inherit the properties `pid` and `schema_type` from the `Thing`-baseclass.
|
||||||
|
|
||||||
The general workflow in the service is as follows.
|
The general workflow in the service is as follows.
|
||||||
We distinguish between two areas of a collection, an **incoming** are and a **curated** area.
|
We distinguish between two areas of a collection, an **incoming** area and a **curated** area.
|
||||||
Data written to a collection is stored in a collection-specific **incoming** area.
|
Data written to a collection is stored in a collection-specific **incoming** area.
|
||||||
A curation process, which is outside the scope of the service, moves data from the incoming area of a collection to the **curated** area of the collection.
|
A curation process, which is outside the scope of the service, moves data from the incoming area of a collection to the **curated** area of the collection.
|
||||||
|
|
||||||
|
|
@ -26,7 +26,7 @@ So any read- and write-operations on an incoming area are actually restricted to
|
||||||
Multiple tokens can share the same zone.
|
Multiple tokens can share the same zone.
|
||||||
That allows multiple submitters to work together when storing records in the service.
|
That allows multiple submitters to work together when storing records in the service.
|
||||||
|
|
||||||
The service provides a HTTP-based API to store and retrieve data objects, and to verify token capabilities.
|
The service provides an HTTP-based API to store and retrieve data objects, and to verify token capabilities.
|
||||||
|
|
||||||
### Installing the service
|
### Installing the service
|
||||||
|
|
||||||
|
|
@ -53,11 +53,13 @@ The following command line parameters are supported:
|
||||||
|
|
||||||
- `--root-path <path>`: Set the ASGI 'root_path' for applications submounted below a given URL path.
|
- `--root-path <path>`: Set the ASGI 'root_path' for applications submounted below a given URL path.
|
||||||
|
|
||||||
- `--sort-by <field>`: By default result records are sorted by the field `pid`.
|
- `--log-level`: set the log level for the service, allowed values are `ERROR`, `WARNING`, `INFO`, `DEBUG`. The default-level is `WARNING`.
|
||||||
This parameter allows overriding the sort field.
|
|
||||||
The parameter can be repeated to define secondary, tertiary, etc. sorting fields.
|
|
||||||
If a given field is not present in the record, the record will be sorted behind all records that possess the field.
|
|
||||||
|
|
||||||
|
```bash
|
||||||
|
dump-things-service /data-storage/store --host 127.0.0.1 --port 8000
|
||||||
|
```
|
||||||
|
|
||||||
|
The above command runs the service on the network location `127.0.0.1:8000` and provides access to the store under `/data-storage/store`.
|
||||||
|
|
||||||
### Configuration file
|
### Configuration file
|
||||||
|
|
||||||
|
|
@ -86,7 +88,7 @@ collections:
|
||||||
|
|
||||||
# The path to the curated data of the collection. This path should contain the
|
# The path to the curated data of the collection. This path should contain the
|
||||||
# ".dumpthings.yaml"-configuration for collections that is described
|
# ".dumpthings.yaml"-configuration for collections that is described
|
||||||
# here: <https://concepts.datalad.org/dump-things/>.
|
# here: <https://concepts.datalad.org/dump-things-storage-v0/>.
|
||||||
# A relative path is interpreted relative to the storage root, which is provided on
|
# A relative path is interpreted relative to the storage root, which is provided on
|
||||||
# service start. An absolute path is interpreted as an absolute path.
|
# service start. An absolute path is interpreted as an absolute path.
|
||||||
curated: curated/personal_records
|
curated: curated/personal_records
|
||||||
|
|
@ -108,7 +110,7 @@ collections:
|
||||||
# Optionally a list of classes that will be ignored when store- or validate-endpoints
|
# Optionally a list of classes that will be ignored when store- or validate-endpoints
|
||||||
# are created. If `use_classes` is present, the entries of this list will further reduce
|
# are created. If `use_classes` is present, the entries of this list will further reduce
|
||||||
# the classes that will receive endpoints. If `use_classes` is not present, the entries
|
# the classes that will receive endpoints. If `use_classes` is not present, the entries
|
||||||
# of this list will reduce the classes from the schema, the will receive endpoints.
|
# of this list will reduce the classes from the schema that will receive endpoints.
|
||||||
# The classes listed here must be listed in `use_classes` if that is defined. If
|
# The classes listed here must be listed in `use_classes` if that is defined. If
|
||||||
# `use_classes` is not defined, they must be listed in the schema.
|
# `use_classes` is not defined, they must be listed in the schema.
|
||||||
ignore_classes:
|
ignore_classes:
|
||||||
|
|
@ -136,7 +138,7 @@ tokens:
|
||||||
# access to the two collections: "rooms_and_buildings" and "fixed_data".
|
# access to the two collections: "rooms_and_buildings" and "fixed_data".
|
||||||
basic_access:
|
basic_access:
|
||||||
|
|
||||||
# The value of "user-id" will be added as an annotation to each record that is
|
# The value of "user_id" will be added as an annotation to each record that is
|
||||||
# uploaded with this token.
|
# uploaded with this token.
|
||||||
user_id: anonymous
|
user_id: anonymous
|
||||||
|
|
||||||
|
|
@ -238,7 +240,7 @@ The backend will be used for the curated area and for the incoming areas of the
|
||||||
If no backend is defined for a collection, the `record_dir+stl`-backend is used by default.
|
If no backend is defined for a collection, the `record_dir+stl`-backend is used by default.
|
||||||
The `+stl`-backends can be useful if an endpoint returns records of multiple classes, because it allows clients to determine the class of each result record.
|
The `+stl`-backends can be useful if an endpoint returns records of multiple classes, because it allows clients to determine the class of each result record.
|
||||||
|
|
||||||
The service guarantees that backends of all types can co-exist independently in the same directory, i.e., there are no name collisions in files that are used for different backends (as long as no class name starts with `.` or `_`)).
|
The service guarantees that backends of all types can co-exist independently in the same directory, i.e., there are no name collisions in files that are used for different backends (as long as no class name starts with `.` or `_`).
|
||||||
|
|
||||||
The following configuration snippet shows how to define a backend for a collection:
|
The following configuration snippet shows how to define a backend for a collection:
|
||||||
|
|
||||||
|
|
@ -291,10 +293,10 @@ collections:
|
||||||
# `forgejo-user-<user-login>`
|
# `forgejo-user-<user-login>`
|
||||||
label_type: team
|
label_type: team
|
||||||
# An optional repository. The token will only be authorized
|
# An optional repository. The token will only be authorized
|
||||||
# if the team has access to the repository. Note: if `repo`
|
# if the team has access to the repository. Note: if `repository`
|
||||||
# is set, the token must have at least repository read
|
# is set, the token must have at least repository read
|
||||||
# permissions.
|
# permissions.
|
||||||
repo: reference-repository
|
repository: reference-repository
|
||||||
|
|
||||||
# Fallback to the config file.
|
# Fallback to the config file.
|
||||||
- type: config # check tokens from the configuration file
|
- type: config # check tokens from the configuration file
|
||||||
|
|
@ -304,14 +306,14 @@ collections:
|
||||||
# permissions for a token, those permissions will be used and no other
|
# permissions for a token, those permissions will be used and no other
|
||||||
# authorization sources will be queried.
|
# authorization sources will be queried.
|
||||||
# The default authorization source is `config`, which reads the token
|
# The default authorization source is `config`, which reads the token
|
||||||
# permissions, user-id, and incoming
|
# permissions, user-id, and incoming from the config file.
|
||||||
|
|
||||||
collection_with_explicit_record_dir+stl_backend:
|
collection_with_explicit_record_dir+stl_backend:
|
||||||
default_token: anon_read
|
default_token: anon_read
|
||||||
curated: collection_3/curated
|
curated: collection_3/curated
|
||||||
backend:
|
backend:
|
||||||
# The record_dir-backend is identified by the
|
# The record_dir+stl backend is identified by the
|
||||||
# type: "record_dir". No more attributes are
|
# type: "record_dir+stl". No more attributes are
|
||||||
# defined for this backend.
|
# defined for this backend.
|
||||||
type: record_dir+stl
|
type: record_dir+stl
|
||||||
|
|
||||||
|
|
@ -351,7 +353,7 @@ If an identical authentication source is listed multiple time in the configurati
|
||||||
|
|
||||||
These authentication sources are available:
|
These authentication sources are available:
|
||||||
|
|
||||||
- config: use the configuration file to
|
- config: use the configuration file to authenticate tokens
|
||||||
- forgejo: use a Forgejo-instance to authenticate tokens
|
- forgejo: use a Forgejo-instance to authenticate tokens
|
||||||
|
|
||||||
All authentication source configurations contain the key `type`.
|
All authentication source configurations contain the key `type`.
|
||||||
|
|
@ -385,10 +387,10 @@ collections:
|
||||||
# `forgejo-user-<user-login>`
|
# `forgejo-user-<user-login>`
|
||||||
label_type: team
|
label_type: team
|
||||||
# An optional repository. The token will only be authorized
|
# An optional repository. The token will only be authorized
|
||||||
# if the team has access to the repository. Note: if `repo`
|
# if the team has access to the repository. Note: if `repository`
|
||||||
# is set, the token must have at least repository read
|
# is set, the token must have at least repository read
|
||||||
# permissions.
|
# permissions.
|
||||||
repo: reference-repository
|
repository: reference-repository
|
||||||
|
|
||||||
# Fallback to the config file.
|
# Fallback to the config file.
|
||||||
- type: config # check tokens from the configuration file
|
- type: config # check tokens from the configuration file
|
||||||
|
|
@ -398,7 +400,7 @@ collections:
|
||||||
# permissions for a token, those permissions will be used and no other
|
# permissions for a token, those permissions will be used and no other
|
||||||
# authorization sources will be queried.
|
# authorization sources will be queried.
|
||||||
# The default authorization source is `config`, which reads the token
|
# The default authorization source is `config`, which reads the token
|
||||||
# permissions, user-id, and incoming
|
# permissions, user-id, and incoming from the config file.
|
||||||
|
|
||||||
...
|
...
|
||||||
|
|
||||||
|
|
@ -478,7 +480,7 @@ The default annotation tag classes can be overridden in the configuration on a p
|
||||||
To override the defaults tags, add a `submission_tags`-attribute to a collection definition.
|
To override the defaults tags, add a `submission_tags`-attribute to a collection definition.
|
||||||
The `submission_tags`-attribute should contain a mapping that maps either `submitter_id_tag`, or `submitter_time_tag` or both to an IRI or a CURIE.
|
The `submission_tags`-attribute should contain a mapping that maps either `submitter_id_tag`, or `submitter_time_tag` or both to an IRI or a CURIE.
|
||||||
If the schema defines a matching prefix, IRIs are automatically converted to CURIEs before storing the record.
|
If the schema defines a matching prefix, IRIs are automatically converted to CURIEs before storing the record.
|
||||||
The service validates that the prefix of a CURIE is defined in the schema of the collection.
|
If a tag is given as a CURIE, the service validates that the prefix of the CURIE is defined in the schema of the collection.
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
type: collections
|
type: collections
|
||||||
|
|
@ -496,152 +498,11 @@ collections:
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
The service currently supports the following backends for storing records:
|
|
||||||
- `record_dir`: this backend stores records as YAML-files in a directory structure that is defined [here](https://concepts.datalad.org/dump-things-storage-v0/). It reads the backend configuration from a "record collection configuration file" as described [here](https://concepts.datalad.org/dump-things-storage-v0/).
|
|
||||||
|
|
||||||
- `sqlite`: this backend stores records in a SQLite database. There is an individual database file, named `__sqlite-records.db`, for each curated area and incoming area.
|
|
||||||
|
|
||||||
- `record_dir+stl`: here `stl` stands for "schema-type-layer".
|
|
||||||
This backend stores records in the same format as `record_dir`, but adds special treatment for the `schema_type` attribute in records.
|
|
||||||
It removes `schema_type`-attributes from the top-level mapping of a record before storing it as YAML-file. When records are read from this backend, a `schema_type` attribute is added back into the record, using a schema to determine the correct class-URI.
|
|
||||||
In other words, all records stored with this backend will have no `schema_type`-attribute in the top-level, and all records read with this backend will have a `schema_type` attribute in the top-level.
|
|
||||||
|
|
||||||
- `sqlite+stl`: This backend stores records in the same format as `sqlite`, but adds the same special treatment for the `schema_type` attribute as `record_dir+stl`.
|
|
||||||
|
|
||||||
Backends can be defined per collection in the configuration file.
|
|
||||||
The backend will be used for the curated area and for the incoming areas of the collection.
|
|
||||||
If no backend is defined for a collection, the `record_dir+stl`-backend is used by default.
|
|
||||||
The `+stl`-backends can be useful if an endpoint returns records of multiple classes, because it allows clients to determine the class of each result record.
|
|
||||||
|
|
||||||
The service guarantees that backends of all types can co-exist independently in the same directory, i.e., there are no name collisions in files that are used for different backends (as long as no class name starts with `.` or `_`)).
|
|
||||||
|
|
||||||
The following configuration snippet shows how to define a backend for a collection:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
...
|
|
||||||
collections:
|
|
||||||
collection_with_default_record_dir+stl_backend:
|
|
||||||
# This is a collection with the default backend, i.e. `record_dir+stl` and
|
|
||||||
# the default authentication, i.e. config-based authentication.
|
|
||||||
default_token: anon_read
|
|
||||||
curated: collection_1/curated
|
|
||||||
|
|
||||||
collection_with_forgejo_authentication_source:
|
|
||||||
# This is a collection with the default backend, i.e. `record_dir+stl` and
|
|
||||||
# a forgejo-based authentication source. That means it will use a forgejo
|
|
||||||
# instance to determine the permissions of a token for this collection.
|
|
||||||
# The instance is also used to determine the user-id and the incoming label.
|
|
||||||
# In the case of forgejo, the user-id and the incoming label are the
|
|
||||||
# forgejo login associated with the token.
|
|
||||||
|
|
||||||
# We still need the name of a default token. If the token is defined in this
|
|
||||||
# config file, its properties will be determined by the
|
|
||||||
# config file. If the token is not defined in the config file, its
|
|
||||||
# properties will be determined by the authentication sources. In this
|
|
||||||
# example by the forgejo-instance at `https://forgejo.example.com`.
|
|
||||||
# If there is more than one authentication source, they will be tried
|
|
||||||
# in the order they are defined in the config file.
|
|
||||||
default_token: anon_read # We still need a default token
|
|
||||||
curated: collection_2/curated
|
|
||||||
|
|
||||||
# Token permissions, user-ids (for record annotations), and incoming
|
|
||||||
# label can be determined by multiple authentication sources.
|
|
||||||
# If no source is defined, `config` will be used, which reads token
|
|
||||||
# information from the config file.
|
|
||||||
# This example explicitly defines `config` and a second authentication
|
|
||||||
# source, a `forgejo` authentication source.
|
|
||||||
auth_sources:
|
|
||||||
- type: forgejo # requires `user`-read and `organization`-read permissions on token
|
|
||||||
# The API-URL of the forgejo instance that should be used
|
|
||||||
url: https://forgejo.example.com/api/v1
|
|
||||||
# An organization
|
|
||||||
organization: data_handling
|
|
||||||
# A team in the organization. The authorization of the team
|
|
||||||
# determines the permissions of the token
|
|
||||||
team: data_entry_personal
|
|
||||||
# `label_type` determines how an incoming label is created for
|
|
||||||
# a Forgejo token. If `label_type` is `team`, the incoming label
|
|
||||||
# will be `forgejo-team-<organization>-<team>`. If `label_type`
|
|
||||||
# is `user`, the incoming label will be
|
|
||||||
# `forgejo-user-<user-login>`
|
|
||||||
label_type: team
|
|
||||||
# An optional repository. The token will only be authorized
|
|
||||||
# if the team has access to the repository. Note: if `repo`
|
|
||||||
# is set, the token must have at least repository read
|
|
||||||
# permissions.
|
|
||||||
repo: reference-repository
|
|
||||||
|
|
||||||
# Fallback to the config file.
|
|
||||||
- type: config # check tokens from the configuration file
|
|
||||||
|
|
||||||
# Multiple authorization sources are allowed. They will be tried in the
|
|
||||||
# order defined in the config file. If an authorization source returns
|
|
||||||
# permissions for a token, those permissions will be used and no other
|
|
||||||
# authorization sources will be queried.
|
|
||||||
# The default authorization source is `config`, which reads the token
|
|
||||||
# permissions, user-id, and incoming
|
|
||||||
|
|
||||||
collection_with_explicit_record_dir+stl_backend:
|
|
||||||
default_token: anon_read
|
|
||||||
curated: collection_3/curated
|
|
||||||
backend:
|
|
||||||
# The record_dir-backend is identified by the
|
|
||||||
# type: "record_dir". No more attributes are
|
|
||||||
# defined for this backend.
|
|
||||||
type: record_dir+stl
|
|
||||||
|
|
||||||
collection_with_sqlite_backend:
|
|
||||||
default_token: anon_read
|
|
||||||
curated: collection_4/curated
|
|
||||||
backend:
|
|
||||||
# The sqlite-backend is identified by the
|
|
||||||
# type: "sqlite". It requires a schema attribute
|
|
||||||
# that holds the URL of the schema that should
|
|
||||||
# be used in this backend.
|
|
||||||
type: sqlite
|
|
||||||
schema: https://concepts.inm7.de/s/flat-data/unreleased.yaml
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
### Command line parameters:
|
|
||||||
|
|
||||||
The service supports the following command line parameters:
|
|
||||||
|
|
||||||
- `<storage root>`: this is a mandatory parameter that defines the directory that serves as root for relative `curated`- and `incoming`-paths. Unless the `-c/--config` option is given, the configuration is loaded from `<storage root>/.dumpthings.yaml`.
|
|
||||||
|
|
||||||
- `--host`: (optional): the IP address of the host the service should run on
|
|
||||||
|
|
||||||
|
|
||||||
- `--port`: the port number the service should listen on
|
|
||||||
|
|
||||||
|
|
||||||
- `-c/--config`: if set, the service will read the configuration from the given path. Otherwise it will try to read the configuration from `<storage root>/.dumpthings.yaml`.
|
|
||||||
|
|
||||||
|
|
||||||
- `--log-level`: set the log level for the service, allowed values are `ERROR`, `WARNING`, `INFO`, `DEBUG`. The default-level is `WARNING`.
|
|
||||||
|
|
||||||
|
|
||||||
- `--root-path`: set the ASGI `root_path` for applications sub-mounted below a given URL path.
|
|
||||||
|
|
||||||
|
|
||||||
The service can be started with the following command:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
dump-things-service
|
|
||||||
```
|
|
||||||
In this example the service will run on the network location `0.0.0.0:8000` and provide access to the stores under `/data-storage/store`.
|
|
||||||
|
|
||||||
To run the service on a specific host and port, use the command line options `--host` and `--port`, for example:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
dump-things-service /data-storage/store --host 127.0.0.1 --port 8000
|
|
||||||
```
|
|
||||||
|
|
||||||
### Endpoints
|
### Endpoints
|
||||||
|
|
||||||
Most endpoints require a *collection*. These correspond to the names of the "data record collection"-directories (for example `myschema-v3-fmta` in [Dump Things Service](https://concepts.datalad.org/dump-things/)) in the stores.
|
Most endpoints require a *collection*. These correspond to the names of the "data record collection"-directories (for example `myschema-v3-fmta` in [Dump Things Service](https://concepts.datalad.org/dump-things-storage-v0/)) in the stores.
|
||||||
|
|
||||||
The service provides the following user endpoints (in addition to user-endpoints there exist endpoints for curators, to view them check the `/docs`-path in an installed service):
|
The service provides the following user endpoints (In addition to user endpoints, there exist endpoints for curators. To view them, check the `/docs`-path in an installed service):
|
||||||
|
|
||||||
- `POST /<collection>/record/<class>`: an object of type `<class>` (defined by the schema associated with `<collection>`) can be posted to this endpoint.
|
- `POST /<collection>/record/<class>`: an object of type `<class>` (defined by the schema associated with `<collection>`) can be posted to this endpoint.
|
||||||
It will be stored in the incoming area for this collection and the user defined by the provided token.
|
It will be stored in the incoming area for this collection and the user defined by the provided token.
|
||||||
|
|
@ -650,9 +511,9 @@ The service provides the following user endpoints (in addition to user-endpoints
|
||||||
It can be set to `json` (the default) or to `ttl` (Terse RDF Triple Language, a.k.a. Turtle).
|
It can be set to `json` (the default) or to `ttl` (Terse RDF Triple Language, a.k.a. Turtle).
|
||||||
If the `json`-format is selected, the content-type should be `application/json`.
|
If the `json`-format is selected, the content-type should be `application/json`.
|
||||||
If the `ttl`-format is selected, the content-type should be `text/turtle`.
|
If the `ttl`-format is selected, the content-type should be `text/turtle`.
|
||||||
The service supports extraction of inlined records as described in [Dump Things Service](https://concepts.datalad.org/dump-things/).
|
The service supports extraction of inlined records as described in [Dump Things Service](https://concepts.datalad.org/dump-things-storage-v0/).
|
||||||
On success, the endpoint will return a list of all stored records.
|
On success, the endpoint will return a list of all stored records.
|
||||||
This might be more than one record if the posted object contains inlined records.
|
The list may contain more than one record if the posted object contains inlined records.
|
||||||
|
|
||||||
- `POST /<collection>/validate/record/<class>`: an object of type `<class>` (defined by the schema associated with `<collection>`) can be posted to this endpoint.
|
- `POST /<collection>/validate/record/<class>`: an object of type `<class>` (defined by the schema associated with `<collection>`) can be posted to this endpoint.
|
||||||
It will validate the posted data.
|
It will validate the posted data.
|
||||||
|
|
@ -661,12 +522,12 @@ The service provides the following user endpoints (in addition to user-endpoints
|
||||||
It can be set to `json` (the default) or to `ttl` (Terse RDF Triple Language, a.k.a. Turtle).
|
It can be set to `json` (the default) or to `ttl` (Terse RDF Triple Language, a.k.a. Turtle).
|
||||||
If the `json`-format is selected, the content-type should be `application/json`.
|
If the `json`-format is selected, the content-type should be `application/json`.
|
||||||
If the `ttl`-format is selected, the content-type should be `text/turtle`.
|
If the `ttl`-format is selected, the content-type should be `text/turtle`.
|
||||||
The service supports extraction of inlined records as described in [Dump Things Service](https://concepts.datalad.org/dump-things/).
|
The service supports extraction of inlined records as described in [Dump Things Service](https://concepts.datalad.org/dump-things-storage-v0/).
|
||||||
On success, the endpoint will return a list of all stored records.
|
On success, the endpoint will return a list of all stored records.
|
||||||
This might be more than one record if the posted object contains inlined records.
|
The list may contain more than one record if the posted object contains inlined records.
|
||||||
|
|
||||||
- `GET /<collection>/records/<class>`: retrieve all readable objects from collection `<collection>` that are of type `<class>` or any of its subclasses.
|
- `GET /<collection>/records/<class>`: retrieve all readable objects from collection `<collection>` that are of type `<class>` or any of its subclasses.
|
||||||
Objects are readable if the default token for the collection allows reading of objects or if a token is provided that allows reading of objects in the collection.
|
Objects are readable if the default token or the token provided has the permission to read the objects in the collection.
|
||||||
Objects from incoming spaces will take precedence over objects from curated spaces, i.e. if there are two objects with identical `pid` in the curated space and in the incoming space, the object from the incoming space will be returned.
|
Objects from incoming spaces will take precedence over objects from curated spaces, i.e. if there are two objects with identical `pid` in the curated space and in the incoming space, the object from the incoming space will be returned.
|
||||||
The endpoint supports the query parameter `format`, which determines the format of the query result.
|
The endpoint supports the query parameter `format`, which determines the format of the query result.
|
||||||
It can be set to `json` (the default) or to `ttl`,
|
It can be set to `json` (the default) or to `ttl`,
|
||||||
|
|
@ -692,7 +553,7 @@ The service provides the following user endpoints (in addition to user-endpoints
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
- `GET /<collection>/record?pid=<pid>`: retrieve an object with the pid `<pid>` from the collection `<collection>`, if the provided token allows reading. If the provided token allows reading of incoming and curated spaces, objects from incoming spaces will take precedence.
|
- `GET /<collection>/record?pid=<pid>`: retrieve an object with the pid `<pid>` from the collection `<collection>` if the provided token allows reading. If the provided token allows reading of incoming and curated spaces, objects from incoming spaces will take precedence.
|
||||||
The endpoint supports the query parameter `format`, which determines the format of the query result.
|
The endpoint supports the query parameter `format`, which determines the format of the query result.
|
||||||
It can be set to `json` (the default) or to `ttl`,
|
It can be set to `json` (the default) or to `ttl`,
|
||||||
|
|
||||||
|
|
@ -717,7 +578,7 @@ The service provides the following user endpoints (in addition to user-endpoints
|
||||||
|
|
||||||
|
|
||||||
- `GET /<collection>/records/`: retrieve all readable objects from collection `<collection>`.
|
- `GET /<collection>/records/`: retrieve all readable objects from collection `<collection>`.
|
||||||
Objects are readable if the default token for the collection allows reading of objects or if a token is provided that allows reading of objects in the collection.
|
Objects are readable if the default token or the token provided has the permission to read the objects in the collection.
|
||||||
Objects from incoming spaces will take precedence over objects from curated spaces, i.e. if there are two objects with identical `pid` in the curated space and in the incoming space, the object from the incoming space will be returned.
|
Objects from incoming spaces will take precedence over objects from curated spaces, i.e. if there are two objects with identical `pid` in the curated space and in the incoming space, the object from the incoming space will be returned.
|
||||||
The endpoint supports the query parameter `format`, which determines the format of the query result.
|
The endpoint supports the query parameter `format`, which determines the format of the query result.
|
||||||
It can be set to `json` (the default) or to `ttl`,
|
It can be set to `json` (the default) or to `ttl`,
|
||||||
|
|
@ -743,17 +604,17 @@ The service provides the following user endpoints (in addition to user-endpoints
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
- `DELETE /<collection>/record?pid=<pid>`: delete an object with the pid `<pid>` from the incoming area of the collection `<collection>`, if the provided token allows writing to the incoming area.
|
- `DELETE /<collection>/record?pid=<pid>`: delete an object with the pid `<pid>` from the incoming area of the collection `<collection>` if the provided token allows writing to the incoming area.
|
||||||
The result is either `True` if the object was deleted or `False` if the object did not exists or was not deleted.
|
The result is either `True` if the object was deleted or `False` if the object did not exists or was not deleted.
|
||||||
|
|
||||||
|
|
||||||
- `GET /docs`: provides information about the API of the service, i.e. about all endpoints.
|
- `GET /docs`: provides information about the service's API, i.e. about all endpoints.
|
||||||
|
|
||||||
#### Curation endpoints
|
#### Curation endpoints
|
||||||
|
|
||||||
The service support a set of curation-endpoints that give direct access to the curated area as well as to existing incoming areas.
|
The service supports a set of curation endpoints that allows direct access to the curated area as well as the incoming areas.
|
||||||
This access requires a `CURATOR`-token.
|
A `CURATOR`-token required to access these endpoints.
|
||||||
Details about the curation-endpoints can be found in [this issue](https://github.com/christian-monch/dump-things-server/issues/118).
|
Details about the curation endpoints can be found in [this issue](https://github.com/christian-monch/dump-things-server/issues/118).
|
||||||
|
|
||||||
|
|
||||||
### Tips & Tricks
|
### Tips & Tricks
|
||||||
|
|
@ -812,7 +673,7 @@ For example, to migrate from a `record_dir+stl` backend, the command is similar,
|
||||||
```
|
```
|
||||||
(Note: a `record_dir:<path>` can be used to copy without the schema type layer from a `record_dir+stl` backend. But in this case the copied records will not have a `schema_type` attribute, because the `record_dir` backend does not "put it back in", unlike a `record_dir+stl` backend.)
|
(Note: a `record_dir:<path>` can be used to copy without the schema type layer from a `record_dir+stl` backend. But in this case the copied records will not have a `schema_type` attribute, because the `record_dir` backend does not "put it back in", unlike a `record_dir+stl` backend.)
|
||||||
|
|
||||||
If the source backend is a `record_dir` or `record_dir+stl` backend and the store was manually modified outside the service (for example, by adding or removing files), it is recommended to run the command `dump-things-rebuild-index` on the source store before copying. This ensures that the index is up to date and all records are copied.
|
If the source backend is a `record_dir` or `record_dir+stl` backend, and the store was manually modified outside the service (for example, by adding or removing files), it is recommended to run the command `dump-things-rebuild-index` on the source store before copying. This ensures that the index is up to date and all records will be copied.
|
||||||
|
|
||||||
If any backend is a `record_dir+stl` backend, a schema has to be supplied via the `-s/--schema` command line parameter. The schema is used to determine the `schema_type` attribute of the records that are copied.
|
If any backend is a `record_dir+stl` backend, a schema has to be supplied via the `-s/--schema` command line parameter. The schema is used to determine the `schema_type` attribute of the records that are copied.
|
||||||
|
|
||||||
|
|
@ -827,25 +688,25 @@ If any backend is a `record_dir+stl` backend, a schema has to be supplied via th
|
||||||
record_dir:<path-to-data>/penguis/curated \
|
record_dir:<path-to-data>/penguis/curated \
|
||||||
sqlite:<path-to-data>/penguis/curated
|
sqlite:<path-to-data>/penguis/curated
|
||||||
```
|
```
|
||||||
The copy command will add the copied records to any existing record in the destination store.
|
The copy command will add the copied records to any existing records in the destination store.
|
||||||
Note: when records are copied from a `record-dir` store, the index is used to locate the records in the source store. If the index is not up-to-date, the copied records might not be complete. In this case, it is recommended to run `dump-things-rebuild-index` on the source store before copying.
|
Note: when records are copied from a `record-dir` store, the index is used to locate the records in the source store. If the index is not up-to-date, some records may not be copied. To ensure all records are copied, it is recommended to run `dump-things-rebuild-index` on the source store before copying.
|
||||||
|
|
||||||
- `dump-things-pid-check`: this command checks the pids in all collections of a store to verify that they can be resolved (if they are in CURIE form).
|
- `dump-things-pid-check`: this command checks the pids in all collections of a store to verify that they can be resolved (if they are in CURIE form).
|
||||||
This is useful to validate the proper definition of prefixes after schema-changes.
|
This is useful to validate the proper definition of prefixes after schema-changes.
|
||||||
|
|
||||||
- `dump-things-create-merged-schema`: this command creates a new schema that statically contains all schemas that the original schema imported.
|
- `dump-things-create-merged-schema`: this command creates a new schema that statically contains all schemas that the original schema imports.
|
||||||
The new schema is fully self contained and does not reference any other schemas anymore.
|
The new schema is fully self-contained and does not reference any other schemas.
|
||||||
|
|
||||||
### If things go wrong
|
### If things go wrong
|
||||||
|
|
||||||
#### Delete a record manually
|
#### Delete a record manually
|
||||||
|
|
||||||
If a schema was changed, for example a prefix-definition changed, the service might not be able anymore to delete a record.
|
If a schema is changed, for example a prefix-definition changed, the service may not be able to delete a record anymore.
|
||||||
In this case the record can be deleted manually if you have access to the storage root.
|
In this case, the record can be deleted manually if you have access to the storage root.
|
||||||
|
|
||||||
To delete the record, open a shell and navigate (`cd`) to the directory where the store is located.
|
To delete the record, open a shell and navigate (`cd`) to the directory where the store is located.
|
||||||
The location can be determined from the configuration file.
|
The location can be determined from the configuration file.
|
||||||
Depending on the storage backend, the next steps are different.
|
Depending on the storage backend, the subsequent steps are different.
|
||||||
|
|
||||||
##### `record-dir` backend
|
##### `record-dir` backend
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue