Use dump_things_pyclient to implement triple-tools #3
12 changed files with 610 additions and 280 deletions
141
README.md
141
README.md
|
|
@ -19,27 +19,73 @@ Perform the following operations, preferably in a Python-virtual environment.
|
|||
|
||||
## The commands
|
||||
|
||||
This project provided the following CLI commands:
|
||||
|
||||
- auto-curate: automatically move records from inboxes to the curated area of a collection
|
||||
- clean-incoming: delete all records from an inbox of a collection
|
||||
- list-incoming: list records in inboxes of a collection
|
||||
- post-records: read records from stdin and post them to inbox or curated area of a collection
|
||||
- read-pages: read records from collection, curated area of a collection, or specific inboxes
|
||||
- read-paginated-url: read records from any paginated service endpoints
|
||||
- build-local-triple-store: read all records from a collection and emit N-Triples
|
||||
|
||||
The following section show the help message for those commands
|
||||
|
||||
#### read-pages
|
||||
|
||||
Read all pages from a paginated endpoint.
|
||||
|
||||
```
|
||||
usage: read_pages [-h] [-s SIZE] [-p PARAMETER] base_url collection
|
||||
usage: read-pages [-h] [-c CLASS_NAME] [-f FORMAT] [-p PID] [-i LABEL] [-C] [-m MATCHING] [-s PAGE_SIZE] [-F FIRST_PAGE] [-l LAST_PAGE] [--stats] [-P] service_url collection
|
||||
|
||||
Get records from a collection on a dump-things-service
|
||||
|
||||
This command lists records that are stored in a dump-things-service. By
|
||||
default all records that are readable with the given token, or the default
|
||||
token, will be displayed. The output format is JSONL (JSON lines), where
|
||||
every line contains a record or a record with paging information. If `ttl`
|
||||
is chosen as format of the output records, the record content will be a string
|
||||
that contains a TTL-documents.
|
||||
|
||||
The command supports to read from the curated area only, to read from incoming
|
||||
areas, or to read records with a given PID.
|
||||
|
||||
Pagination information is returned for paginated results, when requested with
|
||||
`-P/--pagination`. All results are paginated except "get a record with a given PID"
|
||||
and "get the list of incoming zone labels".
|
||||
|
||||
If the environment variable "DUMPTHINGS_TOKEN" is set, its content will be used
|
||||
as token to authenticate against the dump-things-service.
|
||||
|
||||
positional arguments:
|
||||
base_url
|
||||
service_url
|
||||
collection
|
||||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
-s, --size SIZE default: 100
|
||||
-p, --parameter PARAMETER (key=value)
|
||||
-c, --class limit to a particular class (name)
|
||||
-c, --class CLASS_NAME
|
||||
only read records of this class, ignored if "--pid" is provided
|
||||
-f, --format FORMAT format of the output records ("json" or "ttl")
|
||||
-p, --pid PID the pid of the record that should be read
|
||||
-i, --incoming LABEL read from incoming area with the given label in the collection, if LABEL is "-", return the labels
|
||||
-C, --curated read from the curated area of the collection
|
||||
-m, --matching MATCHING
|
||||
return only records that have a matching value (use {'option_strings': ['-m', '--matching'], 'dest': 'matching', 'nargs': None, 'const': None, 'default': None, 'type': None, 'choices': None,
|
||||
'required': False, 'help': 'return only records that have a matching value (use % as wildcard). Ignored if "--pid" is provided. (NOTE: not all endpoints and backends support matching.)', 'metavar':
|
||||
None, 'deprecated': False, 'container': <argparse._ArgumentGroup object at 0x7fab8219b610>, 'prog': 'read-pages'}s wildcard). Ignored if "--pid" is provided. (NOTE: not all endpoints and backends
|
||||
support matching.)
|
||||
-s, --page-size PAGE_SIZE
|
||||
set the page size (1 - 100) (default: 100), ignored if "--pid" is provided
|
||||
-F, --first-page FIRST_PAGE
|
||||
the first page to return (default: 1), ignored if "--pid" is provided
|
||||
-l, --last-page LAST_PAGE
|
||||
the last page to return (default: None (return all pages), ignored if "--pid" is provided
|
||||
--stats show the number of records and pages and exit, ignored if "--pid" is provided
|
||||
-P, --pagination show pagination information (each record from an paginated endpoint is returned as [<record>, <current page number>, <total number of pages>, <page size>, <total number of items>]
|
||||
```
|
||||
|
||||
For a given `<base_url>` and `<collection>` the tool will read all pages
|
||||
returned by `<base_url>/<collection>/records/p/`.
|
||||
returned by `<base_url>/<collection>/records/p/`, or the respective inbox or the curated area.
|
||||
|
||||
The tool reads a token from the environment variable `DUMPTHINGS_TOKEN` if set.
|
||||
|
||||
|
|
@ -73,10 +119,15 @@ The tool reads a token from the environment variable `DUMPTHINGS_TOKEN`.
|
|||
Move records from inboxes into the curated part of a collection.
|
||||
|
||||
```
|
||||
usage: auto_curate [-h] [--destination-base-url DEST_SERVICE_URL] [--destination-collection DEST_COLLECTION] [--destination-token DEST_TOKEN] [--exclude [EXCLUDE ...]] [--list-labels] [--list-only] [-p PID]
|
||||
SOURCE_SERVICE_URL SOURCE_COLLECTION
|
||||
usage: auto-curate [-h] [--destination-service-url DEST_SERVICE_URL] [--destination-collection DEST_COLLECTION] [--destination-token DEST_TOKEN] [-e EXCLUDE] [-l] [-r] [-o] [-p PID] SOURCE_SERVICE_URL SOURCE_COLLECTION
|
||||
|
||||
Automatically move records from the incoming areas of a collection to the curated area of the same collection, or to the incoming area of another collection.
|
||||
Automatically move records from the incoming areas of a
|
||||
collection to the curated area of the same collection, or to
|
||||
the curated area of another collection.
|
||||
|
||||
The environment variable "DUMPTHINGS_TOKEN" must contain a token
|
||||
which used to authenticate the requests. The token must have
|
||||
curator-rights.
|
||||
|
||||
positional arguments:
|
||||
SOURCE_SERVICE_URL
|
||||
|
|
@ -84,21 +135,21 @@ positional arguments:
|
|||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
--destination-base-url DEST_SERVICE_URL
|
||||
--destination-service-url DEST_SERVICE_URL
|
||||
select a different dump-thing-service, i.e. not SOURCE_SERVICE_URL, as destination for auto-curated records
|
||||
--destination-collection DEST_COLLECTION
|
||||
select a different collection, i.e. not the SOURCE_COLLECTION of SOURCE_SERVICE_URL, as destination for auto-curated records
|
||||
--destination-token DEST_TOKEN
|
||||
if provided, this token will be used for the destination service, otherwise ${CURATOR_TOKEN} will be used
|
||||
--exclude, -e [EXCLUDE ...]
|
||||
exclude an inbox on the source collection
|
||||
--list-labels, -l
|
||||
--list-only, -o
|
||||
-p, --pid PID if provided, process only records that match the given PIDs. NOTE: matching does not involve CURIE-resolution!
|
||||
if provided, this token will be used for the destination service, otherwise $DUMPTHINGS_TOKEN will be used
|
||||
-e, --exclude EXCLUDE
|
||||
exclude an inbox on the source collection (repeatable)
|
||||
-l, --list-labels list the inbox labels of the given source collection, do not perform any curation
|
||||
-r, --list-records list records in the inboxes of the given source collection, do not perform any curation
|
||||
-o, --list-only [DEPRECATED: use "--list-records"] list records in the inboxes of the given source collection, do not perform any curation
|
||||
-p, --pid PID if provided, process only records that match the given PIDs
|
||||
```
|
||||
|
||||
`auto-curate` requires that the environment variable `CURATOR_TOKEN` is set, and contains a valid curator-token.
|
||||
|
||||
`auto-curate` requires that the environment variable DUMPTHINGS_TOKEN is set, and contains a valid curator-token.
|
||||
|
||||
#### build-local-triple-store
|
||||
|
||||
|
|
@ -149,7 +200,7 @@ options:
|
|||
List the labels of all inboxes of a given collection
|
||||
|
||||
```
|
||||
usage: list-incoming [-h] [--show-records] base_url collection
|
||||
usage: list-incoming [-h] [-s] base_url collection
|
||||
|
||||
positional arguments:
|
||||
base_url
|
||||
|
|
@ -157,10 +208,10 @@ positional arguments:
|
|||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
--show-records, -s show the records in the inboxes as well
|
||||
-s, --show-records show the records in the inboxes as well
|
||||
```
|
||||
|
||||
`list-incoming` requires that the environment variable `CURATOR_TOKEN` is set, and contains a valid curator-token
|
||||
`list-incoming` requires that the environment variable `CURATOR_TOKEN` is set, and contains a valid curator-token.
|
||||
|
||||
|
||||
#### json2ttl
|
||||
|
|
@ -171,8 +222,14 @@ contain TTL-documents with one string per line.
|
|||
```
|
||||
usage: json2ttl [-h] schema
|
||||
|
||||
Read JSON records from stdin and convert them to TTL
|
||||
|
||||
This command reads one record per line, either JSON format or a JSON-string
|
||||
with a TTL-document from stdin, converts them to TTL or JSON and prints them
|
||||
to stdout.
|
||||
|
||||
positional arguments:
|
||||
schema
|
||||
schema URL of the schema that should be used
|
||||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
|
|
@ -187,6 +244,44 @@ records in a collection to TTL:
|
|||
...
|
||||
```
|
||||
|
||||
#### read-paginated-url
|
||||
|
||||
General tool to read from any paginated endpoint of a dump-things-service
|
||||
|
||||
```
|
||||
usage: read-paginated-url [-h] [-s PAGE_SIZE] [-F FIRST_PAGE] [-l LAST_PAGE] [--stats] [-f FORMAT] [-m MATCHING] [-p] url
|
||||
|
||||
Read paginated endpoint
|
||||
|
||||
This command lists all records that are available via paginated endpoints from
|
||||
a dump-things-service, e.g., from:
|
||||
|
||||
https://<service-location>/<collection>/records/p/
|
||||
|
||||
If the environment variable "DUMPTHINGS_TOKEN" is set, its content will be used
|
||||
as token to authenticate against the dump-things-service.
|
||||
|
||||
positional arguments:
|
||||
url url of the paginated endpoint of the dump-things-service
|
||||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
-s, --page-size PAGE_SIZE
|
||||
set the page size (1 - 100) (default: 100)
|
||||
-F, --first-page FIRST_PAGE
|
||||
the first page to return (default: 1)
|
||||
-l, --last-page LAST_PAGE
|
||||
the last page to return (default: None (return all pages)
|
||||
--stats show information about the number of records and pages and exit, the format is is returned as [<total number of pages>, <page size>, <total number of items>]
|
||||
-f, --format FORMAT format of the output records ("json" or "ttl"). (NOTE: not all endpoints support the format parameter.)
|
||||
-m, --matching MATCHING
|
||||
return only records that have a matching value (use % as wildcard). (NOTE: not all endpoints and backends support matching.)
|
||||
-p, --pagination show pagination information (each record from an paginated endpoint is returned as [<record>, <current page number>, <total number of pages>, <page size>, <total number of items>]
|
||||
```
|
||||
|
||||
`read-paginated-url` reads a token from the environment variable `DUMPTHINGS_TOKEN` if it is set.
|
||||
|
||||
|
||||
## SPARQL search over a collection with qlever
|
||||
|
||||
The provide SPARQL search for a collection the following steps are necessary:
|
||||
|
|
@ -194,7 +289,7 @@ The provide SPARQL search for a collection the following steps are necessary:
|
|||
1. Create N-Triple representation of the records of the store
|
||||
2. Build a qlever index
|
||||
3. Start the qlever server
|
||||
4. Use alever query to send SPARQL queries to the server
|
||||
4. Use qlever query to send SPARQL queries to the server
|
||||
|
||||
----
|
||||
|
||||
|
|
|
|||
|
|
@ -24,6 +24,7 @@ classifiers = [
|
|||
"Programming Language :: Python :: Implementation :: PyPy",
|
||||
]
|
||||
dependencies = [
|
||||
"dump-things-pyclient",
|
||||
"dump-things-service",
|
||||
"progress",
|
||||
"qlever",
|
||||
|
|
@ -44,6 +45,7 @@ list-incoming = "triple_tools.list_incoming:main"
|
|||
post-records = "triple_tools.post_records:main"
|
||||
read-pages = "triple_tools.read_pages:main"
|
||||
json2ttl = "triple_tools.json2ttl:main"
|
||||
read-paginated-url = "triple_tools.read_paginated_url:main"
|
||||
|
||||
[tool.hatch.build.targets.wheel]
|
||||
exclude = [
|
||||
|
|
|
|||
|
|
@ -1 +1 @@
|
|||
__version__ = '0.2.2'
|
||||
__version__ = '0.2.3'
|
||||
|
|
|
|||
|
|
@ -1,33 +1,47 @@
|
|||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from urllib.parse import quote_plus
|
||||
|
||||
|
||||
from triple_tools.communicate import (
|
||||
delete_url,
|
||||
get_labels,
|
||||
get_records_from_label,
|
||||
post_to_url,
|
||||
from dump_things_pyclient.communicate import (
|
||||
HTTPError,
|
||||
curated_write_record,
|
||||
incoming_delete_record,
|
||||
incoming_read_labels,
|
||||
incoming_read_records,
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
argument_parser = argparse.ArgumentParser(
|
||||
prog='auto_curate',
|
||||
description="""
|
||||
logger = logging.getLogger('auto_curate')
|
||||
|
||||
token_name = 'DUMPTHINGS_TOKEN'
|
||||
|
||||
stl_info = False
|
||||
|
||||
description=f"""
|
||||
Automatically move records from the incoming areas of a
|
||||
collection to the curated area of the same collection, or to
|
||||
the incoming area of another collection.
|
||||
the curated area of another collection.
|
||||
|
||||
The environment variable "{token_name}" must contain a token
|
||||
which used to authenticate the requests. The token must have
|
||||
curator-rights.
|
||||
"""
|
||||
|
||||
|
||||
def _main():
|
||||
argument_parser = argparse.ArgumentParser(
|
||||
description=description,
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
)
|
||||
argument_parser.add_argument('base_url', metavar='SOURCE_SERVICE_URL')
|
||||
argument_parser.add_argument('service_url', metavar='SOURCE_SERVICE_URL')
|
||||
argument_parser.add_argument('collection', metavar='SOURCE_COLLECTION')
|
||||
argument_parser.add_argument(
|
||||
'--destination-base-url',
|
||||
'--destination-service-url',
|
||||
default=None,
|
||||
metavar='DEST_SERVICE_URL',
|
||||
help='select a different dump-thing-service, i.e. not SOURCE_SERVICE_URL, as destination for auto-curated records',
|
||||
|
|
@ -42,71 +56,144 @@ def main():
|
|||
'--destination-token',
|
||||
default=None,
|
||||
metavar='DEST_TOKEN',
|
||||
help='if provided, this token will be used for the destination service, otherwise ${CURATOR_TOKEN} will be used',
|
||||
help=f'if provided, this token will be used for the destination service, otherwise ${token_name} will be used',
|
||||
)
|
||||
argument_parser.add_argument('--exclude', '-e', nargs='*', default=[], help='exclude an inbox on the source collection')
|
||||
argument_parser.add_argument('--list-labels', '-l', action='store_true')
|
||||
argument_parser.add_argument('--list-only', '-o', action='store_true')
|
||||
argument_parser.add_argument(
|
||||
'-p', '--pid', action='append',
|
||||
help='if provided, process only records that match the given PIDs. NOTE: matching does not involve CURIE-resolution!',
|
||||
'-e', '--exclude',
|
||||
action='append',
|
||||
default=[],
|
||||
help='exclude an inbox on the source collection (repeatable)',
|
||||
)
|
||||
argument_parser.add_argument(
|
||||
'-l', '--list-labels',
|
||||
action='store_true',
|
||||
help='list the inbox labels of the given source collection, do not perform any curation',
|
||||
)
|
||||
argument_parser.add_argument(
|
||||
'-r', '--list-records',
|
||||
action='store_true',
|
||||
help='list records in the inboxes of the given source collection, do not perform any curation',
|
||||
)
|
||||
argument_parser.add_argument(
|
||||
'-o', '--list-only',
|
||||
action='store_true',
|
||||
help='[DEPRECATED: use "--list-records"] list records in the inboxes of the given source collection, do not perform any curation',
|
||||
)
|
||||
argument_parser.add_argument(
|
||||
'-p', '--pid',
|
||||
action='append',
|
||||
help='if provided, process only records that match the given PIDs',
|
||||
)
|
||||
|
||||
arguments = argument_parser.parse_args()
|
||||
print(arguments)
|
||||
|
||||
curator_token = os.environ.get('CURATOR_TOKEN')
|
||||
curator_token = os.environ.get(token_name)
|
||||
if curator_token is None:
|
||||
print('ERROR: CURATOR_TOKEN not set', file=sys.stderr, flush=True)
|
||||
print(f'ERROR: environment variable "{token_name}" not set', file=sys.stderr, flush=True)
|
||||
return 1
|
||||
|
||||
destination_url = arguments.destination_base_url or arguments.base_url
|
||||
destination_url = arguments.destination_service_url or arguments.service_url
|
||||
destination_collection = arguments.destination_collection or arguments.collection
|
||||
destination_token = arguments.destination_token or curator_token
|
||||
|
||||
for label in get_labels(
|
||||
url_base=arguments.base_url,
|
||||
collection=arguments.collection,
|
||||
token=curator_token
|
||||
):
|
||||
output = None
|
||||
|
||||
# If --list-labels and --list-records are provided, keep only the latter,
|
||||
# because it includes listing of labels
|
||||
if arguments.list_records:
|
||||
if arguments.list_labels:
|
||||
print(label)
|
||||
continue
|
||||
print('WARNING: `-l/--list-labels` and `-r/--list-records` defined, ignoring `-l/--list-labels`', file=sys.stderr, flush=True)
|
||||
arguments.list_labels = False
|
||||
output = {}
|
||||
if arguments.list_labels:
|
||||
output = []
|
||||
|
||||
for label in incoming_read_labels(
|
||||
service_url=arguments.service_url,
|
||||
collection=arguments.collection,
|
||||
token=curator_token):
|
||||
|
||||
if label in arguments.exclude:
|
||||
logger.debug('ignoring excluded incoming label: %s', label)
|
||||
continue
|
||||
|
||||
for record in get_records_from_label(
|
||||
url_base=arguments.base_url,
|
||||
if arguments.list_labels:
|
||||
output.append(label)
|
||||
continue
|
||||
|
||||
if arguments.list_records:
|
||||
output[label] = []
|
||||
|
||||
for record, _, _, _, _ in incoming_read_records(
|
||||
service_url=arguments.service_url,
|
||||
collection=arguments.collection,
|
||||
label=label,
|
||||
token=curator_token
|
||||
):
|
||||
token=curator_token):
|
||||
|
||||
if arguments.pid:
|
||||
if record['pid'] not in arguments.pid:
|
||||
logger.debug(
|
||||
'ignoring record with non-matching pid: %s',
|
||||
record['pid'])
|
||||
continue
|
||||
|
||||
if arguments.list_only:
|
||||
print(f'{label}:\t{record}')
|
||||
if arguments.list_records or arguments.list_only:
|
||||
output[label].append(record)
|
||||
continue
|
||||
|
||||
# Get the class name from the `schema_type` attribute. This requires
|
||||
# that the schema type is either stored in the record or that the
|
||||
# store has a "Schema Type Layer", i.e., the store type is
|
||||
# `record_dir+stl`, or `sqlite+stl`.
|
||||
try:
|
||||
class_name = re.search('([_A-Za-z0-9]*$)', record['schema_type']).group(0)
|
||||
# Store record in collection
|
||||
post_to_url(
|
||||
f'{destination_url}/{destination_collection}/curated/record/{class_name}',
|
||||
token=destination_token,
|
||||
content=record,
|
||||
)
|
||||
except IndexError:
|
||||
global stl_info
|
||||
if not stl_info:
|
||||
print(
|
||||
f"""Could not find `schema_type` attribute in record with
|
||||
pid {record['pid']}. Please ensure that `schema_type` is stored in
|
||||
the records or that the associated incoming area store has a backend
|
||||
with a "Schema Type Layer", i.e., "record_dir+stl" or
|
||||
"sqlite+stl".""",
|
||||
file=sys.stderr,
|
||||
flush=True)
|
||||
stl_info = True
|
||||
print(
|
||||
f'WARNING: ignoring record with pid {record["pid"]}, `schema_type` attribute is missing.',
|
||||
file=sys.stderr,
|
||||
flush=True)
|
||||
continue
|
||||
|
||||
# Store record in destination collection
|
||||
curated_write_record(
|
||||
service_url=destination_url,
|
||||
collection=destination_collection,
|
||||
class_name=class_name,
|
||||
record=record,
|
||||
token=destination_token)
|
||||
|
||||
# Delete record from incoming area
|
||||
url = f'{arguments.base_url}/{arguments.collection}/incoming/{label}/record?pid={quote_plus(record["pid"])}'
|
||||
delete_url(
|
||||
url=url,
|
||||
incoming_delete_record(
|
||||
service_url=arguments.service_url,
|
||||
collection=arguments.collection,
|
||||
label=label,
|
||||
pid=record['pid'],
|
||||
token=curator_token,
|
||||
)
|
||||
|
||||
if output is not None:
|
||||
print(json.dumps(output, ensure_ascii=False))
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
def main():
|
||||
try:
|
||||
return _main()
|
||||
except HTTPError as e:
|
||||
print(f'ERROR: {e}: {e.response.text}', file=sys.stderr, flush=True)
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main())
|
||||
|
|
|
|||
|
|
@ -9,10 +9,13 @@ import sys
|
|||
from dump_things_service.converter import Format, FormatConverter
|
||||
from rdflib import Graph
|
||||
|
||||
from triple_tools.communicate import get_all
|
||||
from dump_things_pyclient.communicate import (
|
||||
HTTPError,
|
||||
get_paginated,
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
def _main():
|
||||
argument_parser = argparse.ArgumentParser()
|
||||
argument_parser.add_argument('schema')
|
||||
argument_parser.add_argument('base_url')
|
||||
|
|
@ -22,8 +25,7 @@ def main():
|
|||
|
||||
token = os.environ.get('DUMPTHINGS_TOKEN')
|
||||
if token is None:
|
||||
print('WARNING: DUMPTHINGS_TOKEN not set', file=sys.stderr, flush=True)
|
||||
|
||||
print('WARNING: environment variable DUMPTHINGS_TOKEN not set', file=sys.stderr, flush=True)
|
||||
|
||||
print(f'Creating converter for schema {arguments.schema} ...', file=sys.stderr, end='', flush=True)
|
||||
converter = FormatConverter(
|
||||
|
|
@ -41,7 +43,7 @@ def main():
|
|||
)
|
||||
|
||||
g = Graph()
|
||||
for json_object in get_all(url_base, os.environ.get('DUMPTHINGS_TOKEN'), {'size': '100'}, show_progress=True):
|
||||
for json_object in get_paginated(url_base, page_size=100, token=os.environ.get('DUMPTHINGS_TOKEN')):
|
||||
object_class = json_object.get('schema_type')
|
||||
if object_class is None:
|
||||
raise ValueError(f'No schema_type in {json_object}')
|
||||
|
|
@ -51,7 +53,7 @@ def main():
|
|||
try:
|
||||
ttl = converter.convert(json_object, class_name)
|
||||
except ValueError as ve:
|
||||
print(f'\nWARNING: could not convert record {json_object["pid"]}: {ve}', file=sys.stderr, flush=True)
|
||||
print(f'WARNING: could not convert record {json_object["pid"]}: {ve}', file=sys.stderr, flush=True)
|
||||
continue
|
||||
g.parse(io.StringIO(ttl), format='n3')
|
||||
|
||||
|
|
@ -59,5 +61,13 @@ def main():
|
|||
return 0
|
||||
|
||||
|
||||
def main():
|
||||
try:
|
||||
return _main()
|
||||
except HTTPError as e:
|
||||
print(f'ERROR: {e}: {e.response.text}', file=sys.stderr, flush=True)
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main())
|
||||
|
|
|
|||
|
|
@ -4,28 +4,29 @@ import argparse
|
|||
import os
|
||||
import sys
|
||||
|
||||
from triple_tools.communicate import (
|
||||
delete_url,
|
||||
get_records_from_label,
|
||||
from dump_things_pyclient.communicate import (
|
||||
HTTPError,
|
||||
incoming_delete_record,
|
||||
incoming_read_records,
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
def _main():
|
||||
argument_parser = argparse.ArgumentParser()
|
||||
argument_parser.add_argument('base_url')
|
||||
argument_parser.add_argument('collection')
|
||||
argument_parser.add_argument('label')
|
||||
argument_parser.add_argument('--list-only', '-l', action='store_true')
|
||||
argument_parser.add_argument('--list-only', '-l', action='store_true', help="list records in the inbox, don't remove them")
|
||||
|
||||
arguments = argument_parser.parse_args()
|
||||
|
||||
curator_token = os.environ.get('CURATOR_TOKEN')
|
||||
if curator_token is None:
|
||||
print('ERROR: CURATOR_TOKEN not set', file=sys.stderr, flush=True)
|
||||
print('ERROR: environment variable CURATOR_TOKEN not set', file=sys.stderr, flush=True)
|
||||
return 1
|
||||
|
||||
for record in get_records_from_label(
|
||||
url_base=arguments.base_url,
|
||||
for record, _, _, _, _ in incoming_read_records(
|
||||
service_url=arguments.base_url,
|
||||
collection=arguments.collection,
|
||||
label=arguments.label,
|
||||
token=curator_token,
|
||||
|
|
@ -35,13 +36,24 @@ def main():
|
|||
continue
|
||||
|
||||
# Delete record from incoming area
|
||||
label_url = f'{arguments.base_url}/{arguments.collection}/incoming/{arguments.label}'
|
||||
delete_url(
|
||||
url = f'{label_url}/record?pid={record["pid"]}',
|
||||
incoming_delete_record(
|
||||
service_url=arguments.base_url,
|
||||
collection=arguments.collection,
|
||||
label=arguments.label,
|
||||
pid=record['pid'],
|
||||
token=curator_token,
|
||||
|
||||
)
|
||||
return 0
|
||||
|
||||
|
||||
def main():
|
||||
try:
|
||||
return _main()
|
||||
except HTTPError as e:
|
||||
print(f'ERROR: {e}: {e.response.text}', file=sys.stderr, flush=True)
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main())
|
||||
|
|
|
|||
|
|
@ -1,130 +0,0 @@
|
|||
from __future__ import annotations
|
||||
|
||||
from collections.abc import Iterable
|
||||
from urllib.parse import quote_plus
|
||||
|
||||
import requests
|
||||
from progress.bar import Bar
|
||||
|
||||
|
||||
def _create_url(
|
||||
url_base: str,
|
||||
parameters: dict[str, str] | None = None,
|
||||
page_number: int | None = None,
|
||||
):
|
||||
parameters = parameters or {}
|
||||
parameters.update({'page': str(page_number)})
|
||||
all_parameters = [f'{k}={quote_plus(v)}' for k, v in parameters.items()]
|
||||
return url_base + '?' + '&'.join(all_parameters)
|
||||
|
||||
|
||||
def _get_page(
|
||||
url_base: str,
|
||||
token: str | None = None,
|
||||
parameters: Iterable[str] | None = None,
|
||||
page_number: int | None = None,
|
||||
):
|
||||
return get_from_url(_create_url(url_base, parameters, page_number), token)
|
||||
|
||||
|
||||
def get_all(
|
||||
url_base: str,
|
||||
token: str | None = None,
|
||||
parameters: dict[str, str] | None = None,
|
||||
show_progress: bool = False,
|
||||
):
|
||||
# Get the first result and the number of pages
|
||||
result = _get_page(url_base, token, parameters, page_number=1)
|
||||
total_pages = result['pages']
|
||||
if total_pages == 0:
|
||||
return
|
||||
|
||||
if show_progress:
|
||||
bar = Bar('Pages', max=total_pages, suffix='%(index)d/%(max)d - %(eta_td)s')
|
||||
yield from result['items']
|
||||
bar.next()
|
||||
else:
|
||||
yield from result['items']
|
||||
|
||||
# Get remaining results
|
||||
for page in range(2, total_pages + 1):
|
||||
result = _get_page(url_base, token, parameters, page_number=page)
|
||||
yield from result['items']
|
||||
if show_progress:
|
||||
bar.next()
|
||||
|
||||
if show_progress:
|
||||
bar.finish()
|
||||
|
||||
|
||||
def check_result(
|
||||
result: requests.Response,
|
||||
method: str,
|
||||
url: str
|
||||
):
|
||||
if not 200 <= result.status_code < 300:
|
||||
msg = f'HTTP {method} {url} failed: {result.status_code}: {result.text}'
|
||||
raise RuntimeError(msg)
|
||||
|
||||
|
||||
def get_from_url(
|
||||
url: str,
|
||||
token: str,
|
||||
):
|
||||
r = requests.get(
|
||||
url,
|
||||
headers=({
|
||||
'x-dumpthings-token': token,
|
||||
} if token else {}),
|
||||
)
|
||||
check_result(r, 'GET', url)
|
||||
return r.json()
|
||||
|
||||
|
||||
def post_to_url(
|
||||
url: str,
|
||||
token: str | None,
|
||||
content: list | dict
|
||||
):
|
||||
r = requests.post(
|
||||
url,
|
||||
headers=({
|
||||
'x-dumpthings-token': token,
|
||||
} if token else {}),
|
||||
json=content,
|
||||
)
|
||||
check_result(r, 'POST', url)
|
||||
return r.json()
|
||||
|
||||
|
||||
def delete_url(
|
||||
url: str,
|
||||
token: str | None,
|
||||
):
|
||||
r = requests.delete(
|
||||
url,
|
||||
headers=({
|
||||
'x-dumpthings-token': token,
|
||||
} if token else {}),
|
||||
)
|
||||
check_result(r, 'DELETE', url)
|
||||
return r.json()
|
||||
|
||||
|
||||
def get_labels(
|
||||
url_base: str,
|
||||
collection: str,
|
||||
token: str | None = None,
|
||||
):
|
||||
yield from get_from_url(f'{url_base}/{collection}/incoming/', token)
|
||||
|
||||
|
||||
def get_records_from_label(
|
||||
url_base: str,
|
||||
collection,
|
||||
label: str,
|
||||
token: str | None = None,
|
||||
parameters: dict[str, str] | None = None,
|
||||
):
|
||||
label_url = f'{url_base}/{collection}/incoming/{label}/records/p/'
|
||||
yield from get_all(label_url, token=token, parameters=parameters)
|
||||
|
|
@ -11,9 +11,21 @@ from dump_things_service.converter import (
|
|||
)
|
||||
|
||||
|
||||
description = f"""Read JSON records from stdin and convert them to TTL
|
||||
|
||||
This command reads one record per line, either JSON format or a JSON-string
|
||||
with a TTL-document from stdin, converts them to TTL or JSON and prints them
|
||||
to stdout.
|
||||
|
||||
"""
|
||||
|
||||
|
||||
def main():
|
||||
argument_parser = argparse.ArgumentParser()
|
||||
argument_parser.add_argument('schema')
|
||||
argument_parser = argparse.ArgumentParser(
|
||||
description=description,
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
)
|
||||
argument_parser.add_argument('schema', help='URL of the schema that should be used')
|
||||
|
||||
arguments = argument_parser.parse_args()
|
||||
|
||||
|
|
@ -26,16 +38,16 @@ def main():
|
|||
print(' done', file=sys.stderr, flush=True)
|
||||
|
||||
error = False
|
||||
|
||||
for line in sys.stdin:
|
||||
json_object = json.loads(line)
|
||||
|
||||
object_class = json_object.get('schema_type')
|
||||
if object_class is None:
|
||||
error = True
|
||||
print(f'ERROR: No schema_type in {json_object}', file=sys.stderr, flush=True)
|
||||
continue
|
||||
|
||||
class_name = re.search('([_A-Za-z0-9]*$)', object_class).group(0)
|
||||
|
||||
try:
|
||||
ttl = converter.convert(json_object, class_name)
|
||||
except ValueError as ve:
|
||||
|
|
|
|||
|
|
@ -1,45 +1,60 @@
|
|||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
|
||||
from triple_tools.communicate import (
|
||||
get_labels,
|
||||
get_records_from_label,
|
||||
from dump_things_pyclient.communicate import (
|
||||
HTTPError,
|
||||
incoming_read_labels,
|
||||
incoming_read_records,
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
def _main():
|
||||
argument_parser = argparse.ArgumentParser()
|
||||
argument_parser.add_argument('base_url')
|
||||
argument_parser.add_argument('collection')
|
||||
argument_parser.add_argument('--show-records', '-s', action='store_true')
|
||||
argument_parser.add_argument('-s', '--show-records', action='store_true', help='show the records in the inboxes as well')
|
||||
|
||||
arguments = argument_parser.parse_args()
|
||||
|
||||
curator_token = os.environ.get('CURATOR_TOKEN')
|
||||
if curator_token is None:
|
||||
print('ERROR: CURATOR_TOKEN not set', file=sys.stderr, flush=True)
|
||||
print('ERROR: environment variable CURATOR_TOKEN not set', file=sys.stderr, flush=True)
|
||||
return 1
|
||||
|
||||
for label in get_labels(
|
||||
url_base=arguments.base_url,
|
||||
result = {}
|
||||
for label in incoming_read_labels(
|
||||
service_url=arguments.base_url,
|
||||
collection=arguments.collection,
|
||||
token=curator_token,
|
||||
):
|
||||
print(label)
|
||||
result[label] = []
|
||||
if arguments.show_records:
|
||||
for record in get_records_from_label(
|
||||
url_base=arguments.base_url,
|
||||
for record, _, _, _, _ in incoming_read_records(
|
||||
service_url=arguments.base_url,
|
||||
collection=arguments.collection,
|
||||
label=label,
|
||||
token=curator_token,
|
||||
):
|
||||
print('\t', record)
|
||||
result[label].append(record)
|
||||
|
||||
if arguments.show_records is False:
|
||||
result = list(result)
|
||||
print(json.dumps(result, indent=2, ensure_ascii=False))
|
||||
return 0
|
||||
|
||||
|
||||
def main():
|
||||
try:
|
||||
return _main()
|
||||
except HTTPError as e:
|
||||
print(f'ERROR: {e}: {e.response.text}', file=sys.stderr, flush=True)
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main())
|
||||
|
|
|
|||
|
|
@ -5,42 +5,51 @@ import json
|
|||
import os
|
||||
import sys
|
||||
|
||||
from triple_tools.communicate import post_to_url
|
||||
from dump_things_pyclient.communicate import (
|
||||
collection_write_record,
|
||||
curated_write_record,
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
argument_parser = argparse.ArgumentParser()
|
||||
argument_parser.add_argument('base_url')
|
||||
argument_parser.add_argument('collection')
|
||||
argument_parser.add_argument('cls')
|
||||
argument_parser.add_argument('--curated', action='store_true')
|
||||
argument_parser.add_argument('cls', metavar='class')
|
||||
argument_parser.add_argument('--curated', action='store_true', help='bypass inbox, requires curator token')
|
||||
|
||||
arguments = argument_parser.parse_args()
|
||||
|
||||
token = os.environ.get('DUMPTHINGS_TOKEN')
|
||||
if token is None:
|
||||
print('WARNING: DUMPTHINGS_TOKEN not set', file=sys.stderr, flush=True)
|
||||
|
||||
url = (
|
||||
arguments.base_url
|
||||
+ ('' if arguments.base_url.endswith('/') else '/')
|
||||
+ arguments.collection
|
||||
+ '/'
|
||||
print(
|
||||
'WARNING: environment variable DUMPTHINGS_TOKEN not set',
|
||||
file=sys.stderr,
|
||||
flush=True,
|
||||
)
|
||||
|
||||
if arguments.curated:
|
||||
url += f'curated/'
|
||||
url += f'record/{arguments.cls}'
|
||||
write_record = curated_write_record
|
||||
else:
|
||||
write_record = collection_write_record
|
||||
|
||||
posted = False
|
||||
for line in sys.stdin:
|
||||
rec = json.loads(line)
|
||||
record = json.loads(line)
|
||||
try:
|
||||
post_to_url(url, token, rec)
|
||||
write_record(
|
||||
service_url=arguments.base_url,
|
||||
collection=arguments.collection,
|
||||
class_name=arguments.cls,
|
||||
record=record,
|
||||
token=token,
|
||||
)
|
||||
except Exception as e:
|
||||
print(e)
|
||||
print(f'Error: {e}', file=sys.stderr, flush=True)
|
||||
else:
|
||||
posted = True
|
||||
print('.', end='', flush=True)
|
||||
|
||||
if posted:
|
||||
# final newline
|
||||
print('')
|
||||
|
|
|
|||
|
|
@ -4,41 +4,172 @@ import argparse
|
|||
import json
|
||||
import os
|
||||
import sys
|
||||
from functools import partial
|
||||
|
||||
from triple_tools.communicate import get_all
|
||||
from dump_things_pyclient.communicate import (
|
||||
HTTPError,
|
||||
collection_read_records,
|
||||
collection_read_records_of_class,
|
||||
collection_read_record_with_pid,
|
||||
curated_read_records,
|
||||
curated_read_records_of_class,
|
||||
curated_read_record_with_pid,
|
||||
incoming_read_labels,
|
||||
incoming_read_records,
|
||||
incoming_read_records_of_class,
|
||||
incoming_read_record_with_pid,
|
||||
)
|
||||
|
||||
|
||||
token_name = 'DUMPTHINGS_TOKEN'
|
||||
|
||||
description = f"""Get records from a collection on a dump-things-service
|
||||
|
||||
This command lists records that are stored in a dump-things-service. By
|
||||
default all records that are readable with the given token, or the default
|
||||
token, will be displayed. The output format is JSONL (JSON lines), where
|
||||
every line contains a record or a record with paging information. If `ttl`
|
||||
is chosen as format of the output records, the record content will be a string
|
||||
that contains a TTL-documents.
|
||||
|
||||
The command supports to read from the curated area only, to read from incoming
|
||||
areas, or to read records with a given PID.
|
||||
|
||||
Pagination information is returned for paginated results, when requested with
|
||||
`-P/--pagination`. All results are paginated except "get a record with a given PID"
|
||||
and "get the list of incoming zone labels".
|
||||
|
||||
If the environment variable "{token_name}" is set, its content will be used
|
||||
as token to authenticate against the dump-things-service.
|
||||
|
||||
"""
|
||||
|
||||
|
||||
def _main():
|
||||
argument_parser = argparse.ArgumentParser(
|
||||
description=description,
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
)
|
||||
argument_parser.add_argument('service_url')
|
||||
argument_parser.add_argument('collection')
|
||||
argument_parser.add_argument('-c', '--class', dest='class_name', help='only read records of this class, ignored if "--pid" is provided')
|
||||
argument_parser.add_argument('-f', '--format', help='format of the output records ("json" or "ttl")')
|
||||
argument_parser.add_argument('-p', '--pid', help='the pid of the record that should be read')
|
||||
argument_parser.add_argument('-i', '--incoming', metavar='LABEL', help='read from incoming area with the given label in the collection, if LABEL is "-", return the labels')
|
||||
argument_parser.add_argument('-C', '--curated', action='store_true', help='read from the curated area of the collection')
|
||||
argument_parser.add_argument('-m', '--matching', help='return only records that have a matching value (use % as wildcard). Ignored if "--pid" is provided. (NOTE: not all endpoints and backends support matching.)')
|
||||
argument_parser.add_argument('-s', '--page-size', type=int, help='set the page size (1 - 100) (default: 100), ignored if "--pid" is provided')
|
||||
argument_parser.add_argument('-F', '--first-page', type=int, help='the first page to return (default: 1), ignored if "--pid" is provided')
|
||||
argument_parser.add_argument('-l', '--last-page', type=int, default=None, help='the last page to return (default: None (return all pages), ignored if "--pid" is provided')
|
||||
argument_parser.add_argument('--stats', action='store_true', help='show the number of records and pages and exit, ignored if "--pid" is provided')
|
||||
argument_parser.add_argument('-P', '--pagination', action='store_true', help='show pagination information (each record from an paginated endpoint is returned as [<record>, <current page number>, <total number of pages>, <page size>, <total number of items>]')
|
||||
|
||||
arguments = argument_parser.parse_args()
|
||||
if arguments.parameter:
|
||||
print(
|
||||
f'WARNING: option -p/--parameter is ignored, use existing options instead',
|
||||
file=sys.stderr,
|
||||
flush=True)
|
||||
|
||||
token = os.environ.get(token_name)
|
||||
if token is None:
|
||||
print(f'WARNING: {token_name} not set', file=sys.stderr, flush=True)
|
||||
|
||||
if arguments.incoming and arguments.curated:
|
||||
print(
|
||||
'ERROR: -i/--incoming and -c/--curated are mutually exclusive',
|
||||
file=sys.stderr,
|
||||
flush=True)
|
||||
return 1
|
||||
|
||||
kwargs = dict(
|
||||
service_url=arguments.service_url,
|
||||
collection=arguments.collection,
|
||||
token=token,
|
||||
)
|
||||
|
||||
if arguments.incoming == '-':
|
||||
result = incoming_read_labels(**kwargs)
|
||||
print('\n'.join(
|
||||
map(
|
||||
partial(json.dumps, ensure_ascii=False),
|
||||
result)))
|
||||
return 0
|
||||
|
||||
elif arguments.pid:
|
||||
for argument_value, argument_name in (
|
||||
(arguments.matching, '-m/--matching'),
|
||||
(arguments.page_size, '-s/--page_size'),
|
||||
(arguments.first_page, '-F/--first_page'),
|
||||
(arguments.last_page, '-l/--last_page'),
|
||||
(arguments.stats, '--stats'),
|
||||
(arguments.class_name, '-c/--class'),
|
||||
):
|
||||
if argument_value:
|
||||
print(
|
||||
f'WARNING: {argument_name} ignored because "-p/--pid" is provided',
|
||||
file=sys.stderr,
|
||||
flush=True)
|
||||
|
||||
kwargs['pid'] = arguments.pid
|
||||
if arguments.curated:
|
||||
result = curated_read_record_with_pid(**kwargs)
|
||||
elif arguments.incoming:
|
||||
kwargs['label'] = arguments.incoming
|
||||
result = incoming_read_record_with_pid(**kwargs)
|
||||
else:
|
||||
kwargs['format'] = arguments.format
|
||||
result = collection_read_record_with_pid(**kwargs)
|
||||
print(json.dumps(result, ensure_ascii=False))
|
||||
return 0
|
||||
|
||||
elif arguments.class_name:
|
||||
kwargs.update(dict(
|
||||
class_name=arguments.class_name,
|
||||
matching=arguments.matching,
|
||||
page=arguments.first_page or 1,
|
||||
size=arguments.page_size or 100,
|
||||
last_page=arguments.last_page,
|
||||
))
|
||||
if arguments.curated:
|
||||
result = curated_read_records_of_class(**kwargs)
|
||||
elif arguments.incoming:
|
||||
kwargs['label'] = arguments.incoming
|
||||
result = incoming_read_records_of_class(**kwargs)
|
||||
else:
|
||||
kwargs['format'] = arguments.format
|
||||
result = collection_read_records_of_class(**kwargs)
|
||||
else:
|
||||
kwargs.update(dict(
|
||||
matching=arguments.matching,
|
||||
page=arguments.first_page or 1,
|
||||
size=arguments.page_size or 100,
|
||||
last_page=arguments.last_page,
|
||||
))
|
||||
if arguments.curated:
|
||||
result = curated_read_records(**kwargs)
|
||||
elif arguments.incoming:
|
||||
kwargs['label'] = arguments.incoming
|
||||
result = incoming_read_records(**kwargs)
|
||||
else:
|
||||
kwargs['format'] = arguments.format
|
||||
result = collection_read_records(**kwargs)
|
||||
|
||||
if arguments.pagination:
|
||||
for record in result:
|
||||
print(json.dumps(record, ensure_ascii=False))
|
||||
else:
|
||||
for record in result:
|
||||
print(json.dumps(record[0], ensure_ascii=False))
|
||||
return 0
|
||||
|
||||
|
||||
def main():
|
||||
argument_parser = argparse.ArgumentParser()
|
||||
argument_parser.add_argument('base_url')
|
||||
argument_parser.add_argument('collection')
|
||||
argument_parser.add_argument('-s', '--size', type=int, default=100)
|
||||
argument_parser.add_argument('-p', '--parameter', action='append', default=[])
|
||||
argument_parser.add_argument('-c', '--class', default=None, dest='cls')
|
||||
|
||||
arguments = argument_parser.parse_args()
|
||||
|
||||
token = os.environ.get('DUMPTHINGS_TOKEN')
|
||||
if token is None:
|
||||
print('WARNING: DUMPTHINGS_TOKEN not set', file=sys.stderr, flush=True)
|
||||
|
||||
url_base = (
|
||||
arguments.base_url
|
||||
+ ('' if arguments.base_url.endswith('/') else '/')
|
||||
+ arguments.collection
|
||||
+ f'/records/p/'
|
||||
)
|
||||
if arguments.cls:
|
||||
url_base += f'{arguments.cls}/'
|
||||
|
||||
parameters = {'size': str(arguments.size)}
|
||||
parameters.update({
|
||||
param.split('=', 1)[0]: param.split('=', 1)[1]
|
||||
for param in (arguments.parameter or [])
|
||||
})
|
||||
|
||||
for json_object in get_all(url_base, token, parameters=parameters):
|
||||
print(json.dumps(json_object))
|
||||
try:
|
||||
return _main()
|
||||
except HTTPError as e:
|
||||
print(f'ERROR: {e}: {e.response.text}', file=sys.stderr, flush=True)
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
|
|
|||
87
triple_tools/read_paginated_url.py
Normal file
87
triple_tools/read_paginated_url.py
Normal file
|
|
@ -0,0 +1,87 @@
|
|||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
from dump_things_pyclient.communicate import (
|
||||
HTTPError,
|
||||
get_paginated,
|
||||
)
|
||||
|
||||
|
||||
token_name = 'DUMPTHINGS_TOKEN'
|
||||
|
||||
description = f"""Read paginated endpoint
|
||||
|
||||
This command lists all records that are available via paginated endpoints from
|
||||
a dump-things-service, e.g., from:
|
||||
|
||||
https://<service-location>/<collection>/records/p/
|
||||
|
||||
If the environment variable "{token_name}" is set, its content will be used
|
||||
as token to authenticate against the dump-things-service.
|
||||
|
||||
"""
|
||||
|
||||
|
||||
def _main():
|
||||
argument_parser = argparse.ArgumentParser(
|
||||
description=description,
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
)
|
||||
argument_parser.add_argument('url', help='url of the paginated endpoint of the dump-things-service')
|
||||
argument_parser.add_argument('-s', '--page-size', type=int, default=100, help='set the page size (1 - 100) (default: 100)')
|
||||
argument_parser.add_argument('-F', '--first-page', type=int, default=1, help='the first page to return (default: 1)')
|
||||
argument_parser.add_argument('-l', '--last-page', type=int, default=None, help='the last page to return (default: None (return all pages)')
|
||||
argument_parser.add_argument('--stats', action='store_true', help='show information about the number of records and pages and exit, the format is is returned as [<total number of pages>, <page size>, <total number of items>]')
|
||||
argument_parser.add_argument('-f', '--format', help='format of the output records ("json" or "ttl"). (NOTE: not all endpoints support the format parameter.)')
|
||||
argument_parser.add_argument('-m', '--matching', help='return only records that have a matching value (use %% as wildcard). (NOTE: not all endpoints and backends support matching.)')
|
||||
argument_parser.add_argument('-p', '--pagination', action='store_true', help='show pagination information (each record from an paginated endpoint is returned as [<record>, <current page number>, <total number of pages>, <page size>, <total number of items>]')
|
||||
|
||||
arguments = argument_parser.parse_args()
|
||||
|
||||
token = os.environ.get(token_name)
|
||||
if token is None:
|
||||
print(f'WARNING: {token_name} not set', file=sys.stderr, flush=True)
|
||||
|
||||
result = get_paginated(
|
||||
url=arguments.url,
|
||||
token=token,
|
||||
first_page=arguments.first_page,
|
||||
page_size=arguments.page_size,
|
||||
last_page=arguments.last_page,
|
||||
parameters={
|
||||
'format': arguments.format,
|
||||
**({'matching': arguments.matching}
|
||||
if arguments.matching is not None
|
||||
else {}
|
||||
),
|
||||
}
|
||||
)
|
||||
|
||||
if arguments.stats:
|
||||
record = next(result)
|
||||
print(json.dumps(record[2:], ensure_ascii=False))
|
||||
return 0
|
||||
|
||||
if arguments.pagination:
|
||||
for record in result:
|
||||
print(json.dumps(record, ensure_ascii=False))
|
||||
else:
|
||||
for record in result:
|
||||
print(json.dumps(record[0], ensure_ascii=False))
|
||||
return 0
|
||||
|
||||
|
||||
def main():
|
||||
try:
|
||||
return _main()
|
||||
except HTTPError as e:
|
||||
print(f'ERROR: {e}: {e.response.text}', file=sys.stderr, flush=True)
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main())
|
||||
Loading…
Add table
Add a link
Reference in a new issue