Reference

API

To view the API’s documentation in development, run the server and open http://127.0.0.1:8000/api/schema/swagger-ui/ or http://127.0.0.1:8000/api/schema/redoc/.

The API is used for managing collections (see Kingfisher Collect and KINGFISHER_PROCESS_URL in the Data Registry).

Environment variables

See OCP’s approach to Django settings. New variables are:

LOG_LEVEL

The log level of the root logger

RABBIT_URL

The connection string for RabbitMQ

RABBIT_EXCHANGE_NAME

The name of the RabbitMQ exchange. Follow the pattern kingfisher_process_{service}_{environment} like kingfisher_process_data_registry_production

SCRAPYD_URL

The base URL of Scrapyd, for example: http://localhost:6800

SCRAPYD_PROJECT

The project within Scrapyd

KINGFISHER_COLLECT_FILES_STORE

The directory from which to read the files written by Kingfisher Collect. If Kingfisher Collect and Kingfisher Process share a filesystem, this will be the same value for both services.

ENABLE_CHECKER

Whether to enable the checker worker

It is recommended to set REQUESTS_POOL_MAXSIZE to 20, to set the maximum number of connections to save in the connection pool used by the ocdsextensionregistry package. This is the same value as the prefetch_count used by RabbitMQ consumers.

Message routing

In each worker and command, the queue name and the routing key of published messages (with one exception) is set by a routing_key variable. The binding keys are set by a consume_routing_keys variable. Queue names and routing keys are prefixed by the exchange name, set by the RABBIT_EXCHANGE_NAME environment variable.

Actor

Consumer routing keys (input)

Publisher routing keys (output)

Processing step

load command

N/A

loader for each collection file

Create LOAD for each collection file

addfiles command

N/A

loader for each collection file

Create LOAD for each collection file

addchecks command

N/A

addchecks for each collection file with missing checks

Create CHECK for each collection file

closecollection command

N/A

collection_closed for the original and derived collections

N/A

close_collection API

N/A

collection_closed for the original and derived collections

N/A

wipe_collection API

N/A

wiper for the collection

N/A

api_loader worker

api

api_loader for the collection file

Create LOAD for the collection file

file_worker worker

  • api_loader

  • loader

file_worker for the collection file in the original and upgraded collections

  • Delete LOAD for the collection file

  • Create CHECK for the collection file in the original and upgraded collections, if the ENABLE_CHECKER environment variable is set

checker worker

  • file_worker

  • addchecks

checker for the collection file

Delete CHECK for the collection file

compiler worker

  • file_worker

  • collection_closed

  • compiler_record for each OCID among records in the collection file

  • compiler_release for each OCID among releases in the entire collection

  • For release packages, do nothing if a LOAD remains

  • Create COMPILE for each OCID

record_compiler worker

compiler_record

record_compiler for the OCID

Delete COMPILE for the OCID

release_compiler worker

compiler_release

release_compiler for the OCID

Delete COMPILE for the OCID

finisher worker

  • file_worker

  • checker

  • record_compiler

  • release_compiler

  • collection_closed

N/A

Do nothing if a step remains

wiper worker

wiper

N/A

N/A