Data Model


Collections are distinct sets of OCDS data. They are the largest unit on which this tool operates.

A collection is uniquely identified by the combination of:

  • Name (source_id): A string. If the collection was created by Kingfisher Scrape, this is the name attribute of the spider.
  • Date (data_version): The date and time at which the collection was created. If the collection was created by Kingfisher Scrape, this is the start_time statistic of the crawl.
  • Sample (sample): A boolean. Whether the collection is only a sample of the data from the source.
  • Base collection (transform_from_collection_id): An integer. The ID of the collection that was transformed into this collection.
  • Transform type (transform_type): A string. The identifier of the transformer that was used to produce this collection.

Each collection is given an integer ID; this is used to refer to the collection in the Command-line tool and the database.

Collections are created by Kingfisher Scrape, the web API, or the new-collection command.

Schema check flags

Collections have flags that indicate what operations to perform on them. These are:

Run CoVE schema checks on the data in this collection
Force OCDS 1.1 checks to be run on OCDS 1.0 data (instead of OCDS 1.0 checks)

To configure the default values for these flags, see Configuration.

Transformed collections

Presently, the tool offers two transformers:

upgrade a collection’s data from OCDS 1.0 to OCDS 1.1
merge a collection’s releases into compiled releases

To transform a collection, create a new collection that refers to the base collection, with either the new-transform-compile-releases or new-transform-upgrade-1-0-to-1-1 command, then run the transform-collection command.


A collection contains one or more files. A file is uniquely identified by its collection and filename. Files can have:

The file could not be retrieved. Presently, errors are either reported by Kingfisher Scrape or caught by the local-load command.
The file contents had to be modified in order to be stored. Presently, the only warning is about the removal of control characters.

File types

The local-load command must be given the type of the file to load:

A single record
A single release
A JSON array of records, like [ { record-1 }, { record-2 } ]
A JSON array of releases
A single record package
A single release package
A JSON array of record packages, like [ { record-package-1 }, { record-package-2 } ]
A JSON array of release packages
Line-delimited JSON, in which each line is a record package
As above, but release packages
A JSON object with a results key whose value is a JSON array of record packages, like { "results": [ { record-package-1 }, { record-package-2 } ] }
As above, but release packages
A JSON object has a results key whose value is a list. Every item in that list is a JSON object. The object has a ocdsReleasePackage key who’s value is a release package


A file contains one or more items. An item is an OCDS resource: a release, record, release package or record package. An item is uniquely identified by its index within the file. Indices are 0-based.

Files of the type record, release, record_package, or release_package have one item only. Files of other types have one or more items.