Submit your data

Dataset selection procedures

The Published Data Library (PDL) operates on a model where multiple complete copies of datasets are stored indefinitely. With storage capacity being finite, there are limits to the size and number of datasets the PDL can support. For this reason, clear procedures are required to determine which datasets are to be included. The PDL is designed for base datasets suitable for future re-use in other applications rather than data that have been reworked specifically for a single research publication (often referred to as “data behind the graph”).

The PDL includes two distinct types of datasets:

  1. Datasets that have already been ingested into the BODC system and subsequently exported. Candidate datasets of this type will be identified through negotiation between the scientists who supplied the data and the BODC data managers responsible for their ingestion in consultation with BODC management. The technical quality of these datasets is BODC's responsibility.
  2. Datasets that have not yet been ingested into the BODC system but are destined for future ingestion. Candidate datasets will be identified through negotiation between data submitting scientists and the BODC data managers responsible for that data. The technical quality of these datasets (including metadata) to a standard deemed acceptable by BODC is the responsibility of the data originator. BODC will also judge the acceptability of candidate datasets in terms of their completeness, but not in terms of their scientific quality or value.

The PDL has the capability to archive and publish high-volume datasets on the CEDA Archive hosted on JASMIN. While the majority of PDL datasets remain housed in BODC's own archive, this extended infrastructure provides additional capacity for storing and disseminating large-scale datasets. Data can be accessed directly from BODC’s archive space at CEDA.

Please note that due to limited storage and operational capacity, demand for inclusion may exceed availability. Large-volume datasets or submissions with short deadlines may not be accepted.

PDL datasets are openly accessible and not subject to the full BODC access control system. Users do not need to register or log in to download data. As a result, user activity tracking is limited to basic statistics, specifically, what was accessed and when.