Data processing steps
General data processing for projects
Our general data processing procedure for data which have been collected as part of a scientific marine project is
1. Archive original data. When data are first received they go through our Accession procedure. The data are securely archived in their original form along with any associated documentation.
2. Quality control is carried out to ensure that data are correctly linked to the appropriate sampling event (e.g. CTD bottle, non-toxic sample, zooplankton net, optics rig deployment). This is done by investigation using log sheets, cruise reports and other data sets. Scientists may be contacted if the problem cannot be clearly resolved.
3. Load data into the relevant data table in the project database. Derived parameters easily calculated from other parameters (e.g. ratios between two concentrations) are not loaded. Equations are provided in accompanying documentation to allow users to calculate derived parameters themselves.
- A code from the BODC Parameter Dictionary is assigned to each data value to standardise parameter descriptions. The code can also carry additional items of information about the measurement, such as sample processing or analytical technique.
- Mean values are taken from replicate samples (excluding any outliers).
- Data are converted into BODC standard units so that all data loaded for a particular parameter are directly comparable.
- Any data that the scientist regard as suspect are flagged with an "L" flag. Any data that BODC believe to be suspect are flagged with an "M" flag. BODC flagging is constrained to obvious problems, or where a mean value has been taken from a number of samples with a high standard deviation.
- The data are linked through a table field to the data originator or the Principal Investigator for the group. They may be contacted by users of the data if necessary.
4. Documentation
is written to accompany the data. This is based on text about the collection
method supplied by the data originator. Any data processing steps and comments
about data quality are included.
5. The data are audited by a different member of staff
to check that no errors have been made by the data scientist.
6. Distribution and delivery. Data are distributed to project participants on request or via our web delivery service.
Related BODC pages
| Overview of all data processing steps at BODC | Specific moored instrument processing for data destined for the National Oceanographic Database (NODB) | |
| Specific CTD data processing for projects | BODC parameter codes | |
| Specific underway data processing for projects | Code and format definitions |