Software engineering
Edserplo — BODC's visualisation program
The following is a technical summary of the capabilities of Edserplo — BODC's visualisation and tidal processing tool.
Contents
- Background
- Data types
- An introduction to Edserplo
- Invoking Edserplo
- Input and output functionality
- The series page
- The series location map
- The time series page
- The profile (CTD) page
- Scatter plots
- Thermistor chain or ADCP (TCAD) page
- Wave data
- HF Radar surface currents — Ocean Surface Current Radar (OSCR)
- Processing tidal data
1. Background
Until 2005, BODC employed a suite of visualisation programs to quickly assess the validity of marine data. Skilled data scientists used the programs to screen and flag oceanographic data. Flagging is a process where suspect data are highlighted, but the actual data values are not changed.
We are also responsible for the processing of data collected by the UK Tide Gauge Network. To process these data efficiently we developed Edteva — a program for editing, visualisation and analysis of tidal data.
These programs depended on the Silicon Graphics, Inc (SGI) operating system, IRIX, and associated workstations. They consisted of some 60,000 lines of code written in Fortran and C++.
In July 2003, we initiated a programme of development for a replacement (Edserplo) for the existing visualisation programs. With the low prices of Intel-based hardware and associated graphics cards it was prudent to change to Linux and/or Windows platforms.
The new software had to
- Have similar functionality to the original four programs — reducing learning overheads for existing users, minimising development time and providing a benchmark for programmers.
- Respond as fast, or faster than, the software it was replacing.
- Be Windows compatible — to accommodate BODC's data scientists.
- Ideally provide Unix and Linux compatibility — to accommodate users from our hosting laboratory, the National Oceanography Centre (NOC).
- Utilise a mainstream programming language — to safeguard future software support.
An initial prototype indicated that Java, with its cross platform capabilities, provided the facilities and performance to support our visualisation operations.
The first version of Edserplo became operational in 2005. This replaced our suite of visualisation programs. It is expected that the software will be continually enhanced.
2. Data types
Edserplo can present any data conforming to the BODC series model which have pressure, depth, altitude and/or time as independent variables. Typical data types include
- CTD (Conductivity Temperature Depth).
- Underway — continuous measurements of sea surface data (e.g. salinity, temperature, attenuance, chlorophyll, nutrients), meteorology, navigation and bathymetry.
- Radiosonde.
- XBT (eXpendable BathyThermograph).
- Tide gauge.
- Current meter.
- ADCP (Acoustic Doppler Current Profiler) — static and shipborne.
- Wave statistics (Hs, Tz).
- One-dimensional wave spectra.
- Two-dimensional wave spectra.
- Thermistor chain.
- HF Radar surface currents.
- Other moored instruments.
3. An introduction to Edserplo
Edserplo consists of a suite of presentation pages. Each page is displayed on the screen in its own window and provides the user with the ability to perform a specific data screening or data processing function.
Special factors relating to the processing of UK Tide Gauge Network data require that there are dedicated pages within Edserplo for this purpose. However, the software is still of a generalised nature and is by no means limited to this particular task. These dedicated pages are identified by (E) below and are described in the processing tidal data section.
Graphical presentation pages
All the pages support panning, zooming and superposition.
- Time series with inset track map
- Profile (CTD) page
- Thermistor chain or ADCP (TCAD)
- Wave spectrum
- Wave histogram
- HF Radar surface current (OSCR)
- Current meter scatter plot
- X-Y scatter plot
- Port time series (E)
Selection and other control pages
- Series page
- Series location map
- Port page (E)
- Output page (E)
- Tidal analysis page (E)
4. Invoking Edserplo
To invoke the program, the user must provide individual data file names or a file of file names (the driver file). The user's preferred display settings may also be defined in a settings file.
5. Input and output functionality
Data file formats
Currently, Edserplo supports the reading of thirteen file formats. They include
- A format for multi-series
- Two formats where the data values themselves are updateable
- Four writeable formats
All formats originate from BODC with one exception, the National Oceanography Centre, Southampton (NOCS) format PSTAR.
Edserplo has been constructed so that it allows the addition of further formats without requiring changes to the kernel of the program. Each file format is supported through a dedicated class; each class extends the BODC's abstract class. Edserplo automatically recognises each input file format. Additional format classes may be added by a Java-savvy end user following instructions in the user guide.
Data parameters
Data file parameters are normally defined using the BODC parameter dictionary. Each parameter is assigned a unique 8-character code. To enable easy comparison of like parameters codes can be aliased by inserting information into the driver file.
The definition for a particular parameter code may be obtained as hover text (tooltip) via a database connection on the appropriate pages. Edserplo can operate without tooltips if the database is not available.
Generating derived quantities
Edserplo has the functionality to generate derived quantities. For example
- Eastings and Northings for wind and current data — generated through user supplied information in the driver file.
- Tide and residuals for UK tide gauge data — automatic generation using information stored in a harmonic constants library.
For each transformation, an appropriate mapping has to be defined between the flags in the derived channels and those in the input channels. This may be done by defining a precedence hierarchy amongst the possible flags.
Tidal analysis and statistics
The UK Tide Gauge Network data require that tidal analyses are retrieved from a library of harmonic constants. However, Edserplo also has the capability of generating a tidal analysis from the data loaded. These analyses may be output to a database and stored for future use. Tidal statistics, including mean sea level, surges and extremes can also be generated and stored in a relational database.
File output
When edits are limited to flag channels the input data files may be overwritten — if the file format permits. The user may toggle between read and write mode on the series page and opt to save as required. The user is prompted on exit if a file has been edited and a save has not been performed.
After data value edits are performed during tidal processing, the data are written out to a protected 'dump' file. The dump file retains port, channel and other setting information as well as the tidal data set and can be accessed during the users next Edserplo session. This procedure allows for the editing of data values within a series without changing the files from which they were derived. Additionally, newly generated concatenated series may be output as files from the output page.
Creating hardcopy
A screen capture of each page can be generated via PNG files. This can be useful to communicate potential problems to data originators.
6. The series page
The series page lists series cross-referenced against their parameters, scales, bases and colours. It allows the user to select or deselect series and parameters of interest for display on subsequent pages.
The display colour for each series and the colour, scale, base and axis for each parameter may be defined directly within the series page. Alternatively, settings may be imported via a settings file.
If the settings are not user defined, Edserplo computes the scales and bases from the assessed data limits for each parameter, randomly selects a colour and sets the display axis to one.
The user may toggle between read and write mode and opt to save data files or setting files as required.
7. The series location map
It can be useful to see information about the location of data series collected at fixed positions. For example, neighbouring series for current meter moorings can be checked against each other. The series location map provides this information. Series locations are displayed on a map and selections can be made. The chosen series are mirrored in the series page.
The same technique can also be used to select CTD series recorded in the course of a research cruise. In this case the position is not recorded in the individual series but is obtained from a cruise database.
8. The time series page
Using the settings from the series page, the chosen series and parameters are displayed on the time series page. The parameter base setting is subtracted from the data value before multiplication by the parameter scale to define a pixel distance from the associated axis.
The data can be examined as
- All selected series
- One series at a time
- All selected parameters for a series
- One parameter at a time
- For two-dimensional data — one bin at a time
Series, parameters and bins can be cycled through by using the numeric keypad. There are differential options to contrast parameters from different series or bins within a series. Bin data are stacked at a separation chosen by the user.
The display can be quickly panned and zoomed horizontally via the keyboard. The time axis graduation and associated annotation can go from months to seconds. A vertical axis can also be set up by the user.
Data values can be displayed in a cycle window which is tied to the current series, the current pointer position and, for two-dimensional data, the current bin. The current point in the series is selected by its proximity to the mouse's X coordinate.
Editing can occur on the time series page but is limited to modifying flags. Flags can be toggled individually by the middle mouse button or collectively by defining a box.
The track map
A large part of BODC's workload is associated with processing data recorded whilst a ship is in transit, where multiple sensors may make recordings every few seconds. Therefore, it is useful to be able plot the ship's position on a track map. The time series page supports this functionality.
Currently the map is limited to a world coastline at two resolutions but in future it should be possible to include GEBCO bathymetry. Nonetheless the system as it stands can cope with data at vastly different scales (contrast the Solent with the whole of the North and South Atlantic) without seriously slowing down the associated time series display. The current ship position is clearly identified on the map.
9. The profile (CTD) page
The profile page or CTD page is predominantly used in BODC for screening CTD (Conductivity, Temperature, Depth) data. However, it can support any data where the independent variable of interest is pressure, depth or altitude.
These variables may not be, and frequently are not, strictly monotonic within a series. Therefore, using the proximity of the mouse's X coordinate to determine the current point (as used for time series) is not necessarily helpful. It could, for instance, take you between points which are not adjacent in time.
The solution is to chop the series up into 'casts' which can be thought to be predominately monotonic. The segmentation into casts is performed automatically during invocation. Information supplied by the user in the driver file ensures that ship movement and other factors do not lead to a plethora of miniscule casts.
Using the settings from the series page, the chosen series, cast and parameters are displayed on the profile page. The Y axis normally increases down the screen but can be flipped, by an option, for a height variable. The zoom orientation is vertical.
The editing of flags is permitted via the profile page. Flags can be toggled individually by the middle mouse button or collectively by defining a box.
10. Scatter plots
Edserplo supports two types of scatter plot, the Current meter scatter plot and the X-Y scatter plot, both of which are used in connection with the time series page. As with all plots in Edserplo, panning and zooming are standard and both scatter plots support outlier chasing.
Outlier chasing allows points selected (by defining a box) on the scatter plot to be highlighted on the time series page. Each point can be visited in turn and flagged on the time series page if required.
Current meter scatter plot
To enable the current meter scatter plot, Easting and Northing channels must exist either as parameters or as derived quantities generated at invocation. Up to eight chosen series can be displayed to allow comparison between current meter series from the same mooring. The X and Y scales are always locked together.
The selected series can be superimposed and the order of plotting can be reversed, so that smaller plots are not obliterated by larger ones. The plot also displays
- A compass rose — which can be rotated to assess the dominant axis direction in the tidal ellipse.
- A circle — which can be adjusted to gauge maximum and minimum currents.
An extension to support two-dimensional (ADCP) data is still to be completed.
X-Y scatter plot
Any two parameters selected via the series page may be plotted on the X-Y scatter plot. If necessary, the X and Y scales can be independently adjusted.
11. Thermistor chain or ADCP (TCAD) page
Any two-dimensional data selected via the series page, with time as one independent variable and depth or pressure as the other can be presented as a series of profiles on the TCAD page.
The zoom axis is vertical. The number of profiles presented at one time can be adjusted by the user and there is a differential mode to distinguish the current profile from its neighbours.
This page can be locked together with the time series page so movement between cycles on one page translates to movement between cycles on the other. Flag editing is permitted on the TCAD or the time series page.
12. Wave data
Edserplo supports two pages for assessing wave data. The Wave histogram for wave statistics (significant wave height — Hs and zero-crossing period — Tz) and Wave spectrum page. Data flags may be edited directly on the spectrum page and indirectly via the histogram page. Both presentations are used in conjunction with the time series page.
Wave histogram
The wave histogram may used for an immediate assessment of the 'shape' of the data. A double histogram of cells 0.5 metres by 0.5 seconds is presented with Hs on the Y axis. The page supports five double histograms derived from the data series. This includes four seasonal and a fifth (a year) featuring the seasons altogether.
Curves associated with the theoretical wave steepness limit may also be displayed on the page. Any data values which fall above or to the left of these steepness curves are not feasible and should be flagged. Providing certain conditions are fulfilled flagging can be done by highlighting a cell and using the update command. By locking the page to the time series page you can check which data points will be flagged.
Wave spectrum
A wave series may include spectral data. For example, a wave buoy’s motion can be monitored for 15 minutes every three hours and the results Fourier-transformed to yield a series of power spectra. The wave spectrum page plots spectral (including directional spectra) channels against frequency. The page functionality mimics that of the TCAD page with one exception — the zoom orientation and the independent axis are horizontal.
13. HF Radar surface currents — the OSCR page
The Ocean Surface Current Radar (OSCR) system is used for monitoring surface currents synoptically across an area, normally in coastal waters. OSCR data is inherently noisy, as ships can produce many spurious returns, so a handy mechanism for flagging out such data is required. Other systems, collecting similar data, can also be presented on the OSCR page.
The OSCR page is used in connection with the time series page. A set of cells is defined by a set of latitudes and longitudes and these points are displayed with the associated current vector against a background map. The positioning information for the cells is integral to the series. The page can be zoomed and panned in the usual way and cells to be edited can be defined by a box.
14. Processing tidal data
BODC is responsible for providing data processing for the UK National Tide Gauge Network. Edserplo has been designed with this in mind. Along with its other functions, Edserplo allows users to
- Process overlapping data
- Visually assess continuity between weeks
- Edit data and not just flags
- Aggregate weekly data files into month and year files
- Generate tides and residuals on a port-by-port basis
- Produce statistics
- Tidally analyse data
To invoke Edserplo in tidal processing mode, the user must provide a file containing a list of data file names with port/channel mapping (the driver file) and a 'dump' file. The presence of a dump file is required to switch Edserplo to tidal processing mode. The user's preferred display settings may also be defined in a settings file.
The port page
The port page has a similar functionality to the series page. It lists the ports of interest, cross-referenced against channel names, scales, bases and colours. There are normally several series associated with each port. Although series can be edited, the files from which they derive are not altered, the data are retained after each invocation in the 'dump' file. Data can be added to the dump file via a driver file.
The port time series page
The port time series display page looks very similar to that of the time series however there are many subtle and not so subtle differences between the two. One peculiarity is that base and scale are defined for each port and channel combination because tidal ranges differ by an order of magnitude around the UK.
In addition to flag editing, the user may perform data value editing. Editing includes data substitution, data replacement by interpolation, gap filling, linear time stretching or shrinking and constant offset adjustment.
The output page
Having selected ports and parameters on the port page, the user goes to the output page to generate new files. The user selects
- The format
- The time span
- The filename template
- The file segmentation scheme (year, month, none, etc)
- The filtering mechanism to be employed (if necessary)
Output can only be produced if the overlaps between the series involved have already been flagged for deletion.
The output page is also used to spawn tidal analyses (Doodson harmonic analysis), generate tidal statistics and save deleted series in the dump file. The deletion takes effect on exit when the dump file is updated. A security mechanism ensures that only one person can update a specified dump file at any one time.
The tidal analysis page
Tidal analyses are reviewed on the tidal analysis page. It is possible to compare selected subsets of constituents of various analyses in an interleaved fashion. Tidal analysis is performed using a Java version of the POL TIRA analysis program.
Related BODC pages
| Software engineering at BODC | The BODC series model | |
| Former BODC visualisation programs | The BODC Transfer system | |
| Edteva — our former tidal processing program | BODC's Underway Data Processing System (BUDS) | |
| Visualisation development — a Java prototype | GEBCO project pages at BODC |
Related external links
| National Oceanography Centre (NOC) |