Specification

SDF files

Fig. 2 shows the flow-diagram for the definition of a well-formed SDF file. Items in red rectangles are text that is found in the SDF file while items in rectangles with round corners are items that are defined further below. Following the arrows in the diagram it is obvious that an SDF file can contain only one (root) workspace or one single dataset. For example, it is not possible that a single SDF file contains two datasets that are not embedded into a workspace.

SDF file

Fig. 2 The general frame of an SDF file. Such a file can either contain a (single) workspace or a (single) dataset.

Workspaces

Workspaces are designed as containers for other workspaces or for datasets. Workspaces can also contain named parameters that serve as global parameters for all workspaces and datasets contained in the workspace. In this way one can compare workspaces with folders in a directory tree.

The definition of a workspace in XML is shown in Fig. 3 It is obvious from this figure that a workspace is enclosed within the two tags <workspace> and </workspace>. Inside this frame a workspace can (optionally) contain a name, a date, an owner, a section with the sample description, and a section for the instrument settings. All of those elements may appear only once within the frame of a workspace. Furthermore, workspaces can contain an arbitrairy number of named parameter values. The syntax of all these tags is explained further below.

workspace

Fig. 3 The XML definition of a workspace.

Datasets

Datasets are containers for data values. Nevertheless, the dataset also contain additional information such as the name of the dataset, the date of its creation, its owner (creator) and more. Datasets can come with additional named parameters and with individual instrument settings. All those items are pretty much the same as the corresponding items in the workspace. Differentiating between workspace-parameters and dataset-parameters allows us to construct very heterogeneous workspaces, e.g., handling a project with many datasets originating from a variety of instruments.

The definition of a dataset in XML is shown in Fig. 4 Items like name, date, owner, etc., are the same as the corresponding workspace counterparts and will be described further below.

dataset

Fig. 4 The XML definition of a dataset.

A dataset contains exactly one datablock. The dataset type attribute determines the type of that wrapped block.

Depending on the data wrapped in it, it can have different structures and optional child elements for storing metadata like the unit or column names.

Todo

Include data type-specific metadata children in flowchart

Dataset type sc: Single-column numeric data

  • required attributes: rows (integer), cols (integer, must be 1), type

  • text content: whitespace-separated values, format depends on type

  • optional dataset children:

    • unit element with value attribute

  • types

    • int: integer values like 123

    • float: floating point values like 1.123 or 1.234e-10

    • hex: linearly transformed float values, written as hexadecimal

      • requires additional attributes offset (float) and multiplier (float)

      • values like 12ABF

      • interpretation: read hex values as integers, multiply by multiplier, add offset

Dataset type mc: Multi-column numeric data

  • required attributes: type (int or float), either rows (integer) and cols (integer), or shape like

    (rows, cols)

  • text content: whitespace-separated values (see above)

  • example: with rows=2 and cols=2, the text 1 2 3 4 will be read as two lines, 1, 2 and 3, 4

Dataset type img: Images

  • attributes:

    • dtype allowed for backwards-compatibility (can be anything)

    • encoding: e.g. base64

    • type: MIME type, e.g. image/png

  • text content: the text representation of the image, according to encoding and type

Note

The reference implementation currently does not check the attribute values and always assumes base64-encoded PNG images

Todo

Add flowchart for datablocks

Instrument Settings

The instrument settings can be specified at the workspace level (if the settings have to be applied for all datasets in the workspace) or, individually, at the dataset level. Instrument settings consist of an intrument name and a list of instrument parameters.

The flow diagram of the XML definition of the instrument settings is shown in Fig. 5.

instrument

Fig. 5 The XML definition of the instrument settings.

Parameters in XML

Parameters have a name. They can either have the attributes value and (optionally) unit, or have no attributes but can contain multiple (or no) parameter children.

The flow diagram of the XML definition of a parameter is shown in Fig. 6.

parameter

Fig. 6 The XML definition of a parameter.

Sample Description in XML

A sample description contains a name and a comment.

The flow diagram of the XML definition of the sample description is shown in Fig. 7.

sample

Fig. 7 The XML definition of the sample description.

Name and Owner

The name and owner elements have no attributes or children, and contain a single line of text (the name).

Date

The date element has an optional dateformat attribute, no children, and contains a single line of text (a timestamp).

Comment

The comment element has no attributes or children, and contains arbitrary text.

Todo

Add flowcharts for these elements