SDF.data_model: Access, Generate and Modify SDF Objects
*******************************************************
The Python representation of SDF objects can be found in the :py:mod:`SDF.data_model` submodule and aims to mirror the
behavior of built-in data structures like :py:class:`dict`, :py:class:`set` and :py:class:`list` as much as possible.
.. py:currentmodule:: SDF.data_model
XMLWritable
===========
The abstract class :py:class:`~XMLWritable` is the interface for all classes in :py:mod:`SDF.data_model` which directly
represent an XML element. It has the two abstract methods :py:meth:`~XMLWritable.to_xml_element` (element) and
:py:meth:`~XMLWritable.from_xml_element` ().
Name, Date, Owner, Comment
==========================
These classes represent atomic values: :py:class:`~Name`, :py:class:`~Owner` and :py:class:`~Comment` are just wrappers
around strings (:py:class:`str`), a :py:class:`~Date` wraps a timestamp (:py:class:`datetime.datetime`). Users don't
have to interact with these classes directly, as each class containing a :py:class:`~Name`, :py:class:`~Date`,
:py:class:`~Owner` or :py:class:`~Comment` has a property with that name (in lowercase) that returns the appropriate
value.
.. inheritance-diagram:: SDF.data_model.Name SDF.data_model.Owner SDF.data_model.Comment SDF.data_model.Date
SDF.data_model.NameElement
:caption: Inheritance Diagram
:parts: 1
.. doctest::
>>> from SDF.data_model import Workspace
>>> from datetime import datetime
>>> ws = Workspace(name="My Workspace", owner="Me")
>>> ws.name
'My Workspace'
>>> ws.owner
'Me'
>>> ws.owner = "You"
>>> ws.owner
'You'
>>> ws.date = datetime.now() # works
>>> ws.comment = "My comment" # works
Details:
--------
- :py:class:`~Name` s are immutable, because they are often used as keys in :py:class:`dict`-like data structures
- :py:class:`~Name` s cannot be empty and cannot contain multiple lines
- Name elements (``...``) are implemented in the class :py:class:`~NameElement`, which subclasses
:py:class:`~Name` and also :py:class:`~XMLWritable`
- :py:class:`~Owner` strings will be normalized, such that multiple whitespace characters will be collapsed into one
space
- Multiline :py:class:`~Comment` s will be de-indented, such that relative indentation is preserved and the least
indented line will become not indented
Samples
=======
Each :py:class:`~Sample` has a name and a comment. :py:class:`~Workspace` s and :py:class:`~Dataset` s can have multiple
:py:class:`~Sample` s. So :py:class:`~Workspace` s and :py:class:`~Dataset` s have a :py:attr:`~SDFObject.samples`
property, which behaves like a :py:class:`dict`. Thus, users will most likely never directly interact with
:py:class:`~Sample` objects.
.. doctest::
>>> from SDF.data_model import Workspace
>>> ws = Workspace("My Workspace")
>>> ws.samples["sample 1"] = "Comment 1"
>>> ws.samples["sample 2"] = "Comment 2"
>>> ws.samples["sample 1"]
'Comment 1'
.. inheritance-diagram:: SDF.data_model.Sample SDF.data_model.SampleSet
:caption: Inheritance Diagram
:parts: 1
Parameters
==========
Parameters can either be single :py:class:`~Parameter` s or :py:class:`~ParameterSet` s. Both are subclasses of
:py:class:`~ParameterType`.
Single Parameters
-----------------
Single parameters (````) are represented by the :py:class:`~Parameter` class. Its
attributes :py:attr:`~Parameter.name`, :py:attr:`~Parameter.value` and :py:attr:`~Parameter.unit` are strings (unit can
be ``None``). The constructor can take any value type, but it will internally be converted to a :py:class:`str`. The
:py:class:`~Parameter` class has the special property :py:attr:`~Parameter.parsed_value`, which tries to parse the value
string as Python literal.
Users will directly interact with :py:class:`~Parameter` objects, but won't have to generate them manually, as will be
seen in the section about :py:class:`~ParameterSet` s.
.. doctest::
>>> import numpy as np
>>> from SDF.data_model import Parameter
>>> p1 = Parameter("par1", 1)
>>> p1.name
'par1'
>>> p1.value
'1'
>>> p1.parsed_value
1
>>> p1.unit is None
True
>>> p2 = Parameter("par2", np.arange(4), "N/m")
>>> p2.value
'[0, 1, 2, 3]'
>>> p2.parsed_value
[0, 1, 2, 3]
>>> p2.unit
'N/m'
.. inheritance-diagram:: SDF.data_model.Parameter
:caption: Inheritance Diagram
:parts: 1
Details
^^^^^^^
- There is no guarantee that :py:attr:`~Parameter.parsed_value` will return the originally passed value,
as there are endless possible cases
- :py:class:`numpy.ndarray` will be represented as (possibly nested) :py:class:`list` s
- :py:class:`bytes` will often be represented as ascii :py:class:`str` s
- :py:class:`list` s cannot be parameter values, since that would make parsing much more complicated.
Use :py:class:`tuple` or :py:class:`numpy.ndarray` instead.
Parameter Sets
--------------
Sets of parameters (``...``) are represented by the :py:class:`~ParameterSet` class. It mostly
resembles :py:class:`dict` with :py:class:`str` keys (the names) and :py:class:`~Parameter` or :py:class:`~ParameterSet`
values.
:py:class:`~Workspace` and :py:class:`~Dataset` objects have a :py:attr:`~SDFObject.parameters` attribute.
.. inheritance-diagram:: SDF.data_model.ParameterSet
:caption: Inheritance Diagram
:parts: 1
To add parameters to a :py:class:`~ParameterSet`, there is a type-safe and a more user-friendly :py:class:`dict` -like
way:
Dict-like
^^^^^^^^^
ParameterSets can be handled similarly to the built-in :py:class:`dict` class.
.. doctest::
>>> from SDF.data_model import Workspace
>>> ws = Workspace("My Workspace")
>>> # single parameters
>>> ws.parameters["par1"] = 1.8
>>> ws.parameters["par1"].parsed_value
1.8
>>> ws.parameters["par2"] = 3.1, "um"
>>> ws.parameters["par2"].parsed_value
3.1
>>> ws.parameters["par2"].unit
'um'
>>> # parameter sets
>>> ws.parameters["parset1"] = [("name1", "value1"), ("name2", "value2")] # tuples are single parameters
>>> ws.parameters["parset1"]["name2"].value
'value2'
>>> ws.parameters["parset2"] = {"name3": (1, 2, 3), "name4": (3.14, "mm")}
>>> ws.parameters["parset2"]["name3"].parsed_value
(1, 2, 3)
>>> ws.parameters["parset2"]["name4"].unit
'mm'
Type-safe
^^^^^^^^^
Instead of relying on our parsing mechanism, users can explicitly create :py:class:`~Parameter` and
:py:class:`~ParameterSet` objects and add them to :py:class:`~ParameterSet` objects by using its inherited
:py:meth:`~ElementSet.add` method.
.. doctest::
>>> from SDF.data_model import Workspace, Parameter, ParameterSet
>>> ws = Workspace("My Workspace")
>>> # single parameters
>>> par1 = Parameter("par1", "value1")
>>> ws.parameters.add(par1)
>>> ws.parameters["par1"].value
'value1'
>>> # parameter sets
>>> parset1 = ParameterSet("parset1")
>>> parset1.add(Parameter("name2", 123, "V/m"))
>>> ws.parameters.add(parset1)
>>> ws.parameters["parset1"]["name2"].unit
'V/m'
Instruments
===========
:py:class:`~Instrument` s are implemented like :py:class:`~ParameterSet` s (both share thesame base class). They cannot
be added to other :py:class:`~Instrument` s or :py:class:`~ParameterSet` s, but users can add :py:class:`~Parameter` s
and :py:class:`~ParameterSet` s to them in the same way as with :py:class:`~ParameterSet` s.
:py:class:`~Workspace` s and :py:class:`~Dataset` s have an
:py:attr:`~SDFObject.instruments` property, which behaves like a set of
:py:class:`~Instrument` s or a :py:class:`~ParameterSet` s with
:py:class:`~Instrument` instances as first-level children.
.. doctest::
>>> from SDF.data_model import Workspace, Instrument
>>> ws = Workspace("My Workspace")
>>> # dict-like
>>> ws.instruments["inst1"] = {"par1": "val1", "par2": ("val2", "unit2")}
>>> ws.instruments["inst1"]["par1"].value
'val1'
>>> # type-safe
>>> inst2 = Instrument("inst2")
>>> inst2.add(Parameter("par3", "val3"))
>>> ws.instruments.add(inst2)
>>> ws.instruments["inst2"]["par3"].value
'val3'
.. inheritance-diagram:: SDF.data_model.Instrument SDF.data_model.InstrumentSet
:caption: Inheritance Diagram
:parts: 1
Data Blocks
===========
Data blocks (``...``) are represented by the abstract class :py:class:`~Data`.
Currently there are three implementations:
- :py:class:`~ArrayData1D`: Wraps a 1-dimensional :py:class:`numpy.ndarray` of float or integer values
- :py:class:`~ArrayData2D`: Wraps a 2-dimensional :py:class:`numpy.ndarray`
- :py:class:`~ImageData`: Wraps a :py:class:`PIL.Image.Image`
Users usually don't have direct contact to these wrappers, since the :py:class:`~Dataset` classes
abstract them and provide direct access to the wrapped :py:class:`~Data` objects.
.. inheritance-diagram:: SDF.data_model.ArrayData1D SDF.data_model.ArrayData2D SDF.data_model.ImageData
:caption: Inheritance Diagram
:parts: 1
SDF Objects
===========
The abstract class :py:class:`~SDFObject` implements the behavior and properties shared by
:py:class:`~Dataset` and :py:class:`~Workspace`:
- :py:class:`~Name`
- Optional :py:class:`~Owner`
- Optional :py:class:`~Date`
- Optional :py:class:`~Comment`
- Optional :py:class:`~Sample` s
- Optional :py:class:`~Parameter` s
- Optional :py:class:`~Instrument` s
Datasets
========
Besides the properties inherited from SDFObject, datasets contain a single :py:class:`~Data` object, and optional
metadata specific to the respective :py:class:`~Data` type (:py:attr:`~Data.type_for_xml`).
- :py:class:`~ArrayDataset1D`: Wraps :py:class:`~ArrayData1D` and has an optional
:py:class:`str` property :py:attr:`~ArrayDataset1D.unit`
- :py:class:`~ArrayDataset2D`: Wraps :py:class:`~ArrayData2D`
- :py:class:`~ImageDataset`: Wraps :py:class:`~ImageData`
These :py:class:`~Dataset` implementations provide access to the object wrapped by
:py:class:`~Data` via their :py:attr:`data` property. This, however, is just a common attribute of the
current :py:class:`~Dataset` implementations and not a requirement for future implementations.
.. doctest::
>>> import numpy as np
>>> from SDF.data_model import ArrayDataset1D
>>> ds_1d = ArrayDataset1D("Dataset 1", np.arange(10), unit="s", comment="Comment 1")
>>> ds_1d.data
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> ds_1d.comment
'Comment 1'
>>> from SDF.data_model import ArrayDataset2D
>>> ds_2d = ArrayDataset2D("Dataset 2", np.random.random((2, 30)),
... samples=dict(cell1="wild type", cell2="mutant"))
>>> ds_2d.data.shape
(2, 30)
>>> ds_2d.samples["cell1"]
'wild type'
>>> from PIL import Image
>>> from SDF.data_model import ImageDataset
>>> img = Image.fromarray(np.random.randint(0, 256, (20, 20)), mode="L") # random grayscale image, 20x20
>>> ds_img = ImageDataset("Dataset 3", img, owner="Santa")
>>> ds_img.data.size
(20, 20)
>>> ds_img.owner
'Santa'
.. inheritance-diagram:: SDF.data_model.ArrayDataset1D SDF.data_model.ArrayDataset2D SDF.data_model.ImageDataset
:caption: Inheritance Diagram
:parts: 1
Details
-------
Since there are many optional properties, the constructor only accepts the name and data object as positional arguments.
Other arguments (owner, parameters, ...) must be passed by keyword.
Workspaces
==========
Besides the properties inherited from :py:class:`~SDFObject`, :py:class:`~Workspace` s can wrap multiple child
:py:attr:`~Workspace.datasets` and :py:attr:`~Workspace.workspaces`, which enables SDF files to be hierarchical.
.. doctest::
>>> import numpy as np
>>> from SDF.data_model import Workspace, ArrayDataset1D
>>> ws2 = Workspace("Child workspace")
>>> ds1 = ArrayDataset1D("First dataset", np.array([1, 2, 3]))
>>> ds2 = ArrayDataset1D("Second dataset", np.array([4, 5, 6]))
>>> # ws1 is initialized with ds1 and ws2 as children
>>> ws1 = Workspace("Parent workspace", datasets=[ds1], workspaces=[ws2])
>>> ws2 in ws1.workspaces
True
>>> ws1.workspaces["Child workspace"] is ws2 # access by name
True
>>> ds1 in ws1.datasets
True
>>> ds2 in ws1.datasets
False
>>> # items can be added or removed later
>>> ws1.datasets.add(ds2)
>>> ws1.datasets.remove(ds1)
.. inheritance-diagram:: SDF.data_model.Workspace
:caption: Inheritance Diagram
:parts: 1
Full Inheritance Diagram
========================
.. inheritance-diagram:: SDF.data_model.ArrayData1D SDF.data_model.ArrayData2D SDF.data_model.ArrayDataset1D
SDF.data_model.ArrayDataset2D SDF.data_model.Comment SDF.data_model.Data SDF.data_model.Dataset
SDF.data_model.Date SDF.data_model.ElementSet SDF.data_model.ImageData
SDF.data_model.ImageDataset SDF.data_model.Instrument SDF.data_model.InstrumentSet
SDF.data_model.Name SDF.data_model.Owner SDF.data_model.AnonymousParameterSet
SDF.data_model.Parameter SDF.data_model.ParameterSet SDF.data_model.Sample
SDF.data_model.SampleSet SDF.data_model.SDFObject SDF.data_model.SourceParameters
SDF.data_model.Workspace SDF.data_model.XMLWritable SDF.data_model.NameElement
:parts: 1
:caption: Inheritance Diagram