Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 222222
Loughborough University

IT Services : High Performance Computing

Metadata


Metadata

An important part of reproducibility is metadata for the results you create. This is generally not very exciting, but it is useful when you want to work out what a results file is related to. In the above example the last step in the pseudo code is to commit changes along with a message. Since the message can be tracked back to a revision in the revision control system then it offers a base level of metadata.

Metadata is typically created as either data embedded in a file (e.g. EXIF data that digital cameras place in JPEG files, and similar techniques are used for MRI scanners and so on) or as additional files. It tends to be in either key-value pairs (e.g. Date: 15-10-2013), JSON, or XML. Often XML versions implement Dublin Core plus additional terms that describe the data from an accepted ontology, often as a derivative of OWL.

A metdata file might include things such as the owner of the data, organisation, date created, information about the data used to create it, programs used, etc. in a way that is logical and descriptive.

Example (XML Fragment) from a fictitious example associated with an output file from a simulation:

<simulationresult>
  <date day="10" month="1" year="2014"/>
    <inputs>
      <pressure>10</pressure>
    <temperature>300</temperature>
  </inputs>
  <parameters>
    <pressure>10</pressure>
    <temperature>300</temperature>
  </parameters>
  <program>
    <path>/home/it/itat/program.exe</path>
    <version>1.1</version>
  </program>
  <nodes>
    <node>hydra24</node>
    <node>hydra25</node>
  </nodes>
</simulationresult>