Serialization Tutorial

As an example, suppose we have parameterized classes and instances:

import param
class TrainingHyperparameters(param.Parameterized):
    lr = param.Number(1e-5, doc='The learning rate')
    max_epochs = param.Integer(10)
    model_regex = param.String(
        "model-{epoch:05d}.pkl",
        doc='Regular exp for storing model weights after every epoch')

t_params = TrainingHyperparameters()

class ModelHyperparameters(param.Parameterized):
    layers = param.ListSelector(
        [], objects=['conv', 'fc', 'recurrent'],
        doc='Sequence of layers by type, bottom-first')
    activations = param.ObjectSelector('relu', objects=['tanh', 'relu'])

m_params = ModelHyperparameters()
m_params.layers = ['conv', 'conv', 'fc']

param_dict = {
  'training': t_params,
  'model': m_params,
}

We can serialize these easily into JSON, YAML, or INI using pydrobert.param.serialization:

import pydrobert.param.serialization as serial
serial.serialize_to_json('conf.json', param_dict)
serial.serialize_to_yaml('conf.yaml', param_dict)  # requires ruamel.yaml or pyyaml
serial.serialize_to_ini('conf.ini', param_dict)

where we get

{
  "training": {
    "lr": 1e-05,
    "max_epochs": 10,
    "model_regex": "model-{epoch:05d}.pkl"
  },
  "model": {
    "activations": "relu",
    "layers": [
      "conv",
      "conv",
      "fc"
    ]
  }
}

or

training:
  lr: 1e-05  # The learning rate
  max_epochs: 10
  model_regex: model-{epoch:05d}.pkl # Regular exp for storing model weights after every epoch
model:
  activations: relu  # Choices: "tanh", "relu"
  layers:  # Sequence of layers by type, bottom-first. Element choices: "conv", "fc", "recurrent"
    - conv
    - conv
    - fc

or

# == Help ==
# [training]
# lr: The learning rate
# model_regex: Regular exp for storing model weights after every epoch

# [model]
# activations: Choices: "tanh", "relu"
# layers: Sequence of layers by type, bottom-first. A JSON string. Element choices: "conv", "fc", "recurrent"


[training]
lr = 1e-05
max_epochs = 10
model_regex = model-{epoch:05d}.pkl

[model]
activations = relu
layers = ["conv", "conv", "fc"]

respectively.

Deserialization proceeds similarly. Files can be used to populate parameters in existing parameterized instances.

t_params.lr = 10000.
assert t_params.lr == 10000.
serial.deserialize_from_yaml('conf.yaml', param_dict)
assert t_params.lr == 1e-05

pydrobert.param.argparse contains convenience functions for (de)serializing config files right from the command line.

import argparse, pydrobert.param.argparse as pargparse
parser = argparse.ArgumentParser()
pargparse.add_parameterized_read_group(parser, parameterized=param_dict)
pargparse.add_parameterized_print_group(parser, parameterized=param_dict)

Sometimes, the default (de)serialization routines are unsuited for the data. For example, INI files do not have a standard format for lists of values. For this, and many other container types, values are parsed with JSON syntax. If we wanted to parse lists differently, such as a comma-delimited list, we can design a custom serializer and deserializer for handling our layers parameter:

class CommaSerializer(serial.DefaultListSelectorSerializer):
    def help_string(self, name, parameterized):
        choices_help_string = super(CommaSerializer, self).help_string(name, parameterized)
        return 'Elements separated by commas. ' + choices_help_string

    def serialize(self, name, parameterized):
        val = super(CommaSerializer, self).serialize(name, parameterized)
        return ','.join(str(x) for x in val)

class CommaDeserializer(serial.DefaultListSelectorDeserializer):
    def deserialize(self, name, block, parameterized):
        block = block.split(',')
        super(CommaDeserializer, self).deserialize(name, block, parameterized)

serial.serialize_to_ini(
    'conf.ini', param_dict,
    # (de)serialize by type
    serializer_type_dict={param.ListSelector: CommaSerializer()},
)
serial.deserialize_from_ini(
    'conf.ini', param_dict,
    # or by name!
    deserializer_name_dict={'model': {'layers': CommaDeserializer()}},
)

With conf.ini:

# == Help ==
# [training]
# lr: The learning rate
# model_regex: Regular expression for storing model weights after every epoch

# [model]
# activations: Choices: "tanh", "relu"
# layers: Sequence of layers by type, bottom-first. Elements separated by commas. Element choices: "conv", "fc", "recurrent"


[training]
max_epochs = 10
model_regex = model-{epoch:05d}.pkl
lr = 1e-05

[model]
activations = relu
layers = conv,conv,fc