Persistent Storage - Cells and Pools
A PFA scoring engine has four types of persistent storage:
- cells (private)
- cells (shared)
- pools (private)
- pools (shared)
These storage areas are like local symbols in that they store Avro-typed data, but they are unlike local symbols in that they have global scope and are remembered between action invocations, and between the begin and end.
Cells vs Pools
- both persistent storage and can be shared
- cells are global variables that cannot be created or destroyed at runtime (only reassigned)
- pools are like environments in R: collections of key-value pairs that can be created and destroyed at runtime, and the granularity of concurrent access is at the level of a single pool item.
Cells and pools are both specified as JSON objects with the same fields, though init is required for cells and not for pools.
from titus.genpy import PFAEngine
pfa = """
input: string
output: string
cells:
longest: {type: string, init: ""}
action:
- if:
">":
- {s.len: input}
- {s.len: {cell: longest}}
then:
- {cell: longest, to: input}
- input
else:
- {cell: longest}
"""
engine, = PFAEngine.fromYaml(pfa)
engine.action("abc")
engine.action("abcdf")
engine.action("abc")
Notes
Cells store individual, named values of a specific type.
The scoring engine above reproduces the fold-method example by storing the tally in a cell of type string.
It is somewhat more cumbersome to use a persistent cell rather than the fold method, but a few interacting cells can perform more complex tasks than the fold method alone. (Later example)
Cells cannot be created or destroyed at runtime, and they must be initialized before the begin method. In the above case, the initial value is an empty string.
Pools are persistent storage elements without this restriction. They can be used to gather data into tables.