PFA Document Structure
A PFA document is a JSON/YAML document with additional constraints. The JSON/YAML content describes algorithms, data types, model parameters, and other aspects of the scoring engine. Some structures have no effect on the scoring procedure and are only intended for archival purposes.
Read the Full Specification here
from titus.genpy import PFAEngine
input, output and action¶
YAML
pfa_yml = """
input: int
output: int
action: input
"""
engine, = PFAEngine.fromYaml(pfa_yml)
engine.action(1)
JSON
pfa_json = """
{
"input": "int",
"output": "int",
"action": "input"
}
"""
engine, = PFAEngine.fromJson(pfa_json)
engine.action(1)
method, zero and merge¶
PFA supports 3 methods:
- map
- emit
- fold
1. Map¶
Map method is simply a mathematical function: one input yields one output.
YAML
pfa_yml = """
input: double
output: double
method: map
action:
- {m.sqrt: input}"""
engine, = PFAEngine.fromYaml(pfa_yml)
print(engine.action(2.0))
JSON
pfa_json = """
{
"input": "double",
"output": "double",
"method": "map",
"action": [
{"m.sqrt": "input"}
]
}
"""
engine, = PFAEngine.fromJson(pfa_json)
print(engine.action(2.0))
2. Emit¶
Of the three types of PFA scoring engine (map, emit, and fold), emit requires special attention in scoring. Map and fold engines yield results as the return value of the function (and fold do so cumulatively), but emit engines always return None.
The only way to get results from them is by passing a callback function.
YAML
engine, = PFAEngine.fromYaml('''
input: double
output: double
method: emit
action:
- if:
==: [{"%": [input, 2]}, 0]
then:
- emit: input
- emit: {/: [input, 2]}
''')
def newEmit(x):
print("output:", x)
engine.emit = newEmit
for x in range(1, 6):
print("input:", x)
engine.action(x)
JSON
engine, = PFAEngine.fromJson('''
{
"input": "double",
"output": "double",
"method": "emit",
"action": [{
"if": {
"==": [{
"%": ["input", 2]
}, 0]
},
"then": [{
"emit": "input"
}, {
"emit": {
"/": ["input", 2]
}
}]
}]
}
''')
def newEmit(x):
print("output:", x)
engine.emit = newEmit
for x in range(1, 6):
print("input:", x)
engine.action(x)
3. Fold¶
Fold method is for aggregation. Rather than waiting till the end of the (potentially infinite) dataset, folding engines return a partial result with each call. The previous partial result becomes available to the next action as a symbol tally. If you are only interested in the total, ignore all but the last output.
engine, = PFAEngine.fromYaml('''
input: double
output: double
method: fold
zero: 0
action:
- {"-": [input, tally]}
merge:
- {"+": [tallyOne, tallyTwo]}
''')
print(engine.action(1)) # 1-0 -> tally is now 1 after execution
print(engine.action(2)) # 2-1
print(engine.action(3)) # 3-1
print(engine.action(4)) # 4-2
print(engine.action(5)) # 5-2
engine, = PFAEngine.fromYaml('''
input: int
output: string
method: fold
zero: ""
action:
- {s.concat: [tally, {s.int: input}]}
merge:
- {s.concat: [tallyOne, tallyTwo]}
''')
print(engine.action(1))
print(engine.action(2))
print(engine.action(3))
print(engine.action(4))
print(engine.action(5))
The zero and merge sections are required for fold engines, and must not be present in map or emit engines.
begin, end, fncs, ranseed¶
pfa = """
{
"input": "string",
"output": {"type": "array", "items": "string"},
"cells": {
"accumulate": {"type": {"type": "array", "items": "string"},
"init": []}},
"method": "map",
"begin":
{"log": {"rand.gaussian": [0.0, 1.0]}},
"action":
{"cell": "accumulate",
"to": {"fcn": "u.addone", "fill": {"newitem": "input"}}},
"end":
{"log": {"rand.choice": {"cell": "accumulate"}}},
"fcns":
{"addone":
{"params": [{"old": {"type": "array", "items": "string"}},
{"newitem": "string"}],
"ret": {"type": "array", "items": "string"},
"do": {"a.append": ["old", "newitem"]}}},
"randseed": 12345,
"name": "ExampleScoringEngine",
"version": 1,
"doc": "Doesn't do much.",
"metadata": {"does": "notmuch"},
"options": {"timeout": 1000}
}
"""
engine, = PFAEngine.fromJson(pfa)
engine.action("abc")
Fibonacci in PFA (Recursion)¶
pfa = """
{
"input": "int",
"output": "int",
"method": "map",
"action": [{"u.fib": ["input"]}],
"fcns":
{
"fib":
{
"params": [{"n": "int"}],
"ret": "int",
"do":
{
"cond":[
{"if": {"==": ["n", 0]}, "then": 0},
{"if": {"==": ["n", 1]}, "then": 1}],
"else": {"+": [
{"u.fib": [{"-": ["n", 1]}]},
{"u.fib": [{"-": ["n", 2]}]}
]}
}
}
}
}
"""
engine, = PFAEngine.fromJson(pfa)
engine.action(12)
Fibonacci in PFA (Loops)¶
pfa = """
{
"input": "int",
"output": "int",
"method": "map",
"action": [{"u.fib": ["input"]}],
"fcns": {
"fib": {
"params": [{"n": "int"}],
"ret": "int",
"do": [
{"let": {"now": 0, "next": 1}},
{"for": {"i": "n"},
"while": {">": ["i", 1]},
"step": {"i": {"-": ["i", 1]}},
"do":
[
{"let": {"tmp": {"+": ["now", "next"]}}},
{"set": {"now": "next",
"next": "tmp"}}
]
},
{"if": {"==": ["n", 0]}, "then": 0, "else": "next"}
]
}
}
}
"""
engine, = PFAEngine.fromJson(pfa)
engine.action(12)
begin and end¶
In some cases, you may want to perform special actions at the beginning and end of a data stream.
PFA has begin and end routines for this purpose.
The begin and end routines do not accept input and do not return output; they only manipulate persistent storage.
engine.begin()
engine.end()
Locator Marks & Names¶
Any JSON object in a PFA document may include "@" as a string-valued field. This string is used to provide a line number from the original source file so that errors can be traced back to their source.
Following Avro convention, names of PFA identifiers start with [A-Za-z_]
& subsequently contain only [A-Za-z0-9_]