function-pipe¶
Introduction¶
The function-pipe Python module defines the class FunctionNode (FN) and decorators to create the derived-class PipeNode (PN). FNs are wrappers of callables that permit returning new FNs after applying operators, composing callables, or partialing. This supports the flexible combination of functions in a lazy and declarative manner.
PipeNodes (PNs) are FNs prepared for extended function composition or dataflow programming. PNs, through a decorator-provided, two-stage calling mechanism, expose to wrapped functions both predecessor output and a common initial input. Rather than strictly linear pipelines, sequences of PNs can be stored and reused; PNs can be provided as arguments to other PNs; and results from PNs can be stored for later recall in the same pipeline.
Code: https://github.com/InvestmentSystems/function-pipe
Docs: http://function-pipe.readthedocs.io
Packages: https://pypi.python.org/pypi/function-pipe
Getting Started¶
FunctionNode and PipeNode are abstract tools for linking Python functions, and are applicable in many domains. The best way to get to know them is to follow some examples and experiment. This documentation provide examples in a number of domains, including processing strings and Pandas DataFrames.
Installation¶
A standard setuptools installer is available via PyPI:
https://pypi.python.org/pypi/function-pipe
Or, install via pip3:
pip3 install function-pipe
Source code can be obtained here:
History¶
The function-pipe tools were developed within Investment Systems, the core development team of Research Affiliates, LLC. Many of the approaches implemented were first created by Max Moroz in 2012. Christopher Ariza subsequently refined and extended those approaches into the current models of FunctionNode and PipeNode. The first public release of function-pipe was in January 2017. After that, Charles Burkland improved the quality and user-friendliness of the library, along with the addition of some more explicit functions. The second public release was made in April 2022.
API¶
function_pipe¶
- class FunctionNode(function: Any, *, doc_function: Optional[Callable] = None, doc_args: Tuple[Any, ...] = (), doc_kwargs: Optional[Dict[str, Any]] = None)¶
A wrapper for a callable that can reside in an expression of numerous FunctionNodes, or be modified with unary or binary operators.
- __init__(function: Any, *, doc_function: Optional[Callable] = None, doc_args: Tuple[Any, ...] = (), doc_kwargs: Optional[Dict[str, Any]] = None) None ¶
Args:
function
: a callable or value. If given a value, will create a function that simply returns that value.doc_function
: the function to display in the repr; will be set tofunction
if not provideddoc_args
: the positional arguments to display in the reprdoc_kwargs
: the keyword arguments to display in the repr
- property unwrap: Callable¶
The doc_function should be set to the core function being wrapped, no matter the level of wrapping.
- __call__(*args: Any, **kwargs: Any) Any ¶
Call the wrapped function with args and kwargs.
- partial(*args: Any, **kwargs: Any) function_pipe.core.function_pipe.FunctionNode ¶
Return a new FunctionNode with a partialed function with args and kwargs.
- __neg__() function_pipe.core.function_pipe.FN ¶
Return a new FunctionNode that when evaulated, will negate the result of
self
- __invert__() function_pipe.core.function_pipe.FN ¶
Return a new FunctionNode that when evaulated, will invert the result of
self
- NOTE:
This is generally expected to be a Boolean inversion, such as ~ (not) applied to a Numpy, Pandas, or Static-Frame objects.
- __abs__() function_pipe.core.function_pipe.FN ¶
Return a new FunctionNode that when evaulated, will find the absolute value of the result of
self
- __eq__(rhs: Any) function_pipe.core.function_pipe.FN ¶
Return self==value.
- __lt__(rhs: Any) function_pipe.core.function_pipe.FN ¶
Return self<value.
- __le__(rhs: Any) function_pipe.core.function_pipe.FN ¶
Return self<=value.
- __gt__(rhs: Any) function_pipe.core.function_pipe.FN ¶
Return self>value.
- __ge__(rhs: Any) function_pipe.core.function_pipe.FN ¶
Return self>=value.
- __ne__(rhs: Any) function_pipe.core.function_pipe.FN ¶
Return self!=value.
- __rshift__(rhs: Callable) function_pipe.core.function_pipe.FN ¶
Composes a new FunctionNode will call
lhs
first, and then feed its result intorhs
- __rrshift__(lhs: Callable) function_pipe.core.function_pipe.FN ¶
Composes a new FunctionNode will call
lhs
first, and then feed its result intorhs
- __lshift__(rhs: Callable) function_pipe.core.function_pipe.FN ¶
Composes a new FunctionNode will call
rhs
first, and then feed its result intolhs
- __rlshift__(lhs: Callable) function_pipe.core.function_pipe.FN ¶
Composes a new FunctionNode will call
rhs
first, and then feed its result intolhs
- __or__(rhs: function_pipe.core.function_pipe.FN) function_pipe.core.function_pipe.FN ¶
Only implemented for PipeNode.
- __ror__(lhs: function_pipe.core.function_pipe.FN) function_pipe.core.function_pipe.FN ¶
Only implemented for PipeNode.
- class PipeNode(function: Any, *, doc_function: Optional[Callable] = None, doc_args: Tuple[Any, ...] = (), doc_kwargs: Optional[Dict[str, Any]] = None, call_state: Optional[function_pipe.core.function_pipe.PipeNode.State] = None, predecessor: Optional[function_pipe.core.function_pipe.PN] = None)¶
This encapsulates the node that will be used in a pipeline.
It is not expected to be created directly, rather, through usage of
pipe_node
(and related) decorators.PipeNodes will be in (or move between) one of three states, depending on where it was created, or what the current state of pipeline evaluation is
- partial(*args: str, **kwargs: str) function_pipe.core.function_pipe.PN ¶
Partialing PipeNodes is prohibited. Use
pipe_node_factory
(and related) decorators to pass in expression-level arguments.
- property call_state: Optional[State]¶
The current call state of the Node
- property predecessor: Optional[function_pipe.core.function_pipe.PN]¶
The PipeNode preceeding this Node in a pipeline. Can be None
- __or__(rhs: function_pipe.core.function_pipe.PN) function_pipe.core.function_pipe.PN ¶
Invokes
rhs
, passing inself
as the kwargPREDECESSOR_PN
.
- __ror__(lhs: function_pipe.core.function_pipe.PN) function_pipe.core.function_pipe.PN ¶
Invokes
lhs
, passing inself
as the kwargPREDECESSOR_PN
.
- __getitem__(pn_input: Any) Any ¶
Invokes
self
, passing inpn_input
as the kwargPN_INPUT
.- NOTE:
If
None
, will evaluate self with a defaultPipeNodeInput
instanceIf user desires for the initial input to be literally
None
, use(**{PN_INPUT: None})
instead.
- __call__(*args: Any, **kwargs: Any) Any ¶
Call the wrapped function with args and kwargs.
- class PipeNodeInput¶
PipeNode input to support store and recall; subclassable to expose other attributes and parameters.
- store(key: str, value: Any) None ¶
Store
key
andvalue
in the underlying store.
- recall(key: str) Any ¶
Recall
key
from the underlying store. Can raise anKeyError
- property store_items: ItemsView[str, Any]¶
Return an items view of the underlying store.
- pipe_node(*key_positions: typing.Union[typing.Callable, str], core_decorator: typing.Callable[[typing.Any], typing.Callable] = <function _core_logger>, self_keyword: str = 'self') Union[Callable, function_pipe.core.function_pipe.PipeNode] ¶
Decorates a function to become a
PipeNode
that takes no expression-level args.This can either be used as a decorator, or a decorator factory, similar to
functools.lru_cache
.Examples:
>>> @pipe_node >>> def func(**kwargs): >>> pass
>>> @pipe_node() >>> def func(): >>> pass
>>> @pipe_node(PN_INPUT) >>> def func(pn_input): >>> pass
>>> class Example: >>> @pipe_node(PN_INPUT) >>> def method(self, pn_input): >>> pass
>>> from functools import partial >>> español_pipe_node = partial(pipe_node, self_keyword="uno_mismo") >>> ... >>> class Ejemplo: >>> @español_pipe_node(PN_INPUT) >>> def método(uno_mismo, pn_input): >>> pass
- Args:
key_positions
: either a single callable, or a list of keywords that will be positionally bound to the decorated function.core_decorator
: a decorator that will be applied to the core_callable. This is typically a logger. By default, it will print to stdout.self_keyword
: which keyword to look for when decorating instance methods.
- pipe_node_factory(*key_positions: typing.Union[typing.Callable, str], core_decorator: typing.Callable[[typing.Any], typing.Callable] = <function _core_logger>, self_keyword: str = 'self') Union[Callable, Callable[[Any], function_pipe.core.function_pipe.PipeNode]] ¶
Decorates a function to become a pipe node factory, that when given expression-level arguments, will return a
PipeNode
This can either be used as a decorator, or a decorator factory, similar to
functools.lru_cache
.Examples:
>>> @pipe_node_factory >>> def func(a, b, **kwargs): >>> pass >>> ... >>> func(1, 2) # This is now a PipeNode!
>>> @pipe_node_factory() >>> def func(*, a, b): >>> pass >>> ... >>> func(a=1, b=2) # This is now a PipeNode!
>>> @pipe_node_factory(PN_INPUT, PREDECESSOR_RETURN) >>> def func(pn_input, previous_value, a, *, b): >>> # pn_input will be given the PN_INPUT from the pipeline >>> # prev will be given the PREDECESSOR_RETURN from the pipeline >>> pass >>> ... >>> func(1, b=2) # This is now a PipeNode!
>>> class Example: >>> @pipe_node_factory(PN_INPUT, PREDECESSOR_RETURN) >>> def method(self, pn_input, previous_value, a, *, b): >>> pass >>> ... >>> Example().method(1, b=2) # This is now a PipeNode!
>>> from functools import partial >>> español_pipe_node_factory = partial(pipe_node_factory, self_keyword="uno_mismo") >>> ... >>> class Ejemplo: >>> @español_pipe_node_factory(PN_INPUT, PREDECESSOR_RETURN) >>> def método(uno_mismo, pn_input, valor_anterior, a, *, b): >>> pass >>> ... >>> Ejemplo().método(1, b=2) # Esto ahora es un PipeNode!
- Args:
key_positions
: either a single callable, or a list of keywords that will be positionally bound to the decorated function.core_decorator
: a decorator that will be applied to the core_callable. This is typically a logger. By default, it will print to stdout.self_keyword
: which keyword to look for when decorating instance methods.
- compose(*funcs: Callable) function_pipe.core.function_pipe.FN ¶
Given a list of functions, execute them from right to left, passing the returned value of the right f to the left f. Store the reduced function in a FunctionNode
- classmethod_pipe_node(*key_positions: typing.Union[typing.Callable, str], core_decorator: typing.Callable[[typing.Any], typing.Callable] = <function _core_logger>) Union[Callable, function_pipe.core.function_pipe.PipeNode] ¶
Decorates a function to become a classmethod
PipeNode
that takes no expression-level args.This can either be used as a decorator, or a decorator factory, similar to
functools.lru_cache
.This is a convenience method, that is the mental equivalent to this pseudo-code:
>>> @classmethod >>> @pipe_node(...) >>> def func(...)
Examples:
>>> @classmethod_pipe_node >>> def func(cls, **kwargs): >>> pass
>>> @classmethod_pipe_node() >>> def func(cls): >>> pass
>>> @classmethod_pipe_node(PN_INPUT) >>> def func(cls, pn_input): >>> pass
- Args:
key_positions
: either a single callable, or a list of keywords that will be positionally bound to the decorated function.core_decorator
: a decorator that will be applied to the core_callable. This is typically a logger. By default, it will print to stdout.
- classmethod_pipe_node_factory(*key_positions: typing.Union[typing.Callable, str], core_decorator: typing.Callable[[typing.Any], typing.Callable] = <function _core_logger>) Callable ¶
Decorates a function to become a classmethod pipe node factory, that when given expression-level arguments, will return a
PipeNode
This can either be used as a decorator, or a decorator factory, similar to
functools.lru_cache
.This is a convenience method, that is the mental equivalent to this pseudo-code:
>>> @classmethod >>> @pipe_node_factory(...) >>> def func(...)
Examples:
>>> @classmethod_pipe_node_factory >>> def func(cls, a, b, **kwargs): >>> pass >>> ... >>> SomeClass.func(1, 2) # This is now a PipeNode!
>>> @classmethod_pipe_node_factory() >>> def func(cls, *, a, b): >>> pass >>> ... >>> SomeClass.func(a=1, b=2) # This is now a PipeNode!
>>> @classmethod_pipe_node_factory(PN_INPUT, PREDECESSOR_RETURN) >>> def func(cls, pn_input, previous_value, a, *, b): >>> # ``pn_input`` will be given the PN_INPUT from the pipeline >>> # ``previous_value`` will be given the PREDECESSOR_RETURN from the pipeline >>> pass >>> ... >>> SomeClass.func(1, b=2) # This is now a PipeNode!
- Args:
key_positions
: either a single callable, or a list of keywords that will be positionally bound to the decorated function.core_decorator
: a decorator that will be applied to the core_callable. This is typically a logger. By default, it will print to stdout.
- staticmethod_pipe_node(*key_positions: typing.Union[typing.Callable, str], core_decorator: typing.Callable[[typing.Any], typing.Callable] = <function _core_logger>) Union[Callable, function_pipe.core.function_pipe.PipeNode] ¶
Decorates a function to become a staticmethod
PipeNode
that takes no expression-level args.This can either be used as a decorator, or a decorator factory, similar to
functools.lru_cache
.This is a convenience method, that is the mental equivalent to this pseudo-code:
>>> @staticmethod >>> @pipe_node(...) >>> def func(...)
Examples:
>>> @staticmethod_pipe_node >>> def func(**kwargs): >>> pass
>>> @staticmethod_pipe_node() >>> def func(): >>> pass
>>> @staticmethod_pipe_node(PN_INPUT) >>> def func(pn_input): >>> pass
- Args:
key_positions
: either a single callable, or a list of keywords that will be positionally bound to the decorated function.core_decorator
: a decorator that will be applied to the core_callable. This is typically a logger. By default, it will print to stdout.
- staticmethod_pipe_node_factory(*key_positions: typing.Union[typing.Callable, str], core_decorator: typing.Callable[[typing.Any], typing.Callable] = <function _core_logger>) Callable ¶
Decorates a function to become a staticmethod pipe node factory, that when given expression-level arguments, will return a
PipeNode
This can either be used as a decorator, or a decorator factory, similar to
functools.lru_cache
.This is a convenience method, that is the mental equivalent to this pseudo-code:
>>> @staticmethod >>> @pipe_node_factory(...) >>> def func(...)
Examples:
>>> @staticmethod_pipe_node_factory >>> def func(a, b, **kwargs): >>> pass >>> ... >>> SomeClass.func(1, 2) # This is now a PipeNode!
>>> @staticmethod_pipe_node_factory() >>> def func(*, a, b): >>> pass >>> ... >>> SomeClass.func(a=1, b=2) # This is now a PipeNode!
>>> @staticmethod_pipe_node_factory(PN_INPUT, PREDECESSOR_RETURN) >>> def func(pn_input, previous_value, a, *, b): >>> # ``pn_input`` will be given the PN_INPUT from the pipeline >>> # ``previous_value`` will be given the PREDECESSOR_RETURN from the pipeline >>> pass >>> ... >>> SomeClass.func(1, b=2) # This is now a PipeNode!
- Args:
key_positions
: either a single callable, or a list of keywords that will be positionally bound to the decorated function.core_decorator
: a decorator that will be applied to the core_callable. This is typically a logger. By default, it will print to stdout.
- store(pni: PipeNodeInput, ret_val: tp.Any, label: str) tp.Any: ¶
Store
ret_val
(the value returned from the previousPipeNode
) topni
underlabel
. Forwardret_val
.
- recall(pni: PipeNodeInput, label: str) tp.Any: ¶
Recall
label
frompni``
and return it. Can raise anKeyError
- call(*pns: PipeNode) tp.Any ¶
Broadcasts
pns
, and returns the result ofpns[-1]
Since
pns
are allPipeNodes
, they will all be evaluated before passed in as values to this function.
- pretty_repr(f: Any) str ¶
Provide a pretty string representation of a FN, PN, or anything. If the object is a FN or PN, it will recursively represent any nested FNs/PNs.
- is_unbound_self_method(core_callable: Union[classmethod, staticmethod, Callable], *, self_keyword: str) bool ¶
Inspects a given callable to determine if it’s both unbound, and the first argument in its signature is
self_keyword
Intuitively Understanding Pipelines¶
Tutorial Alias
PN: pipe node
This tutorial will teach the foundational concepts of function_pipe
PN pipelines through clear and intuitive steps. After reading, you will:
Know how to build and use a PN and/or PN pipeline
Understand the different between the creation and evaluation phase of PN
Understand how to link PNs together
Understand what a PN input is, and how to share data across PNs
Be able to debug issues in your own PNs
Introduction¶
Function pipelines happen in two stages: creation & evaluation.
Creation is the step in which a pipeline is defined, understood as either a single PN, or multiple PNs chained together using the |
operator. Here is a pseudo-code example of this:
pipeline = (pn_a | pn_b | pn_c | ...)
# OR
pipeline = pn
Evaluation is the step in which the pipeline is actually called, where the function code inside each PN is actually run:
pipeline["initial input"] # Evaluate the pipeline by using __getitem__, and passing in some initial input
Visualizing the Distinction Between Creation & Evaluation¶
To get started, we will create two simple PNs, put them into a pipeline expression, and then evaluate that expression. Creation followed by evaluation.
To do this, we will use the fpn.pipe_node
decorator, and define methods which take **kwargs
. (**kwargs
will be explained later!)
Now, let’s see the output that happens were we to run the previous code.
Start creation
End creation
Start pipeline evaluation
| <function pipe_node_1 at 0x7f582c428ca0>
pipe_node_1 has been evaluated
| <function pipe_node_2 at 0x7f582c428b80>
pipe_node_2 has been evaluated
End pipeline evaluation
As you can see, none of the PNs are called (evaluated) until the pipeline expression itself was created and then invoked.
What Is The Deal With Kwargs¶
In the previous example, we used **kwargs
on each function (if we hadn’t, the code would have failed!) Why did we need this, and what are they? Let’s investigate!
To investigate, we will build up a slightly longer pipeline, and expand the nodes to return some values
Running the above code will produce the following output:
| <function pipe_node_1 at 0x7f582cceb700>
{"pn_input": "original_input"}
| <function pipe_node_2 at 0x7f582c2d30d0>
{"pn_input": "original_input", "predecessor_pn": <PN: pipe_node_1>, "predecessor_return": 1}
| <function pipe_node_3 at 0x7f582c33b820>
{"pn_input": "original_input", "predecessor_pn": <PN: pipe_node_1 | pipe_node_2>, "predecessor_return": 2}
repr(pipeline) = '<PN: pipe_node_1 | pipe_node_2 | pipe_node_3>'
There are a few things happening here worth observing.
Every node is given the kwarg
pn_input
.Each node (except the first), is given the kwargs
predecessor_pn
andpredecessor_return
The first node is special. In the context of the pipeline it lives in, there are no PNs preceding it, hence predecessor_pn
and predecessor_return
are not passed in!
For every other node, it is initiutive what the values of predecessor_pn
and predecessor_return
will be. They contain the node instance of the one before, and the return value of that node once it’s evaluated.
As we can observe on pipe_node_3
, the repr of predecessor_pn
shows how it’s predecessor is actually a pipeline of PNs instead of a single PN. Additionally, printing the repr of pipeline
shows how it is a pipeline of multiple PNs.
Note
From now on, we will refer to the three strings above by their symbolic constant handles in the function_pipe module. They are fpn.PN_INPUT
, fpn.PREDECESSOR_PN
, and fpn.PREDECESSOR_RETURN
, respectively.
Using the Kwargs¶
Now that we know what will be passed in through each PN’s **kwargs
based on where it is in the pipeline, let’s write some code that takes advantage of that.
As you can see, PNs have the ability to use the return values from their predecessors, or the fpn.PN_INPUT
whenever they need to.
You can also observe that pipeline_2
reversed the order of the latter two PNs from their order in pipeline_1
. This worked seamlessly, since each of the PNs was accessing information from the predecessor’s return value. Had we tried something like:
it would have failed, since the first PN is never given fpn.PREDECESSOR_RETURN
as a kwarg.
Note
fpn.PREDECESSOR_PN
is a kwarg that is almost never used in regular PNs or pipelines. If you are reaching for this kwarg, you are probably doing something wrong! It’s primary purpose is to ensure the internals of the function_pipe.PipeNode module are working properly, not for use by end users.
Hiding the Kwargs¶
Now that we know how to use **kwargs
, we can see that manually extracting the pipeline kwargs we care about each time is not good! On top of that, it’s highly undesirable to require the signature of all PNs to accept arbitrary **kwargs
.
Lucky for us, the fpn.pipe_node
decorator can be optionally given the desired kwargs we want to positionally bind in the actual function signature.
Ah. That’s much better. It clears up the function signature, and makes it clear what each PN function needs in order to process properly.
To restate what’s happening, arguments given to the decorator will be extracted from the pipeline, and implicitly passed in as the first positional arguments defined in the function signature.
What About Other Arguments¶
So far, we have most of the basics. However, there is one essential use case missing: how do I define additional arguments on my function? Let’s say instead of a PN called add_7
, I want to have a PN called add
, that takes an argument that will be added to the predecessor return value. Here’s a pseudo-code example:
@fpn.pipe_node(fpn.PREDECESSOR_RETURN)
def add(previous_value, value_to_add):
return previous_value + value_to_add
pipeline = (... | ... | add(13) | .. )
Ideally, there should be a mechanism that allows the user bind (or partial) custom args & kwargs to give their pipelines all the flexibility needed.
Welcome To the Factory¶
Thankfully, such a mechanism exists: it’s called fpn.pipe_node_factory
. This is the other key decorator we need to know for building PNs.
The previous example would work exactly as expected had we replaced the fpn.pipe_node
decorator with the fpn.pipe_node_factory
decorator!
To reiterate what’s happening here, the fpn.pipe_node_factory
decorates the method in such way it can be thought of as a factory that builds PNs. This is essential, since every element in a pipeline must be a PN! The PN factories allow us to used bound (or partialed) PN with arbitrary args/kwargs.
A Common Factory Mistake¶
A common failure when using fpn.pipe_node_factory
is forgetting to call the decorator before it’s put into the pipeline!
Building on the previous example, let’s see what happens if we forgot to add an
argument to add
.
Let’s see the failure message this will raise:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
...
ValueError: Either you put a factory in a pipeline (i.e. not a pipe node), or your factory was given a reserved pipeline kwarg ('pn_input', 'predecessor_pn', 'predecessor_return').
This failure should make sense now! Every node in a pipline must be a PN. Since add
was not given a factory argument, it was a PN factory, not a PN.
PN Input (pni)¶
Code Alias
pni: pn_input (argument conventionally bound to fpn.PN_INPUT
)
Up until now, the usage of pni
(i.e. the argument conventionally bound to fpn.PN_INPUT
) has been a relatively diverse. This is because fpn.PN_INPUT
refers to the initial input to the pipeline, and as such, can be any value. For these simple examples, I have been providing integers, but real-world cases typically rely on the fpn.PipeNodeInput
class.
fpn.PipeNodeInput
is a subclassable object, which has the ability to:
Store results from previous PNs
Recall values from previous PNs
Share state across PNs.
Let’s observe the following example, where we subclass fpn.PipeNodeInput
in order to share some state accross PNs.
This is also a good opportunity to highlight how pipeline expressions can be easily reused to provide different results when given different inital inputs. Using the above example, giving a different pni
will give us a totally different result:
Store & Recall¶
One of the main benefits to using a fpn.PipeNodeInput
subclass, is the ability to use fpn.store
and fpn.recall
. These utility methods will store & recall results from a cache privately stored on the pni
.
As you can see, once results have been stored using fpn.store
, they are retrievable using fpn.recall
for any other pipeline that is evaluated with that same pni!
Additionally, you can see that fpn.store
and fpn.recall
simply forward along the previous return values so that they can be seamlessly inserted anywhere into a pipeline.
Note
fpn.store
and fpn.recall
only work when the initial input is a valid instance or subclass instance of fpn.PipeNodeInput
.
Advanced - Instance/Class/Static Methods¶
The final section in this tutorial explains the tools needed for turning classmethods
and staticmethods
into PNs. To do this, we can take advantage of special classmethod/staticmethod tools built into the function_pipe library!
Note
Normal “instance” methods (i.e. functions that expect self (i.e. the instance) passed in as the first argument) work exactly as expected with the fpn.pipe_node
and fpn.pipe_node_factory
decorators, as long as the name of the argument is “self”.
Building on everything we’ve seen so far, let’s take a look at the class below, which demonstrates usage of fpn.classmethod_pipe_node
, fpn.classmethod_pipe_node_factory
, fpn.staticmethod_pipe_node
and fpn.staticmethod_pipe_node_factory
.
To help explain the decorators a bit more, here is a quick pseudo-code example showing an alternative way to understand them:
@fpn.classmethod_pipe_node
# Behaves like you think this would:
@classmethod
@fpn.pipe_node
# ------------------------------------------------------------
@fpn.staticmethod_pipe_node_factory
# Behaves like you think this would:
@staticmethod
@fpn.pipe_node_factory
# etc...
Miscellaneous¶
__getitem__¶
For this entire tutorial, PNs and pipeline expressions have been evaluated using __getitem__
. There is actually another way to do this. As we learned, the first node in a pipeline only receives fpn.PN_INPUT
as a kwarg. Not only that, but it must receive that as a kwarg. The call that kicks off a PN/pipeline evaluation must give a single kwarg:fpn.PN_INPUT
Thus, we can actually evaluate a PN/pipeline expression this way:
some_pipe_node(**{fpn.PN_INPUT: pni})
Obviously, this approach is not very pretty, and it’s quite a lot to type for the privilege of evaluation. Thus, the __getitem__
syntactical sugar was introduced to make it so the user isn’t required to unpack a single kwarg whenever they want to evaluate a pipeline.
Note
__getitem__
has special handling for when the key is None
. This will evaluate the PN/pipeline expression with a bare instance of fpn.PipeNodeInput
. If the user desires to evaluate their expression with the literal value None
, they must kwarg unpack like so: pn(**{fpn.PN_INPUT: None})
.
Common Mistakes¶
Placing a bare factory in pipeline (see: A Common Factory Mistake).
Calling a PN directly (with the exception of unpacking the single kwarg
fpn.PN_INPUT
).Partialing a method wrapped with
fpn.pipe_node
orfpn.pipe_node_factory
.Using
@classmethod
or@staticmethod
decorators instead of the special decorators designed for working with classmethods/staticmethods.Decorating a function with
fpn.pipe_node
whose signature expects args/kwargs outside either those bound from the pipeline, or**kwargs
.
Broadcasting¶
A feature of fpn.pipe_node_factory
is how it handles args/kwargs that are themselves PNs. For these types of arguments, it will evaluate them as isolated PNs with fpn.PN_INPUT
forwarded, and then use the evaluated value in place of that PN. (This is referred to as broadcasting).
Example:
As we can see, when factories are given PNs as args/kwargs, they are evaluated with the fpn.PN_INPUT
given to the original PN/expression being evaluated.
Arithmetic¶
A helpful feature of PNs, is the ability to perform arithmetic operations on the pipeline during creation. Supported operators are:
Unary:
-
,~
, andabs()
Binary:
+
,-
,*
,/
,**
,==
,!=
,>
,<
,<=
, and>=
Conclusion¶
After going through this tutorial, you should now have an understanding of:
The creation and evaluation stages of a pipeline
The
fpn.pipe_node
decorator, and when to use itThe
fpn.pipe_node_factory
decorator, and when to use itHow to positionally bind the first argument(s) of a pipeline to
fpn.PN_INPUT
and/orfpn.PREDECESSOR_RETURN
.How to use
fpn.store
andfpn.recall
to store and recall results from a pipeline.How to use
fpn.PipeNodeInput
.How to make instance methods, classmethods, and staticmethods into PNs.
Why
__getitem__
is used to evaluate a pipeline, and what an alternative calling method isHow to identify and address the most common mistakes when using PNs.
What broadcasting is and how to use it.
How to use arithmetic unary/binary operators in a pipeline.
Here is all of the code examples we have seen so far:
String Processing with FunctionNode and PipeNode¶
Introduction¶
Simple examples of FunctionNode
and PipeNode
can be provided with string processing functions. While not serving any practical purpose, these examples demonstrate core features. Other usage examples will provide more practical demonstrations.
Importing function-pipe¶
Throughout these examples function-pipe will be imported as follows.
This assumes the function_pipe.py module has been installed in site-packages
or is otherwise available via sys.path
.
FunctionNodes for Function Composition¶
FunctionNodes wrap callables. These callables can be lambdas, functions, or instances of callable classes. We can wrap them directly by calling FunctionNode
or use FunctionNode
as a decorator.
Using lambda
callables for brevity, we can start with a number of simple functions that concatenate a string to an input string.
With or without the FunctionNode
decorator, we can call and compose these in Python with nested calls, such that the return of the inner function is the argument to the outer function.
This approach does not return a new function we can use repeatedly with different inputs. To do so, we can wrap the same nested calls in a lambda
. The initial input is the input provided to the resulting composed function.
While this works, it can be hard to maintain. By using FunctionNodes, we can make this composition more readable through it’s linear >>
or <<
operators.
Both of these operators return a FunctionNode
that, when called, pipes inputs to outputs (>>
: left to right, <<
: left to right). As with the lambda
example above, we can reuse the resulting FunctionNode
with different inputs.
Depending on your perspective, a linear presentation from left to right may not map well to the nested presentation initially given. The <<
operator can be used to process from right to left:
And even though it is ill-advised on grounds of poor readability and unnecessary conceptual complexity, you can do bidirectional composition too:
The FunctionNode
overloads standard binary and unary operators to produce new FunctionNodes
that encapsulate operator operations. Operators can be mixed with composition to create powerful expressions.
We can create multiple FunctionNode expressions and combine them with operators and other compositions. Notice that the initial input “*” is made available to both innermost expressions, p
and q
.
In the preceeding examples the functions took only the value of the predecessor return as their input. Each function thus has only one argument. Functions with additional arguments are much more useful.
As is common in approaches to function composition, we can partial multi-argument functions so as to compose them in a state where they only require the predecessor return as their input.
The FunctionNode
exposes a partial
method that simply calls functools.partial
on the wrapped callable, and returns that new partialed function re-wrapped in a FunctionNode
.
PipeNodes for Extended Function Composition¶
At higher level of complexity, FunctionPipe
can start to become difficult to understand or maintain. The PipeNode
class (a subclass of FunctionNode
) and its associated decorators makes extended function composition practical, readable, and maintainable. Rather than using the >>
or <<
operators used by FunctionNode
, PipeNode
uses only the |
operator to express left-to-right composition.
We will build on the tutorial from earlier (LINK NEEDED), and now explore more complex string processing functions using PipeNode
.
Using the function a
from before, we will instead create it as a PipeNode
, using the pipe_node
decorator.
Recall that PNs that receive fpn.PREDECESSOR_RETURN
must have a preceding PN. In our case, we want an initial PN that receives an initial input from the user. We will do this by positionally binding fpn.PN_INPUT
to the first argument.
Finally, we can generalize string concatenation with a cat
function that, given an arbitrary string, concatenates it to its predecessor return value. Since this function takes an expresion-level argument, we must use the pipe_node_factory
decorator.
Now we can create a pipeline expression that evaluates to a single function f
. In order to evaluate the pipeline, recall we must the __getitem__
syntax with some initial input.
Each node in a PipeNode
expression has access to the fpn.PN_INPUT
. This can be used for many applications. A trivial application below replaces initial input characters found in the predecessor return with characters provided with the expression-level argument chars
.
As already shown, a callable decorated with pipe_node_factory
can take expression-level arguments. With a PipeNode
expression, these arguments can be PipeNode
expressions. The following function interleaves expression-level arguments with those of the predecessor return value.
We can break PipeNode
expressions into pieces by storing and recalling results. This requires that the initial input is a PipeNodeInput
or a subclass. The following PNI
class exposes the __init__
based chars
argument as an instance attribute. Alternative designs for PipeNodeInput
subclasses can provide a range of input data preparation. Since our initial input has changed, we need a new innermost node. The input_init
node defined below simply returns the chars
attribute from the PNI
instance passed as key-word argument fpn.PN_INPUT
.
The function-pipe module provides store
and recall
nodes. The store
node stores a predecessor value. The recall
node returns a stored value as an output later in the expression. A recall
node, for example, can be used as an argument to pipe_node_factory
functions. The call
PipeNode
, also provided in the function-pipe module, will call any number of passed PipeNode
expressions in sequence.
While these string processors do not do anything useful, they demonstrate common approaches in working with FunctionNode
and PipeNode
.
Conclusion¶
After going through this tutorial, you should now have an understanding of:
How to use
fpn.FunctionNode
for function compositionThe directionality of
fpn.FunctionPipe
(i.e.>>
and<<
)How to partial expression-level arguments into
fpn.FunctionPipe
The
fpn.pipe_node
decorator, and when to use itThe
fpn.pipe_node_factory
decorator, and when to use itHow to use
fpn.PipeNode
for function composition
Here is all of the code examples we have seen so far:
DataFrame Processing with FunctionNode and PipeNode¶
Introduction¶
The FunctionNode
and PipeNode
were built in large part to handle data processing pipelines with Pandas Series
and DataFrame
. The following examples do simple things with data, but provide a framework that can be expanded to meet a wide range of needs.
Tutorial Data Source¶
Following an example in Wes McKinney’s Python for Data Analysis, 2nd Edition (2017), these examples will use U.S. child birth name records from the Social Security Administration. Presently, this data is found at the following URL. We will write Python code to automatically download this data.
DataFrame Processing with FunctionNode¶
FunctionNode
wrapped functions can be used to link functions in linear compositions. What is passed to the nodes can change, as long as a node is prepared to receive the value of its predecessor. As before, core callables are called only after the complete composition expression is evaluated to a single function and called with the initial input.
We will use the follow imports throughout these examples. The requests
and pandas
third-party packages can be installed using pip
.
We will introduce the FunctionNode
decorated functions one at a time. We start with a function that, given a destination file path, will download the dataset (if it does not already exist), read the zip, and load the data into an OrderedDictionary
of DataFrame
keyed by year. Each DataFrame
has a column for “name”, “gender”, and “count”. We will for now store the URL as a module-level constant.
Next, we have a function that, given that same dictionary, produces a single DataFrame
that lists, for each year, the total number of males and females recorded with columns for “M” and “F”. Notice that the approach used below strictly requires the usage of an OrderedDictionary
.
Given row data that represent parts of whole, a utility function can be used to convert the previously created DataFrame
into percent floats.
A utility function can be used to select a contiguous year range from a DataFrame
indexed by integer year values. We expect the start
and end
parameters to provided through partialing, and the DataFrame
to be provided from the predecessor return value:
We can plot any DataFrame
using Pandas’ interface to matplotlib
(which will need to be installed and configured separately). The function takes an optional argument for destination file path and returns the same path after writing an image file.
Finally, to open the resulting plot for viewing, we will use Python’s webbrowser
module.
With all functions decorated as FunctionNode
, we can create a composition expression. The partialed start
and end
arguments permit selecting different year ranges. Notice that the data passed between nodes changes, from an OrderedDict
of DataFrame
, to a DataFrame
, to a file path string. To call the composition expression f
, we simply pass the necessary argument of the innermost load_data_dict
function.

If, for the sake of display, we want to convert the floating-point percents to integers before ploting, we do not need to modify the FunctionNode
implementation. As FunctionNode
support operators, we can simply scale the output of the percent
FunctionNode
by 100.

While this approach is illustrative, it is limited. Using simple linear composition, as above, it is not possible with the same set of functions to produce multiple plots with the same data, or both write plots and output DataFrame
data in Excel. This and more is possible with PipeNode
.
DataFrame Processing with PipeNode¶
Building on the tutorial from earlier (LINK NEEDED), we will now explore processing dataframes using PipeNode
.
While not required to use pipelines, is is useful to create a PipeNodeInput
subclass that will share state across the pipeline.
The following implementation of a PipeNodeInput
subclass stores the URL as the class attribute URL_NAMES
, and stores the output_dir
argument as an instance attribute. The load_data_dict
function is essentially the same as before, though here it is a classmethod
that reads URL_NAMES
from the class. The resulting data_dict
instance attribute is stored in the PipeNodeInput
, making it available to every node.
We can generalize the gender_count_per_year
function from above to count names per gender per year. Names often have variants, so we can match names with a passed-in function name_match
. As this node takes an expression-level argument, we decorate it with pipe_node_factory
. Setting this function to lambda n: True
results in exactly the same functionality as the gender_count_per_year
function. Recall how we can access data_dict
from the positionally bound pni
argument.
A number of functions used above as FunctionNode
can be recast as PipeNode
by simpy binding fpn.PREDECESSOR_RETURN
as the first positional argument. Recall that PNs that need expression-level arguments are decorated with pipe_node_factory
. The plot
node now takes a file_name
argument, to be combined with the output directory set in the PipeNodeInput
instance.
With these nodes defined, we can create many differnt processing pipelines. For example, to plot two graphs, one each for the distribution of names that start with “lesl” and “dana”, we can create the following expression. Notice that, for maximum efficiency, load_data_dict
is called only once in the PipeNodeInput
. Further, now that plot
takes a file name argument, we can uniquely name our plots.


To support graphing the gender distribution for multiple names simultaneously, we can create a specialized node to merge PipeNode
expressions passed as key-word arguments. We will then merge all those DataFrame
key-value pairs.
Now we can create two expressions for each name we are investigating. These are then passed to merge_gender_data
as key-word arguments. In all cases the raw data DataFrame
is now retained with the store
PipeNode
. After plotting and viewing, we can retrieve and iterate over stored keys and DataFrame
by accessing the store_items
property of PipeNodeInput
. In this example, we load each DataFrame
into a sheet of an Excel workbook.


These examples demonstrate organizing data processing routines with PipeNode
expressions. Using PipeNodeInput
sublcasses, data acesss routines can be centralized and made as efficient as possible. Further, PipeNodeInput
sublcasses can provide common parameters, such as output directories, to all nodes. Finally, the results of sub-expressions can be stored and recalled within PipeNode
expressions, or extracted after PipeNode
execution for writing to disk.
Conclusion¶
After going through this tutorial, you should now have an understanding of:
How to use
fpn.FunctionNode
to do DataFrame processingHow to use
fpn.PipeNode
to do DataFrame processing
Here is all of the code examples we have seen so far:
Numpy Array Processing with PipeNode¶
Introduction¶
This example will present a complete command-line program to print an equal-space, bitmap / pixel-font display of the current Python version (or, with extension, something else more useful). The display will be configurable with (1) a scaling factor and (2) a variable character to be used per pixel. For example:
% python3 pyv.py --scale=1 --pixel=*
**** * * ***** ***** ** *****
* * * * * * * * *
***** ***** *** ***** * *****
* * * * * * *
* * ***** * ***** * * *****
% python3 pyv.py --scale=2 --pixel=.
........ .. .. .......... .......... .... ..........
........ .. .. .......... .......... .... ..........
.. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. ..
.......... .......... ...... .......... .. ..........
.......... .......... ...... .......... .. ..........
.. .. .. .. .. .. ..
.. .. .. .. .. .. ..
.. .. .......... .. .......... .. .. ..........
.. .. .......... .. .......... .. .. ..........
Tutorial¶
Rather than explicitly defining each character as a fixed bit map, we can use simple PipeNode
functions to define characters as pipeline operations on Boolean NumPy arrays. Operations include creating an empty frame, drawing horizontal or vertical lines, shifting those lines, selectively inverting specific pixels, and taking the union or intersection of any number of frames. Since we want to model linear pipelining of frames through transformational nodes, but also need to expose a scale
parameter to numerous nodes, we will use PipeNode
functions and a PipeNodeInput
instance rather than simple function composition.
We will use the follow imports throughout these examples. The numpy
third-party package can be installed with pip
.
In order to minimize the number of function_pipe
stdout logs, we will partial in a forwarding lambda that does not print.
A derived PipeNodeInput
class can specify fixed (as class attributes) or configurable (as arguments passed at initialization and set to instance attributes) parameters, available to all PipeNode
functions when called. For this example, we set a fixed frame shape of 5 by 5 pixels as SHAPE
, and expose scale
and pixel
as instance attributes.
Next, we define pipe_node
decorated functions (that that take no expression-level arguments) for creating an empty matrix, a vertical line, and a horizontal line. The frame
function serves in the innermost position to provide an empty two-dimensional NumPy array filled with False. In the innermost position it only has access to the fpn.PN_INPUT
key-word argument. From the fpn.PN_INPUT
it can read the SHAPE
and scale
attributes to correctly construct the frame. The v_line
and h_line
functions expect a frame passed via fpn.PREDECESSOR_RETURN
, and use the scale
attribute from fpn.PN_INPUT
to write correctly sized Boolean True values in a vertical or horizontal line through the origin (index 0, 0, or the upper left corner) on that frame.
Next, we can create some transformation functions that, given a frame via fpn.PREDECESSOR_PN
, transform and return a new frame. The pipe_node_factory
decorated functions v_shift
and h_shift
use the NumPy roll function to shift the two-dimensional array vertically or horizontally by the steps
argument, passed via expression-level arguments. The steps
passed are interpreted at the unit level, and are thus multipled by scale
via fpn.PN_INPUT
. As a convenience to users (and catching an error made developing these tools), we check and raise an Exception if we try to do a meaningless shift, such as vertically shifting a vertical line, or horizontall shifting a horizontal line. The PipeNode.unwrap
attribute exposes the core callable wrapped by the PipeNode
, permitting direct comparison regardless of PipeNode
state.
We will need at times to draw points directly, either setting a False pixel to True or vice versa. The pipe_node_factory
decorated function flip
will, given coordinate pairs in positional arguments, invert the Boolean value found. Again, we use the fpn.PN_INPUT
to get the scale
argument so coordinates can be passed at the unit level, independent of the scale.
The following pipe_node_factory
decorated functions combine variable numbers of PipeNode
instances passed via positional arguments. The `` union`` and intersect
functions perform logical OR and logical AND, respectively, on all positional arguments. The concat
function concatenates frames into a longer frame, inserting a unit-width space bewteen frames.
We will need a function to print any frame to standard out. For this, we can create a pipe_node
decorated function that, given a frame via fpn.PREDECESSOR_RETURN
, simply walks over the rows and prints the fpn.PN_INPUT
defined pixel
when a frame value is True, a space otherwise. Since this node returns the fpn.PREDECESSOR_RETURN
unchanged, it can be used anywhere in an expression to view a frame mid-pipeline.
We have the tools now to define pipelines to produce the individual characters we need. We will define these in a dictionary, named chars
, so that we can map string characters to PipeNode
expressions, pass them to concat
, and then pipe the results to display
. For brevity, we will not define a complete alphabet. For most characters the process involves taking the union of a number of lines (some shifted) and then flipping a few pixels. The font here is based on the Visitor font:
http://www.dafont.com/visitor.font
We need a function to produce the final PipeNode
expression. The msg_display_pipeline
function, given a string message, will return the PipeNode
expression combining concat
and display
, where concat
is called with PipeNode positional arguments, mapped from chars
, for each character passed in msg
. We map the “_” character for any characters not defined in chars
.
Finally, we can define the outer-most application function, which will parse command-line arguments for pixel
and scale
with argparse.ArgumentParser
. The msg_display_pipeline
function is called with the prepared msg
string, returning f
, a PipeNode
function configured to generate and display the msg
as a banner. A PixelFontInput
instance is created with the pixel
and scale
arguments received from the command line. At last, all core callables are called with the evocation of f
with the __getitem__
syntax, passing the PixelFontInput
instance pixel_font_input
.
Conclusion¶
After going through this tutorial, you should now have an understanding of:
How to use
fpn.PipeNode
to do complex numpy array data pipline processing.
Here is all of the code examples we have seen so far: