Function Programming illustrated in Python: Part 2

Function composition with wrapped values

From the Functional Programming illustrated in Python series

In the previous article, I showed three functions f, g and h, applied in that order to a value. There were several ways to achieve that. One was to nest them, giving “inside out” evaluation:

result = h(g(f(v0)))

Another was to assign to intermediate variables, so that the program flows top-to-bottom:

v1 = f(v0)
v2 = g(v1)
v3 = h(v2)

And a third was to wrap the value, where the wrapper class has an operator which applies the function on the right to the value on the left.

result = Value(v0) >> f >> g >> h

Returning logs

I am going to add a new requirement to the example application: inside two of the functions, f and h, I want to generate some log messages so I can understand what’s going on inside. I want the combined log output from those functions returned at the end.

In regular Python, I could just add print() calls, and the logs would be output as a side effect of the function execution. But we’re trying to see how this can be done using pure functions only.

With a pure function there is really only one choice: we must return the log message as part of the function’s return value. This means our function f could look something like this:

def f(x):
return (x*x, "My log message")

Instead of returning just a value, it now returns a value and a message, as a tuple (a Python data structure which, in this case, contains two values). So far, so good.

But now we have other problems. Firstly, we can’t pass this value directly to function g; it expects only a single value as its input. We can no longer use the direct h(g(f(v0))) form. Let’s switch to the variable assignment form for now, so we can unpick the tuples.

Secondly, function h is also going to produce a log message. Somewhere in our program we’ll need to combine the two log messages from f and h so that we can return the complete set of logs to the user at the end.

Doing this messes up the flow of our main program, which instead of just passing values along, has to deal with ad-hoc log plumbing:

Yuk. Our main flow has been disrupted, and even in such a simple program, it’s now very hard to see how the steps connect together. There must be a better way.

Passing a compound value

As a first attempt to get back to the original simple program logic, let’s try making functions f, g and h all accept and return a tuple of value and log. This means we should be able to plug them together in the simple way that we did before.

Well, that’s… different. Our program’s main logic is simple, but now the functions f, g and h are complex. Each function takes a tuple as its input. It has to split the tuple into the value and the log, and then calculate the answer, and append its log message onto any existing log passed in. Function g, which doesn’t even generate a log message, still had to be modified to extract the log and pass it on unchanged. And at the end we have to extract the components of the tuple as v3[0] and v3[1].

It can be made more Pythonic by using a real class, instead of a tuple:

That’s a little clearer, but still verbose.

Factoring out the common logic

There’s are a couple of things which are now common to f, g and h:

  1. Extracting the value from the input. It’s a pain to write “data.value” instead of just the original parameter “x”.
  2. Appending the output log message to the input log message.

In fact, the input log value is not of any interest to these functions at all — they don’t care what logs have been generated by previous functions — except in order to be able to append their own logs.

So let’s rewrite these functions so that they only take a single value as input, and return just their own log message, then move the logic which combines the log with the previous log into a helper function, called bind. This is what we end up with:

Look at functions f, g and h first. Those are pretty simple now, huh?

We have this new function called bind. Its body consists of only two lines, but it’s worth stepping through carefully to understand it fully.

The input to bind is two values: a function, and a composite value (that is, a ValueAndLog instance)

The first thing it does is to extract just the value part:

              data.value
^^^^^^^^^^

… and pass it as an argument to the provided function (which could be one of f, g or h)

result = func(data.value)
^^^^^^^^^^^^^^ ^

This means that f, g and h are relieved of the need to extract the value themselves. They just receive a plain value, not a wrapped ValueAndLog.

The second line builds a ValueAndLog result. The value part is just the value part of the ValueAndLog returned by the function we just called:

return ValueAndLog(result.value, data.log + result.log)
^^^^^^^^^^^^

and the log part is the log from the original input to bind, concatenated with the log output from the function just called.

return ValueAndLog(result.value, data.log + result.log)
^^^^^^^^^^^^^^^^^^^^^

Put another way:bind applies the given function to the given value, but performs pre-processing of the argument to the function and post-processing of the result from the function.

“bind” takes two arguments, a function and a composite value

That’s really quite clever, for two lines of code. I think this happens a lot with functional programming: a small amount of code can encapsulate something quite deep. You end up staring at it for a while, or taking it to pieces like I just did above, until you finally understand it, and then fit this piece into the larger jigsaw of your code.

Why is this function called bind? You can think of it as an adapter. It binds a function which only takes a simple value parameter, to an argument which is a composite value (ValueAndLog). And it combines the composite value with the function’s result.

Personally, I like to think of bind as indirect function application. Given a function and a value, it applies the function to the value but with some extra processing.

In the above code, bind is a static method¹ of ValueAndLog, which means it’s really just a plain old function inside the ValueAndLog namespace. You don’t call it on any particular instance of ValueAndLog, you just call it with two values. It could have been made a top level function; it’s just neater to keep it inside ValueAndLog, since its functionality is very much specific to logging.

Chaining the functions (again)

The remaining problem is that the main program is still not exactly pretty:

v0 = ValueAndLog(4, "")
v1 = ValueAndLog.bind(f, v0)
v2 = ValueAndLog.bind(g, v1)
v3 = ValueAndLog.bind(h, v2)

This can now be written as nested function calls, and I’ll leave that as an exercise for the reader (hint: where you see v2 as an argument, substitute the expression that calculates v2, and so on)

But it’s messier: instead of calling f(v0) we are calling bind(f, v0)— which applies f to v0 indirectly.

Think back to the previous article. We were able to chain values left to right, using a wrapper around the data value. Can we do same here? Yes! In fact we already have a suitable wrapper, ValueAndLog itself.

All we need to do is to define the >> operator. But instead of just applying the function on the right directly to the value on the left, it has to do the bind pre- and post-processing as well.

The result:

The arguments have been swapped around, and the __rshift__ function is an instance method, where “self” is the ValueAndLog object on the left-hand side — but otherwise it is the same as the bind function before. This is starting to look pretty neat.

Tidying up

There are still some small improvements to be made.

Firstly, the ValueAndLog(4, "") which starts the chain is a bit messy: it exposes the plumbing, in particular what the initial value of an empty log is. We can fix this by making a new function, called unit², which takes a plain value and wraps it in a ValueAndLog.

class ValueAndLog:
@staticmethod
def unit(v):
return ValueAndLog(v, "")

Secondly, we still had to modify function g, which doesn’t do any logging, to return an empty log message. To avoid that, it would be good to have a way to convert a function which returns a plain value, into one which returns a value with an empty log. This function is called lift. Here’s a direct way to write it:

class ValueAndLog:
@staticmethod
def lift(func):
return lambda val: ValueAndLog(func(val), "")

You can read this as: “ValueAndLog.lift(func) returns a new function which, when called with some argument val, invokes func with the same val argument, then wraps the result in a ValueAndLog”

Comparing g with lift(g)

However, there’s a simpler way to write this. We already have a function which takes a raw value and wraps it in ValueAndLog: it’s called unit. We can compose our function with unit to cause it to wrap its output. That is, lift(g) is just compose(unit, g), or in Haskell unit . g

lift(g) built as compose(unit, g)

Here’s the final, tidied up version: notice how g is completely untouched by the logging.

Types

A brief note about types.

Python is a dynamically typed language, which means our ValueAndLog class can hold values of any type — int, float, str, whatever. (Try it: for example, change function f to take int and return str, and function g to take str and return int).

Some other languages are stricter. In those, the class might be called ValueAndLog A or ValueAndLog[A], which means “a class which holds a value of some type A, and a log”. A specific instance of that type would then be, say, ValueAndLog[int] if it carries an int, and it would be an error to try to put anything other than an int in it. In such languages, the compiler can help catch logic errors in your program, before it runs.

Even in a strongly-typed language, you can make a chain where the type changes along the way. You could have function f which takes int and returns ValueAndLog[str], and function g which takes str and returns ValueAndLog[int].

Further uses of this pattern

Although so far it’s been all about logging (and it was a lot of work just for some logs!), this is an example of a more general composition pattern which consists of:

  • A type (class) which wraps some data
  • A “unit” function, which wraps a plain value inside an instance of this type
  • A “bind” function, which selectively applies another function to the unwrapped value, and manipulates that function’s return value³

It turns out there are a whole bunch of other ways to use this pattern, each with its own (class + unit + bind). You could almost say that they form a “category” of their own. Examples include:

  • Dealing with nil or missing values. If the input value is not nil, then the bind method calls the provided function. If the input value is nil, then the bind method skips it entirely, and returns nil.
  • Dealing with errors. The return value from a function can be either a value or an error message. If the value passed along the chain to bind is an error, then the error propagates directly to the output.
  • Managing shared state. A state value is passed along the chain, and each function says how it wants to update that state

And many others.

The next article takes a detour to look at how assignments are done in functional languages.

Acknowledgements and further reading

This article was heavily inspired by:

[1] If you prefer class methods, then it could be written instead as

class ValueAndLog:
@classmethod
def bind(klass, func, data):
result = func(data.value)
return klass(result.value, data.log + result.log)

[2] In Haskell this function is called return, but return is a reserved keyword in Python. In any case, it’s just a regular function.

[3] We also mentioned lift(f) but that’s always just compose(unit, f)