Functional Programming illustrated in Python: Part 4

Monads

7 min readNov 1, 2020

From the Functional Programming illustrated in Python series

If you’re reading about functional programming, sooner or later you’re going to come across Monads. People seem to find Monads a very difficult topic to understand, so strap yourself in for a bumpy ride.

Err, no. I already covered Monads back in part 2 of this series. Move along please — nothing to see here. Go back and read part 2 again if you need to.

Actually, all I did there was write some code involving indirect application of functions to wrapped values, and I avoided using the M-word. Maybe the code contained a Monad, maybe it didn’t. The code works, and I hope you understood it. Does it even matter what it’s called? You can still enjoy a burrito, even if you don’t know that it’s called “a burrito”. It’s easier to order another one if you know the right word for it though.

Silly analogies aside: with a small amount of extra work, there’s a more general pattern waiting to pop out.

A monadic tribesman from Bosra, Syria (Photo: Jadd Haidar, vandalized by me. I apologise)

What is a Monad anyway?

It’s pretty much what you’ve seen already, using ValueAndLog as a concrete example, but more generally:

A type (i.e. class), which wraps some data
A unit function, which wraps a plain value inside an instance of this type — in other words, a constructor
A bind function, which accepts a function and a wrapped value, and returns a wrapped value. This is the bit that does the interesting work: a kind of “helper function”.

The function passed to bind must accept a plain unwrapped value, and return a wrapped value. bind can call it zero or more times.

Finally, before your code can get its Monad Licence, it also has to pass three tests, known as the Monad Laws.

Can’t you tell me more simply what a Monad is?

Well, if I had to summarise in one sentence, I’d say something like: “Monads are a way of composing functions in a continuation-passing style.”

But that doesn’t tell you what a Monad is, only what it’s normally used for.

It’s like if you asked me what a burrito is, and I said “it’s something that you eat”. That doesn’t tell you what a burrito actually is, still less how to make one, but at least it gives you an idea of what you are supposed to do with it.

What are the Monad Laws?

They are¹:

Left identity: unit(a) >> f is the same as f(a)
Right identity: m >> unit is the same as m
Associativity²: (m >> f) >> g is the same as
m >> (lambda x: (f(x) >> g))

where a is any unwrapped value, m is any wrapped value (an instance of the class in question), and f and g are arbitrary functions from unwrapped to wrapped.

You should be able to prove those properties, and this is a straightforward job for our ValueAndLog class. I’ll do the first one for you:

unit(a) returns ValueAndLog(a, "")

Next evaluate ValueAndLog(a, "") >> f, by stepping through the definition of the bind function:

    def __rshift__(self, func):
        result = func(self.value)
        return ValueAndLog(result.value, self.log + result.log)

On entry, self is ValueAndLog(a, "")

The first line calculates result = func(self.value), which is func(a). This must return some instance of our class, say ValueAndLog(x, y)

The second line calculates a new ValueAndLog. The first part is result.value (which is x), and the second part is ""+result.log, which is ""+y which in turn is just y.

Therefore it returns ValueAndLog(x, y), which is the same as func(a). QED.

The others are similarly straightforward to prove, and therefore the ValueAndLog class is in fact a Monad.

People who don’t like proofs have unit tests instead:

Now, could I have written the ValueAndLog class so that it had similar functionality, but didn’t qualify as a Monad? Certainly. For example, perhaps it crossed your mind that the bind function could add the trailing newline onto log messages automatically:

def __rshift__(self, func):
    result = func(self.value)
    return ValueAndLog(result.value, self.log + result.log + "\n")

That now breaks the Monad Laws — try re-running the unit tests with this modification. Also it changes the behaviour of our sample program in an important way. Remember that the original function g didn’t output a log message? When we use this bind function, we find that calling g adds a blank line to the logs. Oops.

What’s the point of the Monad Laws?

Here’s a hand-waving answer: if a type fulfils all the requirements for being a Monad, then it joins a club. Members of this club share a higher-order concept which can be manipulated in its own right, much as higher-order functions manipulate functions. For example, you can write a Monad Transformer which acts on a Monad (any Monad).

Another hand-waving answer: Monads which follow the laws “do what you expect” when you make a chain of bind operations.

If it helps, you could also think of Monads as sharing a common superclass, with the Monad Laws giving the behaviour which you can depend on in all its subclasses.

class Monad:
    ...class ValueAndLog(Monad):
    ...

In Python, there is some benefit in doing this. For example, the lift function is the same for all Monads, so it could live in the superclass:

class Monad:
    @classmethod
    def lift(klass, func):
        return compose(klass.unit, func)

Monads as a flow-control structure

You’ve seen this before:

Value(x) >> f >> g >> h

At each step, the bind operator >> takes a wrapped value on the left, and applies the function on the right to the unwrapped value.

Except, it doesn’t have to. bind is its own master, and can make its own decisions about what to do next. For particular input values, bind could decide to return a value of its own and not call the right-hand side at all. This then terminates the left-to-right flow entirely, and the value of the entire expression becomes what was chosen at that point. This gives a way to short-circuit evaluation, returning early — such as returning an exception if something goes wrong.

Or bind could decide to call the right-hand side multiple times (with different arguments) and combine the results.

This is very powerful. Switching to Haskell for a moment, because its syntactic sugar is difficult to show in Python:

add :: Maybe Int -> Maybe Int -> Maybe Int
add mx my = do
  x <- mx
  y <- my
  return (x + y)

Maybe Int wraps an Int value. The Maybe wrapper carries either a value of that type, or the special value Nothing meaning “no value”. It’s used to capture the idea of a value which may or may not be present, in a strongly typed language.

The first line says: “add is a function, which takes two Maybe Int values and returns a Maybe Int value”. The second line starts the function definition, where mx and my are the parameters to the function.

x <- mx unwraps the value from the first argument, and y <- my unwraps the value from the second argument (see the article on Assignment), and the last line calculates the sum and wraps it: return in Haskell is what we’ve called unit until now.

But what happens if one of the arguments is Nothing?

The <- syntax hides the bind operation which takes place from one line to the next. Let me write the bind¹ operation >> sideways:

  x <- mx
     v
     v
  y <- my
     v
     v
  return (x + y)

(This is only an approximate representation. The x really appears on the right-hand side of a bind operation, as mx >> (lambda x: ...). The article on Assignment explains the exact transformation)

But in essence, if the expression x <- mx cannot provide an unwrapped value — because the wrapped value is Nothing — then the bind operation terminates then and there, returning Nothing and not calling the subsequent lines. The same for y <- my. Therefore we’ll only get to the addition if two not-Nothing values are available.

Those few lines of Haskell are expressing something rather powerful. It’s like being able to add your own control structures to the language.

Writing in idiomatic Python, you’d have to remember to check the arguments explicitly, or risk a crash:

def add(x, y):
    if x is None:
        return None
    if y is None:
        return None
    return x + y

You may have come across a similar problem before in object-oriented programming, when chaining methods:

    foo.bar().baz().qux()

What happens if one of those methods returns None? Your program blows up:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'qux'

Fixing it is ugly — here’s one way:

a = foo.bar() if foo is not None else None
b = a.baz() if a is not None else None
c = b.qux() if b is not None else None

Some languages have special operators just for this use case. There was even a proposal to do this for Python too, adding complexity. The Maybe monad means you don’t need it in Haskell.

But the main reason you care about Monads

… is because functional code can cleanly interact with non-functional, stateful code (such as calling the operating system for input and output), when the non-functional code has a Monad veneer slapped on top of it.

That’s the subject of the next article.

Acknowledgements and further reading

Recommended non-reading

…and while addition and multiplication are both monoids over the positive natural numbers, a monad is a monoid object in a category of endofunctors: return is the unit, and join is the binary operation. It couldn’t be more simple. If that confuses you, it might be helpful to see a Monad as a lax functor from a terminal bicategory.

(from Haskell wiki, What a monad is not)

As an engineer rather than a mathematician, that makes my eyes glaze over. Maybe one day I’ll take it apart to understand each of the pieces.

[1] Please remember this is Python not Haskell. I’m using >> for the bind operator where Haskell uses >>=. Unfortunately, Haskell uses >> for something similar, but not the same (bind where the value is discarded).

[2] That doesn’t look like the normal mathematical definition of associativity, but under the surface it is. For the technical description, see the Haskell wiki. If you rewrite f as lambda x: f(x) in the first line then the symmetry may be more obvious.