Functional Programming illustrated in Python: Part 3

Assignment as parameter binding

From the Functional Programming illustrated in Python series

Consider the following sequence of computations, using assignment to temporary variables:

`v0 = 4v1 = v0*v0v2 = 2*v1v3 = 1+v2print(v3)`

Can it be done in a purely functional form, without `var = expr` assignment? It certainly can.

Binding to function parameters

When we invoke a function, its parameters are set (“bound”) to the arguments passed. For example:

`def myfunc(x):    ... more stuffmyfunc(5)`

During the execution of `myfunc` x is bound to the value 5. Eliminating the named function using a lambda, and split over multiple lines for clarity, this becomes:

`(lambda x:    ... more stuff)(5)`

That provides the trick: define a function and invoke it immediately, to bind its parameter(s) to the supplied argument(s). It gives the transformation:

`var = expr                           (lambda var:... more stuff            ⟾             ... more stuff                                     )(expr)`

Now let’s apply that, one line at a time, to the code posted at the start of this article. First, `v0 = 4`

`(lambda v0:    v1 = v0*v0    v2 = 2*v1    v3 = 1+v2    print(v3))(4)`

Unfortunately, at this point this is no longer runnable Python code, because Python does not permit assignments in the body of a lambda (only an expression). Never mind, let’s press on with the next one:

`(lambda v0:    (lambda v1:        v2 = 2*v1        v3 = 1+v2        print(v3)    )(v0*v0))(4)`

Still invalid, again:

`(lambda v0:    (lambda v1:        (lambda v2:            v3 = 1+v2            print(v3)        )(2*v1)    )(v0*v0))(4)`

Once more:

At last, the code runs, and does what was expected.

With `print(v3)` right inside, this is all impure code. No matter; this can be fixed by returning the value of v3, instead of printing it, and moving the print to the outside.

Now the code inside doing the calculation is pure, and only the outer `print` is impure.

Because the lambdas are nested, all the previously-bound values are available in inner lambdas. Suppose we wanted to calculate x²(2x²+1). We already have x² in v1, so this would be

`v3 = v1 * (1+v2)`

That works just fine:

Observe that the nested lambdas also capture the idea of “dependencies” or “sequencing”. An expression which uses the value of v1 depends on that value having already been calculated. In impure Python, that comes from the rule that you must execute statements strictly in the order that they are written. In the lambda form, such a rule is not needed, because the arguments of a function must be evaluated before the function can be applied to them.

`(lambda v1:    ... expr which uses v1 ...)(... expr which calculates v1 ...)`

Code transformation

The code we’ve ended up with, unfortunately, is horrible to read: the expression calculating the value to bind to a variable is now far-removed from the place it is used.

In practice, you wouldn’t write code like this yourself. Most functional languages¹ provide “syntactic sugar” along the lines of

`LET var = expr... more stuff`

which they mechanically translate into the equivalent, but inconvenient, form shown above, as part of compiling your code.

Can I show that in Python? Not easily. To do this, I would have to write some Python code which reads a Python program into some intermediate form (such as an Annotated Syntax Tree), applies the transformation, and then converts the AST back into Python code. This sort of thing is easy to do in a language like LISP though, where the program itself is just a nested list value.

Left-to-right application

There is, however, another approach which is available directly in Python.

Suppose we could put the argument to a lambda in front of the lambda body, so that

`(lambda x:    ...)(4)`

became something like

`4 passed_to (lambda x:    ...)`

That would keep the value and the use of the value close to each other.

You’ve seen how to do this in Python already, in part 1. The trick is to wrap the value in some other class (say `Value`) and then define the `>>` operator on that class for applying the function on the right to that value.

`x = 4... more code`

then becomes:

`Value(4) >> (lambda x:    ... more code)`

Transforming the original program this way gives:

Again, feel free to move the `print` outside, so that it becomes pure.

What you see here is an example of the “continuation passing” style of programming. The left side of `>>` calculates a value; the right side of `>>` is some function (the “continuation”) which says what to do next with that value. The steps are composed, so that the outermost continuation represents “all the subsequent steps” to be applied.

Although the values are wrapped, in each line it’s the plain (unwrapped) value which is bound to a variable. That’s good, because it avoids the need to do arithmetic on wrapped values.

Of course, our “assignment” is still back-to-front. Without changing the code at all, I can add some comments, showing the intention of each line:

Assignment with compound values

In part 2 I introduced a composite type, `ValueAndLog`, which held two things: a value (the result of a computation) and a log message — forming a sort of “side channel” of log messages.

We ended up with a “bind” operator, `>>`, which passed along the value of a computation step, and combined that step’s log output with the previously accumulated log messages.

`wv3 = ValueAndLog.unit(4) >> f >> ValueAndLog.lift(g) >> h`

Suppose we preferred writing that in an assignment style. As a first step, we could try assigning the intermediate values from each step:

`wv0 = ValueAndLog.unit(4)wv1 = wv0 >> fwv2 = wv1 >> ValueAndLog.lift(g)wv3 = wv2 >> h`

I have called the variables `wv0` (etc) to emphasise that these are wrapped, composite values.

Then, putting the code for functions f, g and h inline, the result is:

`wv0 = ValueAndLog.unit(4)wv1 = wv0 >> (lambda x: ValueAndLog(x*x, "Squared %d\n" % x))wv2 = wv1 >> ValueAndLog.lift(lambda x: 2*x)wv3 = wv2 >> (lambda x: ValueAndLog(1+x, "Added 1 to %d\n" % x))`

For simplicity and consistency, I will expand out the `ValueAndLog.lift` (which if you remember, was just to make function g wrap its plain result in a ValueAndLog)

`wv0 = ValueAndLog.unit(4)wv1 = wv0 >> (lambda x: ValueAndLog(x*x, "Squared %d\n" % x))wv2 = wv1 >> (lambda x: ValueAndLog.unit(2*x))wv3 = wv2 >> (lambda x: ValueAndLog(1+x, "Added 1 to %d\n" % x))`

(Runnable code here)

Now, that’s not pretty. The unwrapped values are available as x inside each lambda; unfortunately, each lambda only has access to a single unwrapped value. If I wanted to do a single computation which used both v1 and v2, I’d need some way to unwrap them both.

I can rename the arguments:

`wv0 = ValueAndLog.unit(4)wv1 = wv0 >> (lambda v0: ValueAndLog(v0*v0, "Squared %d\n" % v0))wv2 = wv1 >> (lambda v1: ValueAndLog.unit(2*v1))wv3 = wv2 >> (lambda v2: ValueAndLog(1+v2, "Added 1 to %d\n" % v2))`

But that doesn’t help; the value of v1 (say) is still only available inside its own lambda.

Doing the `LET` transformation doesn’t help much either. It makes the wrapped values wv0, wv1 etc available, but not the unwrapped ones. The calculations don’t need the wrapped values at all.

As usual, there is a trick. Return to the chain form without intermediate assignments:

`wv3 = (    ValueAndLog.unit(4) >>    (lambda v0: ValueAndLog(v0*v0, "Squared %d\n" % v0)) >>    (lambda v1: ValueAndLog.unit(2*v1)) >>    (lambda v2: ValueAndLog(1+v2, "Added 1 to %d\n" % v2)))`

At this point, it is possible to nest the lambdas so that bound values are available in subsequent steps:

`wv3 = (    ValueAndLog.unit(4) >> (lambda v0:        ValueAndLog(v0*v0, "Squared %d\n" % v0) >> (lambda v1:            ValueAndLog.unit(2*v1) >> (lambda v2:                ValueAndLog(1+v2, "Added 1 to %d\n" % v2)            )        )    ))`

This gives the same structure as the continuation passing form seen before: instead of `Value(...)` we have `ValueAndLog(...)`.

The complete code is:

This form turns out to be so useful that Haskell provides a special syntactic sugar for it, the “do” block. Inside a “do” block, expressions are transformed as follows (except I’m showing Python on the right, not Haskell²):

`do                               expr >> (lambda x:  x <- expr             ⟾          do ...more code...  ...more code...                )`

Remember that expr is a composite (wrapped) value, whereas x is a plain (unwrapped) value. Those are the rules for the bind operator `>>`: it extracts the value from the wrapper before invoking the lambda.

“do” blocks can also contain plain `LET` syntactic sugar:

`do                               (lambda x:  let x = expr          ⟾          do ...more code...  ...more code ...               )(expr)`

In this case, both expr and x are plain unwrapped values. You can see that no bind is taking place in the transformed version.

Adding the `LET` syntax simplifies the code, as it lets you work with plain values without having to wrap them only to unwrap them again. The example becomes:

Again, it’s much more convenient to write the code shown in the comments on the right, and have it translated automatically to the code on the left. In a nutshell, that’s what happens with Haskell’s `do` block under the covers.

No rebinds

The special form `LET x = 1` is, however, not the same as Python’s `x = 1`. The latter is an assignment to a variable, and the variable can have its value changed subsequently, such as `x = 2` or even `x = x + 1`. That’s not permitted in functional code³.

The variable is bound once, as the parameter to the lambda. If you want to change it, you have to call the lambda again with a new argument. This may mean that a function needs to call itself, recursively. That’s a topic for another time.

Instead, the next article introduces a word which begins with the letter “M”.

[1] Even BASIC, the first programming language I learned, and decidedly non-functional, has `LET X=1`. But in BASIC that’s assignment to a mutable variable.

[2] Haskell uses `>>=` for bind and `\x -> …` for lambda.

[3] You could write:

`LET x = 1... do stuffLET x = 2... do more stuff`

but the second x would be a completely different variable, which just happens to have the same name. The original variable still exists but is inaccessible. This is called “shadowing” the variable.