# Functional Programming

It seems fashionable to state that functionable programming is better than object-oriented programming, and there are many articles and blog posts that support this view by comparing Excel or R (which are mostly functional) with VBA or Python (which are mostly object-oriented). My own interest in this subject goes deeper than just a list of pros and cons and I’d like to share some of what I have learned by looking into the rich history of functional programming, the lambda calculus that underpins the theory, and the clever solutions that make functional programming languages so powerful. Although this article dives into some technical areas (there may even be a few formulas!) I believe that anyone with an interest in logic and problem solving should learn how to “think functionally”.

## A brief history of programming

The history of computer programming is a fascinating one, and unfortunately much too involved for a single short article. But there are some key moments in history that are worth understanding to contextualise the zoo of programming languages and styles that we have today.

Humans have been using physical objects to expand their computational abilities for a very long time – that all humans at some point use their fingers as a computational aid to help add and count is quite a profound fact. Unfortunately, fingers cannot encode all computable procedures.

The first *general-purpose* computer was conceived by Charles Babbage and Ada Lovelace in 1837, called the *Analytical Engine*. It was based on an earlier design of Babbage’s to calculate the results of mathematical formulas. Babbage’s great epiphany was that he could build the machine such that it would output instructions for itself to follow at a later point – this is the idea upon which all modern digital computers are based. It was Lovelace, however, who had the insight that the machine could do more than merely calculate mathematical formulas – she conjectured that it could ultimately be configured to do anything abstract, such as write music. She also wrote down the first program for this machine and as such is widely considered to be the first computer programmer. Unfortunately, the machine was never built, and it was not until over 100 years later that an equivalent machine was created.

Throughout this apparently dormant period though, there was a golden age in theoretical computer science (even though nobody knew it at the time). Philosophers like Bertrand Russell had been investigating the properties of so-called *formal systems*. David Hilbert proposed his program to finally set all of mathematics atop a secure foundation – a goal that was later proven impossible by Kurt Gödel[1]. Moses Schönfinkel invented his system of combinatory logic (an alternative formulation of predicate logic), which was shortly thereafter rediscovered by Haskell Curry. Alonzo Church developed the *lambda calculus*, and his doctoral student Alan Turing was contemplating his famous machine and its capabilities. Though seemingly disparate endeavours, all of these had implications for what programs could be computed, and how such programs could be precisely expressed.

This is all to say that the story of modern computers, and all the technology that depends on them, is not only the story of physical machines, but also the story of the abstract formalisms developed with remarkable pace and ingenuity over the first half of the 20^{th} century.

A little more than a decade after the first universal computer was built in the early 1940s, the first commercially available programming language FORTRAN (short for Formula Translation) was developed at IBM. Around the same time, a programming language called IPL (short for Information Processing Language) was invented; this would soon be succeeded by the programming language LISP (short for List Processor). Since then, a vast number of programming languages have been created, and continue to be created every year. Although all of these are unique in their own ways, the vast majority are inspired (directly or indirectly) by FORTRAN and LISP, with remarkably few exceptions. To take two popular examples: Python fundamentally behaves a lot like FORTRAN, and R is heavily inspired by LISP. Neither one looks much like their predecessor on the surface – and both are fully versatile “multi-paradigmatic” programming languages – but each has a deep bias towards a particular mode of computation.

## What is functional programming?

As the name suggests, functional programming is about building programs out of *functions.* *Functions* simply take inputs and return outputs. Many languages will allow the programmer to write *side-effects* into functions – for example writing out a file – but here we use *function* interchangeably with the slightly more precise term: *pure function*. *Pure functions* are functions that have no such side-effects. One consequence of avoiding side-effects is that the same inputs to a pure function will always do the exact same thing.

An alternative way of thinking about *side-effects* is to think about the *state *of the machine. Turing Machines (and machines of that ilk) perform computation by using instructions to **transform the state of the machine**. Functional programs, on the other hand, compute purely by **substituting symbolic expressions**; a *functional program* is effectively a specification of what substitutions are allowed for a given set of symbols. LISP was the first language to fully realise this mode of computation – though originally much more like FORTRAN, purely functional LISP “dialects” were developed in the 1970s.

When supplied with appropriate inputs, a functional program can be “executed” by carrying out substitutions until the program is in the simplest form. If this is conjuring memories of solving algebra puzzles in school, then you’re on the right track.

It is not obvious that such a process should be just as powerful as a machine that can modify its own instructions. And indeed, a *pure function* is an abstraction that cannot really be realised in the physical world. We live in a stateful universe, and for our computers to do anything useful, they must make physical changes to the state of world (e.g. sending a signal to your screen). Nevertheless, when it comes to expressing abstract *algorithms*, both approaches are equally capable.

## Lambda Calculus

So-called functional languages are based on various *lambda calculi*, which play the role of a sort of abstract machine code.[2] The lambda calculus is an entirely novel system of logic based purely on function application. It can get very abstract – even numbers are not taken for granted in the pure lambda calculus and must be constructed in terms of functions. We won’t go into that here!

For the sake of seeing some of the magic underneath functional programming, however, it’s worth taking a look at a couple of simple lambda programs: This is a *lambda expression* which can accept an argument. The blue part is referred to as the “head” of the expression, and the red part the “body”. Applying this expression to an argument would amount to replacing all the red *f *with the supplied argument, and then discarding the blue part when we’re finished. Effectively, this expression is saying: “If you give me an *f*, I’ll apply it twice”.

In standard function notation, this would be most similar to (but not quite the same as) something like f²(x). The difference is that in standard mathematical function notation, we always have the intuition that our functions are mapping from numbers to numbers, and that computing a function always take the form of some kind of arithmetic. The pure lambda calculus doesn’t have numbers or arithmetic though! It’s lambda expressions all the way down.

Quite surprisingly, this curious system forms a Turing Complete language – which means that it can express a simulation of a Turing Machine, and therefore can express any computable program.[3]

Extremely observant readers might notice: the lambda calculus does not appear to have the facility for looping or control flow (i.e., *if…then* statements), which are both quite fundamental to most programming languages. Indeed, these are usually two of the main ingredients required for a system to be Turing Complete.

Though not immediately obvious, both can be done. We’ll start with looping.

## Looping with lambdas

Consider the following lambda expression[4]: This creature is called the *Y combinator*. This might look intimidating at first, but some squinting will reveal how it works. Suppose we feed this a function *f’*. The first step is to replace all instances of *f* (the outermost lambda term) with *f’, *and then drop the leading head. I’ll use colours a bit differently here to draw attention to the substitutions: The next step is a little harder to see: the bracketed expression on the right is the argument to the one on the left (purely by virtue of its position in the expression), so we substitute all instances of *g* in the left expression with the whole expression on the right (once again dropping the leading head and superfluous brackets), giving: You’ll see that this is just the same as the expression in the previous step, but with an additional *f’* at the front. That’s recursion! This can adequately perform the role of looping in other languages.

## Control flow

Control flow, on the other hand, is a bit more abstract (yes, more abstract). Remember that we do not have the notion of “true” or “false” out of the box in lambda calculus. So we have to do something a little strange, and come up with some working definitions: The first of these is saying “if you give me an expression, I’ll give you a new expression that returns that expression, regardless of what you give it”. The second is saying “If you give me an expression, I’ll give you a new expression that just returns whatever you give it”. This is probably not the way you usually think about true and false, but it turns out to work for all the same purposes. We can see this with the following expression: The brave reader can try substituting the above definitions of true and false for *a* in this expression and see what happens![5] You should find that it represents the familiar *=IF(logical_test, [value_if_true], [value_if_false])* from Excel.

## What about data types?

One of the interesting things about the lambda calculus is that it has many variants with slightly different properties. The one we’ve been looking at so far is *untyped*, meaning that any lambda expression can be an argument to any other lambda expression. Other variants add type constraints to the lambda calculus, which restrict this behaviour. The so-called *simply typed lambda calculi* lose their Turing Completeness, because the type of the Y combinator is not computable (and thus cannot be written down within the rules of the system). On the plus side, every program in the simply typed lambda calculus is guaranteed to eventually finish running.

Nevertheless, even the purest functional programming languages have found ways (the details of which are well beyond the scope of this article) to include type constraints whilst also maintaining Turing Completeness. The trade-off is that they must permit programs that might run forever – this is indeed a fundamental trade-off for any system that might reasonably be said to be capable of universal computation.

## Functional vs. Non-functional programming

We’ve had a little look at lambda calculus – the logical framework that underlies functional programming. But we also established that the lambda calculus does not do anything magical – it can just do anything a more traditional programming language can do. So why would we care to use it?

Fundamentally, the difference between functional and non-functional programs comes down to how we reason about them. The solutions to problems using these toolkits typically look very different, and some problems are just more naturally expressed in one system than the other. For example, consider the following examples implementing the *Quicksort *algorithm. The code on the left is a naïve implementation of *Quicksort *in the language *C*; the code on the right is a naïve implementation in the purely functional programming language Haskell [6]:

Though simple, sorting is actually quite a rich example to study the differences between functional and non-functional languages. At a glance, the functional program uses much less code, which most people (even programmers) consider a good thing. This is actually very common for quite a subtle reason – just like with looping, non-trivial data structures in functional languages need to be defined recursively, and so there is a natural marriage between recursive functions and recursive data that leads to very concise code.

The main downside of the functional algorithm is far from obvious to the uninitiated. The *C* algorithm, though certainly less pleasant to look at, is doing something very clever – it is modifying the original list of numbers *in-place. *Because *C* does not have any self-imposed constraints about being able to carry out side-effects, it is able to just shuffle around the data directly in the memory of the computer[7]. The functional solution, in contrast, has to construct an entirely new list in the correct order, because it is not allowed to do such things. This is much less efficient in terms of both speed and memory.

## The functional programming paradigm

There is nothing to stop you from writing a quasi-functional program in a non-functional language, and equally it is entirely possible to simulate a stateful machine in a functional language – the style that we write a program in (colloquially referred to as a *paradigm*) is rarely dictated by the language[8].

Having said that, I do personally favour the functional style most of the time. It is not uncommon for an otherwise simple program to become incredibly complicated with the introduction of state, and with modern hardware it is usually more important to write and maintain the program easily than it is to be able to run it as efficiently as possible. The fact that purely functional languages are kind of obscure is perhaps the main reason that the functional style is not more common.

One of the great things about reading pure functional programs is that they read like a list of definitions. For programs that run through sequential instructions, there is an ever-present anxiety that you *might *just be interpreting the program in the wrong order. The order of instructions is of fundamental concern in a non-functional program; in a functional program, any required ordering is explicit in function definitions, and beyond that, any order is equally valid (because all orders of execution will necessarily yield the same result). This generally makes for a much more pleasant reading experience.

On top of all that, actuarial work tends to be quite mathematical, and as such the functional approach would seem to be the natural choice more often than not. The fact that few actuarial tools are purely functional is the main obstacle to employing it more widely (with the notable exception of Excel[9]), but that rarely stops one from using the main ideas anyway.

## What next

This has been a very brief and high-level overview of what functional programming is, where it comes from, and why it is valuable. Although hopefully interesting in its own right, what I hope is that readers will become aware of something that they were not before, namely: there are many different programming languages which differ only in small or superficial details, but there are also fundamentally different modes of computation on offer. Becoming familiar with the functional mode doesn’t just give you a few new morsels of syntax to add to the collection – it grants you a totally new vocabulary which completely changes the way you think about certain problems.

For those interested in learning more about functional programming, I recommend trying out the *Haskell *programming language. This is the most sophisticated expression of functional programming on offer, and indeed is where many more recent functional ideas have been born. Though popular enough, it is a relatively esoteric language, intended primarily for programming language research, and as such will likely never be directly useful to your actuarial team. But there are plenty of great resources out there for beginners, and it will teach you totally new ways of thinking about programming that certainly *will* be useful to you and your team.

##### Joseph Barnett

April 2022

[1] This is a fact profound enough to warrant a footnote, even though it is not strictly relevant to the article. Colloquially, Gödel proved that any logical system capable of expressing arithmetic will contain some number of true statements which are unprovable. Whilst this is obviously consequential for mathematics and philosophy, it also less obviously puts constraints on the kinds of computer programs that can run in a finite number of steps.

[2] Machine code is the code which is expressed in terms that your computer can understand, and it’s not nice to read. Just as all legal programs can be translated to machine code, all legal functional programs can be translated to an extended form of the lambda calculus (which in turn can be translated to real machine code to run on your PC). Most languages, even functional ones, do not actually do this (opting to cut out the lambda calculus middleman), but it can be done in principle, and a few do take this approach. Regardless of how the computer “really” treats them, the lambda calculus is the theory that supports the grammar of such languages.

[3] This is a bit of a tautology – “computable” is typically taken to mean anything that a Turing Machine can do with infinite time and resources.

[4] Note that brackets don’t mean anything special, as they do for *f(x) *in standard function notation. They just indicate the precedence of function application.

[5] A word of advice for those who embark on this journey: although the pure lambda calculus does not provide arithmetic or numbers, it can be helpful to pretend they exist to test your intuition.

[6] Both examples are taken from the Haskell wiki. Rest assured, they have not deliberately made the *C* code look worse than it needs to be!

[7] This is one of the reasons that C is such a notoriously hard language to use – programmers can wreak all kinds of havoc with this power. But by the same token, expert C programmers are able to squeeze phenomenal performance out of their code.

[8] With a few notable exceptions, such as Java (which enforces OOP) and Haskell (which is necessarily purely functional). Some languages also just lack facilities that would make certain paradigms convenient. For example it is very difficult to do any sophisticated functional work in VBA.

[9] Excel formulas cannot save secret variables in some persistent location, nor can they cause side effects – they can just be evaluated, The Excel formula language is purely functional for this reason.