APR Modelling: Inheriting Actuarial Code
When researching about being an actuary there is often a lot of talk about risk management and the mathematical side of actuarial work. What is often not mentioned, however, is how actuaries need to have strong computer skills – including the ability to write computer code.
VBA (Visual Basic for Applications) is probably the most common form of coding actuaries will be exposed to, mainly due to its integration within Microsoft Excel, which for better or worse, is often the actuary’s program of choice. VBA allows Excel processes to be automated so that they take only a fraction of the time and user input. That said, actuaries may also be required to carry out large-scale data manipulation, requiring the use of SQL, R or SAS, or even make code changes to proprietary modelling software such as Prophet or Tyche.
What problems are there with actuarial code?
The problem is that the background of actuaries is usually mathematical or statistical; in the early years actuarial students are bombarded with so much new information relating to actuarial exams and technical skills that they often receive little to no formal training in coding – APR being an exception!
Without proper training on best practice and alternative approaches, it’s easy for an actuary – junior or senior – to slip into bad habits. This can result in inefficient code, code which doesn’t follow accepted best practice, or even code which doesn’t fulfil its intended purpose! A lack of training can also lead to code being difficult to understand for the user or for another actuary who has picked up the code. This situation is what this article intends to discuss.
Major issues can arise through a lack of clear ownership of the code. All too often, a team’s unofficial VBA expert will write the code and develop tools, with little input from other members of the team who are perhaps less strong in their coding skills. When this person leaves, any modifications or maintenance to the code will need to be carried out by another member of the team, who was not necessarily involved in developing the tool in the first place. It is likely that this other person will have a very different style of coding or will perhaps just be less experienced in writing and understanding code.
This can make maintenance of these coded tools a difficult task, with code often becoming bloated and unwieldy over time.
APR coding projects
APR staff routinely work on coding projects, or projects where the use of coded tools for automation can improve efficiency and add value.
Some examples of the types of projects where APR staff have been involved in the coding process include:
- Writing tools in VBA to automate data checking processes
- Rewriting asset and liability models for with-profits business in VBA
- Modelling of complex assets in R
- Writing large-scale data cleansing and manipulation tools in SQL
As each of these staff return to APR from client placement, they bring with them a further wealth of knowledge and practical experience on how to build and maintain coded models and tools. We have taken some of this experience to inform the rest of this article, which discusses some key points which we have found lead to successfully rewriting code. In this article we will try and distil all this experience into a small number of key principles for updating a coded model. For the purposes of this article, we will consider the situation where a coded model must be reworked rather than started anew and will focus our tips on how best to do this.
Another APR article dedicated to creating a model from scratch can be found here and if you have the scope to start anew with model (whether it is code based or not) we would suggest this as your starting point:
Rewriting code that you haven’t seen before
Before starting, probably the first thing to think about is whether the existing structure of the code is sufficient for any developments which need to be made.
A big-picture view can be really helpful in improving the efficiency of code. A strong and elegant algorithm is almost always the most important factor in the efficient writing of code, rather than making tweaks to individual rows of code but even if the algorithm is elegant and efficient bad coding can still make it a bad model. So the first step is to ensure the code is clearly laid out and well written, and often this can mean a little bit of rewriting before you start making any other changes.
One helpful method for rewriting code is to learn to read the code ‘backwards’. Starting from the final required outputs and tracing them back gives a clear picture of which lines of code feed into these outputs and which didn’t. A helpful property of code here is that it’s generally linear – code further down is performed after code further up. Using “find” functionality aids with finding the next or previous time a variable is used, helping to work out if a variable is a ‘dead end’ or not.
Before starting, restructuring the code can help with this process. Without necessarily changing the length or content of the code, properly structured code can be far more readable and easier to manage before beginning the process of simplification and modification. Also add in comments to remind yourself later if any parts are particularly difficult to unravel.
Sometimes, the first thing to do with poorly-structured code is to simply rearrange the code using proper indentation and spacing. This can make the code more legible and can therefore help with understanding any “proper” changes which must be made to improve the efficiency of the code.
It is also useful to restructure code such that the lines of code related to one particular output are located in the same section of code. Structuring code in this manner means that anyone who will later read the code will be able to clearly see how the values of one particular output are being generated, rather than having to scroll up and down to look at different processes and functions.
Once the code has been restructured, it is helpful to start by looking at a single final output variable, tracing the other variables and calculations which fed into it. Drawing rough diagrams can help with intuition here.
These other intermediate variables can then be traced back further, and so on, until a clear picture emerges of the intention for the output variable in question. All of this related code can then be analysed in isolation to see if there are any possible simplifications or ways of improving the code’s readability.
The below example shows a situation where code has been written to calculate the value of assets after capital gains tax has been paid.
Assets = Assets – CGT_rate * (Assets – Assets_time_0)
Capital_gains_tax = CGT_rate * (Assets – Assets_time_0) Assets_Post_CGT = Assets_Pre_CGT – Capital_gains_tax
It is not always desirable to combine too many steps of a calculation. Particularly for complex calculations, using several intermediate steps and making sure to properly annotate the code can drastically improve readability, allowing for easier debugging and maintenance. Indeed, depending on the specifics of the purpose of the code, it may be preferred to take single lines of code and split these into several lines for ease of understanding.
Here, the calculation has been split onto two rows for the benefit of the reader and the next person to review the code. There is often a trade-off between brevity and how easy the code is to read.
To further improve the readability of the code, it is also often useful to rename variables. In the example above, the variable “Assets” has been renamed to “Assets_post_CGT”. The clearer variable names mean that understanding the code and what each variable actually represents is far easier at first glance.
Important to note is that each different company is likely to have its own preferences regarding the best approach to writing and documenting code. Any new code or amendments to existing code should satisfy any requirements in terms of maintaining consistency with the client’s suite of other models.
Know your coding platform
Although VBA is a particularly common platform for writing code in the actuarial world, the main reason we have used it as our base language in this article, there are indeed several different coding languages and platforms used in the actuarial world. This means that it is likely that the actuary who wrote a tool or a piece of code was not writing in their “first language”.
This can introduce redundancies and inefficiencies into the code which could be avoided by properly knowing the coding platform.
Of course, this is easier said than done. However, there are certain tricks that can be used to help shorten code and boost the efficiency of calculations.
The below examples make use of one of the features of VBA which can be used to boost efficiency. Of course, different languages and platforms are better suited to certain tasks than VBA, so similar efficiency savings can be made in other languages using other such tricks.
Several coding languages require that a variable be initialised when it is first introduced, otherwise running the code can result in an error. Code that initialises variables could look like the example below:
Dim assets as double Dim liabilities as double assets = 0 liabilities = 0
However, it is not a requirement of VBA that variables must be initialised. Any variable which has been declared and dimensioned is automatically initialised by VBA. In the above example, the two lines of code setting the initial values of assets and liabilities equal to zero are redundant, so can be completely deleted.
On one such APR project, understanding this feature of VBA allowed for the removal of hundreds of lines of unused code. This shows that it really helps to be aware of the capabilities and limitations of the language that you’re using.
Profiling a piece of code
A profiler, simply put, is a tool which gives the user statistics on the speed and efficiency of certain parts of code. Running a profiler on a piece of code can help identify inefficiencies such as bottlenecks or memory leaks.
Often, humans cannot intuitively understand how a computer will interpret instructions given by code. This means that the code may run slowly and introduce inefficiencies into the tool. Profiling can help to highlight areas of weakness in a code, such as by identifying sections of a whole piece of code which are taking longer to complete than the actual complexity of the calculation suggests.
This can be useful where a calculation has been written in sub-optimal code, but where the issues in the piece of code are not immediately or intuitively obvious to a human writing or reading the code.
Profiling can be run against a whole piece of code for a big-picture view but can be as detailed as checking the efficiency of individual lines of code.
This all means that profiling is an incredibly powerful tool in highlighting inefficiencies in a piece of code.
Documentation and comments
After rewriting the functional code itself, real value can be added by improving the communication of the code to stakeholders and other users. This communication related to what the code intended to achieve, and the specific mechanisms which it used in order to meet these objectives. By helping others to understand how the code is fulfilling its intended purposes, the tool can be more easily maintained; each time a piece of development is carried out, the coder will not need to carry out an in-depth investigation of how the code functions. Even if documentation feels like a big time investment, future time savings will almost always make up for it.
There are a few devices which can be used to improve the communication around a piece of code. For example, a change specification could be written to explain what changes were made and why. A code commentary document could be written to explain how each section of the code works, and perhaps why a particular calculation is being performed. This piece of work could also be used to document any areas where the code looks to be functional, but the underlying calculation looks suspicious according to actuarial understanding.
Perhaps the most useful communication tool in the coder’s arsenal is the comment, that is explanatory text within the code that is not part of the functional code but merely there impart information to people reading the code. Different audiences and tools call for different approaches to commentary within the code. However, in the context of an actuarial placement on client site, where any code is likely to be handed to another party at the end of the project, making plentiful use of comments can really help the code to be understood and maintained.
Comments can be written for a variety of reasons. A comment could preface an entire block of code to explain the intention of the whole section, including brief discussion of the input variables, the output and the actuarial rationale for the intermediate calculations. Alternatively, a comment could act as a side note to a particularly complicated calculation, which either cannot be simplified or requires several lines of code to perform.
Adding comments to further understanding of a piece of code is almost never a bad idea – this really is an important way to add value to a section of code!
Key Points Recap
There are a multitude of tips and principles we could have chosen from at this point but at APR we think these are some of the key things to keep in mind when building on an existing model.
- If you ever need to modify someone else’s code, take the time to work out exactly how it works before making your change. If it’s a large piece of code, understand the big picture of the whole routine and your section in detail. This will help you to modify the code more efficiently and also identify sections that are now no longer needed in light of your change.
- Structure your code so that it’s easy to read. This includes simple things like indentation, but also keeping related variables and their calculations in one place. Use subroutines, user-defined functions and classes to keep your code well-structured.
- Use profiling to aid understanding and identifying areas of your code that are inefficient and help focus any work you do to improve efficiency
- Comment and document your code! This can be useful for other people reading your code later on, but also for yourself if you ever have to come back to code you wrote a long time ago.
- Use descriptive but concise variable names and avoid using variables to represent more than one thing within the same block of code.
Know what your platform is capable of and always be ready to learn something new. If you’re struggling to do something efficiently, it might be that there is a simple method you’re unaware of. Don’t be afraid to ask around or search online for help!
 This article is an updated version of one we originally posted in April 2016.