Python in Excel – an Analyst’s Dream?
There’s no doubt that Python and Excel are two of the most popular tools in an actuary’s toolkit. Python is one of the most popular programming languages in the world, with a wealth of packages offering powerful functionality for working with data; Excel is ubiquitous when it comes to spreadsheets, with near-unparalleled familiarity and ease.
The announcement by Microsoft of the official integration of Python into Excel – something previously only possible through third-party packages such as xlwings – is likely to have many contemplating the actuarial applications. In this article, we aim to explore a few of these potential uses and put this new functionality through its paces.
Exploration
To start using Python in Excel, you’ll need to be in Microsoft’s ‘Beta Channel’. If you can see ‘Python in Excel’ in the ribbon (under the ‘Formulas’ tab), then you should be good to go.
If you can’t see this in the ribbon, you may just need to make it visible (try File > Options > Customise Ribbon), or you may not yet have been granted access to the feature.
A great way to get started is with the ‘Try Python samples’ option, which provides a number of examples of various data analysis tasks using Python in Excel. Clicking any of these samples will create a new sheet in your Excel workbook containing the appropriate data and Python code.
To start writing code, type =PY() or use the “Insert Python” button from the ribbon. Python code can then be written directly in the formula bar. A few points to note here:
- By default, Python in Excel imports a handful of useful packages (numpy, pandas, matplotlib, statsmodels, seaborn and excel) – see ‘Initialization’ in the ribbon
- Further packages can be imported by putting their import statements directly in your code (as we see when we import the sklearn package later in this article).
- Your code may take a little while to run at first as the cloud container that runs your code boots up.
- The order in which Python code will be run depends both on the ordering of cells in a given sheet (left-to-right, then top-to-bottom) and the ordering of worksheets (left-to-right).
- Python formulas will be recalculated every time a dependent value of a Python cell changes; to improve performance, turn on ‘Partial’ or ‘Manual Calculation’ mode in the Formulas ribbon and use ‘Calculate Now’ (or F9) to re-run when desired.
- You’ll need to press Ctrl + Enter to confirm a PY function call (as opposed to just Enter).
For our article today, we will look at a fairly simple actuarial task: principal components analysis on yield curve data. This is something that an insurer might want to do as part of calibrating interest rate stresses in a Solvency II model, for example; however, Excel’s linear algebra functions are not quite up to the task! We’ve loaded our data into a table in Excel (forwards curves courtesy of the BoE), and now we’re going to dive in.
We start by importing the PCA class from sklearn – the sklearn library, also known as scikit-learn, is a free Python library often used for machine learning-related tasks. We then read our forwards data from our Excel table into a pandas DataFrame; the xl function that we see below is the means by which we ‘send’ data from Excel to Python.
Looking good! Note that the default here is to return the value as a Python Object; however, we are also able to return data as an Excel Value using the toggle next to the formula bar, as we’ll highlight in the next step.
In the next block of code, we do a few things in quick succession:
- We reshape the forwards data to remove the date column and convert to a numpy array.
- We carry out the Principal Components Analysis – note how the PCA class from the sklearn package makes this a simple two-line task!
- We put the components output from this PCA into a pandas DataFrame, adding column and row labels.
Notice how we’re freely able to refer to variables defined in previous cells– all .
This example illustrates how a task that would have been tricky to carry out purely in Excel becomes nearly trivial with the help of Python. However, let’s not stop there – time for some data visualization!
Using the seaborn and matplotlib libraries in Python (referred to by the shorthand ‘sns’ and ‘plt’ here), we can plot a graph of our three principal components and output it to Excel as an image. To the right, we see a similar graph plotted using Excel for comparison.
Of course, where libraries like seaborn really shine are in more complex data visualisations, like the below illustration of seaborn’s pairplot function:
This is just a small taste of the functionality that Python in Excel might enable: we’ve only touched the tip of the iceberg of what’s available in the default packages, and with over 400 further packages available to play with, there’s sure to be something that captures your attention.
Limitations
Of course, it won’t always be the case that ‘Python in Excel’ is the best option available – different use cases may be better suited to just Excel, just Python, or a different solution entirely (such as R). For more complex processes or workflows, Python in Excel may be too limited; a more robust Python development environment may be required, with connectors to read in / out data as appropriate.
Equally, there may be tasks for which Python is unnecessary or uncomfortably complex; it may be difficult for less technical users to follow the flow of calculations or troubleshoot errors. Vanilla Excel may be the right solution there.
Another risk to be aware of is that making any key actuarial processes depend on Python in Excel is reliant on Microsoft maintaining support for this feature going forwards. Microsoft isn’t averse to changing the nature of their offerings or discontinuing them entirely – see Wolfram data types, Money in Excel, or Power View for a few recent examples – so proceed with caution!
Conclusion
Python in Excel is a great tool for those looking to supercharge Excel’s existing capabilities, or for those who wish to explore what Python and its packages have to offer without the hassle of configuring a full development environment. Readers who are interested in previewing this for themselves in will need to get access to the Beta Channel of the Microsoft 365 Insider Program – and keep their eyes peeled for a move across to General Availability.
Jacob Warbrick
October 2023