Password Encryption in VBA
Programming languages such as R and Python may be making waves in the realm of insurance, but ask any actuary, and they’ll tell you that Excel still reigns supreme. This makes VBA (widely reviled in programmer communities and no longer updated by Microsoft) wildly useful – but knowledge of it is quite rare.
This strong reliance on spreadsheet software, as well as the sensitive data and processes often adapted into Excel, makes it reasonable to assume strong password protection is commonplace. However, practices in offices vary. In this article I will lay out some ideas on how to implement some standard Excel protection rules, and will propose a specific VBA solution to this problem.
Here are some types of generic protection often used in spreadsheets:
- Protect worksheets with / without a password.
- Protect the workbook structure so it cannot be tampered with (prevents hiding / unhiding worksheets, adding / deleting worksheets…).
- Protect the VBA code so it cannot be accessed and changed.
Generally, all these protections have one thing in common: a password.
Usually, this comes in the form of a shared password that all members of a team know. These passwords stay in circulation for a few months, and eventually everyone who uses a spreadsheet knows how to unlock it. This isn’t just the developers; it could be testers, users, managers…
The Problem
Is this a problem? On the surface, it isn’t so bad. For one thing, Excel’s protection is easily breakable by anyone with an internet connection and a spare lunch break. Equally, protection should be designed to meet its need. There are two reasons concern might be minimal:
- Malpractice is simply not very common around Excel spreadsheets.
- Excel spreadsheets rarely contain valuable enough information to make malpractice desirable – or at least, they shouldn’t!
As such, the real reason we use protection in spreadsheets tends to be restriction of access rather than to protect against crime. Consider the following examples, drawn from personal client experience:
- A user for a spreadsheet notices the need to change a locked cell due to an assumption change. They are provided with the password for the spreadsheet and they update the assumption every time they run a case. The spreadsheet produces a surrender value calculation, and the result looks reasonable. But what if the assumption only changed temporarily and the user keeps blindly making the change? What if the change produced an incorrect surrender value?
- A tester is checking the annuity rate calculated in a workbook. They unlock the spreadsheet and do some perfunctory calculations in the results next to the annuity calculation to verify it matches expectation. They accidentally save it, and the code which produces the results breaks down since an invisible calculated cell has been overwritten.
Both of the above examples are fairly trivial to fix, but identifying the breach after it has happened leads to potential issues with customers. Obviously, some basic version control processes and a stricter handling of passwords would have prevented both happening. However, theory and practice aren’t always best friends, and invariably little concessions happen.
To address this issue, we are proposing a finer handling of spreadsheet access. In particular, we will borrow the ideas of account creation and encryption from any online portal and adapt it to Excel via some VBA code. The result should be:
- A user-validation process in Excel which requires each user to have their own log-in.
- Fully encrypted passwords such that a password is impossible to reverse engineer.
- A finer handling of account permissions; developers can have carte blanche, users might be presented with a black box.
Before we get started though we will need to introduce an element of cryptography!
Encryption – Hashing
What happens when you create an account online? You’re asked to provide a username (which could be an e-mail address) and a password. This password has to meet certain security requirements, mainly to ensure potential hackers cannot simply guess yours by knowing a little bit about you. Then, a few hours later, you want to log into the app on your tablet device. Almost unbelievably, you’ve already forgotten your password. Luckily, there’s a handy “forgotten password?” button on the same page, taunting you even as it offers its assistance.
A password reset e-mail is sent to you, which you then click on. You provide no information; you just need to click on the link in the e-mail and then you get to decide a new password. You swear at the screen. Why can’t they just tell you your current password? Now you have to invent a whole new one, which you do begrudgingly… only to discover your new password was actually your old one!
The process described above might be so vivid in its description because the author of this article has gone through it several times this month.
There are two reasons the password reset e-mail cannot provide a current password:
- It is more prudent not to display a full password on screen, in case e-mails are being read by the wrong target.
- When passwords are created, they are run through an encryption algorithm, which will usually make them unrecoverable.
To see how this works as a process:
- At the initial stage of creation, the account holder provides a password. For simplicity, let’s say this is Password1.
- The online portal will take this value, and run it directly through a hashing algorithm, designed to make an easily guessable word (“Password”) into something near-unintelligible.
- This value, after the hashing process, is then stored in the company database as the official access password of the account.
- When the account holder next logs on, they input their password using keystrokes. The log-in portal then runs it through a hashing algorithm to return the same value, and matches that against the database to validate the log-in request.
To illustrate the power of hashing, we have used SHA-256 encryption on the password above:
Before: Password 1
After: 19513FDC9DA4FB72A4A05EB66917548D3C90FF94D5419E1F2363EEA89DFEE1DD
Much more secure than the initial password!
What makes this better is that it’s impossible to reverse engineer a password algorithmically, if it has been hashed by a good algorithm. Therefore, there is only one place where the series of keys required to access the account is stored: in the account holder’s head!
Salt Values
As an additional form of security, consider the scenario where two users share the same password. In theory, it is possible for user A to guess the password for the account of user B, or accidentally access it, if the two have similar usernames.
To prevent this from causing trouble, a salt value is used. A salt value is essentially a random string, which is prepended to the original password. This entire string is then fed through the hashing algorithm to lead to two different results. To see the effect of even a simple salt, see the table below:
Password | Salt | Pre-Hash | Hash |
Password1 | A | APassword1 | 6098F531662ED964162675212061C8821D4EB3EAC603DC27DDDA72B6B2B5534 |
Password1 | B | BPassword1 | C23E4DA4E2CC2DD195627F1DDF8BC6E723347D4E6F8417CE78649A23645C |
Notice that even a single character as salt is enough to create wildly different hash values. Salt values are usually unique to the username, not the password, meaning each user will have a randomly generated salt value (that even they don’t know), which then goes into their hashing algorithm process.
Generally, salt values will be longer than the above.
How does this work with spreadsheets?
The idea of using a bespoke hashing algorithm and appending a salt value to a username may seem something like overkill for a spreadsheet. However;
- The idea of user validation can have additional benefits. For example, in a development-test-release cycle, the developer, user and tester of a spreadsheet should be three different people. Some form of user validation can prepare the spreadsheet for use by a specific role.
- Developers can have carte blanche to make all changes.
- Testers might have heavily protected sections of the calculations which prevents accidental change.
- Users might have most worksheets hidden and protected so the model can appear as a black box. Or at the very least a Perspex box that they cannot open!
- Spreadsheets can contain sensitive personal data, whose loss could result in GDPR breaches.
- Most of the time, for consistent protection, all spreadsheets being managed by a team will need to go through a protection process. It is much easier to simply attach shared functionality which will enable the protection process for every spreadsheet.
Meeting these desires will require a fairly robust piece of code that is:
- Versatile: it can be appended to any spreadsheet.
- Consistent: it is held as shared functionality, so the same treatment is guaranteed for all spreadsheets.
- Simple to update: as a single piece of code, changes do not need to be made to multiple spreadsheets; instead, all changes can be made to the shared code and tested on one spreadsheet for assurance.
- Controlled: Excel offers read-only options and certain protections; this code could allow for greater customisation and partial write access, for example.
The Solution
To account for the versatility requirement, we would propose an Excel add-in which is accessed by some Workbook_Open code in the relevant spreadsheet. Using an Excel add-in means a single piece of written code is sufficient for all spreadsheets using it. This also allows for quick updating when changes are needed.
In order to simulate the experience of “logging in”, a VBA userform should be created with a Username / Password combination required from the user. The userform could autofill a username based on the current user running the code.
When a password is entered, the code goes into a database (this could be an SQL server database, a Microsoft Access database, or simply within a workbook somewhere) and looks up the username. It pulls back the salt value (which, recall, is unique to the user). This is then prepended to the password entered and run through the hashing algorithm of choice.
The resulting string of characters is validated against the password stored against the username. One of two things can occur:
- The password will be valid. The code will pull through an access level (e.g 1 = Developer, 2 = Tester…). This will determine further code to run to “prepare” the spreadsheet for the user.
- The password will be invalid. The code will spit out a generic invalid password message and the user is invited to try again. Crucially, a failed log-in means no access to the spreadsheet.
With some fairly standard tweaks and preparation, this piece of code can be attached to any spreadsheet (which should be structured with some consistency to allow the access level rules to be managed by the add-in code), and the user validated fully for access.
An Example
We have developed a very simple model which performs the required steps. It is meant to be illustrative of encryption and the process of validation, so a number of changes to the solution outlined above have been made to accommodate this:
- Instead of a bespoke database (which may be inaccessible to most users), the username / salt / password / access_level combination is stored within the spreadsheet itself, for easy perusal.
- Instead of an add-in, the spreadsheet has been developed as a macro-enabled workbook, purely to help illustrate the effect of different access levels.
- Instead of only a log-in functionality, a Register functionality has been added so anyone in possession of this workbook can themselves add their details and see the effect of different access levels. To be clear, this solution implemented in a real-world context would NOT offer registration; the decision was made to do so here to illustrate the effect.
- A practical implementation of this theory would probably involve some additional work on the part of the user – particularly in setting up a series of rules on which worksheets to hide / show depending on access level.
Below is a short walkthrough of the development of this model.
First, we need to look at the user interface. A specially-made VBA userform is needed to allow the user to select whether this is an attempt to log in or create a new account. This has been designed in the standard fashion, with three userforms. One will be the selection screen:
One will be the registration screen; in practice, someone with admin privileges would perform this step for every user:
The final one will be the sign-in screen:
Now, the process of creating a new user or validating an existing one sounds like it belongs in a nice class module wrapper; in particular, to make the code easily portable and intuitive when used. We go about writing the code for that; the bones of the class module are examinable in the file, but we expose the following to the rest of the workbook:
- UserName (Property)
- Password (Property)
- Role (Property)
- CreateUser (Method)
- ValidateUser (Method)
The code will work like this:
CreateUser
This requires a username, password and role. All three properties will be set, and then CreateUser will:
- Generate a salt unique to the individual username (make the salt random enough that it is unique).
- Use the salt with the password to generate a hashed password.
- Store the row pertaining to the user in the worksheet designed to hold login information.
ValidateUser
This requires only a username / password combination. It will look up the salt, perform the hashing and check it against the HashedPassword column above. The access level will be retrieved, and this will determine what the user sees.
Each type of user role will have a different treatment!
Final Thoughts
With some very simple code and a couple of hours, it is possible to design strong protection to validate Excel users against accidental changes / errors. The example model above is designed to be played around with by any interested readers, so it required the simplifying assumptions made above to make it viable as a single, simple workbook. Additionally, a more professional product would:
- Ensure adequate protection of worksheets / workbook / code is observed at all times.
- Ensure the class module functionality is better encapsulated – in particular with respect to the access level handling process.
The full example model is available from APR. Please contact John.Nicholls@aprllp.com for a copy of the code and further commentary on its adaptability to client offices.
Hopefully this article provides good ground for an initial understanding of password encryption, and some information over how it can be done in Excel, ubiquitous as it is in the industry. For those interested in the concept but who wish to see a product developed not for illustration but for use, please get in touch to discuss.
John Nicholls
November 2020