Cyber Security: SQL Injection
With GDPR regulation ambling its way into the spotlight in recent years, insurers are under pressure to demonstrate just how much of a priority it is. In come carefully worded statements to clients about their data, how it is kept and disposed of. In come Data Protection policies and training videos for staff.
Meanwhile, the Venn diagram intersection of those who feel strongly about GDPR and those who feel prepared to share their sense of humour with the world have stirred into action, fighting to become the first person on Twitter to note that Santa’s naughty/nice list is not in strict keeping with GDPR regulations. He should be looking to create a robust Data Protection policy if he wishes to continue distribution operations in the EEA.
For this author, however, GDPR has just become a terrific excuse to not remember any of their friends’ names or personal details without explicit written consent.
Yet while companies improve data processes, cyber-attacks continue to be a threat. In this article, we will introduce a very basic type of cyber-attack, explore its theory, and how we might combat it.
What is SQL Injection?
To explain this, it is necessary to go back a step or two to databases. Most companies hold data on some form of DBMS (Database Management System). A database is a structured way of holding information – for example, using tables. When it is necessary to retrieve information from a database, a user would need to write some form of query, a question to send the database which will return an answer. This answer could be one word, or it could be an entire table. This query has to be recognizable by the DBMS; the most common language used to communicate with it is called SQL (Structured Query Language).
SQL Injection is a term used to refer to the placement of malicious code in SQL statements. This can lead to very serious consequences when interacting with DBMS, such as:
- The access of sensitive data.
- The destruction of database structures.
- Mistaken authentication; someone could log into an application without needing to provide proper credentials.
Despite potentially having very dangerous consequences, SQL injection is simple in theory.
How Does it Work?
Most instances of SQL injection happen in authentication processes (i.e signing-in). Let’s imagine this scenario:
- There’s a database containing a list of famous tennis players. It contains bank account holdings split by funds. For the purposes of this article, please note all information is randomized.
- You would like to see the bank account information of all the tennis players in the list.
Imagine there’s an online platform to sign in – in theory, a tennis player wanting to see their balance could just put in the right credentials. Let’s walk through how this might work in practice:
First, imagine a log-in form; it should look something like the below, and should be very familiar.
Usually, a user would enter the details above, and press the log in button. Then, the code would look inside the database to check the user’s details exist. If it finds a positive match, the user would be allowed to log in and access the online portal, including their bank balance. In this case, we’ve used two-time Wimbledon champion Andy Murray.
At this point, the web ‘server’ would ping a message to the database, and say something like “hey, here are my details, can you log me in?”. The database would check it has that username/password combination, and, it does! So it logs him in, and returns his bank balance information:
It looks really simple in theory. Under the hood, what’s happening is the database provides an SQL ‘query’, basically a question written with a specific syntax, for the log-in portal to fill in. Here’s an example of what it would look like:
SELECT Name, FundCode, FundValue FROM FundTable WHERE Username = ‘InsertUsername’ and Password = ‘InsertPassword’
Then, the log-in portal would put in the credentials above, replacing the InsertUsername and InsertPassword with those credentials:
SELECT Name, FundCode, FundValue FROM FundTable WHERE Username = ‘AMurray’ and Password = ‘**********’
That query then gets sent to the database, which uses it to check the username and password against a list of ‘allowed users’. It finds a match, so it returns the bank balance linked to that user. The important thing to remember is, if you had put in the wrong details (e.g misspelled a password, like this author does apparently on every possible occasion), the database would have not found a match and would return something empty.
Now, going back to the scenario we’ve laid out, you want to see the details of every player, not just one. So you need to find a way to put in a username/password that will fool the database into thinking you’ve asked it for all players’ information and will return it. For this purpose, you’d need to enter something like this:
Let’s try to unpack why this works! Imagine the same query as before:
SELECT Name, FundCode, FundValue FROM FundTable WHERE Username = ‘InsertUsername’ and Password = ‘InsertPassword’
Remember that what the user enters in the Username box in the picture above replaces the word InsertUsername, and similar for the Password box. If we go ahead and do this, using what’s in that box above:
SELECT Name, FundCode, FundValue FROM FundTable WHERE Username = ‘’ OR 1=1’ and Password = ‘’
To understand why this works, we need to briefly talk about the rules of SQL (i.e syntax):
- A double dash ‘
’ is used to comment out everything after it on the same line. In other words, the database will see it, and ignore everything that comes after. Developers use it to commentate long SQL scripts to make understanding easier.
- OR works just like in English; if we are checking a condition to be true, and include it, either of the clauses around it being true would make the whole thing true. For example, “Roll a die. If you roll a 3 or a 4, you win”.
- 1=1 is a statement that is always true, like “a bird is a bird” or “batman is the best superhero”.
Using the first point above, we ignore everything after the double dash to simplify, since the database thinks it’s just a comment:
SELECT Name, FundCode, FundValue FROM FundTable WHERE Username = ‘’ OR 1=1
Now, that conditional check is always true, because 1 is always equal to 1, so it doesn’t matter what the username is. Every single line in the database has a username, but every single line meets that condition that the username is either blank or 1=1. So the database believes (wrongly) that the user is validated to see every line. Thus, it returns this:
So, what have we learned? We’ve learned that by having a bit of knowledge of the query language, we can fool a database into validating a user when it shouldn’t, and returning all of its information here. In this case, that data is very sensitive, and could be used to harm people (if it weren’t all fabricated for this example). It’s important that there is good protection against this.
Are there other types of injection?
Yes. What is illustrated in the example above is a way to get information from the database without being a valid user. It’s also possible to:
- Delete tables in the database if the hacker knows what they are named.
- Obtain information about the database structure and type by examining error messages from the host application.
- Change values within database tables if the appropriate permissions are not well set up.
- Gain administrative privileges in a log in.
As should be clear, such a simple process can lead to a lot of dangerous outcomes. In fact, it is estimated that SQL injection attacks form two thirds of all web application attacks.
How can we stop it?
Unsurprisingly, with such a simple method of attack, there are a number of measures that can combat the possibility of it.
1: Encryption
It’s relatively simple to require that the user input be encrypted according to some security hash. That way, when the user inputs the potentially malicious string, it doesn’t go directly into the SQL query; instead, it gets run through an encryption, becomes benign, and then user authentication will simply fail. For more details, please see this article on encryption:
2: Use Parameters to add a layer of security
There’s a point in the application where it uses the information provided by the user to complete the SQL query. At this point, the injection outlined above would work because a simple replace-text method is used to complete the query. However, many languages have safer ways of interacting with databases, where the code must use the user input to create a parameter that must go in a specific place of the query. In doing so, parameters tend to have certain data validation requirements, and so an issue with the input “Username” in this case would be flagged as an error long before the query was run.
It is also possible to perform some code-side validation when building the application. This is, however, not recommended, as it will be difficult to ensure all methods of attack are covered.
3: Test any self-built applications thoroughly
Ensure that, when applications which interact with a database are self-built, an appropriate level of rigour is applied in testing. For example, try submitting the single-quote character ‘ in an authentication box, or attempt to submit self-true statements such as 1=1 (as above). By trying to SQL inject an application, it’s possible to find where it is vulnerable to attack.
Note that some applications are naturally resistant to some SQL injection. Microsoft Access, for example, a defunct and largely unused database structure, doesn’t allow comments in its SQL queries, and won’t allow semicolons either (semicolons tend to allow a user to execute two separate SQL statements, one after the other). It’s important to know the syntax of your database language.
4: Whitelist or Blacklist
Blacklisting refers to the practice of banning the entry of certain characters which may be problematic. Whitelisting is an even more security-conscious method whereby only certain characters are approved for use.
5: Get Advice!
There are third-party consultants who can provide helpful advice on routes of attack, possible mitigation strategies and perhaps even provide a ready-made application which has such strategies implemented. It is often useful to have an expert look over applications which, for one reason or another, are particularly exposed to security issues.
Summary
SQL injection is an incredibly simple method of attack where a malicious user can cause very serious consequences; it’s considered to be one of the most common forms of attack on web-based applications. It works by taking advantage of SQL, a language which communicates with databases, primarily through attempting to fool a database into returning information it shouldn’t be returning. While this article illustrates a simple example of how it might work in action, there are many ways of reducing the risk of it happening. These can range from common sense when building and testing applications to the use of more sophisticated techniques.
John Nicholls
May 2022