Sunday 20 November 2011

Why parameterized queries stop SQL injection attacks

An updated version of this article can be found here (How do prepared statements protect against SQL Injection?)

I've recently got a new job, and as such was having to go through a lot of documentation, and 'recommended reading' (which I actually read because I had so much free time), but one of the many things was various type of vulnerabilities and how they work, and surprisingly they told you various ways to actually help against these attacks.
Now, the other day I was down the pub, talking with a friend who has recently got a developer job (we're both just out of university, thus many people I know either have just got a job, or will be hopefully in the near future), and he didn't understand about parameterized queries, and how they actually stopped SQL attacks. Now this I found totally understandable, as when articles talk about parameterized queries stopping SQL attacks they don't really explain why, it's often a case of "It does, so don't ask why" -- possibly because they don't know themselves. A sure sign of a bad educator is one that can't admit they don't know something. But I digress.
When I say I found it totally understandable to be confused is simple. Imagine a dynamic SQL query

sqlQuery='SELECT * FROM custTable WHERE User=' + Username + ' AND Pass=' + password

so a simple sql injection would be just to put the Username in as ' OR 1=1--
This would effectively make the sql query:

sqlQuery='SELECT * FROM custTable WHERE User='' OR 1=1-- ' AND PASS=' + password

This says select all customers where they're username is blank ('') or 1=1, which is a boolean, equating to true. Then it uses -- to comment out the rest of the query. So this will just print out all the customer table, or do whatever you want with it, if logging in, it will log in with the first user's privileges, which can often be the administrator.

Now parameterized queries do it differently, with code like:

sqlQuery='SELECT * FROM custTable WHERE User=? AND Pass=?'

parameters.add("User", username)
parameters.add("Pass", password)

where username and password are variables pointing to the associated inputted username and password

Now at this point, you may be thinking, this doesn't change anything at all. Surely you could still just put into the username field something like Nobody OR 1=1'--, effectively making the query:

sqlQuery='SELECT * FROM custTable WHERE User=Nobody OR 1=1'-- AND Pass=?'

And this would seem like a valid argument. But, you would be wrong.

The way parameterized queries work, is that the sqlQuery is sent as a query, and the database knows exactly what this query will do, and only then will it insert the username and passwords merely as values. This means they cannot effect the query, because the database already knows what the query will do. So in this case it would look for a username of "Nobody OR 1=1'--" and a blank password, which should come up false.

This isn't a complete solution though, and input validation will still need to be done, since this won't effect other problems, such as XSS attacks, as you could still put javascript into the database. Then if this is read out onto a page, it would display it as normal javascript, depending on any output validation. So really the best thing to do is still use input validation, but using parameterized queries or stored procedures to stop any SQL attacks.


  1. This is an excellent explanation but my eyes are bleeding.

  2. Handy, seems to be quite hard to find a straight-forward explanation of *how/why* parameterized queries make the difference as opposed to 'just use them' - I think this explains it nicely.

  3. I cannot thank you enough for that. as you said most of the sites were like "it works, dont ask why"

  4. Good simple explanation, I've been trying to understand this for awhile. Thank you

  5. thanks for the wonderful explanation. it was simple, and to the point. went through 4 websites until i stumbled across yours, and understood how this helps :)

  6. Thanks for the understandable explanation!
    Is there any advantage in using 'parameters.add' compared to 'getRequestString' as described here: in the section 'Parameters for Protection' or are they comparable.

    1. They're just different language implementations, so should effectively work the same. My example was merely a generic example of what it may look like.

      Some implementations may use the OO implementation like the one I suggested and others may go the way Microsoft have with a dynamically written method signature (having a variable number of optional parameters ..similar to how the printf series of functions in C are created)