SQL Formatting & Guidelines

I've never really seen a decent standard on writing SQL statements, that may be because I haven't looked for any standards or I didn't like what I have seen. For the most part when reading SQL statements written in stored procedures, SQL text files, embedded in data access layers etc, I cringe most times realizing the smashed blob of characters is actually a SQL statement.

 

I have designed my own personal formatting guidelines that have worked really well in writing, maintaining and most importantly understanding SQL statements well into the future. All it honestly takes is a little discipline and extra care when writing them in the first place (no matter what dialect of SQL you happen to be using).

Typical Bad Example:

select p.*, t.Name, s.Title as MyTitle from 
Product p, join Producttype t on t.productID = p.ProductID
join sales s on s.productid = p.productid where s.qty > 500 order
by p.name

 

I may have embellished that a little, but the fact of the matter is that most SQL I've run across in open source projects, commercial projects, and some of my own old code, looks something along these lines - basically a mess. This is a rather simple example, many SQL statements are far more complex and writing them this way just compounds the problems of maintaining and understanding the SQL in the future.

SQL Formatting Guidelines


1. Consistency

 

If you are going to name tables as plural (such as Products as opposed to Product), simply be consistent with those naming conventions. I won't get into a war over whether you should choose plural or singular names, it honestly doesn't matter, as long as you're consistent in the long run, name all tables in a plural or singular fashion.

Some folks like to throw underscores between joiner tables such as Product_Cart, others like to use camel caps such as ProductCart. While I would suggest using camel caps over the former version just because it eliminates the need to type a pesky underscore, again it really doesn't matter, just be consistent, don't name one joiner table Product_Cart and another ProductSale.

Consistency boils down to just that, pick a convention, in naming, structure, etc, and stick with it, and ensure all team members adhere to it.

2. Use of Case

While SQL statements can be written in lower and/or upper case characters, it is better to pick one and go with it, this ties in with consistency. I personally suggest staying away from upper case, due to the fact that it is reliant on caps lock or using shift which I find annoying and also for the mere fact that it can have an impact when filtering your statements, its easy to forget you use caps and didn't mean to etc. However, some people like to upper case the SQL keywords and keep everything else lower case and many variations beyond that. Whatever your preference is, be consistent with it.

3. Structure

A solid structure is important in writing understandable code and the same applies to writing SQL. I always break up SQL statements on the major parts of a SQL statement, namely the SELECT, FROM, JOIN, WHERE, GROUP BY and ORDER BY phrases. These should all appear on the left side of your statements, and the information being operated on by these keywords should show up on the right, very similar to writing in virtually any programming language. I also indent between the left and right sides of the statement so that each side lines up visually when looking at the statement. Statements can be written over multiple lines, while it may seem ugly (with all the whitespace, tabs etc), its also human readable which is more important, the SQL engine really doesn't care and that little bit of extra space isn't honestly going to have a profound impact on speed.
For more advanced SQL, those involving derived tables and such, the rules can become a little harder to follow, but try your best to stick to them. Indent sub-queries within the parent query, so as to set them off visually.

4. Proper Use of Alias'

Every major SQL dialect supports name alias' of some kind. While alias' have their purpose, for the most part I notice they are used incorrectly and for completely the wrong reasons. The majority of the time, folks writing SQL statements will begin joining in other tables and immediately apply an alias 'p' to the table Product, for really no other purpose than saving them from typing six more characters every time they have to reference that table. Alias' should not be used to save you typing, they should be used to better clarify what a table represents (they are better applied to more advanced SQL statements, particularly those with derived tables or when joining on the same table multiple times). Changing a bunch of legible table names into a variety of one or two letters alias' for the mere purpose of saving keystrokes is just bad practice, its lazy and doesn't help anyone but the person writing it at that moment, even he won't understand what he meant without analyzing it for some time in six months when it comes time to modify it. Above all, when using alias' use meaningful names. It's always been considered bad practice to name a variable 'myvar' or 't', again, the same applies to SQL alias names, make them meaningful or don't use them.

Good Example:

select		Product.Name,
		Product.Price,
		Product.Dimensions,
		ProductType.Name,
		Sale.Title
from		Product
join		ProductType on
		ProductType.ProductID = Product.ProductID
join		Sale on
		Sale.ProductID = Product.ProductID
where		Sale.Qty > 500
order by		Product.Name

 

While it may look a little more cumbersome and bloated (it is), I think it more than shows how much better SQL statements can be written with just a few rules to follow. Obviously more goes into solid database design (which cascades into solid SQL statement design) but these few suggestions can go a long way to writing cleaner and more understandable SQL statements. It becomes even more useful when profiling SQL statements at runtime, so you can easily copy/paste them into a SQL analyzer tool without the laborious job of converting sloppy SQL to readable SQL come time to debug an issue.

Regardless of whether you find these suggestions useful or not, its certainly worth taking some time to analyze how you write your own code as well as SQL statements to see what kind of things you can change about your writing style to save yourself time now and later when it comes time to fix a bug, optimize a statement or altogether modify it due to new requirements. Above all, while saving keystrokes now might seem like a good idea, its worth the extra 30 seconds typing now, to save yourself another 15-30 just trying to understand it later.

It's always worth it to be aware of your habits and practices in order to identify simple ways to improve how you work. Little changes and tweaks can go along way to being more productive and saving yourself time and grief later.

blog comments powered by Disqus
What can we do for you?

About

A blog by InetSolution about programming, security, design and marketing for banks, credit unions and e-commerce.

Subscribe to our feed