Monday, July 16, 2012

Difference between COALESCE and ISNULL

I have often been asked about the difference between ISNULL and COALESCE. In most cases they work the same, but there are some fundamental differences. Let's review some of them...

ISNULL is a T-SQL function while Coalesce is an ANSI Standard.

ISNULL only takes two arguments and basically evaluates the first argument for null value, if true, it returns the second value back. For example, if you execute the following command

SELECT ISNULL(NULL,NULL) you will get NULL as your result set. Now try executing the same statement with COALESCE -- SELECT COALESCE(NULL,NULL), you will receive the following error

At least one of the arguments to COALESCE must be a typed NULL. 

What does this error mean? Does it mean you can't pass two null arguments to Coalesce? What it really means is that at least one of the argument must be of a particular data type even if it is NULL. For Example, I can do the following..

DECLARE @First Varchar(10)
DECLARE @Second varchar(10) and execute the following statements, they will all work fine.
SELECT COALESCE(@First,@Second)
SELECT COALESCE(NULL, @Second)
SELECT COALESCE(@First,NULL)

Let's run another test...

DECLARE @First char(2) and run the following statements
SELECT ISNULL(@First,'This is a Test') -- returns Th
SELECT COALESCE(@First,'This is a Test') -- returns This is a Test

The data type in ISNULL is determined by the first argument. In this case, the data type of the first argument is Char(2) so it only returns 2 characters of the non-null argument which is the second argument. Coalesce on the other hand determines the data type based on the precedence. Check out the following article about data type precedence ( http://msdn.microsoft.com/en-us/library/ms190309.aspx).

Depending on how you use one or the other, you could actually get different results or an error. For example, if I run the following code...

DECLARE @First varchar(4)='Test'
DECLARE @Second INT = 1
SELECT ISNULL(@First,@Second)
SELECT COALESCE(@First,@Second)


ISNULL will work just fine but COALESCE will generate an error. This is because an INT has higher precedence than varchar and COALESCE will try to convert @First to INT, generating an error. If @First is NULL, then it will work fine and return you the value of @Second.

COALESCE can take more than 2 arguments and will work like Switch/Case (Select/Case) working down the cases until it finds the non-null value or until it has evaluated all the arguments.

Performance wise they both are same in most situations, except there are cases where SQL Server generates different execution plan based on whether you use ISNULL or COALESCE. This generally happens when you are using scalar sub-queries as part of your arguments to ISNULL or COALESCE.

Thank you.

Monday, July 9, 2012

T-SQL Best Practices

When writing a T-SQL either as an ad-hoc query or a stored procedure, besides ensuring the queries are fully optimized, you should keep few things in mind. Although none of these are required, but trust me someone reading your queries in the future will thank you for it, and it may even save you a headache in the future. I don't have many, but there are few guidelines / best practices that I try to adhere to.

Indentation
 Properly Indenting your query makes it so much easier to read and / or troubleshoot. 
 For example, consider the following query..
SELECT FirstName,LastName,EmailAddress,Address1,Address2,Phone,Fax FROM Customers


Can be rewritten as...
SELECT
    FirstName,
    LastName,
    EmailAddress,
    Address1,
    Address2,
    Phone,
    Fax
FROM Customers

It is certainly easier to read and you can quickly figure out what columns are being selected. You can make it even more useful by making the following change...


SELECT
    FirstName
    ,LastName
    ,EmailAddress
    ,Address1
    ,Address2
    ,Phone
    ,Fax
FROM Customers

By using "," as a prefix in subsequent columns, you can easily comment out a line containing the column that you don't want your query to return instead of having to rearrange it.


Using Square Brackets
Although square brackets are not necessary unless you have a space in column name, using it ensures that if your column name ever becomes a reserved word in SQL, you query won't break. For example, you can write the above query as follows...

SELECT
    [FirstName]
    ,[LastName]
    ,[EmailAddress]
    ,[Address1]
    ,[Address2]
    ,[Phone]
    ,[Fax]
FROM [Customers]

Using Statement Terminators
Using semicolon (;) at the end of your SQL Statement. If you use C# or C++ or Java for application development, this may be second nature, but not so much for VB Developers. Although SQL Server currently doesn't require semicolon in most cases, it may change in the future as it is part of the ANSI SQL-92 standard. So to avoid future headaches, always use semicolon. Thus the proper way to write the above query would be...

SELECT
    [FirstName]
    ,[LastName]
    ,[EmailAddress]
    ,[Address1]
    ,[Address2]
    ,[Phone]
    ,[Fax]
FROM [Customers];


If you have multiple statements in your query, use semicolon after each one. For example, I can add SET NOCOUNT ON in my query above.

SET NOCOUNT ON;
SELECT
    [FirstName]
    ,[LastName]
    ,[EmailAddress]
    ,[Address1]
    ,[Address2]
    ,[Phone]
    ,[Fax]
FROM [Customers];

Capitalize all T-SQL Keywords
Capitalizing T-SQL Keywords such as SELECT, FROM, WHERE, JOIN etc. makes it so much easier to read your query and to differentiate between the T-SQL keywords and your own columns etc.

Happy Programming!




Friday, July 6, 2012

SQL SERVER - Little things can save you a big headache

Recently I came across a situation where someone was running a query, basically copying data from one table to another.

The query was simple enough. Basically it was using a fast forward cursor, copying one record from one table into another table. Using SCOPE_IDENTITY, the query retrieved the identity value of the newly inserted record and then inserted another record in a child table, using newly generated identity key. There were several hundred millions of records in the table from which the data was being moved to a new table. Whether this could have been done another way is not the topic for today.

In this particular instance, the query was running in Query Analyzer and it will run for a while, but then will throw an out of memory exception. Since the query didn't track what records were inserted already in the new table before out of memory exception was thrown, the developer truncated the new table and started all over again, only to hit the same issue again.

The reason query was throwing out of memory exception was because the query didn't have SET NOCOUNT ON clause at the beginning. As a result, the query was outputting messages for each statement in this query, not only resulting in memory consumption but also degrading the performance and increasing the network traffic. SET NOCOUNT ON basically prevents the sending of DONE_IN_PROC messages to the client.

The query need not run in a loop like above or in query analyzer to make use of SET NOCOUNT ON. If you don't need messages being returned from the SQL Server or aren't capturing them, be sure to add this clause to your stored procedures.

Thank you.

Monday, July 2, 2012

SQL Server - Degree of Parallelism

In SQL Server 2008, Microsoft introduced a concept called max degree of parallelism. Max degree of parallelism allows the engine to take advantage of all the available processors on the server up to 64 processors to execute a single SQL statement. By default, SQL Server is configured to use all available processors, however, there may be instances where you want to limit the number of processors a query should use. For example, you may have a low priority SQL agent job running in the background and you want to limit it to only use one processor. You can set max degree of parallelism to 1.

The query below will only use up to 2 processors for parallel operation regardless of the number of processors available.

USE AdventureWorks
GO
SELECT ProductID, OrderQty, SUM(LineTotal) As Total
FROM Sales.SalesOrderDetail 
WHERE UnitPrice < $10.00
GROUP BY ProductID, OrderQty
ORDER BY ProductID, OrderQty
OPTION (MAXDOP 2)
GO

To set the option for SQL Server to always use up to X number of processors, you can use SP_Configure.

For example, the following query forces SQL engine to only use 4 processors.

USE Master
GO
SP_Configure 'Show Advanced Options', 1
GO
RECONFIGURE
GO
SP_Configure 'Max Degree of Parallelism', 4
GO
RECONFIGURE
GO

Default "Max Degree of Parallelism" option is 0, which means use all available processors up to 64.

Thank you.