SQL Server Query Optimization – Sargable Functions

When a function is used as part of a SQL join or predicate, if the query optimizer is able to make efficient  use of indexes to speed up the query, then the function is sargable (Search ARGument ABLE). Not all Transact-SQL functions are sargable. So care must be taken to avoid non-sargable functions when creating queries.

In this blog I’m going to demonstrate some workarounds for non-sargable pitfalls.

Setup

Let’s create a table, load it with data (10,000 rows), and create some non-clustered indexes to target when optimizing our queries.

CREATE TABLE Product (
	ProductId int IDENTITY(1,1) NOT NULL,
	ProductName nvarchar(100) NOT NULL,
	ProductLine nchar(2) NULL,
	CreatedDT datetime,
	CONSTRAINT PK_Product PRIMARY KEY CLUSTERED
	(
		ProductId ASC
	)
) ON [PRIMARY];
GO

INSERT INTO Product (ProductName, ProductLine, CreatedDT)
VALUES		('Product 1', 'L1', DATEADD(year, -1, GETDATE())), (' Product 1 ', 'L1', DATEADD(year, -2, GETDATE())),
			('Product 2', 'L1', DATEADD(year, -3, GETDATE())), (' Product 2 ', 'L1', DATEADD(year, -4, GETDATE())),
			('Product 3', 'L2', DATEADD(year, -5, GETDATE())), (' Product 3 ', 'L2', DATEADD(year, -6, GETDATE())),
			('Product 4', 'L2', DATEADD(year, -7, GETDATE())), (' Product 4 ', 'L3', DATEADD(year, -8, GETDATE())),
			('Product 5', 'L4', DATEADD(year, -9, GETDATE())), (' Product 5 ', NULL, DATEADD(year, -10, GETDATE()));
GO 1000

CREATE NONCLUSTERED INDEX IX_Product_ProductName ON dbo.Product
(
	ProductName ASC
) INCLUDE (ProductId) ON [PRIMARY];

CREATE NONCLUSTERED INDEX IX_Product_CreatedDT ON dbo.Product
(
	CreatedDT ASC
) INCLUDE (ProductId, ProductName) ON [PRIMARY];

CREATE NONCLUSTERED INDEX IX_Product_ProductLine ON dbo.Product
(
	ProductLine ASC
) INCLUDE (ProductId, ProductName) ON [PRIMARY];

LTRIM and RTRIM

First off, we  want to optimize a query that returns Product records with ProductName equal to “Product 1”. But note that a subset of the products are prefixed and suffixed with spaces, and we also want to return these.

Here’s how NOT to do it –

SELECT	ProductId, ProductName
FROM	Product
WHERE	LTRIM(RTRIM(ProductName)) = 'Product 1';

We are targeting the IX_Product_ProductName index, and the query plan shows that the optimizer has chosen to use this index.

query1

However, it is scanning all 10,000 rows of the index in order to return the expected 2000. The LTRIM and RTRIM functions are not sargable.

If ProductName was only prefixed with spaces, and not suffixed, then we could optimize the query by using the LIKE operator.

SELECT	ProductId, ProductName
FROM	Product
WHERE	ProductName LIKE 'Product 1%';

But as we need to account for the suffixed spaces, this will not do for us.

To optimize the query, we add a computed column which uses the LTRIM and RTRIM functions to strip the spaces, and then create a non-clustered index on this column –

ALTER TABLE Product ADD ProductNameTrimmed AS LTRIM(RTRIM(ProductName));

CREATE NONCLUSTERED INDEX IX_Product_ProductNameTrimmed ON dbo.Product
(
	ProductNameTrimmed ASC
) INCLUDE (ProductId, ProductName) ON [PRIMARY];

If we now modify the query to use the computed column in the predicate –

SELECT	ProductId, ProductName
FROM	Product
WHERE	ProductNameTrimmed = 'Product 1';

The query optimizer is now able to perform a seek on the non-clustered index –

query2

The number of logical reads has reduced from 50 to 16, and the execution time from 193ms to 67ms. And if the query returned a smaller subset of records, then the performance improvement would be even greater.

DateTime Functions

If we now query on a specific CreatedDT datetime –

SELECT	ProductID, ProductName, CreatedDT
FROM	Product
WHERE	CreatedDT = '2014-11-21 14:08:42.593';

The optimizer performs an efficient seek on the IX_Product_CreatedDT index.

However, if we want to return rows for a date, irrespective of time, we may try the following widely used query –

SELECT	ProductID, ProductName, CreatedDT
FROM	Product
WHERE	DATEADD(dd, 0, DATEDIFF(dd, 0, CreatedDT)) = '2014-11-21';

But the DATEADD and DATEDIFF functions are not sargable so the query optimizer is unable to perform a seek on the IX_Product_CreatedDT non-clustered index. Instead it scans the clustered index, reading all 10,000 rows.

Instead let’s try the following –

SELECT	ProductID, ProductName, CreatedDT
FROM	Product
WHERE	CONVERT(date, CreatedDT) = '2014-11-21';

The query optimizer is now able to perform the seek on the index.

query3

The logical reads have Now lestbeen reduced from 54 to 8. The CONVERT function is this case is sargable.

If we look at the details of the index seek, we see that the query optimizer is using the following seek predicate –

Seek Keys[1]: 
Start: [Sandbox].[dbo].[Product].CreatedDT > Scalar Operator([Expr1005]), 
End: [Sandbox].[dbo].[Product].CreatedDT < Scalar Operator([Expr1006])

So it is translating the CONVERT predicate into a greater than and less than query. This gives us a clue as to another method for enabling the optimizer to efficiently utilize the index –

SELECT	ProductID, ProductName, CreatedDT
FROM	Product
WHERE	CreatedDT > '2014-11-21' AND CreatedDT < '2014-11-22';

Now let’s try and return all rows created in year 2014 –

SELECT	ProductID, ProductName, CreatedDT
FROM	Product
WHERE	YEAR(CreatedDT) = '2014';

Not surprisingly, the query optimizer scans all rows as the YEAR function is not sargable.

To enable the optimizer to use the IX_Product_CreatedDT index we can re-write the query as follows –

SELECT	ProductID, ProductName, CreatedDT
FROM	Product
WHERE	CreatedDT > '2014-1-1' AND CreatedDT < '2015-1-1';

query4

The number of logical reads is reduced from 54 to 8.

ISNULL and COALESCE

The following query returns all Product records with ProductLine equal to “L4”.

SELECT	ProductID, ProductName, ProductLine
FROM	Product
WHERE	ProductLine = 'L4';

The query optimizer performs an efficient seek on the IX_Product_ProductLine index.

query5

The ProductLine column is nullable, and if we want to return these as well we could use a ISNULL or COALESCE statement as follows –

SELECT	ProductID, ProductName, ProductLine
FROM	Product
WHERE	COALESCE(ProductLine, 'L4') = 'L4';

However, the optimizer now chooses a non-optimal scan on the index.

query6

The COALESE function, and also the ISNULL function, is not sargable. The fix is simply to change the predicate to use an OR .. IS NULL as shown below –

SELECT	ProductID, ProductName, ProductLine
FROM	Product
WHERE	ProductLine = 'L4' OR ProductLine IS NULL;

The optimizer now chooses an optimal seek on the index.

query7

Rule of Thumb

Sargable functions in SQL Server are few and far between. As a rule of thumb, I’d recommend that SQL Server developers, without a great deal of experience in optimizing queries, err on the side of caution by not applying functions, or operators, to the columns used in joins or predicates. If there is a imperative to do so, then consider whether there are alternative such as computed columns.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s