If you Google “Temp Tables vs Table Variables” you will come up with a myriad of results all of which contain excellent information. Some examples are:
SQL Server Planet – gives a fantastic matrix of the various things that can be accomplished through each option
Stackoverflow – Rory and Stackoverflow – Martin Smith – each of which gives you insight into the different advantages and disadvantages of one vs. the other. Things like scoping, transactional behavior, indexing, recompilation, locking and statistics. This is a tremendous amount of information and can really help you learn the internals of the two options.
Code Project – gives a nice little run through of the performance of each type with time to complete for the varying record set lengths
Pretty much anything that you could want to know about temp tables and table variables you can find in one of these links.
What I want to discuss here is why it makes such a big difference on these larger data sets, as each of the above links clearly states is under what circumstances you do not want to use table variables. Below I have a very simple set of queries that will demonstrate what I see as the biggest issue with larger datasets being used with table variables.
First I will create my table variable and my temp table with the same dataset:
USE AdventureWorks2014; GO Declare @tbl table (SalesOrderID int , CarrierTrackingNumber varchar(40) , OrderQty int , ProductId int , UnitPrice float); insert into @tbl select SalesOrderID , CarrierTrackingNumber , OrderQty , ProductID , UnitPrice from [Sales].[SalesOrderDetail]; if object_id('tempdb..#t') is not null drop table #t; Create table #t (SalesOrderID int , CarrierTrackingNumber varchar(40) , OrderQty int , ProductId int , UnitPrice float); insert into #t select SalesOrderID , CarrierTrackingNumber , OrderQty , ProductID , UnitPrice from [Sales].[SalesOrderDetail];
And from there I will select out the records, doing a simple join over to a second table that has a direct reference within the temp table or table variable.
Select t.SalesOrderID, soh.AccountNumber, soh.BillToAddressID, soh.OrderDate from @tbl t join Sales.SalesOrderHeader soh on t.SalesOrderID = soh.SalesOrderID; Select t.SalesOrderID, soh.AccountNumber, soh.BillToAddressID, soh.OrderDate from #t t join Sales.SalesOrderHeader soh on t.SalesOrderID = soh.SalesOrderID;
Using SQL Sentry Plan Explorer we can see that first off the Estimated cost for the Select against the table variable is less than that of the select against the Temp table. In my case, this is against a table of 120,000 rows.
Don’t listen to that, because it is not always accurate. It is just an estimate even if you have the actual execution plan.
So in looking at the Query Plan’s we see that for the table variable
This looks like a pretty good plan, right? Clustered index seek on the real table, nested loops to bring it together…nothing alarming about this at first glance.
Heck, compare it to the plan for the temp table
And it makes it almost seem better. This has a table scan against the real table, a hash match and it even has that darn warning (in this case that is just for a missing index suggestion but at first glance you might think this is worse).
HOWEVER, this is actually the problem. The estimated cost makes it seem better, the plan even kind of looks better at initial glance but the reality of it is that this is a MUCH WORSE plan. And why does it do this?
Because that table variable ALWAYS has an estimated row count of 1 row if you do not put optimizer hints on the query. And because of that, the optimizer will assume that doing index seeks instead of scans is a much better approach. In a case like this one where we have over 100,000 rows, that is most assuredly NOT the case.
That Index seek with the nested loops to bring it together ends up doing over 360,000 reads in total, where the table scan with the hash match ends and the “warning” only does 1,246.
We can fairly easily remedy this situation by putting a simple Option (Recompile) at the end of our query that accesses that table variable as you can see below. And when that is done, the query plan that the optimizer creates ends up being the exact same as that of the Temp Table.
Select t.SalesOrderID, soh.AccountNumber, soh.BillToAddressID, soh.OrderDate from @tbl t join Sales.SalesOrderHeader soh on t.SalesOrderID = soh.SalesOrderID option (recompile);
This is a good solution but it is one that comes at the expense of expecting everyone who writes these queries to know this fact and how to get around it.
Table variables have their place but that estimated number of rows being equal to 1 can definitely have an impact on the types of query plans the optimizer will choose. So when you are making that decision, take this into account and choose wisely or remember that Option (Recompile).