中文
这个问题在最近写存储过程时困扰了N久,用BAIDU查遍国内的各网站使用没有一个好的解决方案,最终通过GOOGLE搜索关键字"SQL SERVER DYNAMIC SQ"L,再通过各国外论坛才找到一个不起眼的网站的某页面.
http://www.sommarskog.se/dynamic_sql.html
文章的标题是<The Curse and Blessings of Dynamic SQL>,由此可见此问题让人恨,而一旦解决就能让人爱哩......
正如文章中所称,该文是为了解决诸如此类的问题:
SELECT * FROM @tablename
SELECT @colname FROM tbl
SELECT * FROM tbl WHERE x IN (@list)
而我最头疼的就是第一条,试图通过 select @name=name from @tablename来获取某动态表的变量.但文章中明确指出,同时也是我试了N次得到的结论
动态SQL只能通过如
@SQL='select @name=name from '+@tablename
要运行这条,大多数人都是通过EXEC(@SQL),但是,@name类似程序中局部变量,生存期只有在EXEC中.无法在SQL存储过程中使用.
在文章中终于找到解决办法......
范例程序
DECLARE @sql nvarchar(4000), -- nvarchar(MAX) on SQL 2005.
@col sysname,
@min varchar(20)
SELECT @col = N'au_fname'
SELECT @sql = N'SELECT @min = convert(varchar(20), MIN(' + @col +
N')) FROM authors'
EXEC sp_executesql @sql, N'@min varchar(20) OUTPUT', @min OUTPUT
SELECT @min
终于可以解决哩.......但要注意的是,sp_executesql后的参数只能是NVARCHAR,VARCHAR之类的字符型.
终于解决了.......
该文中还有一些防止SQL注入之类的讨论,以及对新手讲解SQL存储过程的意义何在.
做SQL数据库,并转向存储开发过程的人必看.....
技术,还是老外牛X......而且老外更倾向技术的交流,而国内号称有许多高手,真正乐意分享的太少太少......
英文原文
The Curse and Blessings of Dynamic SQL
An SQL text by Erland Sommarskog, SQL Server MVP.
An earlier version of this article is also available in Korean, German, Spanish and Vietnamese. Translations provided by ASP MVP Jongshin Kim, SQL Server MVP Frank Kalis, Simon Hayes and Tam Vu respectively.
If you follow the various newsgroups on Microsoft SQL Server, you often see people asking why they can't do:
SELECT * FROM @tablenameSELECT @colname FROM tblSELECT * FROM tbl WHERE x IN (@list)
For all three examples you can expect someone to answer Use dynamic SQL and give a quick example on how to do it. Unfortunately, for all three examples above, dynamic SQL is a poor solution. On the other hand, there are situations where dynamic SQL is the best or only way to go.
In this article I will discuss the use of dynamic SQL in stored procedures and to a minor extent from client languages. To set the scene, I start with a very quick overview on application architecture for data access. I then proceed to describe the feature dynamic SQL as such, with a quick introduction followed by the gory syntax details. Next, I continue with a discussion on SQL injection, a security issue that you absolutely must have good understanding of when you work with dynamic SQL. This is followed by a section where I discuss why we use stored procedures, and how that is affected by the use of dynamic SQL. I carryon with a section on good practices and tips for writing dynamic SQL. I conclude by reviewing a number of situations where you could use dynamic SQL and whether it is a good or bad idea to do it.
The article covers all versions of SQL Server from SQL 6.5 to SQL 2005, with emphasis on SQL 2000 and SQL 2005.
Contents:
Introducing Dynamic SQL
A First Encounter
sp_executesql
SQL Injection – a Serious Security Issue
Dynamic SQL and Stored Procedures
The Permission System
Caching Query Plans
Reducing Network Traffic
Encapsulating Logic
Keeping Track of what Is Used
Easiness of Writing SQL Code
Addressing Bugs and Problems
Good Coding Practices and Tips for Dynamic SQL
Use Debug Prints!
Nested Strings
Spacing and Formatting
Dealing with Dynamic Table and Column Names
Quotename, Nested Strings and Quotestring
QUOTED_IDENTIFIER
sp_executesql and Long SQL Strings in SQL 2000
Dynamic SQL in User-Defined Functions
Cursors and Dynamic SQL
Common Cases when to (Not) Use Dynamic SQL
SELECT * FROM @tablename
SELECT * FROM sales + @yymm
UPDATE tbl SET @colname = @value WHERE keycol = @keyval
SELECT * FROM @dbname + '..tbl'
SELECT * FROM tbl WHERE col IN (@list)
SELECT * FROM tbl WHERE @condition
Dynamic Search Conditions
SELECT * FROM tbl ORDER BY @col
SELECT TOP @n FROM tbl
CREATE TABLE @tbl
CREATE TABLE with Unknown Columns
Linked Servers
OPENQUERY
Dynamic Column Widths
Dynamic SQL and Maintenance Tasks
Acknowledgements and Feedback
Revision History
Note: many of the code samples in this text works against the pubs and Northwind databases that ship with SQL 2000 and SQL 7, but not with SQL 2005. You can download these databases from Microsoft's web site.
Accessing a Data from an Application
Before I describe dynamic SQL, I like to briefly discuss the various ways you can access data from an application to give an overview of what I'll be talking about in this article.
(Note: all through this text I will refer to client as anything that accesses SQL Server from the outside. In the overall application architecture that may in fact be a middle tier or a business layer, but as that is of little interest to this article, I use client in the sake of brevity.)
There are two main roads to go, and then there are forks and sub-forks.
- Send SQL statements from the client to SQL Server.
- Rely on SQL generated by the client API, using options like CommandType.TableDirect and methods like .Update.
- Compose the SQL strings in the client code.
- Build the entire SQL string with parameter values expanded.
- Use parameterised queries.
- Perform access through stored procedures.
- Stored procedures in
T-SQL - Use static SQL only.
- Use dynamic SQL together with static SQL.
- Stored procedures in a CLR language such as C# or VB .Net. (SQL 2005 only.)
- Stored procedures in
Fork
Many applications are built along the principles of fork
The main focus for this text is sub-fork
Finally, fork
Introducing Dynamic SQL
In this chapter I will first look at some quick examples of dynamic SQL and point out some very important implications of using dynamic SQL. I will then describe sp_executesql and
A First Encounter
Understanding dynamic SQL itself is not difficult. Au contraire, it's rather too easy to use. Understanding the fine details, though, takes a little longer time. If you start out using dynamic SQL casually, you are bound to face accidents when things do not work as you have anticipated.
One of the problems listed in the introduction was how to write a stored procedure that takes a table name as its input. Here are two examples, based on the two ways to do dynamic SQL in Transact-SQL:
CREATE PROCEDURE general_select1 @tblname sysname,@key varchar(10) ASDECLARE @sql nvarchar(4000)SELECT @sql = ' SELECT col1, col2, col3 ' +' FROM dbo.' + quotename(@tblname) +' WHERE keycol = @key'EXEC sp_executesql @sql, N'@key varchar(10)', @key
CREATE PROCEDURE general_select2 @tblname nvarchar(127),@key varchar(10) ASEXEC('SELECT col1, col2, col3FROM ' + @tblname + 'WHERE keycol = ''' + @key + '''')Before I say anything else, permit me to point out that these are examples of bad usage of dynamic SQL. Passing a table name as a parameter is not how you should write stored procedures, and one aim of this article is to explain this in detail. Also, the two examples are not equivalent. While both examples are bad, the second example has several problems that the first does not have. What these problems are will be apparent as you read this text.
Whereas the above looks very simple and easy, there are some very important things to observe. The first thing is permissions. You may know that when you use stored procedures, users do not need permissions to access the tables accessed by the stored procedure. This does not apply when you use dynamic SQL! For the procedures above to execute successfully, the users must have SELECT permission on the table in @tblname. In SQL 2000 and earlier this is an absolute rule with no way around it. SQL 2005 provides alternative ways, something I will come back to in the section The Permission System.
Next thing to observe is that the dynamic SQL is not part of the stored procedure, but constitutes its own scope. Invoking a block of dynamic SQL is akin to call a nameless stored procedure created ad-hoc. This has a number of consequences:
- Within the block of dynamic SQL, you cannot access local variables (including table variables) or parameters of the calling stored procedure. But you can pass parameters – in and out – to a block of dynamic SQL if you use sp_executesql.
- Any USE statement in the dynamic SQL will not affect the calling stored procedure.
- Temp tables created in the dynamic SQL will not be accessible from the calling procedure since they are dropped when the dynamic SQL exits. (Compare to how temp tables created in a stored procedure go away when you exit the procedure.) The block of dynamic SQL can however access temp tables created by the calling procedure.
- If you issue a SET command in the dynamic SQL, the effect of the SET command lasts for the duration of the block of dynamic SQL only and does not affect the caller.
- The query plan for the stored procedure does not include the dynamic SQL. The block of dynamic SQL has a query plan of its own.
As you've seen there are two ways to invoke dynamic SQL, sp_executesql and
sp_executesql
sp_executesql is a built-in stored procedure that takes two pre-defined parameters and any number of user-defined parameters.
The first parameter @stmt is mandatory, and contains a batch of one or more SQL statements. The data type of @stmt is ntext in SQL 7 and SQL 2000, and nvarchar(MAX) in SQL 2005. Beware that you must pass an nvarchar/ntext value (that is, a Unicode value). A varchar value won't do.
The second parameter @params is optional, but you will use it 90% of the time. @params declares the parameters that you refer to in @stmt. The syntax of @params is exactly the same as for the parameter list of a stored procedure. The parameters can have default values and they can have the OUTPUT marker. Not all parameters you declare must actually appear in the SQL string. (Whereas all variables that appear in the SQL string must be declared, either with a DECLARE inside @stmt, or in @params.) Just like @stmt, the data type of @params in SQL 7/2000 is ntext and nvarchar(MAX) in SQL 2005.
The rest of the parameters are simply the parameters that you declared in @params, and you pass them as you pass parameters to a stored procedure, either positional or named. To get a value back from your output parameter, you must specify OUTPUT with the parameter, just like when you call a stored procedure. Note that @stmt and @params must be specified positionally. You can provide the parameter names for them, but these names are blissfully ignored.
Let's look at an example. Say that in your database, many tables have a column LastUpdated, which holds the time a row last was updated. You want to be able to find out how many rows in each table that were modified at least once during a period. This is not something you run as part of the application, but something you run as a DBA from time to time, so you just keep it as a script that you have a around. Here is how it could look like:
DECLARE @tbl sysname,@sql nvarchar(4000),@params nvarchar(4000),@count intDECLARE tblcur CURSOR STATIC LOCAL FORSELECT object_name(id) FROM syscolumns WHERE name = 'LastUpdated'ORDER BY 1OPEN tblcurWHILE 1 = 1BEGINFETCH tblcur INTO @tblIF @@fetch_status <> 0BREAKSELECT @sql =N' SELECT @cnt = COUNT(*) FROM dbo.' + quotename(@tbl) +N' WHERE LastUpdated BETWEEN @fromdate AND ' +N' coalesce(@todate, ''99991231'')'SELECT @params = N'@fromdate datetime, ' +N'@todate datetime = NULL, ' +N'@cnt int OUTPUT'EXEC sp_executesql @sql, @params, '20060101', @cnt = @count OUTPUTPRINT @tbl + ': ' + convert(varchar(10), @count) + ' modified rows.'ENDDEALLOCATE tblcur
I've put the lines that pertain directly to the dynamic SQL in bold face. You can see that I have declared the @sql and @params variables to be of the maximum length for nvarchar variables in SQL 2000. In SQL 2005, you may want to make it a routine to make @sql nvarchar(MAX), more about this just below.
When I assign the @sql variable, I am careful to format the statement so that it is easy to read, and I leave in spaces to avoid that two concatenated parts are glued together without space in between, which could cause a syntax error. I put the table name in '' around the date literal – the rule in
In this example, the dynamic SQL has three parameters: one mandatory input parameter, one optional input parameter, and one output parameter. I've assumed that this time the DBA wanted to see all changes made after 2006-01-01, which is why I've left out @todate in the call to sp_executesql. Since I left out one variable, I must specify the last, @cnt by name – the same rules as when you call a stored procedure. Note also that the variable is called @cnt in the dynamic SQL, but @count in the surrounding script. Normally, you might want to use the same name, but I wanted to stress that the @cnt in the dynamic SQL is only visible within the dynamic SQL, whereas @count is not visible there.
You may note that I've prepend the string literals with N to denote that they are Unicode strings. As @sql and @params are declared as nvarchar, technically this is not necessary (as long as you stick your 8-bit character set). However, would you provide any of the strings directly in the call to sp_executesql, you must specify the N, as in this fairly silly example:
EXEC sp_executesql N'SELECT @x', N'@x int', @x = 2
If you remove any of the Ns, you will get an error message. Since sp_executesql is a built-in stored procedure, there is no implicit conversion from varchar.
You may wonder why I do not pass @tbl as a parameter as well. The answer is that you can't. Dynamic SQL is just like any other SQL. You can't specify a table name through a variable in
If you are on SQL 2000 or SQL 7, there is a limitation with sp_executesql when it comes to the length of the SQL string. While the parameter is ntext, you cannot use this data type for local variables. Thus, you will have to stick to nvarchar(4000). In many cases this will do fine, but it is not uncommon to exceed that limit. In this case, you will need to use
On SQL 2005, this is not an issue. Here you can use the new data type nvarchar(MAX) which can hold as much data as ntext, but without the many restrictions of ntext.
EXEC()
FETCH tblcur INTO @tblIF @@fetch_status <> 0 BREAKEXEC('UPDATE STATISTICS [' + @tbl + '] WITH FULLSCAN')In the example with sp_executesql, I used
EXEC('UPDATE STATISTICS ' + quotename(@tbl) + ' WITH FULLSCAN')Best practice is to always use a variable to hold the SQL statement, so the example would better read:
FETCH tblcur INTO @tblIF @@fetch_status <> 0 BREAKSELECT @sql = 'UPDATE STATISTICS ' + quotename(@tbl) + ' WITH FULLSCAN'EXEC(@sql)
The fact that you can concatenate strings within
EXEC(@sql1 + @sql2 + @sql3)
Where all of @sql1, @sql2 and @sql3 can be 4000 characters long – or even 8000 characters as
Since you cannot use parameters, you cannot as easily get values out from
In SQL 2005,
EXEC(@sql) AS USER = 'mitchell'EXEC(@sql) AS LOGIN = 'CORDOBA\Miguel'
This is mainly a syntactical shortcut that saves you from embedding the invocation of dynamic SQL in EXECUTE AS and REVERT. (I discuss these statements more in detail in my article Granting Permissions Through Stored Procedures.)
SQL 2005 adds a valuable extension to
SQL Injection – a Serious Security Issue
Before you start to use dynamic SQL all over town, you need to learn about SQL injection and how you protect your application against it. SQL injection is a technique whereby an intruder enters data that causes your application to execute SQL statements you did not intend it to. SQL injection is possible as soon there is dynamic SQL which is handled carelessly, be that SQL statements sent from the client, dynamic SQL generated in
Here is an example. The purpose of the procedure below is to permit users to search for orders by various conditions. A real-life example of such a procedure would have many more parameters, but I've cut it down to two to be brief. (This is, by the way, a problem for which dynamic SQL is a very good solution.) As the procedure is written, it is open for SQL injection:
CREATE PROCEDURE search_orders @custid nchar(5) = NULL,@shipname nvarchar(40) = NULL ASDECLARE @sql nvarchar(4000)SELECT @sql = ' SELECT OrderID, OrderDate, CustomerID, ShipName ' +' FROM dbo.Orders WHERE 1 = 1 'IF @custid IS NOT NULLSELECT @sql = @sql + ' AND CustomerID LIKE ''' + @custid + ''''IF @shipname IS NOT NULLSELECT @sql = @sql + ' AND ShipName LIKE ''' + @shipname + ''''EXEC(@sql)
Before we look at a real attack, let's just discuss this from the point of view of user-friendliness. Assume that the input for the parameters @custid and @shipname comes directly from the user and a naïve and innocent user wants to look for orders where ShipName is Let's Stop N Shop, so he enters Let's. Do you see what will happen? Because @shipname includes a single quote, he will get a syntax error. So even if you think that SQL injection is no issue to you, because you trust your users, you still need to read this section, so that they can search for Brian O'Brien and Samuel Eto'o.
So this is the starting point. A delimiter, usually a single quote, affects your dynamic SQL, and a malicious user can take benefit of this. For instance, consider this input for @shipname:
' DROP TABLE Orders --
The resulting SQL becomes:
SELECT * FROM dbo.Orders WHERE 1 = 1 AND ShipName LIKE '' DROP TABLE orders --'
This is a perfectly legal batch of
Typically, an attacker first tests what happens if he enters a single quote (') in an input field or a URL. If this yields a syntax error, the attacker knows that there is a vulnerability. He then finds out if he needs any extra tokens to terminate the query, and then he can add his own SQL statement. Finally he adds a comment character to kill the rest of the SQL string to avoid syntax errors. Single quote is the most common character to reveal openings for SQL injection, but if you have dynamic table and column names, there are more options an attacker could succeed with. Take this dreadful version of general_select:
CREATE PROCEDURE general_select2 @tblname nvarchar(127),@key varchar(10) ASEXEC('SELECT col1, col2, col3FROM ' + @tblname + 'WHERE keycol = ''' + @key + '''')and assume that @tblname comes from a URL. There are quite some options that an attacker could use to take benefit of this hole.
And don't overlook numeric values: they can very well be used for SQL injection. Of course, in a
Keep in mind that user input comes from more places than just input fields on a form. The most commonly used area for injection attacks on the Internet is probably parameters in URLs and cookies. Thus, be very careful how you handle anything that comes from the user.
You may think that it takes not only skill, but also luck for someone to find and exploit a hole for SQL injection. But remember that there are too many hackers out there with too much time on their hands. SQL injection is a serious security issue, and you must take precautions to protect your applications against it.
Thankfully, it is not difficult at all. I've seen mentioning of various ways to validate input data, but all that is a joke. There are three steadfast principles you need to follow:
- Never run with more privileges than necessary. Users that log into an application with their own login should normally only have EXEC permissions on stored procedures. If you use dynamic SQL, it should be confined to reading operations so that users only need SELECT permissions. A web site that logs into a database should not have any elevated privileges, preferably only EXEC and (maybe) SELECT permissions. Never let the web site log in as sa!
- For web applications: never expose error messages from SQL Server to the end user.
- Always used parameterised statements. That is, in a
T-SQL procedure use sp_executesql, notEXEC() .
The first point is mainly a safeguard, so that if there is a injection hole, the intruder will not be able to do that much harm. The second point makes the task for the attacker more difficult as he cannot get feedback from his attempts.
But it is the third point that is the actual protection, and that we will look a little closer at. The procedure search_orders above should be coded as:
CREATE PROCEDURE search_orders @custid nchar(5) = NULL,@shipname nvarchar(40) = NULL ASDECLARE @sql nvarchar(4000)SELECT @sql = ' SELECT OrderID, OrderDate, CustomerID, ShipName ' +' FROM dbo.Orders WHERE 1 = 1 'IF @custid IS NOT NULLSELECT @sql = @sql + ' AND CustomerID LIKE @custid 'IF @shipname IS NOT NULLSELECT @sql = @sql + ' AND ShipName LIKE @shipname 'EXEC sp_executesql @sql, N'@custid nchar(5), @shipname nvarchar(40)',@custid, @shipname
Since the SQL string does include any user input, there is no opening for SQL injection. It's as simple as that. By the way, note that since we can include parameters in the parameter list, even if they don't actually appear in the SQL string, we don't need any complicated logic to build the parameter list, but can keep it static. In the same vein, we can always pass all input parameters to the SQL string.
As you may recall, you cannot pass everything as parameters to dynamic SQL, for instance table and column names. In this case you must enclose all such object names in
The example above was for dynamic SQL in a
Set cmd = CreateObject("ADODB.Command")Set cmd.ActiveConnection = cnncmd.CommandType = adCmdTextcmd.CommandText = " SELECT OrderID, OrderDate, CustomerID, ShipName " & _" FROM dbo.Orders WHERE 1 = 1 "If custid <> "" Thencmd.CommandText = cmd.CommandText & " AND CustomerID LIKE ? "cmd.Parameters.Appendcmd.CreateParameter("@custid", adWChar, adParamInput, 5, custid)End IfIf shipname <> "" Thencmd.CommandText = cmd.CommandText & " AND ShipName LIKE ? "cmd.Parameters.Append cmd.CreateParameter("@shipname", _adVarWChar, adParamInput, 40, shipname)End IfSet rs = cmd.ExecuteSince the main focus of this text is dynamic SQL in ? as a parameter marker, and you can only pass parameters that actually appear in the SQL string. (If you specify too many parameters, you will get a completely incomprehensible error message.) If you use the SQL Profiler to see what ADO sends to SQL Server, you will find that it invokes – sp_executesql.
Protection against SQL injection is not the only advantage of using parameterised queries. In the section Caching Query Plans, we will look more in detail on parameterised queries and at a second very important reason to use them. This section also includes an example of composing and sending a parameterised SQL statement for SqlClient in VB .Net.
You may think that an even better protection against SQL injection is to use stored procedures with static SQL only. Yes, this is true, but! It depends on how you call your stored procedures from the client. If you compose an EXEC command into which you interpolate the input values, you are back on square one and you are as open to SQL injection as ever. In ADO, you need to call your procedure with the command type adCmdStoredProc and use .CreateParameter to specify the parameters. By specifying adCmdStoredProc, you call the stored procedure through RPC, Remote Procedure Call, which not only protects you against SQL injection, but it is also more efficient. Similar measures apply to other client APIs; all APIs I know of supply a way to call a stored procedure through RPC.
Dynamic SQL and Stored Procedures
In the introduction, I presented various strategies for data-access for an application, and I said that in many shops all data access is through stored procedures. In this section, I will look a little closer at the advantages with using stored procedures over sending SQL statements from the client. I will also look at what happens when you use dynamic SQL in a stored procedure, and show that you lose some of the advantages with stored procedures, whereas other are unaffected.
The Permission System
Historically, using stored procedures has been the way to give users access to data. In a locked-down database, users do not have permissions to access tables directly. Instead, the application performs all access through stored procedures that retrieve and update data in a controlled way, so that users only get to see data they have access to, and they cannot perform updates that violate business rules. This works as long as the procedure and the tables have the same owner, typically dbo (the database owner), through a mechanism known as ownership chaining.
As I have already mentioned, ownership chaining does not work when you use dynamic SQL. The reason for this is very simple: the block of dynamic SQL is not a procedure and does not have any owner. Thus the chain is broken.
SQL 2005
In SQL 2005 this can be addressed by signing a procedure that uses dynamic SQL with a certificate. You associate the certificate with a user, and grant that user (which is a user that cannot log in) the rights needed for the dynamic SQL to execute successfully. A second method in SQL 2005 is to use the EXECUTE AS clause to impersonate a user that has been granted the necessary permissions. This method is easier to use, but has side effects that can have unacceptable consequences for auditing, row-level security schemes and system monitoring. For this reason, my strong recommendation is to use certificates.
Describing these methods more closely, would take up too much space here. Instead I've written a separate article about them, Giving Permissions through Stored Procedures, where I discusses both certificates and impersonation in detail, and I also take a closer look on ownership chaining.
If you write CLR procedures that perform data access, the same is true for them. Ownership chaining never applies since all data access in a CLR procedure is through dynamic SQL. But you can use certificates or impersonation to avoid having to give users direct permissions on the tables.
SQL 2000 and earlier
On SQL 2000 there is no way to combine dynamic SQL with the encapsulation of permissions that you can get through stored procedures. Any use of dynamic SQL requires that the users have direct permissions on the accessed tables. If your security scheme precludes giving users permissions to access tables directly, you cannot use dynamic SQL. It is that plain and simple. Depending on the sensitivity of the data in the application, it may be acceptable to give the users SELECT permissions on the tables (or on some tables) to permit the use of dynamic SQL. I strongly recommend against granting users INSERT, UPDATE and DELETE rights on tables only to permit dynamic SQL in some occasional procedure.
There are nevertheless two alternatives, application roles and "application proxies", but they require you to change the application architecture, so it is nothing you introduce at whim.
Application roles were introduced in SQL 7. Users log into SQL Server but have no permissions on their own beyond the database access. Instead, the application activates the application role by sending a password somehow embedded into it, and this application role has permissions to read and update tables. With application roles, it does not really matter if you use stored procedures or not. The same is true for "application proxies" where the application authenticates the users outside SQL Server and logs into SQL Server on their behalf with a proxy login. This proxy login impersonates the users in SQL Server, and thus their permissions apply. However, since the users do not have any login, they cannot log into SQL Server outside the application. In Giving Permissions..., I discuss these two methods a little further.
For both ofr these methods, keep in mind about SQL injection, and don't grant your application role or the login-less users anything beyond SELECT, INSERT, UPDATE and DELETE permissions on tables. (And preferably only SELECT, and stick all updates into stored procedures with static SQL).
Caching Query Plans
Every query you run in SQL Server requires a query plan. When you run a query the first time, SQL Server builds a query plan for it – or as the terminology goes – it compiles the query. SQL Server saves the plan in cache, and next time you run the query, the plan is reused. The query plan stays in cache until it's aged out because it has not been used for a while, or it is invalidated for some reason. (Why this happens falls outside the scope of this article.)
The reuse of cached query plans is very important for the performance of queries where the compilation time is in par with the execution time or exceeds it. If a query needs to run for four minutes, it does not matter much if the query is recompiled for an extra second each time. On the other hand, if the execution time of the query is 40 ms but it takes one second to compile the query, there is a huge gain with the cached plan, particularly if the query is executed over and over again.
Up to SQL 6.5 the only plans there were put into the cache were plans for stored procedures. Loose batches of SQL were compiled each time. And since the query plan for dynamic SQL is not part of the stored procedure, that includes dynamic SQL as well. Thus in SQL 6.5, the use of dynamic SQL nullified the benefit with stored procedures in this regard.
Starting with SQL 7, SQL Server also caches the plans for bare statements sent from a client or generated through dynamic SQL. Say that you send this query from the client, or execute it with
SELECT O.OrderID, SUM(OD.UnitPrice * OD.Quantity)FROM Orders OJOIN [Order Details] OD ON O.OrderID = OD.OrderIDWHERE O.OrderDate BETWEEN '19980201' AND '19980228'AND EXISTS (SELECT *FROM [Order Details] OD2WHERE O.OrderID = OD2.OrderIDAND OD.ProductID = 76)GROUP BY O.OrderID
The query returns the total order amount for the orders in February 1998 that contained the product Lakkalikööri. SQL Server will put the plan into the cache, and next time you run this query, the plan will be reused. But only if it is exactly the same query. Since the cache lookup is by a hash value computed from the query text, the cache is space- and case-sensitive. Thus, if you add a single space somewhere, the plan is not reused. More importantly, it is not unlikely that next time you want to run the query for a different product, or a different period.
All this changes, if you instead use sp_executesql to run your query with parameters:
DECLARE @sql nvarchar(2000)SELECT @sql = 'SELECT O.OrderID, SUM(OD.UnitPrice * OD.Quantity)FROM dbo.Orders OJOIN dbo.[Order Details] OD ON O.OrderID = OD.OrderIDWHERE O.OrderDate BETWEEN @from AND @toAND EXISTS (SELECT *FROM dbo.[Order Details] OD2WHERE O.OrderID = OD2.OrderIDAND OD.ProductID = @prodid)GROUP BY O.OrderID'EXEC sp_executesql @sql, N'@from datetime, @to datetime, @prodid int','19980201', '19980228', 76
The principle for cache lookup is the same as for a non-parameterised query: SQL Server hashes the query text and looks up the hash value in the cache, still in a case- and space-sensitive fashion. But since the parameter values are not part of the query text, the same plan can be reused even when the input changes.
To make this really efficient there is one more thing you need to observe. Do you see that I've prefixed all tables in the query with dbo? There is a very important reason for this. On SQL 2000, this is an absolute must for effecient use of the query-plan cache. If you leave out dbo from a single table, each user will get his own copy of the plan in the cache. This is because on SQL 2000, each user has a default schema which is equal to the username. So when user1 runs a query that goes "SELECT ... FROM Orders", SQL Server must first check if there is a table user1.Orders, before it looks for dbo.Orders. Since user1.Orders could appear on the scene at any time, SQL Server needs to have a separate plan for each user.
The recommendation to include dbo applies very much to SQL 2005 as well, but it is not an absolute must. This is because on SQL 2005, owner and schema has been separated from each other so that users can have dbo as their default schema, in which case they can share a plan even when tables are not prefixed with dbo. (Since only Orders will be unambiguous.) Note here that for this to happen, users must have been created with the new command CREATE USER. If you – or the DBA – use sp_adduser out of habit, users still have their own default schema, and the situation is the same as on SQL 2000. Thus, to be safe, always prefix your tables with dbo in your dynamic SQL on SQL 2005 as well. (Unless, of course, you are actually using different schemas and not only the dbo schema, something which is a lot easier to do now. If you want to read more about owner/schema separation, there is a section on it in my article Granting Permissions through Stored Procedures.)
If you instead use stored procedures, it is not equally important to prefix tables with dbo. Microsoft still recommends that you do, but even if you don't, users with different default schema can share the same query plan.
From what I have said here, it follows that if you use dynamic SQL with
So far, I've only talked about dynamic SQL in stored procedures. But in this regard there is very little difference to SQL statements sent from the client, or SQL statements generated in CLR procedures. The same rules apply: unparameterised statements are cached but with little probability for reuse, whereas parameterised queries can be as efficient as stored procedures if you remember to always prefix the tables with dbo. (And still with the caveat that the cache lookup is space- and case-sensitive.) Most client APIs implement parameterised queries by calling sp_executesql under the covers.
In the section on SQL Injection, I included an example on how to do parameterised queries with ADO and VB6. Here is an example with VB .Net and SqlClient:
cmd.CommandType = System.Data.CommandType.Textcmd.CommandText = _" SELECT O.OrderID, SUM(OD.UnitPrice * OD.Quantity)" & _" FROM dbo.Orders O " & _" JOIN dbo.[Order Details] OD ON O.OrderID = OD.OrderID" & _" WHERE O.OrderDate BETWEEN @from AND @to" & _" AND EXISTS (SELECT *" & _" FROM dbo.[Order Details] OD2" & _" WHERE O.OrderID = OD2.OrderID" & _" AND OD.ProductID = @prodid)" & _" GROUP BY O.OrderID"cmd.Parameters.Add("@from", SqlDbType.Datetime)cmd.Parameters("@from").Value = "1998-02-01"cmd.Parameters.Add("@to", SqlDbType.Datetime)cmd.Parameters("@to").Value = "1998-02-28"cmd.Parameters.Add("@prodid", SqlDbType.Int)cmd.Parameters("@prodid").Value = 76In difference to ADO, SqlClient uses names with @ for parameters. The syntax for defining parameters is similar to ADO, but not identical. This article is long enough, so I will not go into details on how the Parameters collection works. Instead, I refer you to MSDN where both SqlClient and ADO are documented in detail. Whatever client API you are using, please learn how to use parameterised commands with it. Yes, there is a tone of desperation in my voice. I don't know how many posts I've seen on the newsgroups over the years where people build their SQL strings by interpolating the values from input fields into the SQL string, and thereby degrading the performance of their application, and worst of all opening their database to SQL injection.
... and just when you thought you were safe, I need to turn this upside down. Recall what I said in the beginning of this section, that if the query is going to run for four minutes, one second extra for compilation is not a big deal. And if that recompilation slashes the execution time from forty minutes to four, there is a huge gain. Most queries benefit from cached parameterised plans, but not all do. Say that you have a query where the user can ask for data for some time span. If the user asks for a summary for a single day, there is a good non-clustered index that can be used for a sub-second response time. But if the request is for the entire year, the same index would be a disaster, and a table scan is to prefer. On SQL 2005 you can force a query to be recompiled each time it is executed by adding OPTION (RECOMPILE) to the end of the query, and thus you can still use sp_executesql to get the best protection against SQL injection. On SQL 2000 and earlier, it may in fact be better to interpolate critical parameters into the query string when you need to force recompilation each time.
For the sake of completeness, I should mention that SQL Server is able to auto-parameterise queries. If you submit:
SELECT OrderID, OrderDate FROM dbo.Orders WHERE CustomerID = N'ALFKI'
SQL Server may recast this as
SELECT OrderID, OrderDate FROM dbo.Orders WHERE CustomerID = @P1
so if next time you submit BERGS instead of ALFKI, the query plan will be reused. Auto-parameterisation comes in two flavours: simple and forced. Simple is the default and is the only option on SQL 2000 and earlier. With simple parameterisation, auto-parameterisation happens only with very simple queries, and, it seems, with some inconsistency. With forced parameterisation, SQL Server parameteries all queries that comes its way (with some exceptions documented in Books Online). Forced parameterisation is, in my opinion, mainly a setting to cover up for poorly designed third-party application that uses unparameterised dynamic SQL. For your own development you should not rely on any form of auto-parameterisation. (But in the situation you really a want a new query plan each time, you may have to verify that it doesn't happen when you don't want to.)
They say seeing is believing. Here is a demo that you can try on yourself, if you have SQL 2005. First create this database:
CREATE DATABASE many_spsgoUSE many_spsgoDECLARE @sql nvarchar(4000),@x intSELECT @x = 200WHILE @x > 0BEGINSELECT @sql = 'CREATE PROCEDURE abc_' + ltrim(str(@x)) +'_sp @orderid int ASSELECT O.OrderID, O.OrderDate, O.CustomerID, C.CompanyName,Prodcnt = OD.cnt, Totalsum = OD.totalFROM Northwind..Orders OJOIN Northwind..Customers C ON O.CustomerID = C.CustomerIDJOIN (SELECT OrderID, cnt = COUNT(*), total = SUM(Quantity * UnitPrice)FROM Northwind..[Order Details]GROUP BY OrderID) AS OD ON OD.OrderID = O.OrderIDWHERE O.OrderID = @orderid'EXEC(@sql)SELECT @x = @x - 1END
Then in SQL Server Management Studio, press F7 navigate down to the list of stored procedures. Select all procedures. Then from the context menu select to script them as CREATE TO to a new query window. How long time this takes depends on your hardware, but on my machine it took 90 seconds and at the same time SQL Server grabbed over 250 MB of memory. If you use the Profiler to see what Mgmt Studio is up to, you will see that for each procedure, Mgmt Studio emits a couple of queries with the procedure name embedded. That is, no parameterised statements. Once scripting is complete, issue this command:
ALTER DATABASE many_sps SET PARAMETERIZATION FORCED
and redo the operation. On my machine scripting now completed in five seconds!. This demonstrates that the difference between parameterised and unparameterised can be dramatic. (And that Microsoft can not use their own products properly.) If you run SQL Server on your local machine, you can see this from one more angle, you can stop and restart SQL Server before the two scripting operations, and then use Task Manager to see how much physical memory SQL Server uses in the two cases. That difference lies entirely in the plan cache.
Reducing Network Traffic
Another advantage with stored procedures over SQL sent from the client is that less bytes travel the network. Rather than sending a 50-line query over the network, you only need to pass the name of a stored procedure and a few parameters. This gets more significant if the computation requires several queries, possibly with logic in between. If all logic is outside the database, this could mean that data has to travel up to the client, only to travel back in the next moment. With stored procedures you can use temp tables to hold intermediate results. (You can use temp tables from outer layers as well, although it may require some careful use of your client API.)
In this case, the dividing line goes between sending SQL from the client or running stored procedures. If the stored procedures use static SQL only, or invoke dynamic SQL does not matter, nor does it matter if it is a CLR procedure. You still get the gains of reduced network traffic.
Encapsulating Logic
This is not a question of security or performance, but one of good programming practice and modularising your code. By using stored procedures, you don't have to bog down your client code with the construction of SQL statements. Then again, it depends a little on what you put into those stored procedure. Myself, I am of the school that the business logic should be where the data is, and in this case there is no dispute that you should use stored procedures to encapsulate your logic.
But there are also people who prefer to see the database as a unintelligent container of data, and who prefer to have the business logic elsewhere. In this case, the arguments for using stored procedures for encapsulation may not be equally compelling. You could just as well employ careful programming practices in your client language and send SQL strings.
Nothing of this changes if you use dynamic SQL in your stored procedures. The stored procedure is still a container for some piece of logic, and how it looks on the inside does not matter. I'm here assuming that most of your procedures use static SQL only. If all your stored procedures generate dynamic SQL, then you are probably better off in this regard to do it all in client code. Then again, sometimes there is no other application than Query Analyzer or SQL Server Management Studio. (Typically this would be tasks that are run by an admin.) In this case, the only container of logic available is stored procedures, and it's immaterial whether they use dynamic SQL or not.
Keeping Track of what Is Used
In a complex system with hundreds of tables, you may need to know where a certain table or column is referenced, because you are considering changing or dropping it. If all access to tables is from static SQL in stored procedures, you may be able find all references by using the system stored procedure sp_depends or query a system table directly. (sysdepends in SQL 2000, sys.sql_dependencies in SQL 2005.) I say may, because it is very difficult to maintain complete dependency information in SQL Server. If you drop and recreate a table, all dependency information for the table is lost. What I do myself is to regularly build an empty database from our version-control system, and since our build tool loads all tables before any stored procedure or trigger, I know that I can trust the dependency information in that database.
If you throw dynamic SQL into the mix – be that SQL sent from client, dynamic SQL in
While the main dividing line here is between static SQL and any form of dynamic SQL, dynamic SQL in
In any case, an occasional stored procedure that uses dynamic SQL is not likely cause the Armageddon I pictured above. But it is a good argument for being restrictive with dynamic SQL in any form.
Easiness of Writing SQL Code
One distinct advantage of writing stored
It has to be admitted that the strength of this argument is somewhat reduced by the fact that
Another side of this coin is that when you write dynamic SQL, you embed the SQL code into strings, which makes programming far more complex. Your SQL code is a string delimited by single quotes('), and this string may include strings itself, and to include a single quote into the string you need to double it. You can easily get lost in a maze of quotes if you don't watch out. (In the section Good Coding Practices and Tips for Dynamic SQL, we will look a little closer on how to deal with this problem.) The most commonly used client languages with ") as their string delimiter, so dynamic SQL in client code or CLR stored procedures is less prone to that particular problem. Then again, in VB you don't have multi-line strings, so at the end of each line you have to have a double quote, an ampersand and an underscore for continuation. It sure does not serve to make coding easier. You are relieved from all this hassle, if you use stored procedures with static SQL only.
Addressing Bugs and Problems
Somewhat surprisingly, one of the strongest arguments for stored procedures today may be that they permit you to quickly address bugs and performance problems in the application.
Say that you generate SQL statements in your application, and that there is an error in it. Or that it simply performs unbearably slow. To fix it, you need to build a new executable or DLL, which is likely to contain other code that also has changed since the module was shipped. This means that before the fix can be put into production, the module will have to go through QA and testing.
On the other hand, if the problem is in a stored procedure, and the fix is trivial, you may be able to deploy a fix into production within an hour after the problem was reported.
This difference is even more emphasised, if you are an ISV and you ship a product that the customer is supposed administer himself. If your application uses stored procedures, a DBA may be able to address problems directly without opening support cases. For instance, if a procedure runs unacceptably slow, he may be able to fix that by adding an index hint. In contrast, with an application that generates SQL in the client, his hands will be tied. Of course, as an ISV you may not want your customers to poke around in your code, even less to change it. You may also prefer to ship your procedures WITH ENCRYPTION to protect your intellectual property, but this is best controlled through license agreements. (If you encrypt your procedures, the DBA can still change them, as long as he is able to find a way to decrypt them. Which any DBA that knows how to use Google can do.)
In this case, it does not matter whether the stored procedure uses static SQL only, or if it also uses dynamic SQL. For CLR procedures it depends on many objects you have in your assemblies. If you have one assembly per object, installing a new version of a CLR procedure is as simple as replacing a
(I should add that SQL 2005 offers a new feature that permits the DBA to change the plan for a query without altering the code, by adding a plan guide. This is quite an advanced feature, and I refer to Books Online for details.)
Good Coding Practices and Tips for Dynamic SQL
Writing dynamic SQL is a task that requires discipline to avoid that you lose control over your code. If you just go ahead, your code can become very messy, and be difficult to read, troubleshoot and maintain. In this section, we will look at how to avoid this. I will also discuss some special cases: how you can use sp_executesql for input longer than 4000 chars in SQL 2000, and how to use dynamic SQL with cursors, and the combination of dynamic SQL and user-defined functions.
Use Debug Prints!
When you write a stored procedure that generates dynamic SQL, you should always include a @debug parameter:
CREATE PROCEDURE dynsql_sp @par1 int,...@debug bit = 0 AS...IF @debug = 1 PRINT @sql
When you get a syntax error from the dynamic SQL, it can be very confusing, and you may not even discern where it comes from. And even when you do, it can be very difficult to spot the error only by looking at the code that constructs the SQL. Once the SQL code is slapped in your face, the error is much more likely to be apparent to you. So always include a @debug parameter and a PRINT!
Nested Strings
As I've already mentioned, one problem with dynamic SQL is that you often need to deal with nested string delimiters. For instance, in the beginning of this article, I showed you the procedure general_select2. Here it is again:
CREATE PROCEDURE general_select2 @tblname nvarchar(127),@key varchar(10) ASEXEC('SELECT col1, col2, col3FROM ' + @tblname + 'WHERE keycol = ''' + @key + '''')(Again, I like to emphasise that this sort of procedure is poor use of dynamic SQL.)
SQL is one of those language where the method to include a string delimiter itself in a string literal is to double it. So those four consecutive single quotes ('''') is a string literal with the value of a one single quote ('). This is a fairly simple example, it can get a lot worse. If you work with dynamic SQL, you must learn to master nested strings. Obviously, in this case you can easily escape the mess by using sp_executesql instead – yet another reason to use parameterised statements. However, there are situations when you need to deal with nested quotes even with sp_executesql. For instance, earlier in this article, I had this code:
N' WHERE LastUpdated BETWEEN @fromdate AND 'N' coalesce(@todate, ''99991231'')'
We will look at some tips of dealing with nested strings later in this section.
Spacing and Formatting
Another thing to be careful with is the spacing as you concatenate the parts of a query. Here is an example where it goes wrong:
EXEC('SELECT col1, col2, col3FROM' + @tblname + 'WHERE keycol = ''' + @key + '''')See that there is a space missing after FROM? When you compile the stored procedure you will get no error, but when you run it, you will be told that the columns keycol, col1, col2, col3 are missing. And since you know that the table you passed to the procedure has these columns you will be mighty confused. But this is the actual code generated, assuming the parameters foo and abc:
SELECT col1, col2, col3FROMfooWHERE keycol = 'abc'
This is not a syntax error, because FROMfoo is a column alias to col3. And, yes, it's legal to use a WHERE clause, even if there is no FROM clause. But since the columns cannot exist out of the blue, you get an error for that.
This is also a good example why you should use debug prints. If the code looks like this:
SELECT @sql =' SELECT col1, col2, col3FROM' + @tblname + 'WHERE keycol = ''' + @key + ''''IF @debug = 1 THEN PRINT @sqlEXEC(@sql)
It would be much easier to find the error by running the procedure with @debug = 1. (Obviously, had we included the dbo prefix, this error could not occur at all.)
Overall, good formatting is essential when working with dynamic SQL. Try to write the query as you would have written it in static SQL, and then add the string delimiters outside of that.
You may prefer, though, to have a string terminator on each line. A tip in such case is to do something like this:
EXEC(' SELECT col1, col2, col3 ' +' FROM ' + @tblname +' WHERE keycol = ''' + @key + '''')As you see, I have a space after the opening single quote on each line to avoid syntax problems due to missing spaces.
Dealing with Dynamic Table and Column Names
Passing table and column names as parameters to a procedure with dynamic SQL is rarely a good idea for application code. (It can make perfectly sense for admin tasks). As I've said, you cannot pass a table or a column name as a parameter to sp_executesql, but you must interpolate it into the SQL string. Still you should protect it against SQL injection, as a matter of routine. It could be that bad it comes from user input.
To this end, you should use the built-in function quotename('Orders') returns [Orders]. .
Note that when you work with names with several components, each component should be quoted separately. returns , but that is a table in an unknown schema of which the first four characters are d, b, o and dot. As long as you only work with the dbo schema, best practice is to add dbo in the dynamic SQL and only pass the table name. If you work with different schemas, pass the schema as a separate parameter. (Although you could use the built-in function
While general_select still is a poor idea as a stored procedure, here is nevertheless a version that summarises some good coding virtues for dynamic SQL:
CREATE PROCEDURE general_select @tblname nvarchar(128),@key varchar(10),@debug bit = 0 ASDECLARE @sql nvarchar(4000)SET @sql = 'SELECT col1, col2, col3FROM dbo.' + quotename(@tblname) + 'WHERE keycol = @key'IF @debug = 1 PRINT @sqlEXEC sp_executesql @sql, N'@key varchar(10)', @key = @key
- I'm using sp_executesql rather than
EXEC() . - I'm prefixing the table name with dbo.
- I'm wrapping @tblname in
quotename() . - There is a @debug parameter.
Quotename, Nested Strings and Quotestring
The main purpose of
IF @custname IS NOT NULLSELECT @sql = @sql + ' AND custname = ' + quotename(@custname, '''')
Say that @custname has the value D'Artagnan. This part of the dynamic SQL becomes:
AND custname = 'D''Artagnan'
There is a limitation with
CREATE FUNCTION quotestring(@str nvarchar(1998)) RETURNS nvarchar(4000) ASBEGINDECLARE @ret nvarchar(4000),@sq char(1)SELECT @sq = ''''SELECT @ret = replace(@str, @sq, @sq + @sq)RETURN(@sq + @ret + @sq)END
This version is for SQL 2000. On SQL 2005 replace 1998 and 4000 with MAX, so that it works for any string length. Here is an example of using this function:
IF @custname IS NOT NULLSELECT @sql = @sql + ' AND custname = ' + dbo.quotestring(@custname)
The result is the same as above.
On SQL 7, you would have to implement quotestring as a stored procedure. SQL 6.5 does not have
So with
(I should add that I got the suggestion to use
QUOTED_IDENTIFIER
Another alternative to escape the mess of nested quotes, is make use of the fact that ") as a string delimiter. The default for this setting depends on context, but the preferred setting is ON, and it must be ON in order to use XQuery, indexed views and indexes on computed columns. Thus, this is not a first-rate alternative, but if you are aware of the caveats, you can do this:
CREATE PROCEDURE general_select @tblname nvarchar(127),@key key_type,@debug bit = 0 ASDECLARE @sql nvarchar(4000)SET @sql = 'SET QUOTED_IDENTIFIER OFFSELECT col1, col2, col3FROM dbo.' + quotename(@tblname) + 'WHERE keycol = "' + @key + '"'IF @debug = 1 PRINT @sqlEXEC(@sql)
Since there are two different quote characters, the code is much easier to read. The single quotes are for the SQL string and the double quotes are for the embedded string literals.
All and all, this is an inferior method to both sp_executesql and
sp_executesql and Long SQL Strings in SQL 2000
There is a limitation with sp_executesql on SQL 2000 and SQL 7, since you cannot use longer SQL strings than 4000 characters. (On SQL 2005, use nvarchar(MAX) to avoid this problem.) If you want to use sp_executesql despite you query string is longer, because you want to make use of parameterised query plans, there is actually a workaround. To wit, you can wrap sp_executesql in
DECLARE @sql1 nvarchar(4000),@sql2 nvarchar(4000),@state char(2)SELECT @state = 'CA'SELECT @sql1 = N'SELECT COUNT(*)'SELECT @sql2 = N'FROM dbo.authors WHERE state = @state'EXEC('EXEC sp_executesql N''' + @sql1 + @sql2 + ''',N''@state char(2)'',@state = ''' + @state + '''')This works, because the @stmt parameter to sp_executesql is ntext, so by itself, it does not have any limitation in size.
You can even use output parameters by using INSERT-EXEC, as in this example:
CREATE TABLE #result (cnt int NOT NULL)DECLARE @sql1 nvarchar(4000),@sql2 nvarchar(4000),@state char(2),@mycnt intSELECT @state = 'CA'SELECT @sql1 = N'SELECT @cnt = COUNT(*)'SELECT @sql2 = N'FROM dbo.authors WHERE state = @state'INSERT #result (cnt)EXEC('DECLARE @cnt intEXEC sp_executesql N''' + @sql1 + @sql2 + ''',N''@state char(2),@cnt int OUTPUT'',@state = ''' + @state + ''',@cnt = @cnt OUTPUTSELECT @cnt')SELECT @mycnt = cnt FROM #resultYou have my understanding if you think this is too messy to be worth it.
Dynamic SQL in User-Defined Functions
This very simple: you cannot use dynamic SQL from used-defined functions written in
I've seen more than one post on the newsgroups where people have been banging their head against this. But if you want to use dynamic SQL in a UDF, back out and redo your design. You have hit a roadblock, and in SQL 2000 there is no way out.
In SQL 2005, you could implement your function as a CLR function. Recall that all data access from the CLR is dynamic SQL. (You are safe-guarded, so that if you perform an update operation from your function, you will get caught.) A word of warning though: data access from scalar UDFs can often give performance problems. If you say
SELECT ... FROM tbl WHERE dbo.MyUdf(somecol) = @value
and MyUdf performs data access, you have more or less created a hidden cursor.
Cursors and Dynamic SQL
Not that cursors are something you should use very frequently, but people often ask about using dynamic SQL with cursors, so I give an example for the sake of completeness. You cannot say DECLARE CURSOR
SELECT @sql = 'DECLARE my_cur INSENSITIVE CURSOR FOR ' +'SELECT col1, col2, col3 FROM ' + @tableEXEC sp_executesql @sql
You may be used to using the LOCAL keyword with your cursors. However, it is important to understand that you must use a global cursor, as a local cursor will disappear when the dynamic SQL exits. (Because, as you know by now, the dynamic SQL is its own scope.) Once you have declared the cursor in this way, you can use the cursor in a normal fashion. You must be extra careful with error-handling though, so that you don't exit the procedure without deallocating the cursor.
There is however a way to use locally-scoped cursors with dynamic SQL. Anthony Faull pointed out to me that you can achieve this with cursor variables, as in this example:
DECLARE @my_cur CURSOREXEC sp_executesqlN'SET @my_cur = CURSOR STATIC FORSELECT name FROM dbo.sysobjects;OPEN @my_cur',N'@my_cur cursor OUTPUT', @my_cur OUTPUTFETCH NEXT FROM @my_cur
You refer to a cursor variable, just like named cursors, but there is an @ in front, and, as you see from the example, you can pass them as a parameters. (I have to confess I have never seen any use for cursor variables until Anthony Faull was kind to send me this example.)
EXEC() at Linked Server
A special feature added in SQL 2005 is that you can use
EXEC('SELECT COUNT(*) FROM ' + @db + '.dbo.sysobjects') AT SQL2KSQL2K is here a linked server that has been defined with sp_addlinkedserver.
There is one thing that you can do with ?) as parameter holders. Say that you are on an SQL 2005 box, and you are dying to know how many orders VINET had in the Northwind database. Unfortunately, SQL 2005 does not ship with Northwind, but you have a linked server set up to an instance of SQL 2000 with Northwind. You can run this:
DECLARE @cnt intEXEC('SELECT ? = COUNT(*) FROM Northwind.dbo.Orders WHERE CustomerID = ?',@cnt OUTPUT, N'VINET') AT SQL2KSELECT @cntNote here that the parameter values must appear in the order the parameter markers appear in the query. When passing a parameter, you can either specify a constant value or a variable.
You may ask why the inconsistency with a different parameter marker from sp_executesql? Recall that linked servers in SQL Server are always accessed through an OLE DB provider, and OLE DB uses ? as the parameter marker, a convention inherited from ODBC. OLE DB translates that parameter marker as is appropriate for the data source on the other end. (Not all RDBMS use @ for variables.)
As with regular
EXEC('SELECT COUNT(*) FROM ' + @db + '.dbo.sysobjects')AS USER = 'davidson' AT SQL2KThis begs the question: is davidson here a local user or a remote user at SQL2K? Books Online is not very clear about this, but I did some quick experimenting, and found that what you are impersonating is a local user or login, not a login on the remote server. (The login to use on the remote server can be defined with sp_addlinkedsrvlogin.)
Common Cases when to (Not) Use Dynamic SQL
When you read the various newsgroups on SQL Server, there is almost every day someone who asks a question that is answered with use dynamic SQL with a quick example to illustrate, but ever so often the person answering forgets to tell about the implications on permissions or SQL injection. On top of that, far too many examples uses
So, in this section I will explore some situations where you could use dynamic SQL. You will see that sometimes dynamic SQL is a good choice, but also that in many cases that it is an outright bad idea.
SELECT * FROM @tablename
A common question is why the following does not work:
CREATE PROCEDURE my_proc @tablename sysname ASSELECT * FROM @tablename
As we have seen, we can make this procedure work with help of dynamic SQL, but it should also be clear that we gain none of the advantages with generating that dynamic SQL in a stored procedure. You could just as well send the dynamic SQL from the client. So, OK: 1) if the SQL statement is very complex, you save some network traffic and you do encapsulation. 2) As we have seen, on SQL 2005 there are methods to deal with permissions. Nevertheless, this is a bad idea.
There seems to be several reasons why people want to parameterise the table name. One camp appears to be people who are new to SQL programming, but have experience from other languages such as C++, VB etc where parameterisation is a good thing. Parameterising the table name to achieve generic code and to increase maintainability seems like good programmer virtue.
But it is just that when it comes to database objects, the old truth does not hold. In a proper database design, each table is unique, as it describes a unique entity. (Or at least it should!) Of course, it is not uncommon to end up with a dozen or more look-up tables that all have an id, a name column and some auditing columns. But they do describe different entities, and their semblance should be regarded as mere chance, and future requirements may make the tables more dissimilar.
Furthermore, when it comes to building a query plan, each table has its set of statistics and presumptions that are by no means interchangeable, as far as SQL Server is concerned. Finally, in a complex data model, it is important to get a grip of what's being used. When you start to pass table and column names as parameters, you definitely lose control.
So if you want to do the above (save the fact that SELECT * should not be used in production code), to save some typing, you are on the wrong path. It is much better to write ten or twenty stored procedures, even if they are similar to each other.
(If your SQL statements are complex, so that there actually is a considerable gain in maintainability to only have them in one place, despite different tables being used, you could consider using a pre-processor like the one in C/C++. You would still have one set of procedures per table, but the code would be in one single include file.)
SELECT * FROM sales + @yymm
This is a variation of the previous case, where there is a suite of tables that actually do describe the same entity. All tables have the same columns, and the name includes some partitioning component, typically year and sometimes also month. New tables are created as a new year/month begins.
In this case, writing one stored procedure per table is not really feasible. Not the least, because the user may want to specify a date range for a search, so even with one procedure per table you would still need a dynamic dispatcher.
Now, let's make this very clear: this is a flawed table design. You should not have one sales table per month, you should have one single sales table, and the month that appear in the table name, should be the first column of the primary key in the united sales table. At least logically. Sometimes, when you have huge tables (say over 10 GB in size), partitioning can be a good idea, but you should do it right and use partitioned views, that we will look at in this section. I like to make the point that since SQL Server is a enterprise RDBMS, it can handle very large tables very efficiently, as long as you keep in mind that good indexing is essential. A few million rows is no cause for concern.
If you have a legacy application, it may be prohibitively expensive to make a redesign. Then again, the complexity of dynamic SQL also comes with a cost. Fortunately, there are alternatives. A simple approach is to define a view like this:
CREATE VIEW sales ASSELECT year = '2006', * FROM dbo.sales2006UNION ALLSELECT year = '2005', * FROM dbo.sales2005UNION ALL...
(For a view like this, SELECT * could be considered OK.) Instead of composing the table name dynamically, you can now say:
SELECT ... FROM sales WHERE year = '2006' AND ...
Unfortunately, this view is not terribly efficient, as the query will access all three tables. Furthermore, the view is not updateable. While this is a partitioned view in some sense, the view does not fulfil the rules for partitioned views in the SQL Server sense. Such views can be very efficient, because for queries that include the partitioning column in the WHERE clause, SQL Server will only access the relevant table(s). And such a view is updatable, so you can insert data into it, and the data will end up in the right table.
Here is a quick example/demo on how to properly set up a partitioned view. Assume that as legacy of a poor design we have these three tables:
SELECT * INTO Orders96 FROM Northwind..Orders WHERE year(OrderDate) = 1996ALTER TABLE Orders96 ADD CONSTRAINT pk96 PRIMARY KEY (OrderID)SELECT * INTO Orders97 FROM Northwind..Orders WHERE year(OrderDate) = 1997ALTER TABLE Orders97 ADD CONSTRAINT pk97 PRIMARY KEY (OrderID)SELECT * INTO Orders98 FROM Northwind..Orders WHERE year(OrderDate) = 1998ALTER TABLE Orders98 ADD CONSTRAINT pk98 PRIMARY KEY (OrderID)
First step is to a add Year column to each table. These columns need a default (so that processes that insert directly into these tables are unaffected) and a CHECK constraint. Here is how it looks for Orders96:
ALTER TABLE Orders96 ADD Year char(4) NOT NULLCONSTRAINT def96 DEFAULT '1996'CONSTRAINT check96 CHECK (Year = '1996')
This column must be the first column in the primary key, so we need to drop the current primary key and recreate it:
ALTER TABLE Orders DROP CONSTRAINT pk96ALTER TABLE Orders96 ADD CONSTRAINT u96 UNIQUE (Year, OrderID)
Again, this must be performed for all three tables. Finally, you can create the view:
CREATE VIEW Orders ASSELECT * FROM dbo.Orders96UNION ALLSELECT * FROM dbo.Orders97UNION ALLSELECT * FROM dbo.Orders98
You now have a proper partitioned view that you can perform inserts and updates through. And if you run a query like:
SELECT OrderID, OrderDate, EmployeeIDFROM OrdersWHERE Year = @yearAND CustomerID = N'BERGS'
SQL Server will at run-time only access the OrdersNN table that maps to @year. If you look at the query plan casually, it may seem that all three tables are accessed, but if you check the Filter operators you will find something called STARTUP EXPR. This means that SQL Server determines at run-time whether to access the table or not.
For your real-world case you may find it prohibitive to change the primary key. In this case you could add a UNIQUE constraint with the partitioning column + the real primary key. This will not be a proper partitioned view, and the view will not be updatable, but with some luck SQL Server may still apply startup expressions, and access only one of the base tables. At least I got it to work, when I ran a quick test. You should verify that it works for your situation.
When a new table is added with a new year, the view needs to be redefined. If this happens frequently, for instance by each month, you should probably set up a job for this. I leave out example code, but it requires running a cursor over sysobjects to compose a CREATE VIEW statement that you then execute with sp_executesql or
This was a very concentrated introduction to partitioned views, a feature which was introduced in SQL 2000. What Microsoft had in mind was truly big tables where partitioning is desired not only for performance but also manageability. You can find the full rules for partitioned views under the topic for CREATE VIEW in Books Online. Good reading is also Stefan Delmarco's detailed article SQL Server 2000 Partitioned Views.
For completeness sake, I should mention that it is also possible to define distributed partitioned views with tables spread out over several servers. Furthermore, SQL 2005 adds another partitioning feature, partitioned tables. I'm not detailing any of them here.
UPDATE tbl SET @colname = @value WHERE keycol = @keyval
In this case people want to update a column which they select at run time. The above is actually legal in
In this case dynamic SQL would call for the user to have UPDATE permissions on the table, something which is not to take lightly. So there is all reason to avoid it. Here is a fairly simple workaround:
UPDATE tblSET col1 = CASE @colname WHEN 'col1' THEN @value ELSE col1 END,col2 = CASE @colname WHEN 'col2' THEN @value ELSE col2 END,...
If you don't know about the CASE expression, please look it up in Books Online. It's a very powerful SQL feature.
Then again, one would wonder why people want to do this. Maybe it's because their tables look like this:
CREATE TABLE products (prodid prodid_type NOT NULL,prodname name_type NOT NULL,...sales_1 money NULL,sales_2 money NULL,...sales_12 money NULL,PRIMARY KEY (prodid))
It could make more sense to move these sales_n columns to a second table:
CREATE TABLE product_sales (prodid prodid_type NOT NULL,month tinyint NOT NULL,sales money NOT NULL,PRIMARY KEY (prodid, month))
SELECT * FROM @dbname + '..tbl'
In this case the table is in another database which is somehow determined dynamically. There seems to be several reasons why people want to do this, and depending on your underlying reason, the solution is different.
Get Data from another Database
If you for some reason have your application spread over two databases, what you absolutely not should do is to have code that says:
SELECT ... FROM otherdb.dbo.tbl JOIN ...
This is bad, because if someone asks for a second environment on the same server, you have a lot of code to change.
The best solution for this particular problem on SQL 2005, is to use synonyms:
CREATE SYNONYM otherdbtbl FOR otherdb.dbo.tbl
You can then refer to otherdb.dbo.tbl as just otherdbtbl. If there is a need for a second set of databases, you only have to update the synonyms, and there is no need to use dynamic SQL.
Yet a way to avoid dynamic SQL is to use stored procedures for all inter-database communication. That is, if you are in db1 and need to get data from db2, you call a stored procedure in db2. This can be dynamic, because EXEC permits you to specify a variable that holds the name of the procedure to execute.
SELECT @dbname = quotename(dbname) FROM ...SELECT @sp = @dbname + '..some_sp'EXEC @ret = @sp @par1, @par2...
If you want to get result sets back from db2, look at my article How to Share Data between Stored Procedures for suggestions.
There may still be cases you may find that dynamic SQL is the only feasible situation. This can be done in two ways. The most obvious is:
SELECT @dbname = quotename(dbname) FROM ...SELECT @sql = ' SELECT ... FROM ' + @dbname + ' .dbo.otherdbtbl ' +' JOIN dbo.localtbl ... 'EXEC sp_executesql @sql, @params, ...
But, if the query is complex, and most of the tables are in the remote database you can also do:
SELECT @sql = ' SELECT ... FROM dbo.othertbl ' +' JOIN ' + quotename(db_name()) + '.dbo.localtbl ... 'SELECT @dbname = quotename(dbname) FROM ...SELECT @sp_executesql = @dbname + '..sp_executesql'EXEC @sp_executesql @sql, @params, ...
As above, I make use of that you can specify the procedure name dynamically with EXEC. The trick here is that when you specify a system stored procedure in three-part notation with the database name, the procedure executes in the context of that database. Thus, the dynamic SQL in this example runs in @dbname, not the current database.
Do Something in Every Database
This sounds to me like some sysadmin venture, and for sysadmin tasks dynamic SQL is usually a fair game, because neither caching nor permissions are issues. Nevertheless there is an kind of alternative: sp_MSforeachdb, demonstrated by this example:
sp_MSforeachdb 'SELECT ''?'', COUNT(*) FROM sysobjects'
As you might guess, sp_MSforeachdb uses dynamic SQL internally, so what you win is that you don't have to write the control loop yourself. I should hasten to add that sp_MSforeachdb is not documented in Books Online, which also means that use of it is not supported by Microsoft and it could be changed or withdrawn from SQL Server without notice.
A "Master" Database
The scenario here is that you have a suite of databases with identical schema. The typical reason they are different databases and not one, is that every database serves a different customer, and each customer can access his database (but of course no one else's). Some people see a problem with the same stored procedures in fifty databases, and believe that they face a maintenance nightmare. So they get the idea that they should put the procedures in a "master" database. Yes, you can do that. It will give you a much bigger maintenance problem, because your code will entirely littered with dynamic SQL. In fact, if you feel that this is the only alternative, you are better off skipping stored procedures altogther and do all acecss from client code instead. In such case there is only one place you need to specify the database: the connection string.
What else can you do? Some people might suggest that you should collapse the databases into one, and employ a strict row-level security scheme. Personally, I would never accept such a solution as a potential customer. In a complex application, bugs can easily lead to that information is exposed to people who should not see it. Besides, row-level security cannot be implemented entirely waterproof in SQL Server. Whereas queries only would return the data they should, query plans and error messages may indirectly disclose information to users who are not authorised to see it.
Another wild approach is to use SQL Server's own master database and install the application procedures as system procedures. This works at least in SQL 2000 and earlier. I have not verified that it still does in SQL 2005 where Microsoft's own system procedures now reside in the invisible resource database. In any case, this is entirely unsupported. So while I mention the possibility, I don't give you the details on how to do it and I strongly recommend that you don't go there.
What then is the real solution? Install the stored procedures in each database and develop rollout routines for your SQL objects. You need this anyway, the day you want to update the table definitions. This also permits you to have some flexibility. Some custo