SQL Performance Tips and Best Practices


Good database design and coding will achieve all the required project functionality and deliver acceptable performance.  It takes into quality delivery to your customer.

·         Optimize queries using the tools provided by SQL Server.

Use the graphical execution plan in Query Analyzer or SHOWPLAN_TEXT or SHOWPLAN_ALL commands to analyze your queries. Make sure your queries do an "Index seek" instead of an "Index scan" or a "Table scan." A table scan or an index scan is a highly undesirable and should be avoided where possible.

EXAMPLE 1:

IF EXISTS (SELECT 1 FROM dbo.Action WITH (NOLOCK) WHERE ActionID = 50)
Instead Of: 
IF EXISTS (SELECT COUNT(ActionID) FROM dbo.Action WHERE ActionID = 50)


EXAMPLE 2:

Consider the following query to find the second highest offer MaxManPayAmount from the Operator table:

SELECT MIN(MaxManPayAmount) FROM Operator WHERE OperatorID IN
(SELECT TOP 2 OperatorID FROM Operator ORDER BY MaxManPayAmount Desc)

The same query can be re-written using a derived table, as shown below, and it performs generally as fast as the above query:
      
   SELECT MIN(opr.MaxManPayAmount) FROM (SELECT TOP 2 MaxManPayAmount FROM dbo.Operator  WITH (NOLOCK) ORDER BY MaxManPayAmount DESC) AS opr


·         Do not use SELECT *,  instead practice writing required column names after the SELECT statement.

·         Do not use SQL keywords as the name of database objects, In cases where this is necessary, surround the object name with brackets, such as [Year].

·         Use table WITH (NOLOCK) command for SELECT statement which avoids a deadlock occurrence.

·         Avoid dynamic SQL statements as much as possible.

o    Dynamic SQL tends to be slower than static SQL, as SQL Server must generate an execution plan at runtime. IF and CASE statements come in handy to avoid dynamic SQL.

·         Avoid using <> as a comparison operator.

o    Use ID IN (1,3,4,5) instead of ID <> 2.

·         Avoid unnecessary use of temporary tables.

o    Use 'Derived tables' or CTE (Common Table Expressions) wherever possible, as they perform better.

·         Use SET NOCOUNT ON at the beginning of stored procedures, SQL batches and Triggers. This improves the performance of stored  procedures.

o    SET NOCOUNT ON suppresses messages like '(1 row(s) affected)' after executing INSERT, UPDATE, DELETE and SELECT statements. Performance is improved due to the reduction of network traffic.

·         Place all DECLARE statements before any other code in the stored procedure.

·         Do not use cursors or application loops to do inserts.

o    Try to avoid server side cursors as much as possible. Always stick to a 'set-based approach' instead of a 'procedural approach' for accessing and manipulating data. Cursors can often be avoided by using SELECT statements instead. If a cursor is unavoidable, use a WHILE loop instead. For a WHILE loop to replace a cursor, however, you need a column (primary key or unique key) to identify each row uniquely.

·         Fully qualify tables and column names.

o    Prefix table names with the owner name. This improves readability and avoids unnecessary confusion. Microsoft SQL Server Books Online states that qualifying table names with owner names help in execution plan reuse, further boosting performance.

Example: dbo.FeatureAction

·         Use ANSI 92  Joins instead of old style joins his improves readability and avoids unnecessary confusion.

False code:
            SELECT * FROM dbo.FeatureAction, Feature  
WHERE Feature.FeatureID = FeatureAction.FeatureID
            
True code:
SELECT FeatureAction.FeatureID, Feature.FeatureName FROM dbo.FeatureAction 
INNER JOIN dbo.Feature WITH (NOLOCK) ON Feature.FeatureID = FeatureAction.FeatureID

·         Avoid the use of cross joins, if possible.

·         Avoid to define default values for parameters in stored procedure unless until it is needed.

o    If a default is needed, the front end will supply the value.

·         Do not use the RECOMPILE option for stored procedures.

·         Do not use column numbers in the ORDER BY clause. Use appropriate column name.

·         Do not use GOTO.

·  Check the global variable @@ERROR immediately after executing a data manipulation statement (like INSERT/UPDATE/DELETE), so that you can rollback the transaction if an error occurs.
OR  
  Use TRY/CATCH

·         Do basic validations in the front-end itself during data entry.

·   Off-load tasks, like string manipulations, concatenations, row numbering, case  conversions, type conversions etc., to the front-end applications if these operations  are going to consume more CPU cycles on the database server.

·         Always use a column list in your INSERT statements.

o    This helps avoid problems when the table structure changes.(like adding or   dropping a column)

·    When executing an UPDATE or DELETE statement, use the primary key in the WHERE condition, if possible. This reduces error possibilities.

o    Each table must have a primary key, In most cases it should be an IDENTITY column named <tablename>ID.

o    Tables will usually have a Primary Key. The only time a Primary Key can be omitted from table design is for a limited size table (10-100 rows) that is stand-alone (not significantly referenced by other tables), is not involved in complex joins or queries, and not expected to receive any significant updates or insertion. 

·         Normalize data to third normal form

o    Do not compromise on performance, sometimes, a little de-normalization results in better performance

·         Necessary INDEX should be created on tables for fast retrieval of data. The same index column order needs to be followed in WHERE clause.

·  Proper LOCK mechanism should be defined for CLUSTRED INDEX (PRIMARY KEY) and NON CLUSTERED INDEX.

Following instructions help to avoid deadlocks. Other things to keep in mind to avoid deadlocks are:

  • Always access tables in the same order in all stored procedures and triggers consistently.

  • Keep transactions as short as possible.

  • Touch the minimum amount of data possible during a transaction.

  • Never wait for user input in the middle of a transaction.

  • Do not use higher level locking hints or restrictive isolation levels unless they are absolutely needed. Use table WITH (NOLOCK) command for SELECT statement which avoids the deadlock occurs on table level.

  • Define the appropriate LOCKs Option during Primary Key, Unique Key, and Index creation. [ALLOW_ROW_LOCKS =  ON,   ALLOW_PAGE_LOCKS = ON,  hint tells the SQL engine to start out with row locks. If more than twenty five row locks are needed, that might be escalated into a page lock, and so on. This is the granularity of the lock].  

·         Minimize the use of NULLs, as they often confuse front-end applications, unless the applications are coded intelligently to eliminate NULLs or convert the NULLs into some other form.

o    Any expression that deals with NULL results in a NULL output.

o    The ISNULL and COALESCE functions are helpful in dealing with NULL values.

·         Do not use the identitycol or rowguidcol unless until it is needed.

·         Default constraints must be defined at the column level.

o    Not to specify DEFAULT CONSTRAINTS for Columns allow with NULL value.  

·         Define all constraints (that apply to more than one column) at the table level other than defaults.

·         Referential Integrity (Foreign Key) within the database should be enforced.

·    Avoid rules, database level defaults that must be bound or user-defined data types. While these are legitimate database constructs, opt for constraints and column defaults to hold the database consistent for development and conversion coding.

·         Use the CHAR data type for a column only when the column content is non-nullable and fixed length.

·         Use the VARCHAR data type for a column when the column content is variable size and could allow null.

o CHAR(100), when NULL, will consume 100 bytes, resulting in space wastage. Preferably, use VARCHAR(100) in this situation. Variable-length columns have very little processing overhead compared with fixed-length columns.

o    VARCHAR data columns use an empty string wherever needed.

·         Avoid using TEXT or NTEXT datatypes for storing large textual data. use the maximum allowed characters of VARCHAR(MAX) and NVARCHAR(MAX).

o    Use the maximum allowed characters of VARCHAR instead.

o    You cannot directly write or update text data using the INSERT or UPDATE statements. Instead, you have to use special statements like READTEXTWRITETEXT and UPDATETEXT. So, if you don't have to store more than 8KB of text, use the CHAR(8000) or VARCHAR(8000)  data type instead.

·         Do not use white space in identifiers.

Example [Birth Date] date

·   Create STORED PROCEDURE as much as possible with business logic on the same database for accessing data.

o    Do not prefix stored procedures with ‘sp_’.

o    The prefix sp_ is reserved for system stored procedures that ship with SQL Server. Whenever SQL Server encounters a procedure name starting with sp_, it first tries to locate the procedure in the master database, then it looks for any qualifiers  (database, owner) provided, then it tries dbo as the owner. Time spent locating the stored procedure can be saved by avoiding the "sp_" prefix.

o    Do prefix stored procedures with “USP_”(user defined stored procedure).

·        You might need the length of a string variable in many places of your stored procedure, but don't call the LEN function whenever it's needed. Instead, call the LEN function once and store the result in a variable for later use.

·      Multiple records to be stored /deleted to table at an attempt, pass XML data as parameter value to stored procedure for BULK INSERT OR BULK UPDATE OR BULK DELETE. XML data should created/formed from application server and pass it xml.

Example: Bulk Insert.

DECLARE @XML AS XML
--SET @XML = (select * from ED for xml auto)
SET @XML = '<ED EID="1" DID="1" SN="1"/>
                     <ED EID="1" DID="2" SN="2"/>
                     <ED EID="1" DID="3" SN="3"/>
                     <ED EID="2" DID="4" SN="1"/>
                     <ED EID="2" DID="5" SN="2"/>
                     <ED EID="3" DID="7" SN="2"/>
                     <ED EID="4" DID="7" SN="1"/>
                     <ED EID="4" DID="8" SN="2"/>
                     <ED EID="5" DID="4" SN="1"/>
                     <ED EID="5" DID="5" SN="2"/>
                     <ED EID="5" DID="6" SN="3"/>
                     <ED EID="6" DID="7" SN="1"/>
                     <ED EID="6" DID="8" SN="2"/>
                     <ED EID="6" DID="9" SN="3"/>
                     <ED EID="7" DID="0" SN="1"/>'
INSERT INTO EmployeeDegree
([EmployeeID], [DegreeID], [Sequence]) 
SELECT
    DOC.col.value('@EID', 'INT') as EmployeeID,
    DOC.col.value('@DID', 'INT') as DegreeID,
    DOC.col.value('@SN', 'INT') as Sequence
       FROM @XML.nodes('ED') DOC(col)



·         Access tables in the same order in your stored procedures and triggers consistently.

·         Do not call functions repeatedly within your stored procedures, triggers, functions and batches.

·        The RETURN statement is meant for returning the execution status only, but not data.

·    Return multiple result sets from one stored procedure to avoid trips from the application server to SQL server.

·    Avoid wild-card characters at the beginning of a word while searching using the LIKE keyword, as these results in an index scan, which defeats the purpose of an index.

1 comment: