Prepared by Bihag Thaker
Good Indentation makes queries more readable and easy to debug. Try to break the queries into different lines by different query clauses like UPDATE, FROM, JOIN, ON, WHERE etc.
Wherever possible, place targeted column names in separate lines one level indented to the right side. If a column is not the first column in the list, place comma (,) separator before the name of a column.
This helps queries easier to read, debug and maintain. While debugging a query, anytime any line containing a column name can be easily commented out or uncommented as per the need.
Following is an example of properly indented code for UPDATE statement:
12345678 UPDATE [dbo].[tbl_Customers]SETFirstName = 'FirstNameUpdated',LastName = 'LastNameUpdated',BirthDate = '19841231',EmailAddress = 'EmailAddressUpdated@domain.com',Gender = 'M'WHERE CustomerID = 'CUST001'
While writing UPDATE queries, make sure that the correct set of rows will be updated by including proper WHERE clause. UPDATE operation once committed cannot be undone.
Writing and executing SELECT query with the similar WHERE criteria as in UPDATE statement on the table should be exercised before implementing actual UPDATE queries. In this way, it can be confirmed that the correct set of rows will be updated.
When it is required to return updated records to the calling application with old values and new updated values, consider using OUTPUT clause in the UPDATE statement rather than using two separate UPDATE and SELECT statements to retrieve ‘just-updated’ records as shown below:
12345678910111213141516171819 UPDATE [dbo].[tbl_Customers]SETFirstName = 'FirstName',LastName = 'LastName',BirthDate = '20111212',EmailAddress = 'EmailAddress',Gender = 'M'OUTPUTdeleted.FirstName AS Old_FirstName,inserted.FirstName AS New_FirstName,deleted.LastName AS Old_LastName,inserted.LastName AS New_LastName,deleted.BirthDate AS Old_BirthDate,inserted.BirthDate AS New_BirthDate,deleted.EmailAddress AS Old_EmailAddress,deleted.EmailAddress AS New_EmailAddress,deleted.Gender AS Old_Gender,inserted.Gender AS New_GenderWHERE CustomerID='CUST006'
Prepared by Bihag Thaker
Good Indentation makes queries more readable and easy to debug. Try to break the queries into different lines by different query clauses like INSERT and VALUES or INSERT… SELECT.
Wherever possible, place targeted column names in separate lines one level indented to the right side
enclosed with the left parenthesis and right parenthesis.If a column is not the first column in the list, place comma (,) separator before the name of a column. Apply same formatting convention to VALUES clause also.
This helps queries easier to read, debug and maintain. While debugging a query, anytime any line containing a column name can be easily commented out or uncommented as per the need.
Following is an example of properly indented code for INSERT statement:
123456789101112131415161718 INSERT INTO [dbo].[tbl_Customers](CustomerID,FirstName,LastName,BirthDate,EmailAddress,Gender)VALUES('CUST001','SomeFirstName','SomeLastName','19820101','SomeEmailAddress@domain.com','M')
While writing INSERT statements, always specify explicitly the list of targeted columns for which the values are supplied with the VALUES clause. Do not rely on implicit order of the columns in table.
This can break a query if some field is added to or deleted from the table. Thus, instead of writing an insert query without specifying the list of targeted columns as shown below:
12345678910 INSERT INTO [dbo].[tbl_Customers]VALUES('CUST001','SomeFirstName','SomeLastName','19820101','SomeEmailAddress@domain.com','M')Write the above query by explicitly specifying the list of column names within a pair of parenthesis in INSERT statement as shown below:
123456789101112131415161718 INSERT INTO [dbo].[tbl_Customers](CustomerID,FirstName,LastName,BirthDate,EmailAddress,Gender)VALUES('CUST001','SomeFirstName','SomeLastName','19820101','SomeEmailAddress@domain.com','M')
When inserting multiple records with multiple INSERT statements as shown below:
123456789101112131415161718192021222324252627282930313233343536 INSERT INTO [dbo].[tbl_Customers](CustomerID,FirstName,LastName,BirthDate,EmailAddress,Gender)VALUES('CUST001','SomeFirstName','SomeLastName','19820101','SomeEmailAddress@domain.com','M')INSERT INTO [dbo].[tbl_Customers](CustomerID,FirstName,LastName,BirthDate,EmailAddress,Gender)VALUES('CUST002','SomeAnotherFirstName','SomeAnotherLastName','19820101','SomeAnotherEmailAddress@domain.com','F')Instead re-write the same query by replacing multiple INSERT statements with single INSERT statement and by specifying values for multiple rows with VALUES clause as shown below:
12345678910111213141516171819202122232425262728293031323334 INSERT INTO [dbo].[tblCustomers](CustomerID,FirstName,LastName,BirthDate,EmailAddress,Gender)VALUES('CUST001','SomeFirstName1','SomeLastName1','19820101','SomeEmailAddress1@domain.com','M'),('CUST002','SomeFirstName2','SomeLastName2','19820101','SomeEmailAddress2@domain.com','F'),('CUST003','SomeFirstName3','SomeLastName3','19820101','SomeEmailAddress3@domain.com','M')This is the new feature of SQL Server 2008 called row constructor that allows us to specify values for multiple records in a single INSERT statement which increases code readability and manageability.
When it is required to return newly inserted records to the calling application, consider using OUTPUT clause in the INSERT statement rather than using two separate INSERT and SELECT statements to retrieve ‘just-inserted’ records as shown below:
12345678910111213141516171819202122232425 INSERT INTO [dbo].[tbl_Customers](CustomerID,FirstName,LastName,BirthDate,EmailAddress,Gender)OUTPUTinserted.CustomerID,inserted.FirstName,inserted.LastName,inserted.BirthDate,inserted.EmailAddress,inserted.GenderVALUES('CUST006','SomeFirstName1','SomeLastName1','19820101','SomeEmailAddress1@domain.com','M')
Prepared by Bihag Thaker
Do not use ORDER BY clause unnecessarily in queries until the result set is strictly required to be sorted in a particular order. ORDER BY clause is an expensive operation and can cause performance issues.
Always specify the name of the columns in ORDER BY clause rather than specifying the position of the column in the SELECT list. For Example, write the query as:
12345 SELECTOrderID,OrderDateFROM [dbo].[tbl_Orders]ORDER BY OrderDate DESCDo not write above query as follows:
12345 SELECTOrderID,OrderDateFROM [dbo].[tbl_Orders]ORDER BY 2 DESC
Whenever it is safe and OK to for reading dirty data, include WITH (NOLOCK) query hint in SELECT queries. It reduces locking issues and retrieves data faster. See the following example:
123456 SELECTOrderID,CustomerID,OrderDateFROM [dbo].[tbl_Orders] WITH (NOLOCK)WHERE OrderDate>='20180101' AND OrderDate<='20180401'
Do not use old syntax for joining the tables in queries. Use ANSI-SQL standard syntax and use keywords for joining the tables like INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN etc.
Beware while joining tables in queries. Prefer INNER JOIN over OUTER JOIN wherever it is possible to use them and do not use OUTER JOIN unnecessarily as they are less efficient then INNER JOIN.
Beware of using OUTER JOIN correctly. If you have LEFT OUTER JOIN and have specified a WHERE condition on a column from the right table, then the query is equal to INNER JOIN and is logically incorrect. Such queries must be most likely to use INNER JOIN instead of OUTER JOIN.
While joining the tables, make use of proper aliases for the tables and prefix each column name with its associated table alias. Do not use long names for table aliases.
Keep them short and abbreviated by specifying the first character of a word in the table name. For example, use OD as an alias for table tblOrderDetails and PC for table tblProductCategories. Following is an example of table alias in join queries.
12345678910 SELECTC.CustomerID,C.FirstName,C.LastName,COUNT(O.OrderID) AS TotalOrdersFROM [dbo].[tbl_Customer] AS CINNER JOIN [dbo].[tbl_Orders] AS O ON C.CustomerID = O.CustomerIDWHERE OrderDate>='20180101'GROUP BY C.CustomerIDHAVING COUNT(OrderID) > 3
Do not use DISTINCT keyword when GROUP BY clause is present in the same query because GROUP BY clause itself results in an only unique combination of columns. Thus DISTINCT is not of meaning when GROUP BY clause is present.
While specifying conditions in WHERE clause, try not to use <> and NOT operators wherever possible. They tend to introduce query performance issues.
Avoid using LIKE operator as much as possible. Using like operator is not efficient and degrades the query performance. LIKE operator should be avoided in WHERE clause when the first character(s) is specified with a wildcard character.
This prevents the use of the index and retrieves rows inefficiently. So avoid queries similar to below:
12345678 SELECTFirstName,LastName,BirthDate,EmailAddressFROM [dbo].[tbl_Customers]WHERE LastName LIKE '%Patel%'ORDER BY LastName,FirstName
While specifying date literals in WHERE clause, do not use date formats which rely on date format set for a particular session. Always use the universal date format which is ‘YYYYMMDD’.
This format guarantees to work correctly irrespective of date format set for a session. For example to retrieve all orders for year 2010, use the following date values in WHERE clause:
123456 SELECTOrderID,OrderDateFROM [dbo].[tbl_Orders]WHERE OrderDate >= '20180101' AND OrderDate <= '20180430'ORDER BY OrderDate DESC
Whenever possible, try to avoid the use of scalar functions in WHERE clause. For example, the previous query retrieves all the orders for the year 2010. Same results can be achieved by using YEAR() function as shown below:
123456 SELECTOrderID,OrderDateFROM [dbo].[tbl_Orders]WHERE YEAR(OrderDate) = 2010ORDER BY OrderDate DESCSuch careless use of functions in WHERE clause should be avoided when there is an alternative solution. Such usages of functions in queries can degrade the performance of the queries.
Likewise, avoid unnecessary use of CAST() and CONVERT() functions to convert between the data types. Also, minimize the use of other scalar functions as they degrade the performance of the queries.
In case, the same value of a scalar function needs to be used at multiple places in a query or a single batch, then consider storing the value of a scalar function in a variable and subsequently use that variable at other places.
Never use comparison operators like ‘equals to’ (=), ‘not equal to’ (<>) to compare NULL values in WHERE clause. Always use IS NULL and IS NOT NULL operators to compare NULL values. So, instead of writing the query as:
1234567 SELECTFirstName,LastName,BirthDate,EmailAddressFROM [dbo].[tbl_Customers]WHERE EmailAddress=NULLWrite the above query correctly as follows:
1234567 SELECTFirstName,LastName,BirthDate,EmailAddressFROM [dbo].[tbl_Customers]WHERE EmailAddress IS NOT NULLThe result of comparing NULL values with comparison operators depends on the SQL Server setting
SET ANSI_NULLS. So, it should never be used.
Prepared by Bihag Thaker
Good Indentation makes queries more readable and easy to debug.
Try to break the queries into different lines by different query clauses like SELECT, FROM, WHERE, INNER JOIN, GROUP BY, HAVING, ORDER BY etc. Wherever possible, place column names on separate lines one level
indented to the right side.If a column is not the first column in the list, place comma (,) separator before the name of the column. This helps queries easier to read and debug.
While debugging a query, anytime any line containing a column name can be easily commented out or
uncommented as per the need. Following is the example of the same:
1234567891011 SELECTC.CustomerID,C.FirstName,C.LastName,COUNT(O.OrderID) AS TotalOrdersFROM [dbo].[tbl_Customer] AS CINNER JOIN [dbo].[tbl_Orders] AS OON C.CustomerID = O.CustomerIDWHERE OrderDate>='20180101'GROUP BY C.CustomerIDHAVING COUNT(OrderID) > 3
Do not use ‘SELECT *’ in SELECT queries. Always specify the list of columns in SELECT queries.
Avoid following types of queries:
1 SELECT * FROM [dbo].[tbl_Customers]Instead specify the list of columns even if all the columns are required in result set as shown below:
1234567 SELECTFirstName,LastName,BirthDate,EmailAddress,GenderFROM [dbo].[tbl_Customers]
Use of ‘SELECT *’ guarantees the same order of the column in result set every time.
This behaviour may cause problems when a column is inserted to or deleted from the table and can break the application when application is expecting only required fields.
If a column is removed from the table and query is using ‘SELECT *’, then it can go unnoticed and logical bugs can be introduced in code.
Also retrieve only columns which are required in the output to be given to the application.
Only retrieve the records which are required by the application. Try to limit the result set of queries by applying various filters with clauses like ON, WHERE and HAVING wherever possible.
This has a significant performance impact on very large tables. This practice also reduces network traffics by requiring the less number of bytes to transfer across the network.
For calculated columns, always specify aliases with ‘AS’ keyword rather than leaving a column unnamed as shown in following example:
1234567 SELECTFirstName + ' ' + LastName AS FullName,CASE GenderWHEN 'M' THEN 'Male'WHEN 'F' THEN 'Female'END AS GenderFROM [dbo].[tbl_Customers]However column aliases can be assigned in a few different ways as shown below:
12345678910111213 SELECTFirstName + ' ' + LastName AS FullNameFROM [dbo].[tbl_Customers]SELECTFullName = FirstName + ' ' + LastNameFROM [dbo].[tbl_Customers]SELECTFirstName + ' ' + LastName FullNameFROM [dbo].[tbl_Customers]Always use the technique of using AS keyword as a standard practice to specify column aliases.
Also observe the indentation of CASE expression used to distinguish the gender of a customer in previous query. Try to put every case on separate line one level indented to the right side enclosed with CASE and END body.
This practice makes very complex queries easier to understand and debug. An alias name must be assigned to such column expressions.
Prepared by Bihag Thaker
Always try to write queries that follow the latest ANSI-SQL standards. Try to avoid using features which are non-standard as they may tend to be depreciated in future versions of the product.
Always write all T-SQL Keywords like CREATE, ALTER, DROP, INSERT, UPDATE, DELETE, TRUNCATE, SELECT, FROM, WHERE, JOIN, INNER, OUTER, FULL, ON, WITH, GROUP BY, ORDER BY, HAVING, AS, CASE, IF, WHILE, BEGIN, END and all the other T-SQL keywords in UPPER CASE only.
This makes queries more readable and user-defined objects referenced in the queries can be identified quickly from the SQL Keywords. Following is a simple example of a query with all its T-SQL Keywords in UPPER CASE.
12345678 SELECTFirstName,LastName,BirthDate,EmailAddressFROM [dbo].[tbl_Customers]WHERE LastName LIKE 'J%'ORDER BY LastName,FirstName
Always reference the objects in queries by two-part referencing convention like [SchemaName].[ObjectName] for appropriate object name resolution. Try to avoid following types of object reference in queries:
1234567 SELECTFirstName,LastName,BirthDate,EmailAddressFROM tbl_CustomersWHERE EmailAddress IS NULL
Instead reference the object by explicitly specifying the Schema Name:
1234567 SELECTFirstName,LastName,BirthDate,EmailAddressFROM [dbo].[tbl_Customers]WHERE EmailAddress IS NULLThis way, it is guaranteed that always the correct intended object will be referenced if there are
two objects with the same name in two different schemas.
Always, write clear and useful comments for complex SQL Statements which are not obvious and require explanation. This helps developers to understand the code easily written by someone else.
Use comments in Stored Procedures, Views, User Defined Functions, Triggers and anything which is complex and needs explanation. Explain what a particular statement is doing and how by placing comments above the statements.
Use single line comments with double hyphen (–) characters and multiline comments spanning across multiple lines using /* */ commenting blocks.
At the beginning of an object definition, use multiline comments to specify the name of the object, purpose of the object, name of the author.
Additionally, any revision or change in the object definition should be specified with the Change Date, the name of the person who makes a change and description of the change. Some examples of the same can be seen in Sample Code Snippet sections.
Use of GOTO statements should be avoided.
Statement blocks within BEGIN END, BEGIN TRANSACTION…COMMIT TRANSACTION, BEGIN TRY…END TRY and BEGIN CATCH…END CATCH should be indented one tab right side.
IF and ELSE block is more readable when using in following format:
1234567 IF SomeConditionBEGIN--IF Statement BlockENDELSE BEGIN--ELSE Statement BlockEND
Prefer using static queries and avoid dynamic SQL queries in T-SQL batches as much as possible.
When it is mandatory to use dynamic query, do not execute a dynamic query with EXECUTE statement. Rather use sp_executesql stored procedure and make it parameterized dynamic query by passing the appropriate parameters to sp_executesql stored procedure.
Follow the set-based theory and do not use cursors until there is not any alternative solution to using cursors. Generally most of the tasks can be achieved with set-based theory by using complex DML Statements and combination of temporary tables.
Use Cursors only as a last resort as cursors are very resource intensive and poor in performance.
Even if there is a compelling reason of using cursor, consider if same functionality can be achieved with the WHILE loop. Try both, WHILE loop approach and CURSOR approach, compare them and use the one which is efficient.
If cursor has to be used then consider to make it read-only, forward-only cursor.
Prepared by Bihag Thaker
To standardize the naming conventions for different objects in SQL Server for consistency across different applications using the same database, following naming conventions should be practised:
For user-defined tables: use the prefix ‘tbl_’ + name of the table. e.g. “tbl_OrderDetails”
For user-defined stored procedures: use the prefix ‘usp_’ + name of the stored procedure. Do not use prefix ‘sp’ as it is intended to recognize the SQL Server’s system stored procedures. e.g. “usp_DeleteCustomer”
For user defined views: use the prefix ‘vw_’ + Name of the view. e.g. “vw_GetOrderDetails”
For user defined functions: use prefix ‘udf_’ + name of user defined functions. e.g. “udf_ValidateEmailAddress()”
For triggers: use prefix ‘trg_’ + TableName +’_’ + ‘[INSTEADOF]/[AFTER]’ + ‘_’ + ‘INSERT/UPDATE/DELETE[_ INSERT/UPDATE/DELETE]’. e.g. “trg_tbl_Orders_AFTER_INSERT_UPDATE”
For Indexes: use prefix ‘idx_’ + TableName + ’_’ + ColumnNames separated with underscore (_) character. e.g. “idx_tbl_Orders_OrderDate”
For Primary Key Constraints: use prefix ‘pk_’ + TableName + ’_’ + ColumnNames separated with underscore (_) character. e.g. “pk_tbl_Orders_OrderId”
For Foreign Key Constraints: use prefix ‘fk_’ + TableName + ’_’ + ColumnNames separated with underscore (_) character. e.g. “fk_tbl_OrderDetails_OrderId”
For Unique Key Constraints: use prefix ‘uk_’ + TableName + ’_’ + ColumnNames separated with underscore (_) character. e.g. “uk_tbl_Customers_SocialSecurityNumber”
For Check Constraints: use prefix ‘ck_’ + TableName + ’_’ + ColumnName. e.g. “ck_tbl_OrderDetails_Quantity”
For Default Constraints: use prefix ‘df_’ + TableName + ’_’ + ColumnName. “df_tbl_Orders _OrderDate”
Prepared by Bihag Thaker
There are various types of objects in SQL Server like Tables, Views, Stored Procedures, User Defined Functions, Triggers, Constraints, Indexes, Schemas and Variables and so on. These objects should be assigned appropriate identifiers with which they are referenced in the database. Object Identifiers are of two types in SQL Server: (1) Regular Identifiers and (2) Delimited Identifiers
Regular Identifiers are the one that does comply with the format of identifiers in SQL Server. Delimited Identifiers are the one which does not comply with the format of identifiers.
Delimited identifiers must be delimited or enclosed by double quotation marks (“) or a pair of square brackets ([ ]).
For example, a delimited identifier not complying with the rules of the format of regular identifiers can be a T-SQL Reserved Word or can contain space within or any character which is not generally allowed in regular identifiers.
For example, even if ORDER is the reserved word in SQL Server, it can be assigned as an identifier to an object if it is delimited either by double quotation mark (“) or a pair of square brackets ([ ]) in the following manner in its definition
1 2 3 4 5 6 7 8 9 10 11 |
CREATE TABLE [Order] ( OrderID INT ) --OR CREATE TABLE "Order" ( OrderID INT ) |
Delimiting identifiers with a double quotation mark (“) depends on SET QUOTED_IDENTIFIER setting of SQL Server.
If SET QUOTED_IDENTIFIER is set to OFF then all literal strings enclosed with a double quotation mark (“) are interpreted as string constants and not as identifiers. Due to this, delimiting identifiers with a double quotation mark (“) should never be used.
Following is an example of regular identifier assigned to a table which is not delimited:
1 2 3 4 |
CREATE TABLE tblCustomerOrders ( OrderID INT ) |
Even if, SQL Server allows the use of delimited identifiers, its use is strongly discouraged and should notbe used. Use of regular identifiers to identify the objects should be practiced.
Following are some of the common guild lines along with best practices which follow the rules of regular identifiers and should be practiced while assigning identifiers to various objects:
Maximum length of identifiers should not exceed 128 characters and 116 characters for local
temporary tables.
First character must be a letter which can be any from either a-z or A-Z.
Even if SQL Server allows underscore (_), at sign (@) and number sign (#) as the first character, they should not be used as the first character in identifiers. Some of these characters have special meaning in SQL Server.
For example, an identifier that starts with a number sign (#) denotes a local temporary table, one that starts with ‘double number sign’ (##) denotes a global temporary table and one that starts with ‘at sign’ (@) denotes a variable name or parameter name.
Some of the functions in SQL Server starts with double at signs (@@). Thus, these special characters should be avoided while specifying identifiers.
The subsequent characters can be any letter or at sign (@), a dollar sign ($) or an underscore (_) character. However, use of these special characters in identifiers should be avoided. Underscore (_) character may be used for word separation.
Space should not be used in identifiers. For word separation, either an underscore (_) or Pascal Casing should be used. Wherever possible, usage of underscore (_) should be minimized by using Pascal Casing in identifiers.
For example, an identifier should be specified as ‘CustomerAddress’ instead of ‘Customer_Address’. It achieves the same readability by use of Pascal Casing and also reduces the length of identifiers.
Do not use T-SQL Keywords as object identifiers
Identifiers should be self-explanatory rather than being abbreviations and acronyms and they should indicate their purpose. This helps the developer in understanding the code written by someone else even if comments are not present.
For example, rather than using variable names @A or @i for ‘Sum’ and ‘Loop Counter’ respectively, they should be defined as @Sum and @Counter.