Recently, I was helping one of the community attendees on the SQL Server Query Time-Out issue. Once we are done with troubleshooting, he asked a very excellent question;
“I am dealing with very critical OLTP systems. As my company has been expanding the business, the data is rapidly growing. It is causing a lot of Store Procedures to experience query time-out issue. Application team wants to know which all the SPs are Time-Out prone so that they can actively work on the problematic SPs and address the issue before it happens.”
Now, when we are mainly talking about OLTP workloads application, we assume very high throughput and concurrency. It means the response times to finish the workload should in sub-seconds. Considering the OLTP workloads characteristics, your application team may have hardcoded the query time-out limit in application code. It could be the standard 60 seconds or maybe a little bit higher or lower.
Demonstration
To demonstrate it, I am using AdventureworksDW2016CTP3 database and assuming that Application team has hardcoded the query time-out limit 60 seconds in application code for most of the SPs. Now, let’s explore step-by-step approach how we can pro-actively look at any workloads that might be getting closer your query time-out value.
Setting up execution Time-Out limit
- Let’s first see how you can set execution time-out limit to 60 seconds
1) Click on the connect to server option
2) Click on the Options button
3) Go to Connection Properties
4) Change the execution time-out from zero to 60 seconds
5) Click on the Connect
Note: In case, your SQL instance is already connected. I would recommend to disconnect and re-connect by following the steps.
6) Once you established the connection successfully, run the below query to test the execution time-out limit.
SELECT TOP 1 * FROM [AdventureworksDW2016CTP3].[sales].[SalesOrderDetail] WAITFOR DELAY '00:01:10'
We can see that the moment query took more than 60 seconds to run, we got “Execution (connection) Timeout” Expire error.
Simulating the Scenario
- Let’s create exactly same two stored procedures, except the wait for delay time will be different for the stored procedures.
a) SP1 – sel_nottimeoutbound – This SP got wait for a delay of 30 seconds.
USE [AdventureworksDW2016CTP3] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE PROCEDURE [dbo].[sel_nottimeoutbound] AS SELECT [SalesOrderID], SUM([UnitPrice]) AS TotalUnitPrice FROM [AdventureworksDW2016CTP3].[sales].[SalesOrderDetail] GROUP BY [SalesOrderID] WAITFOR DELAY '00:00:30'
b) SP2 – sel_timeoutbound- This SP got wait for a delay of 58 seconds. It is close to querying time-out value.
USE [AdventureworksDW2016CTP3] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE PROCEDURE [dbo].[sel_timeoutbound] AS SELECT [SalesOrderID], SUM([UnitPrice]) AS TotalUnitPrice FROM [AdventureworksDW2016CTP3].[sales].[SalesOrderDetail] GROUP BY [SalesOrderID] WAITFOR DELAY '00:00:58'
- Execute both the SPs in separate-separate query windows
a) Executed the SP “sel_nottimeoutbound” and completed in 30 seconds.
USE AdventureworksDW2016CTP3 GO EXEC [dbo].[sel_nottimeoutbound]
b) Executed the SP “sel_timeoutbound” and completed in 58 seconds.
USE AdventureworksDW2016CTP3 GO EXEC [dbo].[sel_timeoutbound]
Finding Time-Out Prone Queries from Buffer Cache
- Now regarding viewing proactively which queries are getting near to the timeout value, we can use the DMV sys.dm_exec-procedure_stats and we can look at max_elapsed_time. Let’s run the below script to get all the SPs’ max_elapsed_time.
SELECT CASE WHEN database_id = 32767 THEN 'Resource' ELSE DB_NAME(database_id) END AS dbname ,OBJECT_SCHEMA_NAME (OBJECT_ID, database_id) AS [schema_name] ,OBJECT_NAME(OBJECT_ID, database_id) AS object_name ,MAX (max_elapsed_time)/1000000 AS max_elapsed_time_sec ,MAX (cached_time) as 'cached_time' ,MAX (last_execution_time) AS 'last_execution_time' ,MAX (execution_count) AS 'execution_count' FROM sys.dm_exec_procedure_stats WHERE type = 'P' AND database_id NOT IN (1,2,3,4,32767) GROUP BY database_id,[object_id]
Query Result:
You see the SP “sel_nottimeoutbound” max elapsed time touched to 30 seconds, and the SP “sel_timeoutbound” max_elapsed_time reached the 58 seconds mark. As we assumed that Application team had hardcoded the query time-out limit to 60 seconds in the application code, It means the SP “sel_timeoutbound” will be experiencing query time-out issue soon in future.
Conclusion
Using the DMV sys.dm_exec-procedure_stats, you can proactively figure out time-out prone queries. That is how you can actively inform your application team that these are queries which are going to experience query time-out issue soon in future. So that Application team can start optimizing them dedicatedly before they start timing out in production.