Since the DMF sys.dm_io_virtual_file_stats shows cumulative I/O statistics value for the database files, you can’t just use the DMF to shoot a conclusion that the database files are experiencing high latency issue. The reason is the overall aggregation will mask spikes of poor performance.
It is recommended to capture the latency data over a period to reach a conclusion. In this tips, we will explore how to capture I/O latency statistics data that occurred over a period using the DMF.
But before that I would request you to have a look on the first part of this blog, that will help you to understand what all the information you will be capturing over the period – SQL SERVER DATABASE FILES WISE I/O LATENCY PART#1.
Let’s follow a step by step approach to capturing the period data;
Step1:
- Create a table ‘SQL_database_file_wise_I_O_latency’ the under master database
- Captures the output from sys.dm_io_virtual_file_stats into the table
IF NOT EXISTS (SELECT * FROM [master].[sys].[objects] WHERE [name] = N'SQL_database_file_wise_I_O_latency') BEGIN USE master CREATE TABLE [dbo].[SQL_database_file_wise_I_O_latency]( [database name] [nvarchar](128) NULL, [avg_read_stall_ms] [numeric](10, 1) NULL, [avg_write_stall_ms] [numeric](10, 1) NULL, [avg_io_stall_ms] [numeric](10, 1) NULL, [file size (mb)] [decimal](18, 2) NULL, [physical_name] [nvarchar](260) NOT NULL, [type_desc] [nvarchar](60) NULL, [io_stall_read_ms] [bigint] NOT NULL, [num_of_reads] [bigint] NOT NULL, [io_stall_write_ms] [bigint] NOT NULL, [num_of_writes] [bigint] NOT NULL, [io_stalls] [bigint] NULL, [total_io] [bigint] NULL, [capturetime] [datetime] NOT NULL ) ON [PRIMARY] INSERT INTO SQL_database_file_wise_I_O_latency SELECT DB_NAME(fs.database_id) AS [database name], CAST(fs.io_stall_read_ms/(1.0 + fs.num_of_reads) AS NUMERIC(10,1)) AS [avg_read_stall_ms], CAST(fs.io_stall_write_ms/(1.0 + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_write_stall_ms], CAST((fs.io_stall_read_ms + fs.io_stall_write_ms) /(1.0 + fs.num_of_reads + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_io_stall_ms], CONVERT(DECIMAL(18,2), mf.size/128.0) AS [File Size (MB)], mf.physical_name, mf.type_desc, fs.io_stall_read_ms, fs.num_of_reads, fs.io_stall_write_ms, fs.num_of_writes, fs.io_stall_read_ms + fs.io_stall_write_ms AS [io_stalls], fs.num_of_reads + fs.num_of_writes AS [total_io], getdate() as capturetime FROM sys.dm_io_virtual_file_stats(null,null) AS fs INNER JOIN sys.master_files AS mf WITH (NOLOCK) ON fs.database_id = mf.database_id AND fs.[file_id] = mf.[file_id] ORDER BY avg_io_stall_ms DESC OPTION (RECOMPILE); END ELSE INSERT INTO SQL_database_file_wise_I_O_latency SELECT DB_NAME(fs.database_id) AS [database name], CAST(fs.io_stall_read_ms/(1.0 + fs.num_of_reads) AS NUMERIC(10,1)) AS [avg_read_stall_ms], CAST(fs.io_stall_write_ms/(1.0 + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_write_stall_ms], CAST((fs.io_stall_read_ms + fs.io_stall_write_ms) /(1.0 + fs.num_of_reads + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_io_stall_ms], CONVERT(DECIMAL(18,2), mf.size/128.0) AS [File Size (MB)], mf.physical_name, mf.type_desc, fs.io_stall_read_ms, fs.num_of_reads, fs.io_stall_write_ms, fs.num_of_writes, fs.io_stall_read_ms + fs.io_stall_write_ms AS [io_stalls], fs.num_of_reads + fs.num_of_writes AS [total_io], getdate() as capturetime FROM sys.dm_io_virtual_file_stats(null,null) AS fs INNER JOIN sys.master_files AS mf WITH (NOLOCK) ON fs.database_id = mf.database_id AND fs.[file_id] = mf.[file_id] ORDER BY avg_io_stall_ms DESC OPTION (RECOMPILE);
Step2:
- Archive two weeks older data from the table. You can modify the archiving period as per requirement.
DELETE FROM master.dbo.SQL_database_file_wise_I_O_latency WHERE capturetime <=DATEADD(WEEK,-2,GETDATE());
Script to create SQL Server Job directly
- Created a SQL Server job using the above scripts and scripted out the SQL job below. So that, you use the script to create a job directly to capture database files wise I/O latency details. Kindly run the script on the SQL Server instance where you want to capture the data.
Note: On the first successful execution of the job, the table “SQL_database_file_wise_I_O_latency” will be created in master database. If you want to keep the table in different database, you need to replace the master database with your database name.
USE [msdb] GO /****** Object: Job [SQL_database_file_wise_I_O_latency] Script Date: 24/6/2017 11:41:48 AM ******/ BEGIN TRANSACTION DECLARE @ReturnCode INT SELECT @ReturnCode = 0 /****** Object: JobCategory [[Uncategorized (Local)]] Script Date: 24/6/2017 11:41:48 AM ******/ IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1) BEGIN EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback END DECLARE @jobId BINARY(16) EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N'SQL_database_file_wise_I_O_latency', @enabled=1, @notify_level_eventlog=0, @notify_level_email=0, @notify_level_netsend=0, @notify_level_page=0, @delete_level=0, @description=N'No description available.', @category_name=N'[Uncategorized (Local)]', @owner_login_name=N'sa', @job_id = @jobId OUTPUT IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback /****** Object: Step [Capture SQL_database_file_wise_I_O_latency] Script Date: 24/6/2017 11:41:48 AM ******/ EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Capture SQL_database_file_wise_I_O_latency', @step_id=1, @cmdexec_success_code=0, @on_success_action=3, @on_success_step_id=0, @on_fail_action=2, @on_fail_step_id=0, @retry_attempts=0, @retry_interval=0, @os_run_priority=0, @subsystem=N'TSQL', @command=N'IF NOT EXISTS (SELECT * FROM [master].[sys].[objects] WHERE [name] = N''SQL_database_file_wise_I_O_latency'') BEGIN USE master CREATE TABLE [dbo].[SQL_database_file_wise_I_O_latency]( [database name] [nvarchar](128) NULL, [avg_read_stall_ms] [numeric](10, 1) NULL, [avg_write_stall_ms] [numeric](10, 1) NULL, [avg_io_stall_ms] [numeric](10, 1) NULL, [file size (mb)] [decimal](18, 2) NULL, [physical_name] [nvarchar](260) NOT NULL, [type_desc] [nvarchar](60) NULL, [io_stall_read_ms] [bigint] NOT NULL, [num_of_reads] [bigint] NOT NULL, [io_stall_write_ms] [bigint] NOT NULL, [num_of_writes] [bigint] NOT NULL, [io_stalls] [bigint] NULL, [total_io] [bigint] NULL, [capturetime] [datetime] NOT NULL ) ON [PRIMARY] INSERT INTO SQL_database_file_wise_I_O_latency SELECT DB_NAME(fs.database_id) AS [database name], CAST(fs.io_stall_read_ms/(1.0 + fs.num_of_reads) AS NUMERIC(10,1)) AS [avg_read_stall_ms], CAST(fs.io_stall_write_ms/(1.0 + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_write_stall_ms], CAST((fs.io_stall_read_ms + fs.io_stall_write_ms) /(1.0 + fs.num_of_reads + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_io_stall_ms], CONVERT(DECIMAL(18,2), mf.size/128.0) AS [File Size (MB)], mf.physical_name, mf.type_desc, fs.io_stall_read_ms, fs.num_of_reads, fs.io_stall_write_ms, fs.num_of_writes, fs.io_stall_read_ms + fs.io_stall_write_ms AS [io_stalls], fs.num_of_reads + fs.num_of_writes AS [total_io], getdate() as capturetime FROM sys.dm_io_virtual_file_stats(null,null) AS fs INNER JOIN sys.master_files AS mf WITH (NOLOCK) ON fs.database_id = mf.database_id AND fs.[file_id] = mf.[file_id] ORDER BY avg_io_stall_ms DESC OPTION (RECOMPILE); END ELSE INSERT INTO SQL_database_file_wise_I_O_latency SELECT DB_NAME(fs.database_id) AS [database name], CAST(fs.io_stall_read_ms/(1.0 + fs.num_of_reads) AS NUMERIC(10,1)) AS [avg_read_stall_ms], CAST(fs.io_stall_write_ms/(1.0 + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_write_stall_ms], CAST((fs.io_stall_read_ms + fs.io_stall_write_ms) /(1.0 + fs.num_of_reads + fs.num_of_writes) AS NUMERIC(10,1)) AS [avg_io_stall_ms], CONVERT(DECIMAL(18,2), mf.size/128.0) AS [File Size (MB)], mf.physical_name, mf.type_desc, fs.io_stall_read_ms, fs.num_of_reads, fs.io_stall_write_ms, fs.num_of_writes, fs.io_stall_read_ms + fs.io_stall_write_ms AS [io_stalls], fs.num_of_reads + fs.num_of_writes AS [total_io], getdate() as capturetime FROM sys.dm_io_virtual_file_stats(null,null) AS fs INNER JOIN sys.master_files AS mf WITH (NOLOCK) ON fs.database_id = mf.database_id AND fs.[file_id] = mf.[file_id] ORDER BY avg_io_stall_ms DESC OPTION (RECOMPILE);', @database_name=N'master', @flags=0 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback /****** Object: Step [Archieve SQL_database_file_wise_I_O_latency] Script Date: 24/6/2017 11:41:48 AM ******/ EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Archieve SQL_database_file_wise_I_O_latency', @step_id=2, @cmdexec_success_code=0, @on_success_action=1, @on_success_step_id=0, @on_fail_action=2, @on_fail_step_id=0, @retry_attempts=0, @retry_interval=0, @os_run_priority=0, @subsystem=N'TSQL', @command=N'DELETE FROM master.dbo.SQL_database_file_wise_I_O_latency WHERE capturetime <=DATEADD(WEEK,-2,GETDATE());', @database_name=N'master', @flags=0 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'Schedule - SQL_database_file_wise_I_O_latency', @enabled=1, @freq_type=4, @freq_interval=1, @freq_subday_type=4, @freq_subday_interval=10, @freq_relative_interval=0, @freq_recurrence_factor=0, @active_start_date=20170624, @active_end_date=99991231, @active_start_time=0, @active_end_time=235959, @schedule_uid=N'74653a85-dca0-427d-b97f-2dc995402d3d' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback COMMIT TRANSACTION GOTO EndSave QuitWithRollback: IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION EndSave: GO
- Once you run the above script successfully, verify the created job.
- Verify the job steps, it will contain two steps – Capturing & archiving the data
- Check the job schedule – By default, it will be capturing the data every day every 10 minutes.
Do you have any better way to capture the data? Please share via comments to help others.