Identifying Top Wait Events in SQL Server

Introduction
Identifying Top Wait Events
Actionable Insights
Conclusion

Introduction

In the world of managing databases, making them run faster and more efficiently is a never-ending goal. For people who manage SQL Server systems, known as database administrators (DBAs), understanding and handling wait events is incredibly important. Wait events are basically periods when a session in SQL Server is doing nothing, just waiting for something specific to happen before it can get back to work.
Understanding these wait events is key to figuring out where the problems are in SQL Server and how to make things better. That's why learning how to use SQL queries to dig into these wait events is really valuable. In this blog post, we'll talk about how to do just that—how to use SQL queries to find out important stuff about the top wait events.

Identifying Top Wait Events

Now, let's interpret the key components of the query results:
  • Wait Type: This column identifies the type of wait event, such as I/O, CPU, or network-related waits.
  • Wait Time: Indicates the total time spent waiting for the specified event to occur.
  • Signal Wait Time:Represents the time spent waiting for the CPU after the event has occurred.
  • Percentage of Signal Waits:Provides insights into the proportion of signal waits relative to total waits. A higher percentage may indicate CPU contention issues.
  • Number of Waits: Indicates the frequency of occurrence for each wait event.
  • Average Time per Wait:Calculated by dividing total wait time by the number of waits, offering insights into the average duration of each wait event

with wait_stats as
(
    select
        server_utc_start_time,
        sample_utc_time,
        x.sample_time,
        x.wait_type,
        sum(x.sum_wait_time_ms) as wait_time_ms,
        sum(x.sum_signal_wait_time_ms) as signal_wait_time_ms,
        sum(x.sum_waiting_tasks) as waiting_tasks
    from (
        select
            (select dateadd(hh, datediff(hh, getdate(), getutcdate()), sqlserver_start_time) from sys.dm_os_sys_info) as server_utc_start_time,
            getutcdate() as sample_utc_time,
            sysdatetimeoffset() as sample_time,
            owt.wait_type,
            sum(owt.wait_duration_ms) over (partition by owt.wait_type, owt.session_id) as sum_wait_time_ms,
            0 as sum_signal_wait_time_ms,
            0 as sum_waiting_tasks
        from
            sys.dm_os_waiting_tasks owt
        where
            owt.session_id > 50
        and
            owt.wait_duration_ms >= 0
        union all
        select
            (select dateadd(hh, datediff(hh, getdate(), getutcdate()), sqlserver_start_time) from sys.dm_os_sys_info) as server_utc_start_time,
            getutcdate() as sample_utc_time,
            sysdatetimeoffset() as sample_time,
            os.wait_type,
            sum(os.wait_time_ms) over (partition by os.wait_type) as sum_wait_time_ms,
            sum(os.signal_wait_time_ms) over (partition by os.wait_type) as sum_signal_wait_time_ms,
            sum(os.waiting_tasks_count) over (partition by os.wait_type) as sum_waiting_tasks
        from
            sys.dm_os_wait_stats os
    ) x
    group by
        server_utc_start_time,
        sample_utc_time,
        x.sample_time,
        x.wait_type
)
select
    'WAIT_STATS' as wait_stats,
    waits.server_utc_start_time as server_utc_startup_time,
    waits.sample_utc_time as sample_utc_time,
    waits.sample_time as sample_time,
    cast(datediff(mi, waits.server_utc_start_time, waits.sample_time) / 60.0 as decimal(18,1)) as hours_sample,
    waits.wait_type as wait_type,
    cast(c.wait_time_seconds / 60.0 / 60 as decimal(18,1)) as wait_time_hours,
    cast(waits.wait_time_ms / 1000.0 / cores.cpu_count / datediff(ss, waits.server_utc_start_time, waits.sample_time) as decimal(18,1)) AS per_core_per_hour,
    cast(c.signal_wait_time_seconds / 60.0 / 60 as decimal(18,1)) as signal_wait_time_hours,
    case when c.wait_time_seconds > 0 then
        cast(100.*(c.signal_wait_time_seconds / c.wait_time_seconds) as numeric(4,1))
    else
        0
    end as percent_signal_waits,
    waits.waiting_tasks as number_of_waits,
    case when waits.waiting_tasks > 0 then
        cast(waits.wait_time_ms / (1.0 * (waits.waiting_tasks)) as numeric(12,1))
    else
        0
    end as avg_ms_per_wait
from
    wait_stats waits
    cross apply
        (
            select sum(1) as cpu_count
            from sys.dm_os_schedulers
            where status = 'VISIBLE ONLINE'
            and is_online = 1
        ) as cores
    cross apply
        (
            select
                cast(waits.wait_time_ms/1000. as numeric(12,1)) as wait_time_seconds,
                cast(waits.signal_wait_time_ms/1000. as numeric(12,1)) as signal_wait_time_seconds
        ) as c
where
    waits.waiting_tasks > 0
and
    waits.wait_time_ms > 0
and
    waits.wait_type not in (
        'BROKER_EVENTHANDLER', 'BROKER_RECEIVE_WAITFOR', 'BROKER_TASK_STOP',
        'BROKER_TO_FLUSH', 'BROKER_TRANSMITTER', 'CHECKPOINT_QUEUE',
        'CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT', 'CLR_SEMAPHORE',
        'DBMIRROR_DBM_EVENT', 'DBMIRROR_DBM_MUTEX', 'DBMIRROR_EVENTS_QUEUE',
        'DBMIRROR_WORKER_QUEUE', 'DBMIRRORING_CMD', 'DIRTY_PAGE_POLL',
        'DISPATCHER_QUEUE_SEMAPHORE', 'FT_IFTS_SCHEDULER_IDLE_WAIT',
        'FT_IFTSHC_MUTEX', 'FT_IFTSISM_MUTEX', 'HADR_CLUSAPI_CALL',
        'HADR_FABRIC_CALLBACK', 'HADR_FILESTREAM_IOMGR_IOCOMPLETION',
        'HADR_LOGCAPTURE_WAIT', 'HADR_NOTIFICATION_DEQUEUE',
        'HADR_TIMER_TASK', 'HADR_WORK_QUEUE', 'LAZYWRITER_SLEEP',
        'LOGMGR_QUEUE', 'ONDEMAND_TASK_QUEUE', 'PARALLEL_REDO_DRAIN_WORKER',
        'PARALLEL_REDO_LOG_CACHE', 'PARALLEL_REDO_TRAN_LIST',
        'PARALLEL_REDO_TRAN_TURN', 'PARALLEL_REDO_WORKER_SYNC',
        'PARALLEL_REDO_WORKER_WAIT_WORK', 'PREEMPTIVE_HADR_LEASE_MECHANISM',
        'PREEMPTIVE_SP_SERVER_DIAGNOSTICS', 'PREEMPTIVE_XE_DISPATCHER',
        'QDS_ASYNC_QUEUE', 'QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP',
        'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP', 'QDS_SHUTDOWN_QUEUE',
        'REDO_THREAD_PENDING_WORK', 'REQUEST_FOR_DEADLOCK_SEARCH',
        'SLEEP_SYSTEMTASK', 'SLEEP_TASK', 'SOS_WORK_DISPATCHER',
        'SP_SERVER_DIAGNOSTICS_SLEEP', 'SQLTRACE_BUFFER_FLUSH',
        'SQLTRACE_INCREMENTAL_FLUSH_SLEEP', 'UCS_SESSION_REGISTRATION',
        'WAIT_XTP_OFFLINE_CKPT_NEW_LOG', 'WAITFOR', 'XE_DISPATCHER_WAIT',
        'XE_LIVE_TARGET_TVF', 'XE_TIMER_EVENT'
    )
order by
    wait_time_seconds desc;


Actionable Insights

Armed with these insights, DBAs can take targeted actions to optimize SQL Server performance. For instance:
  • Identifying Bottlenecks: By focusing on wait events with the highest wait times, DBAs can pinpoint performance bottlenecks and prioritize optimization efforts accordingly.
  • Tuning Resource Allocation: Understanding the distribution of signal waits versus total waits can help DBAs fine-tune resource allocation, such as CPU resources, to alleviate contention issues.
  • Optimizing Queries: Analyzing wait events associated with specific queries can guide query optimization strategies, leading to improved overall performance.

Conclusion

In the complex landscape of SQL Server performance tuning, analyzing wait events serves as a cornerstone for identifying and addressing performance bottlenecks. By leveraging SQL queries to extract and interpret wait event statistics, DBAs can gain valuable insights into system behavior and take proactive measures to enhance performance, ensuring optimal responsiveness and reliability for SQL Server environments.

Related content



Rate Your Experience

: 89 : 1


Last updated in July, 2024

Online Tests
Read more

Cloud Technology
Read more

Oracle Database
Read more

MSSQL Database
Read more

PostGres Database
Read more

Linux
Read more

ASP/C#
Read more

Quick Access