Introduction
CPU utilization is a critical performance metric for any Oracle database, but where do you go to find the most accurate information? While many DBAs and developers may be familiar with the standard sources of CPU utilization data from operating system and other tools, there are other, often-overlooked sources of valuable information buried deep within Oracle’s Automated Workload Repository (AWR).
In addition to the usual suspect within Oracle (i.e. DBA_HIST_SYSMETRIC_SUMMARY) there are other sources of CPU utilization data that can be expertly used if you apply the proper calculations. These sources include DBA_HIST_ACTIVE_SESS_HISTORY (example use cases: user-level or SQL-level CPU utilization), DBA_HIST_SERVICE_STAT (example use case: service-level CPU utilization), DBA_HIST_SYSSTAT (various Oracle process level CPU statistics), and DBA_HIST_SYS_TIME_MODEL (various Oracle process level CPU statistics). By tapping into these hidden sources that need to be calculated, you can gain a more granular understanding of how your database is using CPU resources which can help with a variety of tasks such as workload relocation and maximizing performance.
In this exciting new blog post, we'll take a deep dive into the world of CPU utilization in Oracle databases, exploring the various sources of data available, and providing practical tips and examples for using that information to improve your system's performance. From analyzing historical trends to understanding cumulative statistics, we'll cover the basics of what you need to know to get the most out of your CPU utilization data. Join us on this journey of discovery and uncover the hidden secrets of CPU utilization in Oracle databases!
Uncovering Peak Machine Workload
When it comes to analyzing CPU utilization trends in Oracle databases, there is an often-overlooked metric that provides valuable insights with minimal effort - 'Host CPU Utilization (%)' from DBA_HIST_SYSMETRIC_SUMMARY. Not only do the values for this metric come from the operating system itself, but this metric is commonly understood and can easily be used to reveal critical periods of high workload, helping you pinpoint performance bottlenecks and optimize resource allocation.
One of the compelling reasons I use 'Host CPU Utilization (%)' is its ease of use. Since CPU metrics are just data, what can be easier than talking SQL to your data? Unlike sources of operating system generated CPU metrics that require o.s.-specific accounts, specialized tools, or complex [to some] commands, this metric can be obtained directly from the Oracle database using simple SQL queries. No additional privileges [other than being able to read the Oracle data dictionary] or external tools are required, making it accessible to a broader range of users, including DBAs, developers, and performance analysts. [I only wish I could talk SQL to the interesting operating system sources of data, but that’s a subject for a future post.]
Here's a simple query you can use to get a time-series on CPU Utilization from the AWR view DBA_HIST_SYSMETRIC_SUMMARY:
select to_char(begin_interval_time, 'YYYY-MM-DD HH24:MI') begin_interval_time
, snap_id, instance_number
, sum(case metric_name when 'Host CPU Utilization (%)'
then average end) as "AVG Host CPU Utilization (%)"
, sum(case metric_name when 'Host CPU Utilization (%)'
then maxval end) as "MAX Host CPU Utilization (%)"
from dba_hist_sysmetric_summary natural join dba_hist_snapshot
where begin_interval_time >= trunc(sysdate)
and metric_name in ('Host CPU Utilization (%)')
group by to_char(begin_interval_time, 'YYYY-MM-DD HH24:MI') , snap_id, instance_number
order by snap_id, instance_number
;
Sidebar Note: I frequently use this pattern to pivot across the top many metrics (there are about 160 metrics in DBA_HIST_SYSMETRIC_SUMMARY). While I call this the “small model approach” (i.e. hand picking a few metrics from among tens of thousands) and is not suitable for solving any arbitrary performance problem, pivoting out “sysmetric” metrics can be very helpful in getting a quick view into what is going on with your database workload. For example, I will often also look at IO metrics such as: ‘Average Synchronous Single-Block Read Latency’, ‘Physical Reads Per Sec’, ‘Physical Writes Per Sec’, ‘Physical Read Total Bytes Per Sec’, ‘Physical Write Total Bytes Per Sec’; also workload metrics such as: ‘Average Active Sessions’, ‘Database CPU Time Ratio’, ‘Database Wait Time Ratio’, ‘Redo Generated Per Sec’, ‘Logons Per Sec’, and many more.
The simplicity of 'Host CPU Utilization (%)' doesn't compromise its effectiveness. Although the granularity is based on average and maximum values over the snapshot interval, it is often more than sufficient for identifying periods of high workload. By reviewing the time-series data, you can easily spot spikes in CPU utilization and correlate them with specific timeframes, queries, or application activities. This can be invaluable when troubleshooting performance issues, identifying resource-intensive SQL statements, or assessing the impact of workload changes.
Moreover, 'Host CPU Utilization (%)' provides a concise and easily interpretable measure of CPU load. The percentage value clearly indicates the proportion of CPU resources consumed by processes during a given interval. This makes it a powerful tool for capacity planning, performance monitoring, and workload management.
By leveraging 'Host CPU Utilization (%)', you gain valuable insights into the overall CPU consumption of the Oracle database without the need for extensive expertise or complex system-level monitoring. It empowers you to make informed decisions, optimize resource allocation, and proactively address performance challenges.
So, next time you find yourself in need of a quick and accessible CPU utilization metric, don't overlook the hidden gem of 'Host CPU Utilization (%)' in DBA_HIST_SYSMETRIC_SUMMARY. Unlock its potential, and let it guide you towards a deeper understanding of your database workload, enabling you to unlock a realm of optimization possibilities and ensure the smooth operation of your Oracle environment.
The Power of Obtaining CPU Utilization at User/Process/SQL_ID Level in Oracle Databases
Obtaining CPU utilization metrics at the user/process/SQL_ID level is a common need for database administrators and developers. While overall Host CPU Utilization (%) is a commonly understood metric, it does not provide granular insights into the workload and its impact on the database performance. Instead, metrics such as DB CPU and DB Time provide more accurate and useful information on how much time the database spends on CPU and how much time is spent waiting for I/O, network, or other resources.
However, despite the usefulness of these metrics, they are not easy to obtain. Oracle does not provide % CPU utilization at the user/process/SQL_ID level natively, so database administrators and developers must resort to custom calculations or third-party tools to get this information.
Knowing CPU utilization at the user/process/SQL_ID level is important for workload analysis, capacity planning, and resource allocation. By understanding which users, processes, or SQL statements are consuming what percentage of CPU resources, database administrators and developers can identify performance bottlenecks and optimize the workload distribution. This information can also help in making decisions on whether to reallocate workloads to another hosting environment or to the cloud, based on the available capacity and the expected workload demands.
Rationale for Calculating CPU % from Other Sources:
Even though Oracle by default does not provide the percent CPU usage of a particular user or process, there are several use cases where someone may want to know this information. With DBA_HIST_ACTIVE_SESS_HISTORY, it is commonly known that you can get the approximate number of seconds a session was ON CPU, but that does not tell you anything in comparison to the Host CPU Utilization (%); for this you’ll need to put your thinking cap on and do some calculations [read on for details].
Here are some common scenarios for needing a more granular understanding of CPU utilization:
Performance troubleshooting: When troubleshooting performance issues in an Oracle database, it can be helpful to identify which user or process is using a significant amount of CPU resources or what proportion of CPU resources. By monitoring CPU usage over time, you can identify patterns and correlations between CPU usage and specific queries or transactions.
Resource allocation: In a multi-tenant environment, where multiple users or applications are sharing the same database instance, it can be useful to monitor CPU usage by user or process to allocate resources fairly and optimize performance.
Chargeback/show-back: In some organizations, IT services are charged back or show-backed to internal customers based on resource usage. By tracking CPU usage by user or process, you can accurately allocate costs and ensure that users are being charged appropriately for the resources they consume.
Security: It may be of value to monitor CPU usage by user or process to detect potential security threats.
Overall, tracking CPU usage by user or process can help identify performance bottlenecks, allocate resources efficiently, and ensure database security, making it an essential task for database administrators in many organizations.
Calculating CPU Utilization from ASH
The need to develop custom calculations or queries to extract CPU utilization metrics by user or process in an Oracle database arises from the fact that Oracle does not provide these metrics natively. While Oracle provides several system-level metrics for CPU usage, such as DB CPU Time and Host CPU Utilization (%), these metrics do not provide insights into CPU usage by individual users or processes.
By developing custom calculations or queries to extract CPU utilization metrics by user or process, you can gain a more granular understanding of how CPU resources are being consumed in your Oracle database. This can be especially useful in identifying performance bottlenecks and optimizing resource usage.
For example, you may want to track CPU usage by user to identify which users or applications are consuming what proportion of CPU resources and optimize resource allocation accordingly. Alternatively, you may want to track CPU usage by SQL ID to identify which queries are consuming what proportion of CPU resources and optimize query performance.
Overall, developing custom calculations or queries to extract CPU utilization metrics by user or process can help you gain deeper insights into how CPU resources are being consumed in your Oracle database, enabling you to optimize performance, allocate resources more efficiently, and ensure compliance with regulatory requirements.
The basis for doing this from DBA_HIST_ACTIVE_SESS_HISTORY (ASH) is based on the documented assumption that a row in ASH represents approximately 10 seconds of time. If you take into consideration the number of CPU’s and the amount of time in a snapshot interval, you can come up with a CPU utilization (%) figure for any grouping in ASH. The calculation is as follows, for your grouping, take:
sum(sec_on_cpu) / (minutes_in_snapshot_interval * seconds_per_minute * num_cpus * 100)
Here’s an example that you can modify for your requirements (e.g. group on SQL_ID):
-- CPU % Calculated from ASH
-- N.B. ASH may over or under count because it's a sampling mechanism
-- and we infer seconds on CPU rather than it being directly provided
-- grouping by username
with db_cpu_ash as
(
select instance_number
, (select username from dba_users
where dba_users.user_id = dba_hist_active_sess_history.user_id) as username
, begin_interval_time, snap_id
, count(*) * 10 as sec_on_cpu
from dba_hist_active_sess_history natural join dba_hist_snapshot
where session_state = 'ON CPU'
and trunc(begin_interval_time) = trunc(sysdate)
group by instance_number, user_id, begin_interval_time, snap_id
)
, params as
(
select value num_cpus
, 10 minutes_in_interval
, 60 seconds_per_minute
from v$osstat where stat_name = 'NUM_CPUS'
)
select to_char(begin_interval_time, 'YYYY-MON-DD HH24:MI') begin_interval_time
, instance_number, username
, round(sum(sec_on_cpu)) "DB CPU Sec"
, round(sum(sec_on_cpu) / (minutes_in_interval * seconds_per_minute * num_cpus) * 100 , 3) "CPU %"
from db_cpu_ash, params
group by to_char(begin_interval_time, 'YYYY-MON-DD HH24:MI')
, instance_number, username
, minutes_in_interval , seconds_per_minute , num_cpus
order by begin_interval_time, instance_number, 4 desc
;
Edit: 19-May-23 : missed off the “FROM” clause and misplaced parenthesis in the above query.
From this example query, you can easily adjust the group by to come up with, say CPU utilization per SQL_ID, service (via SERVICE_HASH) [actually I would do this from DBA_HIST_SERVICE_STAT for a more accurate measure, but that will have to be the subject of a future post], MODULE, MACHINE, … .
Summary
To wrap this conversation up, I would point out that calculating CPU utilization from the Active Session History (ASH) can provide valuable insights into the workload and performance of specific users, processes, or SQL statements in an Oracle database. By leveraging the available data from ASH, DBA_HIST_SYSMETRIC_SUMMARY, DBA_HIST_SERVICE_STAT, and other related views, you can determine the percentage of CPU resources utilized by various components.
While Oracle does not provide a direct metric for CPU utilization at the user/process/SQL level, you can derive it through calculations based on available data points. This approach allows you to monitor and analyze CPU usage patterns, identify resource-intensive activities, and make informed decisions for optimizing database performance and resource allocation.
By understanding how to extract and interpret CPU utilization information from different sources, you can gain a deeper understanding of the workload's impact on system resources. This knowledge enables you to effectively manage and allocate resources, identify performance bottlenecks, and optimize the overall performance of your Oracle database.
Remember that each method of calculating or extracting CPU utilization has its own considerations and limitations. It's important to choose the approach that aligns best with your specific requirements and circumstances. Regular monitoring and analysis of CPU utilization trends can help you proactively address performance issues, ensure efficient resource utilization, and provide an optimal user experience within your Oracle environment.
Up Next
I hope in a future blog post to detail out extracting and calculating CPU utilization from other sources including, DBA_HIST_SERVICE_STAT (example use case: service-level CPU utilization), DBA_HIST_SYSSTAT (various Oracle process level CPU statistics), and DBA_HIST_SYS_TIME_MODEL (various Oracle process level CPU statistics). Again, by tapping into these hidden sources that need to be calculated, you can gain a more granular understanding of how your database is using CPU resources which can help with a variety of tasks such as workload relocation and maximizing performance.