Oracle Database Tuning Statistics

This pages is now a little dated, having been written for early versions of Oracle 8.  However, the sql is still useful for a quick reference. Since the release of Oracle 8i the best way to gather statistics on a regular basis is with Statspack.  Being able to graph the data collected is extremely useful - and orca for oracle is a good method for doing this.
-- April 2005

It is possible to measure a vast number of statistics for an Oracle database. An important element of any useful tuning exercise is to be able to make comparisons over time: collecting and comparing statistics is the only way to make tuning a proactive exercise. At the least, before and after snapshots are necessary to see what effect changes have had.  

The sql statistics presented here are reduced to a percentage figure to allow long term comparisons without reference needing to be taken to changes in, for example, the number of transactions occurring or the amount of memory available to the SGA. Be aware, however, that this approach can also hide a multiplicity of sins.  For example, a small percentage of shared pool reloads may still be a damagingly large number.

In addition, the SQL is cast to provide a rounding to decimal places. Whilst this results in tidier output it also allows zero or 100% answers.

Cache Hit Ratios

There are two main figures

Buffer Cache Hit Ratio

select round((1-(pr.value/(bg.value+cg.value)))*100,2)
from v$sysstat pr, v$sysstat bg, v$sysstat cg
where pr.name='physical reads'
and bg.name='db block gets'
and cg.name='consistent gets'
The buffer cache hit ratio is a measure of the proportion of requests for data which is satisfied by data already in the buffer cache. Higher ratios are better as access to data in memory is speedier than an IO operation to disk. There comes a point of diminishing returns when increasing the size of the database buffer. Also, remember that this is part of the SGA and it may be more important to use additional memory for other parts of the SGA. It is vital that the whole SGA fits within main memory, as paging of the SGA is disastrous for performance.

Optimum High
init.ora parameter DB_BLOCK_BUFFERS

Dictionary Cache Hit Ratio

select sum(gets-getmisses)*100/sum(gets) 
from v$rowcache
The dictionary cache hit ratio is a measure of the proportion of requests for information from the data dictionary, the collection of database tables and views containing reference information about the database, its structures, and its users. On instance startup, the data dictionary cache contains no data, so any SQL statement issued is likely to result in cache misses. As more data is read into the cache, the likelihood of cache misses should decrease. Eventually the database should reach a "steady state" in which the most frequently used dictionary data is in the cache.

The dictionary cache resides within the Shared Pool, part of the SGA, so increasing the shared pool size should improve the dictionary cacge hit ratio.

Optimum High
init.ora parameter SHARED_POOL_SIZE

Sorts in Memory

select round((mem.value/(mem.value+dsk.value))*100,2)
from v$sysstat mem, v$sysstat dsk
where mem.name='sorts (memory)'
and dsk.name='sorts (disk)'
This is a measure of the proportion of data sorts which occur within memory rather than on disk. Sorts on disk make use of the user's tempoary table space. The maximum size of sort which will occur in memory is defined by the sort area size, which is the size within the PGA which will be used. Each Oracle process which sorts will allocate this much memory, though it may not need all of it. Use of memory for this purpose reduces that available to the SGA.

Optimum High
init.ora parameter SORT_AREA_SIZE

Shared Pool

There are two important figures

Shared Pool Free

select round((sum(decode(name,'free memory',bytes,0))/sum(bytes))*100,2)
from v$sgastat
The percentage of the shared pool not currently in use. If a large proportion of the shared pool is always free, it is likely that the size of the shared pool can be reduced. Low free values are not a cause for concern unless other factors also indicate problems, e.g. a poor dictionary cache hit ratio or large proportion of reloads occurring.

Optimum Small but non-zero
init.ora parameter SHARED_POOL_SIZE

Shared Pool Reloads

select round(sum(reloads)/sum(pins)*100,2)
from v$librarycache
where namespace in ('SQL AREA','TABLE/PROCEDURE','BODY','TRIGGER')
This is similar to a Library Cache Miss Ratio, but is specific to SQL and PL/SQL blocks. Shared pool reloads occur when Oracle has to implicitly reparse SQL or PL/SQL at the point when it attempts to execute it. A larger shared pool wil reduce the number of times that code needs to be reloaded. Also, ensuring that similar pieces of SQL are written identically will increase sharing of code.

To take advantage of additional memory available for shared SQL areas, you may also need to increase the number of cursors permitted for a session. You can do this by increasing the value of the initialization parameter OPEN_CURSORS.

Optimum Low
init.ora parameter SHARED_POOL_SIZE
init.ora parameter OPEN_CURSORS

Library Cache Hit Ratios

These values are closely related to shared pool reloads . Improving these figures is carried out in the same manner as discussed in that section. There are two figures

Library Cache Get Hit Ratio

The proportion of requests for a lock on an object which were satisfied by finding that object's handle already in memory.
select round(sum(gethits)/sum(gets)*100,2)
from v$librarycache

Optimum High
init.ora parameter SHARED_POOL_SIZE
init.ora parameter OPEN_CURSORS

Library Cache Pin Hit Ratio

select round(sum(pinhits)/sum(pins)*100,2)
from v$librarycache
The proportion of attempts to pin an object which were satisfied by finding all the pieces of that object already in memory.

Optimum High
init.ora parameter SHARED_POOL_SIZE
init.ora parameter OPEN_CURSORS

Recursive Calls/Total Calls

select round((rcv.value/(rcv.value+usr.value))*100,2)
from v$sysstat rcv, v$sysstat usr
where rcv.name='recursive calls'
and usr.name='user calls'
A high ratio of recursive calls to total calls may indicate any of the following:
Optimum Low

Short/Total Table Scans

select round((shrt.value/(shrt.value+lng.value))*100,2)
from v$sysstat shrt, v$sysstat lng
where shrt.name='table scans (short tables)'
and lng.name='table scans (long tables)'
This is the proportion of full table scans which are occurring on short tables. Short tables may be scanned by Oracle when this is quicker than using an index. Full table scans of long tables is generally bad for overall performance. Low figures for this graph may indicate lack of indexes on large tables or poorly written SQL which fails to use existing indexes or is returning a large percentage of the table.

Optimum High

Redo Activity

The redo log buffer is a circular buffer in the SGA that holds information about changes made to the database. This graph gives an indication of the level of contention occuring for redos. It is made up of two figures

Redo Space Wait Ratio

select round((req.value/wrt.value)*100,2)
from v$sysstat req, v$sysstat wrt
where req.name= 'redo log space requests'
and wrt.name= 'redo writes'
A redo space wait is when there is insufficient space in the redo buffer for a transaction to write redo information. It is an indication that the redo buffer is too small given the rate of transactions occurring in relation to the rate at which the log writer is writing data to the redo logs.

Optimum Very Low
init.ora parameter LOG_BUFFER

Redo Log Allocation Contention

There are two latches:

The Redo Allocation Latch

select round(greatest(
(sum(decode(ln.name,'redo allocation',misses,0))
/greatest(sum(decode(ln.name,'redo allocation',gets,0)),1)),
(sum(decode(ln.name,'redo allocation',immediate_misses,0))
/greatest(sum(decode(ln.name,'redo allocation',immediate_gets,0))
+sum(decode(ln.name,'redo allocation',immediate_misses,0)),1))
)*100,2)
from v$latch l,v$latchname ln
where l.latch#=ln.latch#
The redo allocation latch controls the allocation of space for redo entries in the redo log buffer. To allocate space in the buffer, an Oracle user process must obtain the redo allocation latch. Since there is only one redo allocation latch, only one user process can allocate space in the buffer at a time. The single redo allocation latch enforces the sequential nature of the entries in the buffer.

After allocating space for a redo entry, the user process may copy the entry into the buffer. This is called "copying on the redo allocation latch". A process may only copy on the redo allocation latch if the redo entry is smaller than a threshold size.

The maximum size of a redo entry that can be copied on the redo allocation latch is specified by the initialization parameter LOG_SMALL_ENTRY_MAX_SIZE.

Redo Copy Latches

select round(greatest(
(sum(decode(ln.name,'redo copy',misses,0))
/greatest(sum(decode(ln.name,'redo copy',gets,0)),1)),
(sum(decode(ln.name,'redo copy',immediate_misses,0))
/greatest(sum(decode(ln.name,'redo copy',immediate_gets,0))
+sum(decode(ln.name,'redo copy',immediate_misses,0)),1)) )*100,2)
from v$latch l,v$latchname ln
where l.latch#=ln.latch#
The user process first obtains the copy latch. Then it obtains the allocation latch, performs allocation, and releases the allocation latch. Next the process performs the copy under the copy latch, and releases the copy latch. The allocation latch is thus held for only a very short period of time, as the user process does not try to obtain the copy latch while holding the allocation latch.

If the redo entry is too large to copy on the redo allocation latch, the user process must obtain a redo copy latch before copying the entry into the buffer. While holding a redo copy latch, the user process copies the redo entry into its allocated space in the buffer and then releases the redo copy latch.

With multiple CPUs the redo log buffer can have multiple redo copy latches. These allow multiple processes to copy entries to the redo log buffer concurrently. The number of redo copy latches is determined by the parameter LOG_SIMULTANEOUS_COPIES.

Optimum Very Low
init.ora parameter LOG_SMALL_ENTRY_MAX_SIZE
init.ora parameter LOG_SIMULTANEOUS_COPIES

Table Contention

There are two figures which give indications of how well table storage is working. Figures are averaged across all tables in use. This means one table may be seriously at fault or many tables may have low level problems.

Chained Fetch Ratio

select round((cont.value/(scn.value+rid.value))*100,2)
from v$sysstat cont, v$sysstat scn, v$sysstat rid
where cont.name= 'table fetch continued row'
and scn.name= 'table scan rows gotten'
and rid.name= 'table fetch by rowid'
This is a proportion of all rows fetched which resulted in a chained row continuation. Such a continuation means that data for the row is spread across two blocks, which can occur in either of two ways:
Row Migration
This occurs when an update to a row cannot fit within the current block. In this case, the data for the row is migrated to a new block leaving a pointer to the new location in the original block.
Row Chaining
This occurs when a row cannot fit into a single data block, e.g. due to having large or many fields. In this case, the row is spread over two or more blocks.
Optimum Very Low

Free List Contention

select round((sum(decode(w.class,'free list',count,0))/
(sum(decode(name,'db block gets',value,0))
+ sum(decode(name,'consistent gets',value,0))))*100,2)
from v$waitstat w, v$sysstat
Free list contention occurs when more than one process is attempting to insert data into a given table. The table header structure maintains one or more lists of blocks which have free space for insertion. If more processes are attempting to make insert than there are free lists some will have to wait for access to a free list.

Optimum Very Low

CPU Parse Overhead

select round((prs.value/(prs.value+exe.value))*100,2)
from v$sysstat prs, v$sysstat exe
where prs.name like 'parse count (hard)'
and exe.name= 'execute count'
The CPU parse overhead is the proportion of database CPU time being spent in parsing SQL and PL/SQL code. High values of this figure indicate that either a large amount of once-only code is being used by the database or that the shared sql area is too small.

Optimum Low
init.ora parameter SORT_AREA_SIZE

Latches

Latches are simple, low-level serialization mechanisms to protect shared data structures in the SGA. When attempting to get a latch a process may be willing to wait, hence this graph includes two figures. See also redo log allocation latches.

Willing to Wait Latch Gets

select round(((sum(gets) - sum(misses)) / sum(gets))*100,2)
from v$latch
An attempt by a process to obtain a latch which is willing to wait will sleep and retry until it obtains the latch.

Optimum High

Immediate Latch Gets

select round(((sum(immediate_gets) - sum(immediate_misses)) / sum(immediate_gets))*100,2)
from v$latch
An attempt to obtain a latch which a process is not allowed to wait for will timeout.

Optimum High

Rollback Segment Contention

select round(sum(waits)/sum(gets)*100,2)
from v$rollstat
This figure is an indication of whether a process had to wait to get access to a rollback segment. To improve figures, increase the number of rollback segments available.

Optimum Low

Further Information

There are many sources of further information on all the subjects included here.

Online

A valuable source is Oracle's Technology Network, particlarly the Documentation on Concepts and Tuning.

Other sources include:

Naturally, all these sites also have links.

Offline

A large number of books have been written on the subject. Two publishers worth checking out are O'Reilly and Oracle Press.