Sunday, January 28, 2007

Handling and resolving unshared cursors/large version_counts(MetaLink)

Doc ID: Note:296377.1 Type: TROUBLESHOOTING
Last Revision Date: 18-JAN-2007 Status: REVIEWED

Troubleshooting Guide to high version_counts
This document is intended to explain how SQL sharing works and give examples of diagnostics which can be used to help determine why SQL sharing may not occur
RDBMS
DBPERF
Version [All versions]

Author: James Cremonini
The information in this document has been reviewed and is current as of 01-JAN-2005.


--------------------------------------------------------------------------------
Instructions for the reader: The Troubleshooting Guide is provided to assist in debugging SQL sharing issues. When possible, diagnostic tools are included in the document to assist in troubleshooting problems. This document does not contain bugs/patches as these topics are addressed in the articles referenced at the bottom of this document.
--------------------------------------------------------------------------------
Troubleshooting Summary
Introduction: What is shared SQL ?
Diagnostics: How do I see the versions and why they are not shared ?
Documentation: What do the reasons given in v$SQL_SHARED_CURSOR mean?
Best Practises: How do I make SQL more shareable - are there any parameters which can help?
Q&A: Are there any times when a high version count is expected even though BINDS are being used?
Prob: What articles, white papers, or manuals should I read for more information on [subcomponent]?

--------------------------------------------------------------------------------
Troubleshooting Details
1. What is shared SQL ?

The first thing to remember is that all SQL is implicitly sharable. When a SQL statement is entered, the RDBMS will create a hash value for text of the statement and that hash value then helps the RDBMS to easily find SQL already in the shared pool. It is not in the scope of this article to discuss this in any great detail, so let's just assume entering a series of text results in a hash value being created

For instance :- 'select count(*) from emp' hashes to the value 4085390015

We now create a parent cursor for this sql and a single child. It does not matter that a SQL statement may never be shared - when it is first parsed a parent and a single child are created. The easy way to think of this is that the PARENT cursor is a representation of the hash value and the child cursor(s) represent the metadata for that SQL

What is 'SQL Metadata'?

Metadata is all the information which enables a statement to run. For instance, in the example I have given EMP is owned by scott and therefore has an OBJECT_ID which points to the EMP table owned by this user. When the user SCOTT logged in, optimizer parameters are initialised in that session for use by the statement, so this too is used by the optimizer and is therefore metadata. There are other examples of Metadata which will be mentioned further in this document.

Let's say this session logs out and back in again now. It then runs the same command again (as the same user). This time we already have the SQL in the shared pool (but we don't know this yet). What we do is hash the statement and then search for that hash value in the shared pool. If we find it, we can then search through the children to determine if any of them are usable by us (ie the metadata is the same). If it is, then we can share that SQL statement
I would still have one version of that SQL in the shared pool because the metadata enabled me to share the statement with the already existent child. The fundementals are that the parent is not shared, it is the children which determine shareability.

Now - another user 'TEST' has it's own version of EMP. If that user was to now run the select statement above then what would happen is :-

1. The statement is hashed - it is hashed to the value 4085390015
2. The SQL will be found in the shared pool as it already exists
3. The children are scanned (at this point we have one child)
4. Because the OBJECT_ID of the EMP table owned by TEST is different the OBJECT_ID owned by scott we have a 'mismatch'

(Essentially, what happens here is that we have a linked list of children which we traverse, comparing the metadata of the current SQL with that of all the children. If there were 100 children then we would scan each of them (looking for a possible mismatch and moving on) until we found one we could share. If we cannot share any (ie. have exhausted the list of children) then we need to create a new child)

5. We therefore have to create a new child - we now have 1 PARENT and 2 CHILDREN.



[top]2. How do I see the versions and why they are not shared ?

Lets use the example above and take a look at what SQL we can use to see this in the shared pool.

SCOTT runs select count(*) from emp

I can now run the following to see the PARENT statement and it's hash value and address

select sql_text, hash_value,address from v$sqlarea where sql_text like 'select count(*) from emp%';



SQL_TEXT HASH_VALUE ADDRESS
------------------------------------- ----------------
select count(*) from emp 4085390015 0000000386BC2E58
To see the CHILDREN (I expect to see 1 at this point) :-

9i - select * from v$sql_shared_cursor where kglhdpar = '0000000386BC2E58'

10G - select * from v$sql_shared_cursor where address = '0000000386BC2E58'


ADDRESS KGLHDPAR U S O O S L S E B P I S T A B D L T R I I R L I O S M U T N F
---------------- ---------------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0000000386BC2D08 0000000386BC2E58 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N


We can see we have a single child (ADDRESS 0000000386BC2D08). The mismatch information (USOOSL etc) is all N because this is the first child. Now, if I log in as another user and run the same select (select count(*) from emp) and look again I will get the following output:-


ADDRESS KGLHDPAR U S O O S L S E B P I S T A B D L T R I I R L I O S M U T N F
---------------- ---------------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
0000000386BC2D08 0000000386BC2E58 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
0000000386A91AA0 0000000386BC2E58 N N N N N N N N N N N N N Y N N N Y N N N N N N N N N N N N N


We can now see the 2nd child ( 0000000386A91AA0) and also the reasons why it could not be shared with the first (The 'Y's denote a mismatch) The reasons are (1) AUTH_CHECK_MISMATCH and (2) TRANSLATION_MISMATCH This is basically because the objects under my new user do not map to those of SCOTT (the current child). So , authentication fails because I cannot access SCOTTs objects and translation fails because we have different object_ids.
[top]
3. What do the reasons given in v$SQL_SHARED_CURSOR mean?

Below are the list of reasons given as well as some workable examples (Those denoted by ** are the ones most often seen) :-

UNBOUND_CURSOR - The existing child cursor was not fully built (in other words, it was not optimized)

SQL_TYPE_MISMATCH - The SQL type does not match the existing child cursor

**OPTIMIZER_MISMATCH - The optimizer environment does not match the existing child cursor



select count(*) from emp; -> 1 PARENT, 1 CHILD
alter session set optimizer_mode=ALL_ROWS
select count(*) from emp; -> 1 PARENT, 2 CHILDREN (The optimizer mode has changed and therefore the existing child cannot be reused)

(The same applies with events - if I turned on tracing with 10046 than I would get the OPTIMIZER_MISMATCH again and a 3rd child)
OUTLINE_MISMATCH - The outlines do not match the existing child cursor


If my user had created stored outlines previously for this command and they were stored in seperate categories (say "OUTLINES1" and "OUTLINES2") running:-

alter session set use_stored_outlines = OUTLINES1;
select count(*) from emp;
alter session set use_stored_oulines= OUTLINES2;
select count(*) from emp; --> Would create a 2nd child as the outline used is different than the first run.
STATS_ROW_MISMATCH - The existing statistics do not match the existing child cursor



Check that 10046/sql_trace is not set on all sessions as this can cause this.

LITERAL_MISMATCH - Non-data literal values do not match the existing child cursor

SEC_DEPTH_MISMATCH - Security level does not match the existing child cursor

EXPLAIN_PLAN_CURSOR - The child cursor is an explain plan cursor and should not be shared


Explain plan statements will generate a new child by default - the mismatch will be this.
BUFFERED_DML_MISMATCH - Buffered DML does not match the existing child cursor

PDML_ENV_MISMATCH - PDML environment does not match the existing child cursor

INST_DRTLD_MISMATCH - Insert direct load does not match the existing child cursor

SLAVE_QC_MISMATCH -The existing child cursor is a slave cursor and the new one was issued by the coordinator

(or, the existing child cursor was issued by the coordinator and the new one is a slave cursor).

TYPECHECK_MISMATCH - The existing child cursor is not fully optimized

AUTH_CHECK_MISMATCH - Authorization/translation check failed for the existing child cursor

**BIND_MISMATCH - The bind metadata does not match the existing child cursor


variable a varchar2(100);
select count(*) from emp where ename = :a -> 1 PARENT, 1 CHILD
variable a varchar2(400);
select count(*) from emp where ename = :a -> 1 PARENT, 2 CHILDREN (The bind 'a' has now changed in definition)


DESCRIBE_MISMATCH - The typecheck heap is not present during the describe for the child cursor

LANGUAGE_MISMATCH - The language handle does not match the existing child cursor

TRANSLATION_MISMATCH - The base objects of the existing child cursor do not match

ROW_LEVEL_SEC_MISMATCH - The row level security policies do not match

INSUFF_PRIVS - Insufficient privileges on objects referenced by the existing child cursor

INSUFF_PRIVS_REM - Insufficient privileges on remote objects referenced by the existing child cursor

REMOTE_TRANS_MISMATCH - The remote base objects of the existing child cursor do not match


USER1: select count(*) from table@remote_db
USER2: select count(*) from table@remote_db (Although the SQL is identical, the dblink pointed to by remote_db may be a private dblink which resolves
to a different object altogether)
LOGMINER_SESSION_MISMATCH

INCOMP_LTRL_MISMATCH

OVERLAP_TIME_MISMATCH - error_on_overlap_time mismatch

SQL_REDIRECT_MISMATCH - sql redirection mismatch

MV_QUERY_GEN_MISMATCH - materialized view query generation

USER_BIND_PEEK_MISMATCH - user bind peek mismatch

TYPCHK_DEP_MISMATCH - cursor has typecheck dependencies

NO_TRIGGER_MISMATCH - no trigger mismatch

FLASHBACK_CURSOR - No cursor sharing for flashback

ANYDATA_TRANSFORMATION - anydata transformation change

INCOMPLETE_CURSOR - incomplte cursor

TOP_LEVEL_RPI_CURSOR - top level/rpi cursor

DIFFERENT_LONG_LENGTH - different long length

LOGICAL_STANDBY_APPLY - logical standby apply mismatch

DIFF_CALL_DURN - different call duration

BIND_UACS_DIFF - bind uacs mismatch

PLSQL_CMP_SWITCHS_DIFF - plsql compiler switches mismatch

CURSOR_PARTS_MISMATCH - cursor-parts executed mismatch

STB_OBJECT_MISMATCH - STB object different (now exists)

ROW_SHIP_MISMATCH - row shipping capability mismatch

PQ_SLAVE_MISMATCH - PQ slave mismatch

Check you want to be using PX with this reason code, as the problem could be caused by running lots of small SQL statements which do not really need PX. If you are on < 11i you may be hitting BUG4367986

TOP_LEVEL_DDL_MISMATCH - top-level DDL cursor

MULTI_PX_MISMATCH - multi-px and slave-compiled cursor

BIND_PEEKED_PQ_MISMATCH - bind-peeked PQ cursor

MV_REWRITE_MISMATCH - MV rewrite cursor

ROLL_INVALID_MISMATCH - rolling invalidation window exceeded

I suspect this can occur when you see a mix of this reason and some other one in v$sql_shared_cursor, together with a library cache latch issue. My suspicion is this is as a result of that 'other' reason so you should address that first to relieve the latch.

OPTIMIZER_MODE_MISMATCH - optimizer mode mismatch

PX_MISMATCH - parallel query mismatch

MV_STALEOBJ_MISMATCH - mv stale object mismatch

FLASHBACK_TABLE_MISMATCH - flashback table mismatch

LITREP_COMP_MISMATCH - literal replacement compilation mismatch

[top]
4. What further tracing is available ?

Solution:
In 10G it is possible to use CURSORTRACE to aid the investigation of why cursors are not being shared. This event should only be used under the guidance of support and the resultant trace file is undocumented. To get the trace for a particular SQL statement you first of all need to get the hash_value (See the above select from v$sqlarea). You then set the trace on using:-

alter system set events
'immediate trace name cursortrace level 577, address hash_value';

This will write a trace file to user_dump_dest each time we try to reuse the cursor.

[top]
5. Are there any times when a high version count is expected even though BINDS are being used?

Solution:
Consider the following where cursor_sharing=SIMILAR

select /* TEST */ * from emp where sal > 100;
select /* TEST */ * from emp where sal > 101;
select /* TEST */ * from emp where sal > 102;
select /* TEST */ * from emp where sal > 103;
select /* TEST */ * from emp where sal > 104;

SELECT sql_text,version_count,address
FROM V$SQLAREA
WHERE sql_text like 'select /* TEST */%';

SELECT * FROM V$SQL_SHARED_CURSOR WHERE kglhdpar = '&my_addr';

You will see several versions , each with no obvious reason for not being shared

Explanation:
One of the cursor sharing criteria when literal replacement is enabled with cursor_sharing as similar is that bind value should match initial bind value
if the execution plan is going to change depending on the value of the literal. The reason for this is we _might_ get a sub optimal plan if we use the same
cursor. This would typically happen when depending on the value of the literal optimizer is going to chose a different plan. Thus in this test case we have a
predicate with > , if this was a equality we would always share the same child cursor. If application developers are ready to live with a sub-optimal plan and
save on memory , then they need to set the parameter to force.

"The difference between SIMILAR and FORCE is that SIMILAR forces similar statements to share the SQL area without deteriorating execution plans.
Setting CURSOR_SHARING to FORCE forces similar statements to share the SQL area potentially deteriorating execution plans."

No comments: