PLSQL NOTES: TOPIC

Showing posts with label TOPIC. Show all posts

29 December, 2015

Oracle | Forward Declaration

Forward Declaration of Elements

There can be cases where the elements have to called in program code before they are coded in the program. But Oracle is very fuzzy about the declaration of program units. It will throw you an error because if the element is not defined, Oracle treats it as wrong semantic.

To resolve cases, where mutual recursion is present, meaning a program element calls another program unit which in turn calls the first program unit, this issue can be resolved using forward declarations.

In forward declaration, the actual element is not coded, but rather a declaration of it is made before calling it in another program unit. This ensures that Oracle knows the code for the declared program unit is somewhere later present in the code.

There are few guidelines for forward declaration as well:

1. A variable or a cursor cannot be forward declared.

2. Only modules like a procedure or a function can be forward declared.

3. The definition of the forward declared element should be present in the declaration part of the same PL/SQL block where the element is declared.

08 July, 2015

Oracle Collections : Introduction

A collection is an ordered group of elements which have the same data type and we use subscripts to actually access those elements.

There are three types of collections provided by Oracle:

Index-by Tables / Associative Arrays
Nested Table
Variable size array / Varray

Index by Table / Associative Array
Each stored data unit is in the form of KEY-VALUE pair. The keys are unique and it can be either a string or an integer.

TYPE type_name IS TABLE OF element_type [NOT NULL] INDEX BY sucscript_type;
table_name type_name;

Below is an example showing how to use associatibe array:

TYPE salary IS TABLE OF NUMBER INDEX BY varchar2(20);
salary_list salary;

--Adding elements to this ass. array
salary_list('ram') = 15000;
salary_list('lakshman') = 7050;
salary_list('bharat') = 35000;
salary_list('shatrugan') = 5000;

--Now using the collection elements
name := salary_list.FIRST;
WHILE name IS NOT NULL LOOP
DBMS_OUTPUT.PUT_LINE ('Salary of '||name||' is '||TO_CHAR(salary_list(name));

name := salary_list.NEXT(name);

END LOOP;
END;
/

Elements of a index-by table can be %ROWTYPE data of a database table; or it could be %TYPE of a single column of the table.

NESTED TABLES

Its like a one-dimensional array where the elements are stored with their subscript.
But its not conventional type of array as the no. of elements in the nested table is unbound compared to normal array. And its also not as dense as arrays are because the elements keep on changing making it sparse.

The syntax of nested tables is similar to that of the index-by collection except the fact that nested tables do not use INDEX BY clause.

Nested tables can be stored in the database tables where as index-by doesn't.

06 July, 2015

Analytical Functions - Introduction

1. What we can do with analytical functions can be done through simple joins and subqueries, but aggregate functions are always faster or atleast equal to native SQL queries.

2. The “group by” works similar to the aggregate function, but the “non-group by” columns are not allowed in the query. In aggregate func, this restriction is omitted and we can see the non-group by data also and the grouped value is repeated for all the results.

select student_class, count(*) cnt from students

where class in ('6A','6B')

group by class

select student_class, house, count(*) over (partition by student_class) cnt from students

where class in ('CF','CHILD')

3. The Analytical functions are computed after all the where, join, group by and having clauses are complete and before the order by clause. So, the analytical functions can appear only in the select clause and main order by clause of the query.

4. If the condition inside the OVER() clause is left empty, the analytical functions works on the entire resultset. We can use any non-analytical function in the OVER() clause.

select status, ext_batch_id, count(*) over () cnt from gefi_batch_header

where status in ('CF','CHILD')

5. The functions SUM, COUNT, AVG, MIN, MAX are the common analytic functions the result of which does not depend on the order of the records.

Functions like LEAD, LAG, RANK, DENSE_RANK, ROW_NUMBER, FIRST, FIRST VALUE, LAST, LAST VALUE depends on order of records. In the next example we will see how to specify that. We can order the records using the ORDER BY clause inside the OVER() clause.

Now , we will explore each of the above mentioned analytical functions in detail:

ROW_NUMBER, RANK and DENSE_RANK

The order by clause of the row_number() function, does not need to be in the select part of the query. This is not possible in the normal order by clause used in the query.

select status, ext_batch_id, row_number() over (partition by status order by batch_id) cnt from gefi_batch_header

where status in ('CF','CHILD')

You can even omit the partitioning clause and still use the order by in the OVER() clause.

select status, ext_batch_id, row_number() over (order by batch_id) cnt from gefi_batch_header

where status in ('CF','CHILD')

RANK and DENSE_RANK, both provide a rank to the resultset rows, but there is a slight difference in the 2 approaches:

SELECT empno, deptno, sal,

RANK() OVER (PARTITION BY deptno

ORDER BY sal DESC NULLS LAST) RANK,

DENSE_RANK() OVER (PARTITION BY

deptno ORDER BY sal DESC NULLS

LAST) DENSE_RANK

FROM emp

WHERE deptno IN (10, 20)

ORDER BY 2, RANK;

EMPNO  DEPTNO   SAL  RANK DENSE_RANK

------ ------- ----- ----- ----------

  7839      10  5000     1          1

  7782      10  2450     2          2

  7934      10  1300     3          3

  7788      20  3000     1          1

  7902      20  3000     1          1

  7566      20  2975     3          2

  7876      20  1100     4          3

  7369      20   800     5          4

8 rows selected.

As you can see from the query and its result, the RANK skips the rank number, if it finds 2 results with the same rank level. The next in order is assigned the next possible rank. But in case if DENSE_RANK, the values are not skipped and hence it is dense[may be].

05 July, 2015

Oracle Storage Structures : Pointers

There are different storage structures available in Oracle. The choice depends upon the purpose and usage of the table.

There are the following types :

Heap Tables (used by default)
Clusters (master-detail type tables)
Index Organized Tables (data stored acc to its key value)

Now adding few basic detials about each of these types.

HEAP STRUCTURED TABLE

Got its name from the way data is added/removed from it. When inserting, the first free storage block is used by oracle.
And when any data is deleted from heap table, it can be used by new records.
Default structure for oracle.
Each row has its unique ROWID associated with it and if the record position changes, this ROWID is also changed.

CLUSTER

Clusters are further divided into 2 types - hash cluster and index cluster.
Index cluster uses index to maintain the records in table.
Hash cluster uses a hash algorithm to locate a row in the table and are better suited for searches where equality operator is used.

These are few points of difference between hash and index based clusters.

INDEX ORGANIZED TABLES

Normally the index and the rowdata are stored separately; it increases the storage space . But IOTs store the rowdata in the index itself; which makes it fast and low on memory.
Uses a B-Tree index structure to store index.
The PK is a must for using IOTs. Data is stored in the PK itself.

Honorable mention for a storage structure used by oracle internally.

To minimize the loss of data, oracle uses redo log files. The size of the files can be configured.
The lesser the size, the smaller the loss of data be.
And to create even more safeguard for failure, we can use redo log groups.
redo log files can be multiplexed as well and in this, multiple copies of the updates are saved.These multiple copies create a group.

The process of archiving the redo log files is called archiving.

But server has to be run in the archivelog mode.
Log Writer process(LGWR) is not allowed to reuse or overwrite redo log files until its archived.

Oracle can start multiple archived processes and this can be initiated multiple times. At a single time maximum of 10 archival processes can be started.

SELECT * FROM v$log;

If you want to see the information on redo log files.

02 July, 2015

RDBMS Performance Tuning Material | Part one

1. There are two levels of perf. tuning ; one is at the application level and other is at the database level.

2. Application Tuning - deals with the tuning of applications like forms, reports; an application is a program which interacts with the database and hense

application tuning involves controlling the frequency and amount of data the app requests the database to send and receive.

3. Pointers for application tuning

3.1 Generate EXPLAIN PLAN for all the queries used in the application and ensure tuning is done.

3.2 Generate EXPLAIN PLAN for the db views as well. Views actually are complex compared to tables as actually a DB request to a view is

translated into a table call internally. And a view might have its own joins so can itself hamper performance if not created correctly. Also take due note if you

have joins of views in the queries.

3.3 As the size of data increases, there may be a need to create indexes on table columns to ensure performance. But it also involves ensuring

that the queries use the indexes created and there are not too many indexes created.

3.4 Always prefer to use PK or indexed columns.

3.5 Use specific scenarios to ensure good throughput. Ex- using ROWID is better compared to using the LIKE operator.

3.6 Reuse the queries as much as possible as it facilitates the use of shated SQL which helps improve performance.

3.7 Always remember -"Tuning does not solve the problems of poor design"

4. Pointers for database tuning

4.1 At DB level, there can three areas where tuning can be done -

4.1.1 Memory Tuning - cache and buffers

4.1.2 I/O Tuning - data access efficient

4.1.3 Contention Tuning - resolve resource availability issues

4.2 Four basic steps involved in the process of DB tuning which are true for all the above mentioned three areas -

4.2.1 Gather information

4.2.2 Determine optimal changes

4.2.3 Implement changes

4.2.4 Monitor database

5. Performance Tools

6. Begin Statistics Utility (UTLBSTAT) and End Statistics Utility(UTLESTAT)

These scripts help in taking snapshot of oracle instance at a specific interval of time. They use oracle's dynamic performance views(V$) to gather this info.

First, utlBstat is run and after that oracle instance starts gathering all the performance statistics. It keeps on doing that until instance is stopped or utlEstat

is run. After you run utlEstat, the DB creates a REPORTS.TXT file which contains all the statistical info gathered.

7. EXPLAIN PLAN

This utility helps DBA to pass a query to optimizer and see how it performs within it.

To run explain plan, you must have a table created in your schema where you want to gather the statistics.

>$ORACLE_HOME/rdbms/admin/utlxplain.sql;

The query used to get the explain plan for any sql statement is:

EXPLAIN PLAN

SET STATEMENT_ID = 'QUERY1'

INTO CORE_TABLE FOR

SELECT transid, policyno, canx_code

FROM policy

WHERE policyno = 100;

Explain plan works for only DML statements.

8. SQL*Trace and TKPROF

Unlike explain plan which only generates the query execution path of the optimizer; sqlTrace also generates info like the cpu usage, no of rows fetched

etc.

Parameters which need to be set in INIT.ORA file before you can run sqlTrace are: MAX_DUMP_FILE_SIZE - max size of the trace file, SQL_TRACE - makes

a trace result to be written to file for every database user; not preferred though, USER_DUMP_DEST - destination where the dump files are written.

ALTER SESSION SET SQL_TRACE = TRUE; The syntax of the query used to start sqlTrace in sql session.

For generating the trace file for an application, we can use the PLSQL code; the syntax is:

BEGIN

DBMS_SESSION.SET_SQL_TRACE(TRUE);

-- the code goes here whose statistics are to be gathered

DBMS_SESSION.SET_SQL_TRACE(FALSE);

END

Once the file is generated , it needs to be converted into a readable format. TKPROF utility is used for that. The syntax -

% tkprof trace_file.trc new_file.log

TKPROF utility converts the unreadable trc file into a log file. There are certain optional parameters present in the syntax.

EXPLAIN - username password

INSERT - where to dump both the sql statements and data for each statement

PRINT - no of queries to be picked in the trace file

RECORD - specify the output file

SORT - order the results of tkprof

9. With both explain plan and trace files, few pointers help to estimate the performance parameters:

9.1 There should be a few logical I/O blocks versus a large no of rows returned. Optimal ratio is 2:1

9.2 mostly, execute value should be higher than the parse value. but if no of parses is higher compared to the no of executions, try incresing the

size of the shared pool.

9.3 No of logical I/O should be higher compared to the physical I/O.

10. Dynamic performance tables - are actually views despite the name.

are created when the db instance starts.

these are SGA held memory structures.

useful for perf. tuning and backup and recovery.

01 July, 2015

SQL*LOADER : INTRODUCTION

There are two ways to load data; conventional and direct path.
Direct path removes much of dB overhead and writes data directly to dB files.

Posted on the go..

28 June, 2015

ORACLE EXPLAIN PLAN

What's an explain plan?

An explain plan is a representation of the access path that is taken when a query is executed within Oracle.

Query processing can be divided into 7 phases:

[1] Syntactic	Checks the syntax of the query
[2] Semantic	Checks that all objects exist and are accessible
[3] View Merging	Rewrites query as join on base tables as opposed to using views
[4] Statement Transformation	Rewrites query transforming some complex constructs into simpler ones where appropriate (e.g. subquery merging, in/or transformation)
[5] Optimization	Determines the optimal access path for the query to take. With the Rule Based Optimizer (RBO) it uses a set of heuristics to determine access path. With the Cost Based Optimizer (CBO) we use statistics to analyze the relative costs of accessing objects.
[6] QEP Generation	QEP = Query Evaluation Plan
[7] QEP Execution	QEP = Query Evaluation Plan

Steps [1]-[6] are handled by the parser. Step [7] is the execution of the statement.

The explain plan is produced by the parser. Once the access path has been decided upon it is stored in the library cache together with the statement itself. We store queries in the library cache based upon a hashed representation of that query. When looking for a statement in the library cache, we first apply a hashing algorithm to the statement and then we look for this hash value in the library cache. This access path will be used until the query is reparsed.

Terminology

Row Source	A set of rows used in a query may be a select from a base object or the result set returned by joining 2 earlier row sources
Predicate	where clause of a query
Tuples	rows
Driving Table	This is the row source that we use to seed the query. If this returns a lot of rows then this can have a negative affect on all subsequent operations
Probed Table	This is the object we lookup data in after we have retrieved relevant key data from the driving table.

How does Oracle access data?

At the physical level Oracle reads blocks of data. The smallest amount of data read is a single Oracle block, the largest is constrained by operating system limits (and multiblock i/o). Logically Oracle finds the data to read by using the following methods:

Full Table Scan (FTS)
Index Lookup (unique & non-unique)
Rowid

Explain plan Hierarchy

Query Plan
-----------------------------------------
SELECT STATEMENT     [CHOOSE] Cost=1234
  TABLE ACCESS FULL LARGE [:Q65001] [ANALYZED]

The rightmost uppermost operation of an explain plan is the first thing that the explain plan will execute. In this case TABLE ACCESS FULL LARGE is the first operation. This statement means we are doing a full table scan of table LARGE. When this operation completes then the resultant row source is passed up to the
next level of the query for processing. In this case it is the SELECT STATEMENT which is the top of the query.

[CHOOSE] is an indication of the optimizer_goal for the query. This DOES NOT necessarily indicate that plan has actually used this goal. The only way to confirm this is to check the
cost= part of the explain plan as well. For example the following query indicates that the CBO has been used because there is a cost in the cost field:

SELECT STATEMENT     [CHOOSE] Cost=1234

However the explain plan below indicates the use of the RBO because the cost field is blank:

SELECT STATEMENT     [CHOOSE] Cost=

The cost field is a comparative cost that is used internally to determine the best cost for particular plans. The costs of different statements are not really directly comparable.

[:Q65001] indicates that this particular part of the query is being executed in parallel. This number indicates that the operation will be processed by a parallel query slave as opposed to being executed serially.

[ANALYZED] indicates that the object in question has been analyzed and there are currently statistics available for the CBO to use. There is no indication of the 'level' of analysis done.

Access Methods in detail

In a FTS operation, the whole table is read up to the high water mark (HWM). The HWM marks the last block in the table that has ever had data written to it. If you have deleted all the rows then you will still read up to the HWM. Truncate resets the HWM back to the start of the table. FTS uses multiblock i/o to read the blocks from disk. Multiblock i/o is controlled by the parameter <PARAM:db_block_multi_block_read_count>.

This defaults to:

db_block_buffers / ( (PROCESSES+3) / 4 )

Maximum values are OS dependant

Buffers from FTS operations are placed on the Least Recently Used (LRU) end of the buffer cache so will be quickly aged out. FTS is not recommended for large tables unless you are reading >5-10% of it (or so) or you intend to run in parallel.

Example FTS explain plan:

SQL> explain plan for select * from dual;

Query Plan
-----------------------------------------
SELECT STATEMENT     [CHOOSE] Cost=
  TABLE ACCESS FULL DUAL

Data is accessed by looking up key values in an index and returning rowids. A rowid uniquely identifies an individual row in a particular data block. This block is read via single block i/o.

In this example an index is used to find the relevant row(s) and then the table is accessed to lookup the ename column (which is not included in the index):

SQL> explain plan for
select empno,ename from emp where empno=10;
Query Plan

------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
    INDEX UNIQUE SCAN EMP_I1

Notice the 'TABLE ACCESS BY ROWID' section. This indicates that the table data is not being accessed via a FTS operation but rather by a rowid lookup. In this case the rowid has been produced by looking up values in the index first. The index is being accessed by an 'INDEX UNIQUE SCAN' operation. This is explained below. The index name in this case is EMP_I1. If all the required data resides in the index then a table lookup may be unnecessary and all you will see is an index access with no table access.

In the following example all the columns (empno) are in the index. Notice that no table access takes place:

SQL> explain plan for
select empno from emp where empno=10;

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
  INDEX UNIQUE SCAN EMP_I1

Indexes are presorted so sorting may be unecessary if the sort order required is the same as the index.

SQL> explain plan for select empno,ename from emp
where empno > 7876 order by empno;

Query Plan
-------------------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
  INDEX RANGE SCAN EMP_I1 [ANALYZED]

In this case the index is sorted so ther rows will be returned in the order of the index hence a sort is unecessary.

SQL> explain plan for
select /*+ Full(emp) */ empno,ename from emp
where empno> 7876 order by empno;

Query Plan
-------------------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=9
  SORT ORDER BY
    TABLE ACCESS FULL EMP [ANALYZED]  Cost=1 Card=2 Bytes=66

Because we have forced a FTS the data is unsorted and so we must sort the data
after it has been retrieved.

There are 4 methods of index lookup:

index unique scan
index range scan
index full scan
index fast full scan

Method for looking up a single key value via a unique index. Always returns a single value You must supply AT LEAST the leading column of the index to access data via the index, However this may return > 1 row as the uniqueness will not be guaranteed.

SQL> explain plan for
select empno,ename from emp where empno=10;

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
    INDEX UNIQUE SCAN EMP_I1

Method for accessing multiple column values You must supply AT LEAST the leading column of the index to access data via the index Can be used for range operations (e.g. > < <> >= <= between)

SQL> explain plan for select empno,ename from emp
where empno > 7876 order by empno;

Query Plan
-------------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
  INDEX RANGE SCAN EMP_I1 [ANALYZED]

A non-unique index may return multiple values for the predicate col1 = 5 and will use an index range scan

SQL> explain plan for select mgr from emp where mgr = 5

Query plan
--------------------
SELECT STATEMENT [CHOOSE] Cost=1
  INDEX RANGE SCAN EMP_I2 [ANALYZED]

In certain circumstances it is possible for the whole index to be scanned as opposed to a range scan (i.e. where no constraining predicates are provided for a table). Full index scans are only available in the CBO as otherwise we are unable to determine whether a full scan would be a good idea or not. We choose an index Full Scan when we have statistics that indicate that it is going to be more efficient than a Full table scan and a sort.

For example we may do a Full index scan when we do an unbounded scan of an index and want the data to be ordered in the index order. The optimizer may decide that selecting all the information from the index and not sorting is more efficient than doing a FTS or a Fast Full Index Scan and then sorting.

An Index full scan will perform single block i/o's and so it may prove to be inefficient. Index BE_IX is a concatenated index on big_emp (empno,ename)

SQL> explain plan for select empno,ename
     from big_emp order by empno,ename;

Query Plan
------------------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=26
  INDEX FULL SCAN BE_IX [ANALYZED]

Scans all the block in the index Rows are not returned in sorted order Introduced in 7.3 and requires V733_PLANS_ENABLED=TRUE and CBO may be hinted using INDEX_FFS hint uses multiblock i/o can be executed in parallel can be used to access second column of concatenated indexes. This is because we are selecting all of the index.

Note that INDEX FAST FULL SCAN is the mechinism behind fast index create and recreate. Index BE_IX is a concatenated index on big_emp (empno,ename)

SQL> explain plan for select empno,ename from big_emp;

Query Plan
------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=1
  INDEX FAST FULL SCAN BE_IX [ANALYZED]

Selecting the 2nd column of concatenated index:

SQL> explain plan for select ename from big_emp;

Query Plan
------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=1
  INDEX FAST FULL SCAN BE_IX [ANALYZED]

This is the quickest access method available Oracle simply retrieves the block specified and extracts the rows it is interested in. Most frequently seen in explain plans as Table access by Rowid

SQL> explain plan for select * from dept where rowid = ':x';

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS BY ROWID DEPT [ANALYZED]

Table is accessed by rowid following index lookup:

SQL> explain plan for
select empno,ename from emp where empno=10;

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
TABLE ACCESS BY ROWID EMP [ANALYZED]
    INDEX UNIQUE SCAN EMP_I1

A Join is a predicate that attempts to combine 2 row sources We only ever join 2 row sources together Join steps are always performed serially even though underlying row sources may have been accessed in parallel. Join order - order in which joins are performed

The join order makes a significant difference to the way in which the query is executed. By accessing particular row sources first, certain predicates may be satisfied that are not satisfied by with other join orders. This may prevent certain access paths from being taken.

Suppose there is a concatenated index on A(a.col1,a.col2). Note that a.col1 is the leading column. Consider the following query:

select A.col4
from   A,B,C
where  B.col3 = 10
and    A.col1 = B.col1
and    A.col2 = C.col2
and    C.col3 = 5

We could represent the joins present in the query using the following schematic:

  B     <---> A <--->    C
col3=10                col3=5

There are really only 2 ways we can drive the query: via B.col3 or C.col3. We would have to do a Full scan of A to be able to drive off it. This is unlikely to be efficient with large tables;

If we drive off table B, using predicate B.col3=10 (as a filter or lookup key) then we will retrieve the value for B.col1 and join to A.col1. Because we have now filled the leading column of the concatenated index on table A we can use this index to give us values for A.col2 and join to A.

However if we drive of table c, then we only get a value for a.col2 and since this is a trailing column of a concatenated index and the leading column has not been supplied at this point, we cannot use the index on a to lookup the data.

So it is likely that the best join order will be B A C. The CBO will obviously use costs to establish whether the individual access paths are a good idea or not.

If the CBO does not choose this join order then we can hint it by changing the from
clause to read:

from B,A,C

and using the /*+ ordered */ hint. The resultant query would be:

select /*+ ordered */ A.col4
from   B,A,C
where  B.col3 = 10
and    A.col1 = B.col1
and    A.col2 = C.col2
and    C.col3 = 5

Join Types

Sort Merge Join (SMJ)
Nested Loops (NL)
Hash Join

Rows are produced by Row Source 1 and are then sorted Rows from Row Source 2 are then produced and sorted by the same sort key as Row Source 1. Row Source 1 and 2 are NOT accessed concurrently Sorted rows from both sides are then merged together (joined)

                   MERGE
                 /      \
            SORT        SORT
             |             |
        Row Source 1  Row Source 2

If the row sources are already (known to be) sorted then the sort operation is unecessary as long as both 'sides' are sorted using the same key. Presorted row sources include indexed columns and row sources that have already been sorted in earlier steps. Although the merge of the 2 row sources is handled serially, the row sources could be accessed in parallel.

SQL> explain plan for
select /*+ ordered */ e.deptno,d.deptno
from emp e,dept d
where e.deptno = d.deptno
order by e.deptno,d.deptno;

Query Plan
-------------------------------------
SELECT STATEMENT [CHOOSE] Cost=17
  MERGE JOIN
    SORT JOIN
      TABLE ACCESS FULL EMP [ANALYZED]
    SORT JOIN
      TABLE ACCESS FULL DEPT [ANALYZED]

Sorting is an expensive operation, especially with large tables. Because of this, SMJ is often not a particularly efficient join method.

First we return all the rows from row source 1 Then we probe row source 2 once for each row returned from row source 1

Row source 1
~~~~~~~~~~~~
Row 1 --------------       -- Probe ->       Row source 2
Row 2 --------------       -- Probe ->       Row source 2
Row 3 --------------       -- Probe ->       Row source 2

Row source 1 is known as the outer table
Row source 2 is known as the inner table

Accessing row source 2 is known a probing the inner table For nested loops to be efficient it is important that the first row source returns as few rows as possible as this directly controls the number of probes of the second row source. Also it helps if the access method for row source 2 is efficient as this operation is being repeated once for every row returned by row source 1.

SQL> explain plan for
select a.dname,b.sql
from dept a,emp b
where a.deptno = b.deptno;

Query Plan
-------------------------
SELECT STATEMENT [CHOOSE] Cost=5
  NESTED LOOPS
    TABLE ACCESS FULL DEPT [ANALYZED]
    TABLE ACCESS FULL EMP [ANALYZED]

New join type introduced in 7.3 More efficient in theory than NL & SMJ Only accessible via the CBO Smallest row source is chosen and used to build a hash table and a bitmap The second row source is hashed and checked against the hash table looking for joins. The bitmap is used as a quick lookup to check if rows are in the hash table and are especially useful when the hash table is too large to fit in memory.

SQL> explain plan for
select /*+ use_hash(emp) */ empno
from emp,dept
where emp.deptno = dept.deptno;

Query Plan
----------------------------
SELECT STATEMENT  [CHOOSE] Cost=3
  HASH JOIN
    TABLE ACCESS FULL DEPT
    TABLE ACCESS FULL EMP

Hash joins are enabled by the parameter HASH_JOIN_ENABLED=TRUE in the init.ora or session. TRUE is the default in 7.3

A Cartesian Product is done where they are no join conditions between 2 row sources and there is no alternative method of accessing the data Not really a join as such as there is no join! Typically this is caused by a coding mistake where a join has been left out. It can be useful in some circumstances - Star joins uses cartesian products.

Notice that there is no join between the 2 tables:

SQL> explain plan for
select emp.deptno,dept,deptno
from emp,dept

Query Plan
------------------------------
SLECT STATEMENT [CHOOSE] Cost=5
  MERGE JOIN CARTESIAN
    TABLE ACCESS FULL DEPT
    SORT JOIN
      TABLE ACCESS FULL EMP

The CARTESIAN keyword indicate that we are doing a cartesian product.

Operations

Operations that show up in explain plans

sort
filter
view

There are a number of different operations that promote sorts

order by clauses
group by
sort merge join

Note that if the row source is already appropriately sorted then no sorting is required. This is now indicated in 7.3:

SORT GROUP BY NOSORT
     INDEX FULL SCAN .....

In this case the group by operation simply groups the rows it does not do the sort operation as this has already been completed.

Sorts are expensive operations especially on large tables where the rows do not fit in memory and spill to disk. By default sort blocks are placed into the buffer cache. This may result in aging out of other blocks that may be reread by other processes. To avoid this you can use the parameter <Parameter:SORT_DIRECT_WRITES> which does not place sort blocks into the buffer cache.

Has a number of different meanings used to indicate partition elimination may also indicate an actual filter step where one row source is filtering another functions such as min may introduce filter steps into query plans

In this example there are 2 filter steps. The first is effectively like a NL except that it stops when it gets something that it doesn't like (i.e. a bounded NL). This is there because of the not in. The second is filtering out the min value:

SQL> explain plan for select * from emp
     where empno not in (select min(empno)
     from big_emp group by empno);

Query Plan
------------------
SELECT STATEMENT [CHOOSE]  Cost=1
  FILTER     **** This is like a bounded nested loops
    TABLE ACCESS FULL EMP [ANALYZED]
     FILTER   **** This filter is introduced by the min
        SORT GROUP BY NOSORT
          INDEX FULL SCAN BE_IX

This example is also interesting in that it has a NOSORT function. The group by does not need to sort because the index row source is already pre sorted.

When a view cannot be merged into the main query you will often see a projection view operation. This indicates that the 'view' will be selected from directly as opposed to being broken down into joins on the base tables. A number of constructs make a view non mergeable. Inline views are also non mergeable.

In the following example the select contains an inline view which cannot be merged:

SQL> explain plan for
select ename,tot
from emp,
    (select empno,sum(empno) tot from big_emp group by empno) tmp
where emp.empno = tmp.empno;

Query Plan
------------------------
SELECT STATEMENT [CHOOSE]
  HASH JOIN
    TABLE ACCESS FULL EMP [ANALYZED]
    VIEW
      SORT GROUP BY
        INDEX FULL SCAN BE_IX

In this case the inline view tmp which contains an aggregate function cannot be merged into the main query. The explain plan shows this as a view step.

Allows a large table to be broken up into a number of smaller partitions which can be queried much more quickly than the table as a whole a union all view is built over the top to provide the original functionality Check constraints or where clauses provide partition elimination capabilities

SQL> explain plan for
select /*+ use_nl(p1,kbwyv1) ordered */  sum(prc_pd)
from parent1 p1,  kbwyv1
where p1.class = 22
and   kbwyv1.bitm_numb = p1.bitm_numb
and   kbwyv1.year = 1997
and   kbwyv1.week between 32 and 33 ;

Query Plan
-----------------------------------------
SELECT STATEMENT   [FIRST_ROWS] Cost=1780
  SORT AGGREGATE
    NESTED LOOPS   [:Q65001] Ct=1780 Cd=40 Bt=3120
      TABLE ACCESS FULL PARENT1 [:Q65000] [AN] Ct=20 Cd=40 Bt=1040
      VIEW  KBWYV1 [:Q65001]
        UNION-ALL PARTITION  [:Q65001]
          FILTER   [:Q64000]
            TABLE ACCESS FULL KBWYT1 [AN] Ct=11 Cd=2000 Bt=104000
          TABLE ACCESS FULL KBWYT2 [AN] Ct=11 Cd=2000 Bt=104000
          TABLE ACCESS FULL KBWYT3 [AN] Ct=11 Cd=2000 Bt=104000
          FILTER   [:Q61000]
            TABLE ACCESS FULL KBWYT4 [AN] Ct=11 Cd=2000 Bt=104000

KBWYV1 is a view on 4 tables KBWYT1-4. KBWYT1-4 contain rows for week 31-34 respectively and are maintained by check constraints. This query should only return rows from partions 2 & 3. The filter operation indicates this. Partitions 1 & 4 are eliminated at execution time. The view line indicates that the view is not merged. The union-all partion information indicates that we have recognised this as a partition view. Note that the tables can be accessed in parallel.

Remote Queries

Only shows remote in the OPERATION column OTHER column shows query executed on remote node OTHER_NODE shows where it is executed Different operational characteristics for RBO & CBO

RBO - Drags everything across the link and joins locally
CBO - Uses cost estimates to determine whether to execute remotely or locally

SQL>  explain plan for
select *
from dept@loop_link;

Query Plan
-------------------------------------------------------
SELECT STATEMENT REMOTE  [CHOOSE] Cost=1
  TABLE ACCESS FULL DEPT [SJD.WORLD] [ANALYZED]

In this case the whole query has been sent to the remote site. The other column shows nothing.

SQL> explain plan for
select a.dname,avg(b.sal),max(b.sal)
from dept@loop_link a, emp b
where a.deptno=b.deptno
group by a.dname
order by max(b.sal),avg(b.sal) desc;

Query Plan
-----------------------------------------------------
SELECT STATEMENT   [CHOOSE] Cost=20
  SORT ORDER BY  [:Q137003] [PARALLEL_TO_SERIAL]
    SORT GROUP BY  [:Q137002] [PARALLEL_TO_PARALLEL]
      NESTED LOOPS   [:Q137001] [PARALLEL_TO_PARALLEL]
        REMOTE   [:Q137000] [PARALLEL_FROM_SERIAL]
        TABLE ACCESS FULL EMP [:Q137001] [ANALYZED]
        [PARALLEL_COMBINED_WITH_PARENT]

Bind Variables

Bind variables are recommended in most cases because they promote sharing of sql code
At parse time the parser has NO IDEA what the bind variable contains. With RBO this makes no difference but with CBO, which relies on accurate statistics to produce plans, this can be a problem.

Defining bind variables in sqlplus:

variable x varchar2(18);
assigning values:
begin
:x := 'hello';
end;
/

SQL> explain plan for
select *
from dept
where rowid = ':x';

Query Plan
------------------------------------
SELECT STATEMENT [CHOOSE] Cost=1
  TABLE ACCESS BY ROWID DEPT [ANALYZED]

Parallel Query

Main indicators that a query is using PQO:

[:Q1000004] entries in the explain plan
Checkout the other column for details of what the slaves are executing
v$pq_slave will show any parallel activity

Columns to look in for information

other - contains the query passed to the slaves
other_tag - describes the contents of other
object_node - indicates order of pqo slaves

Parallel Query operates on a producer/consumer basis. When you specify parallel degree 4 oracle tries to allocate 4 producer slaves and 4 consumer slaves. The producers can feed any of the consumers. If there are only 2 slaves available then we use these. If there is only 1 slave available then we go serial If there are none available then we use serial. If parallel_min_percent is set then we error ora 12827 instead of using a lower number of slaves or going serial

Consumer processes typically perform a sorting function. If there is no requirement for the data to be sorted then the consumer slaves are not produced and we end up with the number of slaves used matching the degree of parallelism as opposed to being 2x the degree.

Parallel Terms

PARALLEL_FROM_SERIAL	This means that source of the data is serial but it is passed to a parallel consumer
PARALLEL_TO_PARALLEL	Both the consumer and the producer are parallel
PARALLEL_COMBINED_WITH_PARENT	This operation has been combined with the parent operator. For example in a sort merge join the sort operations would be shown as PARALLEL_COMBINED_WITH_PARENT because the sort and the merge are handled as 1 operation.
PARALELL_TO_SERIAL	The source of the data is parallel but it is passed to a serial consumer. This typically will happen at the top of the explain plan but could occur anywhere

Assumptions

OPTIMIZER_MODE = CHOOSE
DEPT is small compared to EMP
DEPT has an index (DEPT_INDX) on deptno column

Three examples are presented

Query #1: Serial
Query #2: Parallel
Query #3: Parallel, with forced optimization to RULE and forced usage of DEPT_INDX

Sample Query #1 (Serial)

select A.dname, avg(B.sal), max(B.sal)
from  dept A, emp B
where A.deptno = B.deptno
group by A.dname
order by max(B.sal), avg(B.sal) desc;

Execution Plan #1 (Serial)

OBJECT_NAME                      OBJECT_NODE OTHER
-------------------------------  ----------- -------
SELECT STATEMENT
 SORT ORDER BY
   SORT GROUP BY
     MERGE JOIN
       SORT JOIN
         TABLE ACCESS FULL emp
       SORT JOIN
         TABLE ACCESS FULL dept

Notice that the object_node and other columns are empty

Sample Query #2 (Query #1 with parallel hints)

select /*+ parallel(B,4) parallel(A,4) */
A.dname, avg(B.sal), max(B.sal)
from  dept A, emp B
where A.deptno = B.deptno
group by A.dname
order by max(B.sal), avg(B.sal) desc;

Execution Plan #2 (Parallel)

OBJECT_NAME                      OBJECT_NODE OTHER
-------------------------------  ----------- -------
SELECT STATEMENT      Cost = ??
 SORT ORDER BY                   :Q55004     **[7]**
   SORT GROUP BY                 :Q55003     **[6]**
     MERGE JOIN                  :Q55002     **[5]**
       SORT JOIN                 :Q55002     **[4]**
         TABLE ACCESS FULL emp   :Q55001     **[2]**
       SORT JOIN                 :Q55002     **[3]**
         TABLE ACCESS FULL dept  :Q55000     **[1]**

Execution Plan #2  -- OTHER column

**[1]**  (:Q55000) "PARALLEL_FROM_SERIAL"

Serial execution of SELECT DEPTNO, DNAME FROM DEPT

**[2]**  (:Q55001) "PARALLEL_TO_PARALLEL"

        SELECT /*+ ROWID(A1)*/
        A1."DEPTNO" C0, A1."SAL" C1
        FROM "EMP" A1
        WHERE ROWID BETWEEN :1 AND :2

**[3]**  (:Q55002) "PARALLEL_COMBINED_WITH_PARENT"
**[4]**  (:Q55002) "PARALLEL_COMBINED_WITH_PARENT"
**[5]**  (:Q55002) "PARALLEL_TO_PARALLEL"

        SELECT /*+ ORDERED USE_MERGE(A2)*/
        A2.C1 C0, A1.C1 C1
        FROM :Q55001 A1,:Q55000 A2
        WHERE A1.C0=A2.C0

**[6]**  (:Q55003) "PARALLEL_TO_PARALLEL"

        SELECT MAX(A1.C1) C0, AVG(A1.C1) C1, A1.C0 C2
        FROM :Q55002 A1
        GROUP BY A1.C0

**[7]**  (:Q55004) "PARALLEL_FROM_SERIAL"

        SELECT A1.C0 C0, A1.C1 C1, A1.C2 C2
        FROM :Q55003 A1
        ORDER BY A1.CO, A1.C1 DESC

Sample Query #3 (Query #2 with fudged hints)

select /*+ index(A dept_indx) parallel(B,4) parallel(A,4) */
      A.dname, avg(B.sal), max(B.sal)
from  dept A, emp B
where A.deptno = B.deptno
group by A.dname
order by max(B.sal), avg(B.sal) desc;

Execution Plan #3 (Parallel)

OBJECT_NAME                         OBJECT_NODE OTHER
----------------------------------- ----------- -------
SELECT STATEMENT          Cost = ??
 SORT ORDER BY                      :Q58002     **[6]**
   SORT GROUP BY                    :Q58001     **[5]**
     NESTED LOOPS JOIN              :Q58000     **[4]**
       TABLE ACCESS FULL emp        :Q58000     **[3]**
       TABLE ACCESS BY ROWID dept   :Q58000     **[2]**
         INDEX RANGE SCAN dept_indx :Q58000     **[1]**

Execution Plan #3  -- OTHER column

**[1]**  (:Q58000) "PARALLEL_COMBINED_WITH_PARENT"
**[2]**  (:Q58000) "PARALLEL_COMBINED_WITH_PARENT"
**[3]**  (:Q58000) "PARALLEL_COMBINED_WITH_PARENT"
**[4]**  (:Q58000) "PARALLEL_TO_PARALLEL"

        SELECT /*+ ORDERED USE_NL(A2) INDEX(A2) */
        A2."DNAME" C0, A1.C0 C1
        FROM
          (SELECT /*+ ROWID(A3) */
           A3."SAL" CO, A3."DEPTNO" C1
           FROM "EMP" A3
           WHERE ROWID BETWEEN :1 AND :2) A1,
          "DEPT" A2
        WHERE A2."DEPTNO" = A1.C1

**[5]**  (:Q58001) "PARALLEL_TO_PARALLEL"

        SELECT MAX(A1.C1) C0, AVG(A1.C1) C1, A1.C0 C2
        FROM :Q58000 A1
        GROUP BY A1.C0

**[6]**  (:Q58002) "PARALLEL_TO_SERIAL"

        SELECT A1.C0 C0, A1.C1 C1, A1.C2 C2
        FROM :Q58001 A1
        ORDER BY A1.C0, A1.C1 DESC

How to obtain explain plans

Main advantage is that it does not actually run the query - just parses the sql. This means that it executes quickly. In the early stages of tuning explain plan gives you an idea of the potential performance of your query without actually running it. You can then make a judgement as to any modifications you may choose to make.

Autotrace can be configured to run the sql & gives a plan and statistics afterwards or just give you an explain plan without executing the query.

Analyzes trace file

Source - www.akadia.com