5 Indexing XMLType Data

You can create indexes on your XML data, to focus on particular parts of it that you query often, and thus improve performance. This chapter includes guidelines for doing this. It describes various ways that you can index XMLType data, whether schema-based or non-schema-based, and regardless of the XMLType storage model you use (binary XML, unstructured, hybrid, or structured).

This chapter contains these topics:

Oracle XML DB Tasks Involving Indexes
Overview of Indexing XMLType Data
Function-Based Indexes on XMLType Data
XMLIndex
Oracle Text Indexes on XML Data

Note:

The explain plans shown here are for illustration only. If you run the examples presented here in your environment your explain plans might not be identical.

See Also:

Oracle Database Concepts for an overview of indexing
Oracle Database Advanced Application Developer's Guide for information about using indexes in application development

Oracle XML DB Tasks Involving Indexes

Table 5-1 identifies the documentation for some user tasks involving indexes.

Table 5-1 Oracle XML DB Tasks Involving Indexes

For information about how to...	See...
Create a function-based index on unstructured `XMLType` data	"Creating Function-Based Indexes on Unstructured XMLType Tables and Columns"
Create a function-based index on structured `XMLType` data	"Creating Function-Based Indexes on Structured XMLType Tables and Columns"
Create B-tree indexes on objects underlying structured `XMLType` data	"Using Indexes to Tune Queries on Collections Stored as OCTs"
Create an `XMLIndex` index	Example 5-9, Example 5-14, Example 5-16
Name the path table when creating an `XMLIndex` index	Example 5-14
Specify storage options when creating an `XMLIndex` index	Example 5-16
Obtain the name of an `XMLIndex` index for a given table or column	Example 5-12
Rename an XMLIndex index	Example 5-13
Drop an XMLIndex index	Example 5-13
Show all existing secondary indexes on an `XMLIndex` path table	Example 5-17, Example 5-23
Obtain the name of a path table for an `XMLIndex` index	Example 5-15
Obtain the name of an `XMLIndex` index, given its path table	Example 5-25
Create a secondary index on an `XMLIndex` path table	"Creating Additional Secondary Indexes on an XMLIndex Path Table"
Create a function-based index on a path-table `VALUE` column	Example 5-18
Create a numeric index on a path-table `VALUE` column	Example 5-20
Create a date index on a path-table `VALUE` column	Example 5-21
Create an Oracle Text `CONTEXT` index on a path-table `VALUE` column	Example 5-22
Show whether an `XMLIndex` index is used in evaluating a query	"How to Tell If XMLIndex is Used"
Turn off use of an `XMLIndex` index	"Turning Off Use of XMLIndex"
Extract an XML fragment using `XMLIndex`	Example 5-26
Exclude or include particular XPath expressions from use by an `XMLIndex` index	"XMLIndex Path Subsetting: Specifying the Paths You Want to Index"
Specify namespace prefixes for XPath expressions used for `XMLIndex`	"XMLIndex Path Subsetting: Specifying the Paths You Want to Index"
Create an `XMLIndex` on Oracle XML DB Repository	"Using XMLIndex on Oracle XML DB Repository"
Query Oracle XML DB Repository using `XMLIndex`	"Using XMLIndex on Oracle XML DB Repository"
Specify that an `XMLIndex` index should be created and maintained using parallel processes	"XMLIndex Parallelism"
Change the parallelism of an `XMLIndex` path table to tune index performance	"XMLIndex Parallelism"
Schedule maintenance for an `XMLIndex` index	"Asynchronous (Deferred) Maintenance of XMLIndex Indexes"
Manually synchronize an `XMLIndex` index and its base table	"Asynchronous (Deferred) Maintenance of XMLIndex Indexes"
Collect statistics on a table or index for the cost-based optimizer	Example 5-37
Create an Oracle Text `CONTEXT` index	Example 5-38
Use an Oracle Text `CONTEXT` index for full-text search of XML data	Example 5-39
Show whether an Oracle Text `CONTEXT` index is used in a query	Example 5-39

Overview of Indexing XMLType Data

Database indexes improve performance by providing faster access to table data. The use of indexes is particularly recommended for online transaction processing (OLTP) environments involving few updates.

Problem: Fine-Grained Structure of XML Data

You can create indexes on one or more table columns, or on a functional expression. XML data, however, has its own, fine-grained structure, which is not necessarily reflected in the structure of the database tables used to store it. For this reason, effectively indexing XML data can be a bit different from indexing most database data.

B-tree Indexes Are Appropriate for Structured Storage

For structured XML storage, XML objects such as elements and attributes correspond to object-relational columns and tables, so creating B-tree indexes on those columns and tables provides an excellent way to effectively index the corresponding XML objects. Here, the storage model directly reflects the fine-grained structure of the XML data, so there is no special problem for indexing structured XML data.

Unstructured and Hybrid Storage Present an Indexing Problem for XML Data

For unstructured and hybrid XML storage, indexing a database column using the standard sorts of index (B-tree, bitmap) is generally not helpful for accessing particular parts of an XML document. If an XMLType column that contains an XML document is stored as a CLOB instance, then the details within that document are inaccessible to the column index — the entire document acts as a single unit as far as the column index is concerned. In hybrid storage, part of an XML document is broken up and stored object-relationally (structured storage), but one or more XML fragments are stored as CLOB instances (unstructured storage). A typical use case here is mapping an XML-schema complexType or a complex element to CLOB storage, because the entire fragment is generally accessed as a unit. For standard indexes, it acts as a unit for indexing as well.

Solution: XMLIndex

XMLIndex provides a general, XML-specific index that indexes the internal structure of XML data. One of its main purposes is to overcome the indexing limitation presented by unstructured and hybrid storage of XML data, that is, CLOB storage. It does this by indexing the XML tags of your document and identifying document fragments based on XPath expressions that target them. It can also index scalar node values, to provide quick lookup based on individual values or ranges of values. It also records document hierarchy information for each node it indexes: relations parent–child, ancestor–descendant, and sibling.

See Also:

"XMLIndex"

Other Indexes for XML Data

In addition to XMLIndex, you can use function-based indexes and Oracle Text indexes with XML data. In releases prior to Oracle Database 11g Release 1 (11.1), CTXXPath indexes are also sometimes appropriate for use with XML data.

Function-Based Indexes

In many cases where an XPath expression targets a singleton node, a function-based index can be effective in increasing access performance. In particular, SQL functions XMLQuery, XMLExists, XMLCast, extract, extractValue, and existsNode are useful candidates for this kind of index on XML data. If a functional expression in a query WHERE clause matches a function-based index, then access to the corresponding data can sometimes be faster even than that provided by a more general XMLIndex index. In addition, for structured storage, defining an index based on SQL function extractValue often leads, by XPath rewrite, to automatic creation of (B-tree) indexes on the underlying objects. In this case also, the XPath target must be a singleton element or attribute.

Oracle Text Indexes

Besides accessing XML nodes such as elements and attributes, it is sometimes important to provide fast access to particular passages of text within XML text nodes. This is the purpose of Oracle Text indexes: they index full-text strings. An Oracle Text CONTEXT index enables SQL function contains for full-text search over XML. With structured storage, XPath rewrite can often rewrite XPath function ora:contains to SQL function contains, so in those cases too an Oracle Text index can be employed. Full-text indexing is particularly useful for document-centric applications, which often contain a mix of XML elements and text-node content. Full-text searching can often be made more powerful, more focused, by combining it with structural XML searching, that is, by restricting it to certain parts of an XML document, which are identified by using XPath expressions.

See Also:

"Oracle Text Indexes on XML Data"

CTXXPath Indexes

Another type of index that is available for indexing XML data, CTXXPath, is deprecated, starting with Oracle Database 11g Release 1 (11.1). It has been superseded by XMLIndex, and it is made available only for use with older database releases. It cannot help in extracting an XML fragment, and it acts only as a preliminary filter for equality predicates; after such filtering, XPath expressions are evaluated functionally (that is, without the benefit of XPath rewrite).

Note:

The CTXSYS.CTXXPath index is deprecated in Oracle Database 11g Release 1 (11.1). The functionality that was provided by CTXXPath is now provided by XMLIndex.

Oracle recommends that you replace CTXXPath indexes with XMLIndex indexes. The intention is that CTXXPath will no longer be supported in a future release of the database.

Optimization Chooses Indexes

Which indexes are used when more than one might apply in a given case? Cost-based optimization determines the index or indexes to use, so that performance is maximized. Oracle Text indexes apply only to text, which, for XML data, means text nodes. Whenever text nodes are targeted and a corresponding Oracle Text index is defined, it is used. If other indexes are also appropriate in a particular context, then they can be used as well. However, just because an index is defined and might appear applicable in a given situation does not mean that it will be used — it will not be used if the cost-based optimizer deems that its use would not be cost-effective.

Function-Based Indexes on XMLType Data

You can create a function-based index on an XMLType table or column, whether the data is XML schema-based or not, and whether the XMLType storage is structured, unstructured, or binary XML. A function-based index is created by evaluating the specified functions for each row in the target table or column and storing the value in the index. You can create a function-based index as either a B-tree index or a bitmap index.

Creating Function-Based Indexes on Unstructured XMLType Tables and Columns

If a function-based index is defined on XML data that is not managed using structured storage, then the index is created by invoking the function on the XML content and indexing the result. (This is not necessarily what happens when you try to create a function-based index on XML data in structured storage — see "Creating Function-Based Indexes on Structured XMLType Tables and Columns".)

Example 5-1 shows the creation of an index based on SQL function extractValue, where the XML data is in unstructured (CLOB) storage. The data is first copied to CLOB-based table po_clob from structured-storage table purchaseorder of standard database schema OE. Both of these tables will be used in examples throughout this chapter.

Example 5-1 Creating a Function-Based Index on a CLOB XMLType Instance

The CREATE INDEX statement in this example creates a function-based index on the value of the text node of element Reference. This index enforces the uniqueness constraint on the text-node value.

CREATE TABLE po_clob OF XMLType
  XMLTYPE STORE AS CLOB
  ELEMENT "http://localhost:8080/source/schemas/poSource/xsd/purchaseOrder.xsd#PurchaseOrder";
 
Table created.
 
INSERT INTO po_clob SELECT OBJECT_VALUE FROM OE.purchaseorder;
 
132 rows created.

CREATE UNIQUE INDEX po_fn_based_ix ON po_clob
  (extractValue(OBJECT_VALUE, '/PurchaseOrder/Reference'));
 
Index created.
 
INSERT INTO po_clob
  VALUES (XMLType(bfilename('XMLDIR', 'EABEL-20021009123335791PDT.xml'),
                  nls_charset_id('AL32UTF8')));
INSERT INTO po_clob
*
ERROR at line 1:
ORA-00001: unique constraint (OE.PO_FN_BASED_IX) violated

The cost-based optimizer considers using a function-based index only when the function that is included in the WHERE clause is identical to the function that was used to create the index.

To see this, consider the queries and explain plans in Example 5-2. These queries each find a purchase-order document based on the text node of element Reference. The first query, which uses function existsNode to locate the document, does not use the index created in Example 5-1; the second query, which uses function extractValue, does use the index. This is because the index was created using extractValue.

Example 5-2 Function-Based Index Is Used Only by a Matching Query

EXPLAIN PLAN FOR
  SELECT OBJECT_VALUE FROM po_clob
    WHERE existsNode(OBJECT_VALUE, '/PurchaseOrder[Reference="EABEL-20021009123335791PDT"') = 1;
 
Explained.
 
--
SET ECHO OFF
 
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 2803800196
 
-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |    42 | 84084 |    23  (27)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| PO_CLOB |    42 | 84084 |    23  (27)| 00:00:01 |
-----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(EXISTSNODE(SYS_MAKEXML('0BAD982DA615296BE040578CB00B1198',
              3460,"PO_CLOB"."XMLDATA"),'/PurchaseOrder[Reference="EABEL-2002100912333
              5791PDT"')=1)
 
15 rows selected.
 
EXPLAIN PLAN FOR
  SELECT OBJECT_VALUE FROM po_clob
    WHERE extractValue(OBJECT_VALUE, '/PurchaseOrder/Reference') = 'EABEL-20021009123335791PDT';
 
Explained.
 
SET ECHO OFF
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 2594805861
 
----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |     1 |  2002 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| PO_CLOB        |     1 |  2002 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | PO_FN_BASED_IX |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access(EXTRACTVALUE(SYS_MAKEXML('0BAD982DA615296BE040578CB00B1198',3460,"XMLDAT
              A"),'/PurchaseOrder/Reference')='EABEL-20021009123335791PDT')
 
Note
-----
   - dynamic sampling used for this statement
 
19 rows selected.

Creating Function-Based Indexes on Structured XMLType Tables and Columns

When you use structured XMLType storage, there are a few things to be aware of when creating a function-based index:

The element or attribute being targeted by the function must be a singleton, that is, it must occur only once in the XML Document — a function-based index must not target a collection. See "No XPath Rewrite for EXTRACTVALUE Applied to a Collection".
If the function being indexed is extractValue, then Oracle XML DB tries to rewrite your function-based CREATE INDEX statement to a different CREATE INDEX statement that is not function-based and does not use the XPath-expression argument as such. Instead, the resulting index is a B-tree index that operates directly on the underlying objects. This B-tree index takes the place of the function-based index that you specify. See "XPath Rewrite for EXTRACTVALUE Indexes on Singleton Elements or Attributes".
If Oracle XML DB cannot rewrite the CREATE INDEX statement, then a function-based index is created, just as for unstructured storage. This is what happens if the function is not extractValue, and it can also happen in a few cases for extractValue.

XPath Rewrite for EXTRACTVALUE Indexes on Singleton Elements or Attributes

Example 5-3 shows a CREATE INDEX statement for a structured XMLType table, purchaseorder, in sample database schema OE. This statement tries to create a function-based index using SQL function extractValue, and it targets a singleton XML element, Reference — each purchase-order document has a single Reference element.

Example 5-3 CREATE INDEX with EXTRACTVALUE on a Singleton Element or Attribute

CREATE INDEX po_fn_based_ix ON purchaseorder
  (extractValue(OBJECT_VALUE, '/PurchaseOrder/Reference'));

Index created.

Oracle XML DB rewrites this CREATE INDEX statement into the statement in Example 5-4. The index created is a B-tree index on the underlying object-relational columns. To see this, you can look at the COLUMN_NAME column for the index in table USER_IND_COLUMNS. This shows that the underlying object-relational column REFERENCE is used.

SELECT INDEX_NAME, TABLE_NAME, COLUMN_NAME FROM USER_IND_COLUMNS
  WHERE INDEX_NAME = 'PO_FN_BASED_IX' AND TABLE_NAME = 'PURCHASEORDER';
 
INDEX_NAME       TABLE_NAME       COLUMN_NAME
---------------- ---------------- ----------------------
PO_FN_BASED_IX   PURCHASEORDER    "XMLDATA"."REFERENCE"
 
1 row selected.

Example 5-4 XPath Rewrite of an EXTRACTVALUE Index on a Singleton Element or Attribute

CREATE INDEX po_fn_based_ix ON purchaseorder p (p."XMLDATA"."REFERENCE");

Index created.

Example 5-3 and Example 5-4 yield the same result; they are two ways of creating the same index.

SELECT INDEX_NAME, TABLE_NAME, COLUMN_NAME FROM USER_IND_COLUMNS
  WHERE INDEX_NAME = 'PO_FN_BASED_IX' AND TABLE_NAME = 'PURCHASEORDER';
 
INDEX_NAME       TABLE_NAME       COLUMN_NAME
---------------- ---------------- ----------------------
PO_FN_BASED_IX   PURCHASEORDER    "XMLDATA"."REFERENCE"
 
1 row selected.

No XPath Rewrite for EXTRACTVALUE Applied to a Collection

A function-based index must target a singleton, not a collection. This section demonstrates this for structured storage.

If a collection is stored in a CLOB instance, you cannot directly access its members. In structured storage, a collection is stored as an ordered collection table or an XMLType instance, which means that you can directly access its members. Because the structured storage model directly reflects the fine-grained structure of the XML data, you can create indexes that target collection members.

What you cannot do is create an index based on function extractValue that targets a collection with an XPath expression, and expect that Oracle XML DB will rewrite your CREATE INDEX statement, creating the necessary indexes on the underlying objects.

If you want such indexes, you must create them yourself. To do this, you must understand the structure of the SQL object that is used to manage the collection. Given this information, you can create the required (B-tree) indexes directly on the appropriate SQL-object attributes using conventional object-relational SQL.

The aim of this section is to show that you cannot count on XPath rewrite to create the appropriate indexes for you. Refer to "Using Indexes to Tune Queries on Collections Stored as OCTs" for an example of how to create such indexes manually.

This section deliberately tries to create a function-based index on a repeating attribute in a collection, to see what happens. This is tried in a couple of ways, without success.

Suppose that you want to create an index on attribute Id of element Part. You might try to create the index as in Example 5-5. However, when an element or attribute being indexed occurs multiple times in a document, a CREATE INDEX operation on an XPath expression that targets it fails, because SQL function extractValue can return only a single value for each row that it processes. In this case, there can be multiple occurrences of attribute Id in a purchase-order document, because its ancestor element LineItem is a collection.

Example 5-5 Trying to Create a Function-Based Index on a Repeating Attribute

CREATE INDEX po_fn_based_ix ON purchaseorder
  (extractValue(OBJECT_VALUE, '/PurchaseOrder/LineItems/LineItem/Part/@Id'));
CREATE INDEX po_fn_based_ix ON purchaseorder
*
ERROR at line 1:
ORA-19025: EXTRACTVALUE returns value of only one node

You can instead create an index by replacing function extractValue with a combination of function extract and XMLType method getStringVal(), as shown in Example 5-6.

Example 5-6 Creating a Function-Based Index Using EXTRACT and getStringVal()

CREATE INDEX po_fn_based_ix
  ON purchaseorder (extract(OBJECT_VALUE,
                            'PurchaseOrder/LineItems/LineItem/Part/@Id').getStringVal());
 
Index created.

This CREATE INDEX operation succeeds in creating a function-based index, but the index is not what you might expect. The index is created by invoking SQL function extract and XMLType method getStringVal() for each row in the table, and then indexing that result against the rowid of the row.

The problem with this technique is that extract returns multiple nodes. The result of applying function extract is a single XMLType fragment that contains all of the matching nodes.

Note:

In general, avoid creating an index based on SQL function extract. It is unlikely that the index will be useful.

The result of then invoking method getStringVal() on the XMLType instance that contains this fragment is a concatenation of the nodes in the fragment. What is indexed is therefore the concatenation of the three UPC codes (attribute Id), and not, as intended, each of the individual UPC codes. This is shown in Example 5-7.

Example 5-7 Function-Based Index on Concatenated Nodes

SELECT extract(OBJECT_VALUE, '/PurchaseOrder/LineItems') XML,
       extract(OBJECT_VALUE, 'PurchaseOrder/LineItems/LineItem/Part/@Id').getStringVal() INDEX_VALUE
  FROM purchaseorder
  WHERE existsNode(OBJECT_VALUE, '/PurchaseOrder[Reference="SBELL-2002100912333601PDT"]') = 1;
 
XML                                                               INDEX_VALUE
----------------------------------------------------------------- -----------------------------------
<LineItems>                                                       71551500905837429140222715515011020
  <LineItem ItemNumber="1">
    <Description>A Night to Remember</Description>
    <Part Id="715515009058" UnitPrice="39.95" Quantity="2"/>
  </LineItem>
  <LineItem ItemNumber="2">
    <Description>The Unbearable Lightness Of Being</Description>
    <Part Id="37429140222" UnitPrice="29.95" Quantity="2"/>
  </LineItem>
  <LineItem ItemNumber="3">
    <Description>Sisters</Description>
    <Part Id="715515011020" UnitPrice="29.95" Quantity="4"/>
  </LineItem>
</LineItems>

1 row selected.

The way to resolve this problem is to use structured storage with ordered collection tables (OCTs), to force each XML node that is indexed to be stored as a separate row. The index can be created directly on the appropriate OCT, using object-relational SQL similar to that generated by XPath rewrite.

In sum, you can effectively index XML data that is stored object-relationally (structured storage) by indexing the database columns that correspond to XML nodes. In the singleton extractValue case, Oracle XML DB does this for you automatically as a convenience; you use the simple syntax of creating an index based on function extractValue with an XPath-expression argument, and Oracle XML DB does the rest.

See Also:

"Using Indexes to Tune Queries on Collections Stored as OCTs" for an example of indexing an ordered collection table

XMLIndex

This section contains these topics:

Advantages of XMLIndex
XPath Expressions Not Indexed by XMLIndex
Components of an XMLIndex Index
Data Dictionary Static Public Views Related to XMLIndex
Creating, Dropping, Altering, and Examining an XMLIndex Index
Creating Additional Secondary Indexes on an XMLIndex Path Table
How to Tell If XMLIndex is Used
Turning Off Use of XMLIndex
XMLIndex Path Subsetting: Specifying the Paths You Want to Index
Using XMLIndex on Oracle XML DB Repository
XMLIndex Parallelism
Asynchronous (Deferred) Maintenance of XMLIndex Indexes
Collecting Statistics on XMLIndex Objects For the Cost-Based Optimizer
Guidelines for Using XMLIndex
PARAMETERS Clause for CREATE INDEX and ALTER INDEX

B-tree indexes can be used advantageously with structured storage — they provide sharp focus by targeting the underlying objects directly. They are generally ineffective, however, in addressing the detailed structure (elements and attributes) of an XML document or fragment stored in a CLOB instance. That is the special domain of XMLIndex: unstructured and hybrid storage.

A typical use case for XMLIndex is where you generally expect to access certain portions of a document in their entirety, so you pack those portions into one or more CLOB instances. You might nevertheless sometimes need to query within these document portions. XMLIndex can help here, whereas XPath rewrite is ineffective in this case.

Another use case is where an XML schema contains xsd:any elements, for lack of any specific knowledge of the document structure and data types involved. The data corresponding to these elements is stored in CLOB instances, and XMLIndex can be used to speed access to it.

In addition to CLOB storage of XML, you can also use XMLIndex to index binary XMLType data.

Unlike a B-tree index, which you define for a specific database column that represents an individual XML element or attribute, an XMLIndex index is, by default, very general: Unless you specify a more narrow focus by detailing specific XPath expressions to use or not to use in indexing, indexing with XMLIndex applies to all possible XPath expressions for your XML data.

Advantages of XMLIndex

An XMLIndex index can be used for SQL functions XMLQuery, XMLTable, XMLExists, XMLCast, extract, extractValue, and existsNode. It presents the following advantages over other indexing methods:

An XMLIndex index is effective in any part of a query; it is not limited to use in a WHERE clause. This is not the case for any of the other kinds of indexes you might use with XML data.
XMLIndex can thus speed access to both SELECT list data and FROM list data, making it useful for XML fragment extraction, in particular. Function-based indexes (and CTXXPath indexes, which are deprecated) cannot be used to extract document fragments.
You need no prior knowledge of the XPath expressions that will be used in queries. XMLIndex is completely general. This is not the case for function-based indexes. If you do have such prior knowledge, then you can often improve performance by tailoring XMLIndex indexing to those paths most queried.
You can use an XMLIndex index with either XML schema-based or non-schema-based data. It can be used with unstructured storage, hybrid storage, and binary XML storage. B-tree indexing is appropriate only for XML schema-based data that is stored object-relationally (structured storage); it is ineffective for XML schema-based data stored in a CLOB instance.
For hybrid storage of XML schema-based data, XMLIndex can handle XPath expressions that target document fragments that are stored within a CLOB instance. XPath rewrite is ineffective in such cases.
You can use an XMLIndex index for searches with XPath expressions that target collections, that is, nodes that occur multiple times within a document. This is not the case for function-based indexes.
XMLIndex indexing — both index creation and index maintenance — can be carried out in parallel, using multiple database processes. This is not the case for function-based indexes (and CTXXPATH indexes, which are deprecated).
Updating of XMLIndex indexes on binary XML storage can be accomplished in a piecewise manner, improving DML performance considerably. This is not the case for any of the other kinds of indexes you might use with XML data.

XPath Expressions Not Indexed by XMLIndex

The following types of XPath expressions are not indexed by XMLIndex:

Applications of XPath functions, except ora:contains, , , , and (these are all indexed). In particular, user-defined XPath functions are not indexed.
Axes other than child, descendant, and attribute, that is, axes parent, ancestor, following-sibling, preceding-sibling, following, preceding, and ancestor-or-self.
Expressions using the union operator, | (vertical bar).

Components of an XMLIndex Index

XMLIndex is a domain index; it is designed specifically for the domain of XML data. It is a logical index, which has three components:

A path index – This indexes the XML tags of a document and identifies its various document fragments.
An order index – This indexes the hierarchical positions of the nodes in an XML document. It keeps track of parent–child, ancestor–descendant, and sibling relations.
A value index – This indexes the values of an XML document. It provides lookup by either value equality or value range. A value index is used for values in query predicates (WHERE clause).

XMLIndex is implemented using a path table and a set of (local) secondary indexes corresponding to its components. These are all owned by the owner of the base table upon which the XMLIndex index is created. The path table contains one row for each indexed node in the XML document. For each indexed node, the path table stores:

The corresponding rowid of the table that stores the document.
A locator, which provides fast access to the corresponding document fragment. For binary XML storage of XML schema-based data, it also stores data-type information.
An order key, to record the hierarchical position of the node in the document. You can think of this as a Dewey decimal key like that used in library cataloging and Internet protocol SNMP. In such a system, the key 3.21.5 represents the node position of the fifth child of the twenty-first child of the third child of the document root node.

Table 5-2 shows the main information^Foot 1 that is in the path table. The path index and the order index each use two columns of the path table. Together, columns PATHID and RID represent the path index; columns ORDER_KEY and RID represent the order index. Secondary indexes are created automatically for columns PATHID and ORDER_KEY.

Table 5-2 XMLIndex Path Table

Column	Data Type	Description
`PATHID`	`RAW(8)`	Unique identifier for the XPath path to the node.
`RID`	`ROWID`	Rowid of the table used to store the XML data.
`ORDER_KEY`	`RAW(1000)`	Decimal order key that identifies the hierarchical position of the node. (Document ordering is preserved.)
`LOCATOR`	`RAW(2000)`	Fragment-location information. Used for fragment extraction. For binary XML storage of XML schema-based data, data-type information is also stored here.
`VALUE`	`VARCHAR2(4000)`	Text of an attribute node or a simple element node.

Example 5-8 explores the contents of the path table for two purchase-order documents.

Example 5-8 Path Table Contents for Two Purchase Orders

<PurchaseOrder>
 <Reference>SBELL-2002100912333601PDT</Reference>
 <Actions>
  <Action>
   <User>SVOLLMAN</User>
  </Action>
 </Actions>
 . . .
</PurchaseOrder>

<PurchaseOrder>
 <Reference>ABEL-20021127121040897PST</Reference>
 <Actions>
  <Action>
   <User>ZLOTKEY</User>
  </Action>
  <Action>
   <User>KING</User>
  </Action>
 </Actions>
 . . .
</PurchaseOrder>

An XMLIndex index on an XMLType table or column storing these purchase orders will include a path table that has one row for each indexed node in the XML documents. Suppose that the system assigns the following PATHIDs when indexing the nodes according to their XPath expressions:

PATHID	Indexed XPath
`1`	`/PurchaseOrder`
`2`	`/PurchaseOrder/Reference`
`3`	`/PurchaseOrder/Actions`
`4`	`/PurchaseOrder/Actions/Action`
`5`	`/PurchaseOrder/Actions/Action/User`

The resulting path table would then be something like this (column LOCATOR is not shown):

PATHID	RID	ORDER_KEY	VALUE
`1`	`R1`	`1`	——
`2`	`R1`	`1.1`	`SBELL-2002100912333601PDT`
`3`	`R1`	`1.2`	——
`4`	`R1`	`1.2.1`	——
`5`	`R1`	`1.2.1.1`	`SVOLLMAN`
`1`	`R2`	`1`	——
`2`	`R2`	`1.1`	`ABEL-20021127121040897PST`
`3`	`R2`	`1.2`	——
`4`	`R2`	`1.2.1`	——
`5`	`R2`	`1.2.1.1`	`ZLOTKEY`
`4`	`R2`	`1.2.2`	——
`5`	`R2`	`1.2.2.1`	`KING`

Ignore the Path Table; It Is Transparent

Though you might create secondary indexes on path-table columns, you can generally ignore the path table itself. You cannot access the path table, other than to DESCRIBE it and create (secondary) indexes on it. You need never explicitly gather statistics on the path table. You need only collect statistics on the XMLIndex index or the base table on which the XMLIndex index is defined; statistics are collected and maintained on the path table and its secondary indexes transparently.

Column VALUE of the XMLIndex Path Table

A secondary index on column VALUE is used with XPath expressions in a WHERE clause that have predicates involving string matches. For example:

/PurchaseOrder[Reference/text() = "SBELL-2002100912333601PDT"]

Column VALUE stores the effective text value of a simple element node (no children) or an attribute node. For an element, this is obtained by concatenating all of the text nodes of the element — comments and processing instructions are ignored during indexing.

Column VALUE is a fixed size, VARCHAR2(4000). Any overflow (beyond 4000 bytes) during index creation or update is truncated, but the LOCATOR value for that row is then flagged so that the full value can be retrieved from the base table when needed.

In addition to the 4000-byte limit for column VALUE, there is a limit on the size of a key for the secondary index created on this column. This is the case for B-tree and function-based indexes as well; it is not an XMLIndex limitation. The index-key size limit is a function of the block size for your database. It is this limit that determines how much of VALUE is indexed.

In sum, only the first 4000 bytes of the effective text value are stored in column VALUE, and only the first N bytes of column VALUE are indexed, where N is the index-key size limit (N < 4000). Because of the index-key size limit, the index on column VALUE acts only as a preliminary filter for the effective text value.

For example, suppose that your database block size requires that the VALUE index be no larger than 800 bytes, so that only the first 800 bytes of the effective text value is indexed. The first 800 bytes of the effective text value is first tested, using XMLIndex, and only if that text prefix matches the query value is the rest of the effective text value tested.

The secondary index on column VALUE is an index on SQL function substr (substring equality), because that function is used to test the text prefix. This function-based index is created automatically as part of the implementation of XMLIndex for column VALUE.

For example, the XPath expression /PurchaseOrder[Reference/text() = :1] in a query WHERE clause might, in effect, be rewritten to a test something like this:

substr(VALUE, 1 800) = substr(:1, 1, 800) AND VALUE = :1;

This conjunction contains two parts, which are processed from left to right. The first test uses the index on function substr as a preliminary filter, to eliminate text whose first 800 bytes do not match the first 800 bytes of the value of bind variable :1.

Only the first test uses an index — the full value of column VALUE is not indexed. After preliminary filtering by the first test, the second test checks the entire effective text value — that is, the full value of column VALUE — for full equality with the value of :1. This check does not use an index.

Even if only the first 800 bytes of text is indexed, it is important for query performance that up to 4000 bytes be stored in column VALUE, because that provides quick, direct access to the data, instead of requiring, for example, extracting it from deep within a CLOB-instance XML document. If the effective text value is greater than 4000 bytes, then the second test in the WHERE-clause conjunction will require accessing the base-table data.

Note that neither the VALUE column 4000-byte limit nor the index-key size affect query results in any way; they can affect only performance.

Default Text for XML Schema-Based Data

As mentioned, XMLIndex can be used with XML schema-based data. If an XML schema specifies a defaultValue value for a given element or attribute, and a particular document does not specify a value for that element or attribute, then the defaultValue value is used for the VALUE column.

Creating Secondary Indexes on Column VALUE

If you do not specify a secondary index for column VALUE when you create the XMLIndex index, then a default secondary index is created on column VALUE. This default index has the default properties — in particular, it is an index for text (string-valued) data only.

You can, however, create a VALUE index of a different type. For example, you can create a number-valued index if that is appropriate for many of your queries. You can create multiple secondary indexes on the VALUE column. An index of a particular type is used only when it is appropriate. For example, a number-valued index is used only when the VALUE column is a number; it is ignored for other values. Secondary indexes on path-table columns are treated like any other secondary indexes — you can alter them, drop them, mark them unusable, and so on.

See Also:

"Creating Additional Secondary Indexes on an XMLIndex Path Table" for examples of creating secondary indexes on column VALUE
"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of the PARAMETERS clause

Data Dictionary Static Public Views Related to XMLIndex

Information about the standard database indexes is available in static public views USER_INDEXES, ALL_INDEXES, and DBA_INDEXES. Similar information about XMLIndex indexes is available in static public views USER_XML_INDEXES, ALL_XML_INDEXES, and DBA_XML_INDEXES. Table 5-3 describes the columns in each of these views.

Table 5-3 XMLIndex Static Public Views

Column Name	Type	Description
`INDEX_OWNER`	`VARCHAR2`	Owner of the index. Not available for `USER_XML_INDEXES`.
`INDEX_NAME`	`VARCHAR2`	Name of the `XMLIndex` index.
`TABLE_OWNER`	`VARCHAR2`	Owner of the base table on which the index is defined.
`TABLE_NAME`	`VARCHAR2`	Name of the base table on which the index is defined.
`PATH_TABLE_NAME`	`VARCHAR2`	Name of the `XMLIndex` path table.
`PARAMETERS`	`XMLType`	Parameters specific to the index. These can include the set of XPath paths defining path-subsetting and the name of a scheduler job for synchronization.
`ASYNC`	`VARCHAR2`	Asynchronous index updating specification. See "Asynchronous (Deferred) Maintenance of XMLIndex Indexes".
`PEND_TABLE_NAME`	`VARCHAR2`	Name of the table that records base-table DML operations since the last index synchronization. See "Asynchronous (Deferred) Maintenance of XMLIndex Indexes".

Creating, Dropping, Altering, and Examining an XMLIndex Index

You create an XMLIndex index by declaring the index type to be XDB.XMLIndex, as illustrated in Example 5-9.

Example 5-9 Creating an XMLIndex Index on XMLType Unstructured Storage

CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex;
 
Index created.

This creates an XMLIndex index named po_xmlindex_ix on XMLType table po_clob.

Note:

Although you can partition an XMLType table or column, you cannot create an XMLIndex index on such a table or column.

You can create an XMLIndex index on CLOB portions of hybrid XMLType storage, that is, on CLOB data that is embedded within object-relational storage. Example 5-10 illustrates this. It assumes that the XML schema used maps the LineItems element to CLOB, such as is shown in Example 5-10.

Example 5-10 Creating an XMLIndex Index on XMLType Hybrid Storage

CREATE INDEX po_xmlindex_hybrid_ix ON li_clob
  (extract(OBJECT_VALUE, '/PurchaseOrder/LineItems'))
  INDEXTYPE IS XDB.XMLIndex;

Example 5-11 XML Schema Fragment that Maps LineItems to CLOB Storage

<xs:element name="LineItems" type="LineItemsType" xdb:SQLName="LINEITEMS"                 
           xdb:SQLType="CLOB"/>

You can obtain the name of an XMLIndex index on a particular XMLType table (or column), as shown in Example 5-12. You can also select INDEX_NAME from DBA_INDEXES or ALL_INDEXES, as appropriate.

Example 5-12 Obtaining the Name of an XMLIndex Index on a Particular Table

SELECT INDEX_NAME FROM USER_INDEXES
  WHERE TABLE_NAME = 'PO_CLOB' AND ITYP_NAME = 'XMLINDEX';

INDEX_NAME
---------------
PO_XMLINDEX_IX
 
1 row selected.

You rename or drop an XMLIndex index just as you would any other index, as illustrated in Example 5-13. This renaming changes the name of the XMLIndex index only. It does not change the name of the path table — you can rename the path table separately.

Example 5-13 Renaming and Dropping an XMLIndex Index

ALTER INDEX po_xmlindex_ix RENAME TO new_name_ix;

Index altered.

DROP INDEX new_name_ix;

Index dropped.

Similarly, you can change other index properties using other ALTER INDEX options, such as REBUILD. XMLIndex is no different from other index types in this respect.

You can use the PARAMETERS clause of a CREATE INDEX statement to name the path table. Example 5-14 names the path table "my_path_table".

Example 5-14 Naming the Path Table of an XMLIndex Index

CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex
  PARAMETERS ('PATH TABLE my_path_table');

If you do not name the path table this way, then its name is generated by the system, using the index name you provide to CREATE INDEX as a base. Example 5-15 shows this for the index created in Example 5-9.

Example 5-15 Determining the System-Generated Name of an XMLIndex Path Table

SELECT PATH_TABLE_NAME FROM USER_XML_INDEXES
  WHERE TABLE_NAME = 'PO_CLOB' AND INDEX_NAME = 'PO_XMLINDEX_IX';
 
PATH_TABLE_NAME
------------------------------
SYS72060_PO_XMLINDE_PATH_TABLE
 
1 row selected.

By default, the storage options of the XMLIndex path table and its secondary indexes are derived from the storage properties of the base table on which the XMLIndex index is created. You can specify different storage options by using a PARAMETERS clause when you create the index, as shown in Example 5-16. The PARAMETERS clause of CREATE INDEX (and ALTER INDEX) must be between single quotation marks (').

See Also:

"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of the PARAMETERS clause

Example 5-16 Specifying Storage Options When Creating an XMLIndex Index

CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex
  PARAMETERS
    ('PATH TABLE po_path_table
      (PCTFREE 5 PCTUSED 90 INITRANS 5
       STORAGE (INITIAL 1k NEXT 2k MINEXTENTS 3 BUFFER_POOL KEEP)
       NOLOGGING ENABLE ROW MOVEMENT PARALLEL 3)
      PATH ID INDEX   po_path_id_ix (LOGGING PCTFREE 1 INITRANS 3)
      ORDER KEY INDEX po_order_key_ix (LOGGING PCTFREE 1 INITRANS 3)
      VALUE INDEX     po_value_ix (LOGGING PCTFREE 1 INITRANS 3)');
 
Index created.

Because XMLIndex is a logical domain index, not a physical index, all physical attributes are either zero (0) or NULL.

In addition to specifying storage options for the path table, Example 5-16 names the secondary indexes. An index is created on column VALUE because it is specified; otherwise, no such index would be created.

Like the name of the path table, the names of the secondary indexes on the path-table columns are generated automatically using the index name as a base, unless you specify them in the PARAMETERS clause. Example 5-17 illustrates this, and shows how you can determine these names using public view USER_IND_COLUMNS. It also shows that the path index and the order index each use two columns, including column RID. Note, too, that no VALUE index was created, by default.

Example 5-17 Determining the Names of the Secondary Indexes of an XMLIndex Index

SELECT INDEX_NAME, COLUMN_NAME, COLUMN_POSITION FROM USER_IND_COLUMNS
  WHERE TABLE_NAME IN (SELECT PATH_TABLE_NAME FROM USER_XML_INDEXES
                         WHERE INDEX_NAME = 'PO_XMLINDEX_IX')
  ORDER BY INDEX_NAME, COLUMN_NAME;
 
INDEX_NAME                     COLUMN_NAME  COLUMN_POSITION
------------------------------ ------------ ---------------
SYS73321_PO_XMLINDE_ORDKEY_IX  ORDER_KEY                  2
SYS73321_PO_XMLINDE_ORDKEY_IX  RID                        1
SYS73321_PO_XMLINDE_PATHID_IX  PATHID                     1
SYS73321_PO_XMLINDE_PATHID_IX  RID                        2
 
4 rows selected.

See Also:

Example 5-23 for a similar, but more complex example

The RENAME clause of an ALTER INDEX statement for XMLIndex applies only to the XMLIndex index itself. To rename the path table and secondary indexes, you must determine the names of these objects and use appropriate ALTER INDEX statements on them directly. Similarly, to retrieve the physical properties of the secondary indexes or alter them in any other way, you will need to obtain their names, as in Example 5-17.

You can use ALTER INDEX to modify any of the index parameters of the primary or secondary indexes . As a convenience, an alternative way to change properties of a path-id, order-key, or value index is to use ALTER INDEX on the parent XMLIndex index, providing a PARAMETERS clause with the new properties.

See Also:

"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of the PARAMETERS clause

Creating Additional Secondary Indexes on an XMLIndex Path Table

This section adds extra secondary indexes to the XMLIndex index created in Example 5-16.

You can create any number of additional secondary indexes on the VALUE column of the path table of an XMLIndex index. These can be of different types, including function-based indexes and Oracle Text indexes.

Whether or not a given index is used for a given element occurrence when processing a query is determined by whether it is of the appropriate type for that value and whether it is cost-effective to use it.

Example 5-18 creates a function-based index on column VALUE of the path table using SQL function substr. This might be useful if your queries often use substr applied to the text nodes of XML elements.

Example 5-18 Creating a Function-Based Index on Path-Table Column VALUE

CREATE INDEX fn_based_ix ON po_path_table (substr(VALUE, 1, 100));
 
Index created.

If you have many elements whose text nodes represent numeric values, then it can make sense to create a numeric index on the column VALUE. However, doing so directly, in a manner analogous to Example 5-18, raises an ORA-01722 error (invalid number) if some of the element values are not numbers. This is illustrated in Example 5-19.

Example 5-19 Trying to Create a Numeric Index on Path-Table Column VALUE Directly

CREATE INDEX direct_num_ix ON po_path_table (to_number(VALUE));
CREATE INDEX direct_num_ix ON po_path_table (to_number(VALUE))
                                             *
ERROR at line 1:
ORA-01722: invalid number

What is needed is an index that will be used for numeric-valued elements but will be ignored for element occurrences that do not have numeric values. Procedure createNumberIndex of package DBMS_XMLINDEX exists specifically for this purpose. You pass it the names of the database schema, the XMLIndex index, and the numeric index to be created. Creation of a numeric index is illustrated in Example 5-20.

Example 5-20 Creating a Numeric Index on Column VALUE with Procedure createNumberIndex

CALL DBMS_XMLINDEX.createNumberIndex('OE', 'PO_XMLINDEX_IX', 'API_NUM_IX');

Note that because such an index is specifically designed to ignore elements that do not have numeric values, its use will not detect their presence. If there are non-numeric elements and, for whatever reason, the XMLIndex index is not used in some query, then an ORA-01722 error will be raised. However, if the index is used, no such error will be raised, because the index ignores non-numeric data. As always, the use of an index will never change the result set — it will never give you different results, but use of an index can prevent you from detecting erroneous data.

Creating a date-valued index is similar to creating a numeric index; you use procedure DBMS_XMLINDEX.createDateIndex. Example 5-21 shows this.

Example 5-21 Creating a Date Index on Column VALUE with Procedure createDateIndex

CALL DBMS_XMLINDEX.createDateIndex('OE', 'PO_XMLINDEX_IX', 'API_DATE_IX', 
                                   'dateTime');

Example 5-22 creates an Oracle Text CONTEXT index on column VALUE. This is useful for full-text queries on text values of XML elements. XPath predicates that use XPath function ora:contains are rewritten to applications of SQL function contains on column VALUE. If a CONTEXT index is defined on column VALUE, then it will be used during predicate evaluation. An Oracle Text index is independent of all other VALUE-column indexes.

Example 5-22 Creating an Oracle Text CONTEXT Index on Path-Table Column VALUE

CREATE INDEX po_otext_ix ON po_path_table (VALUE)
  INDEXTYPE IS CTXSYS.CONTEXT
    PARAMETERS('TRANSACTIONAL');
 
Index created.

See Also:

"From ora:contains to contains" for information about parameter TRANSACTIONAL

The query in Example 5-23 shows all of the secondary indexes created on the path table of an XMLIndex index. The indexes created explicitly are in bold. Note, in particular, that some indexes, such as the function-based index created on column VALUE, do not appear as such; the column name listed for it them a system-generated name, such as SYS_NC00006$. This means that you cannot see these columns by executing a query with COLUMN_NAME = 'VALUE' in the WHERE clause.

Example 5-23 Showing All Secondary Indexes on an XMLIndex Path Table

SELECT c.INDEX_NAME, c.COLUMN_NAME, c.COLUMN_POSITION, e.COLUMN_EXPRESSION
  FROM USER_IND_COLUMNS c LEFT OUTER JOIN USER_IND_EXPRESSIONS e
    ON (c.INDEX_NAME = e.INDEX_NAME)
  WHERE c.TABLE_NAME IN (SELECT PATH_TABLE_NAME FROM USER_XML_INDEXES
                           WHERE INDEX_NAME = 'PO_XMLINDEX_IX')
  ORDER BY c.INDEX_NAME, c.COLUMN_NAME;
 
INDEX_NAME           COLUMN_NAME  COLUMN_POSITION COLUMN_EXPRESSION
-------------------- ------------ --------------- ----------------------
API_DATE_IX          SYS_NC00008$               1 SYS_EXTRACT_UTC(SYS_XMLCONV("V
                                                  ALUE",3,8,0,0,181))
API_NUM_IX           SYS_NC00007$               1 TO_BINARY_DOUBLE("VALUE")
FN_BASED_IX          SYS_NC00006$               1 SUBSTR("VALUE",1,100)
PO_ORDER_KEY_IX      ORDER_KEY                  2
PO_ORDER_KEY_IX      RID                        1
PO_OTEXT_IX          VALUE                      1
PO_PATH_ID_IX        PATHID                     1
PO_PATH_ID_IX        RID                        2
PO_VALUE_IX          VALUE                      1
 
9 rows selected.

See Also:

Oracle Database PL/SQL Packages and Types Reference for information on PL/SQL procedures createNumberIndex and createDateIndex in package DBMS_XMLINDEX
"Oracle Text Indexes Are Used Independently of Other Indexes" for information on using Oracle Text indexes

How to Tell If XMLIndex is Used

It is at query compile time that it is determined whether or not a given XMLIndex index can be used, that is, whether the query can be rewritten into a query against the index. If it cannot be determined at compile time that an XPath expression in the query is a subset of the paths you specified to be used for indexing, then an XMLIndex index is not used. For example, if the path /PurchaseOrder/LineItems//* is included for indexing, then a query with /PurchaseOrder/LineItems/LineItem/Description can use the index, but a query with //Description cannot. The latter also matches potential Description elements that are not children of /PurchaseOrder/LineItems, and it is not possible at compile time to know if such additional Description elements will be present in the data.

To know whether a particular XMLIndex index has been used in resolving a query, you can examine an explain plan of the query. If the index is used, then its path table, order key, or path id will be referenced in the explain plan. The explain plan will not directly indicate that a domain index was used; it will not refer to the XMLIndex index by name.

See Also:

Example 5-24 shows that the XMLIndex index created in Example 5-14 is used in a particular query.

Example 5-24 Examining an Explain Plan to See If XMLIndex Is Used

The reference to MY_PATH_TABLE in the explain plan here indicates that the XMLIndex index (created in Example 5-14) was used in this query. Similarly, reference to columns LOCATOR, ORDER_KEY, and PATHID indicates the same thing.

SET AUTOTRACE ON EXPLAIN
 
SELECT XMLQuery('/PurchaseOrder/Requestor' PASSING OBJECT_VALUE RETURNING CONTENT) FROM po_clob
  WHERE XMLExists('/PurchaseOrder[Reference="SBELL-2002100912333601PDT"]' PASSING OBJECT_VALUE);
 
XMLQUERY('/PURCHASEORDER/REQUESTOR'PASSINGOBJECT_VALUERETURNINGCONTENT)
-----------------------------------------------------------------------
<Requestor>Sarah J. Bell</Requestor>
 
1 row selected.
 
 
Execution Plan
. . .
----------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                          | Rows  | Bytes |Cost(%CPU)| Time     |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                               |     1 |    24 |   15  (7)| 00:00:01 |
|   1 |  SORT GROUP BY                   |                               |     1 |  3524 |          |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID    | MY_PATH_TABLE                 |     2 |  7048 |    2  (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN              | SYS55148_PO_XMLINDE_PATHID_IX |     6 |       |    1  (0)| 00:00:01 |
|   4 |  NESTED LOOPS                    |                               |     1 |    24 |   15  (7)| 00:00:01 |
|   5 |   VIEW                           | VW_SQ_1                       |     1 |    12 |   13  (0)| 00:00:01 |
|   6 |    HASH UNIQUE                   |                               |     1 |  5046 |          |          |
|   7 |     NESTED LOOPS                 |                               |       |       |          |          |
|   8 |      NESTED LOOPS                |                               |     1 |  5046 |   13  (0)| 00:00:01 |
|*  9 |       TABLE ACCESS BY INDEX ROWID| MY_PATH_TABLE                 |     1 |  3524 |   11  (0)| 00:00:01 |
|* 10 |        INDEX RANGE SCAN          | SYS55148_PO_XMLINDE_PATHID_IX |     1 |       |    2  (0)| 00:00:01 |
|* 11 |       INDEX RANGE SCAN           | SYS55148_PO_XMLINDE_PATHID_IX |     6 |       |    1  (0)| 00:00:01 |
|* 12 |      TABLE ACCESS BY INDEX ROWID | MY_PATH_TABLE                 |     1 |  1522 |    2  (0)| 00:00:01 |
|  13 |   TABLE ACCESS BY USER ROWID     | PO_CLOB                       |     1 |    12 |    1  (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(SYS_XMLI_LOC_ISNODE("SYS_P0"."LOCATOR")=1)
   3 - access("SYS_P0"."PATHID"=HEXTORAW('74C39DFE')  AND "SYS_P0"."RID"=:B1)
   9 - filter("SYS_P5"."PATHID"=HEXTORAW('6F7C')  AND SYS_XMLI_LOC_ISNODE("SYS_P5"."LOCATOR")=1)
  10 - access("SYS_P5"."VALUE"='SBELL-2002100912333601PDT')
  11 - access("SYS_P2"."PATHID"=HEXTORAW('093CA37E')  AND "SYS_P5"."RID"="SYS_P2"."RID")
  12 - filter(SYS_XMLI_LOC_ISNODE("SYS_P2"."LOCATOR")=1 AND "SYS_P2"."ORDER_KEY"<"SYS_P5"."ORDER_KEY" AND
              "SYS_P5"."ORDER_KEY"<SYS_ORDERKEY_MAXCHILD("SYS_P2"."ORDER_KEY") AND
              SYS_ORDERKEY_DEPTH("SYS_P2"."ORDER_KEY")+1=SYS_ORDERKEY_DEPTH("SYS_P5"."ORDER_KEY"))
. . .

Given the name of a path table from an explain plan such as this, you can obtain the name of its XMLIndex index as shown in Example 5-25. (This is more or less opposite to the query in Example 5-15.)

Example 5-25 Obtaining the Name of an XMLIndex Index from Its Path-Table Name

SELECT INDEX_NAME FROM USER_XML_INDEXES WHERE PATH_TABLE_NAME = 'MY_PATH_TABLE';
 
INDEX_NAME
------------------------------
PO_XMLINDEX_IX
 
1 row selected.

XMLIndex can be used for XPath expressions in the SELECT list, the FROM list, and the WHERE clause of a query, and it is useful for SQL functions XMLQuery, XMLTable, XMLExists, XMLCast, extractValue, existsNode, and extract. Unlike function-based indexes (and CTXXPath indexes, which are deprecated), XMLIndex indexes can be used when you extract an XML fragment from a document. Example 5-26 illustrates this.

Example 5-26 Using XMLIndex to Extract an XML Fragment

SET AUTOTRACE ON EXPLAIN
 
SELECT li.description, li.itemno
  FROM po_clob, XMLTable('/PurchaseOrder/LineItems/LineItem'
                         PASSING OBJECT_VALUE
                         COLUMNS "DESCRIPTION" VARCHAR(40) PATH '/LineItem/Description',
                                 "ITEMNO"      INTEGER     PATH '/LineItem/@ItemNumber') li
  WHERE XMLExists('/PurchaseOrder[Reference="SBELL-2002100912333601PDT"]'
                  PASSING OBJECT_VALUE);
 
DESCRIPTION                                  ITEMNO
---------------------------------------- ----------
A Night to Remember                               1
The Unbearable Lightness Of Being                 2
Sisters                                           3
 
3 rows selected.

The explain plan for this query shows, by referring to the path table, that XMLIndex is used. It also shows the use of internal SQL function sys_orderkey_depth — see "Guidelines for Using XMLIndex".

Execution Plan
. . .
----------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name                          | Rows  | Bytes |Cost(%CPU)| Time   |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                               |     1 |  1546 | 8  (13)| 00:00:01 |
|*  1 |  FILTER                            |                               |       |       |        |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID      | MY_PATH_TABLE                 |     1 |  3524 | 2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                | SYS63727_PO_XMLINDE_PATHID_IX |     6 |       | 1   (0)| 00:00:01 |
|*  4 |  FILTER                            |                               |       |       |        |          |
|*  5 |   TABLE ACCESS BY INDEX ROWID      | MY_PATH_TABLE                 |     1 |  3524 | 2   (0)| 00:00:01 |
|*  6 |    INDEX RANGE SCAN                | SYS63727_PO_XMLINDE_PATHID_IX |     1 |       | 1   (0)| 00:00:01 |
. . .
|* 14 |         TABLE ACCESS BY INDEX ROWID| MY_PATH_TABLE                 |     1 |  3524 | 2   (0)| 00:00:01 |
|* 15 |          INDEX RANGE SCAN          | SYS63727_PO_XMLINDE_VALUE_IX  |     1 |       | 1   (0)| 00:00:01 |
|* 16 |         INDEX RANGE SCAN           | SYS63727_PO_XMLINDE_ORDKEY_IX |     1 |       | 1   (0)| 00:00:01 |
|* 17 |        TABLE ACCESS BY INDEX ROWID | MY_PATH_TABLE                 |     1 |  1522 | 2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS BY USER ROWID     | PO_CLOB                       |     1 |    12 | 1   (0)| 00:00:01 |
|* 19 |    INDEX RANGE SCAN                | SYS63727_PO_XMLINDE_PATHID_IX |     6 |       | 1   (0)| 00:00:01 |
|* 20 |   TABLE ACCESS BY INDEX ROWID      | MY_PATH_TABLE                 |     1 |  1522 | 2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(:B1<SYS_ORDERKEY_MAXCHILD(:B2))
   2 - filter("SYS_P3"."ORDER_KEY">:B1 AND SYS_XMLI_LOC_ISNODE("SYS_P3"."LOCATOR")=1 AND
              "SYS_P3"."ORDER_KEY"<SYS_ORDERKEY_MAXCHILD(:B2) AND SYS_ORDERKEY_DEPTH("SYS_P3"."ORDER_KEY")=
              SYS_ORDERKEY_DEPTH(:B3)+1)
   3 - access("SYS_P3"."PATHID"=HEXTORAW('54393E4C')  AND "SYS_P3"."RID"=:B1)
   4 - filter(:B1<SYS_ORDERKEY_MAXCHILD(:B2))
   5 - filter("SYS_P6"."ORDER_KEY">:B1 AND SYS_XMLI_LOC_ISNODE("SYS_P6"."LOCATOR")=1 AND
              "SYS_P6"."ORDER_KEY"<SYS_ORDERKEY_MAXCHILD(:B2) AND SYS_ORDERKEY_DEPTH("SYS_P6"."ORDER_KEY")=
              SYS_ORDERKEY_DEPTH(:B3)+1)
   6 - access("SYS_P6"."PATHID"=HEXTORAW('7DE452AA')  AND "SYS_P6"."RID"=:B1)
       filter(SYS_PATHID_IS_NMSPC("SYS_P6"."PATHID")=0)
  14 - filter("SYS_P11"."PATHID"=HEXTORAW('6F7C')  AND SYS_XMLI_LOC_ISNODE("SYS_P11"."LOCATOR")=1)
  15 - access("SYS_P11"."VALUE"='SBELL-2002100912333601PDT')
  16 - access("SYS_P11"."RID"="SYS_P8"."RID" AND "SYS_P8"."ORDER_KEY"<"SYS_P11"."ORDER_KEY")
  17 - filter("SYS_P8"."PATHID"=HEXTORAW('093CA37E')  AND SYS_XMLI_LOC_ISNODE("SYS_P8"."LOCATOR")=1)
  19 - access("SYS_P0"."PATHID"=HEXTORAW('7676FDEA')  AND "SYS_P0"."RID"="PO_CLOB".ROWID)
  20 - filter(SYS_XMLI_LOC_ISNODE("SYS_P0"."LOCATOR")=1)
. . .

Turning Off Use of XMLIndex

You can turn off the use of XMLIndex in any of these ways:

Use optimizer hint /*+ NO_XMLINDEX_REWRITE */
Use optimizer hint /*+ NO_XMLINDEX_REWRITE_IN_SELECT */
Use optimizer hint /*+ NO_XML_QUERY_REWRITE */

Each of these turns off the use of all XMLIndex indexes. In addition to turning off use of XMLIndex, NO_XML_QUERY_REWRITE turns off all XPath rewrite (XMLIndex is part of XPath rewrite).

Hint NO_XMLINDEX_REWRITE_IN_SELECT turns off the use of XMLIndex indexes only for XPath expressions in the SELECT list; XMLIndex indexes can still be used for XPath expressions in other query parts, such as a WHERE clause or a FROM clause. This hint can be especially useful with XML data that is stored as binary XML, in cases where streaming evaluation of XPath expressions in a SELECT list provides better performance than XMLIndex.

Example 5-27 shows the use of these optimizer hints.

Example 5-27 Using Optimizer Hints to Turn Off XMLIndex

SELECT /*+ NO_XMLINDEX_REWRITE */ 
  count(*) FROM po_clob WHERE existsNode(OBJECT_VALUE, '/*') = 1;

SELECT /*+ NO_XMLINDEX_REWRITE_IN_SELECT */
  extractValue(li.OBJECT_VALUE, '/LineItem/Description')
  FROM po_clob p,
       table(XMLSequence(extract(p. OBJECT_VALUE,
                                 '/PurchaseOrder/LineItems/LineItem'))) li;

SELECT /*+ NO_XMLINDEX_REWRITE_IN_SELECT */
  li.description
  FROM po_clob p,
       XMLTable('/PurchaseOrder/LineItems/LineItem' PASSING OBJECT_VALUE
                COLUMNS "DESCRIPTION" VARCHAR(40)
                PATH '/LineItem/Description') li;

SELECT /*+ NO_XML_QUERY_REWRITE */
  count(*) FROM po_clob WHERE existsNode(OBJECT_VALUE, '/*') = 1;

In each of the queries that uses hint NO_XMLINDEX_REWRITE_IN_SELECT, the XPath expression in the SELECT list does not use XMLIndex, but the XPath expression in the FROM clause, /PurchaseOrder/LineItems/LineItem, might use XMLIndex. Note that in the query that uses function XMLTable, the XPath expression that corresponds to column li.description does not appear in the SELECT list textually, but it is treated as if it does because of XPath rewrite. That is, XPath rewrite treats the XPath expression as if it were present in the SELECT list.

Note:

The NO_INDEX optimizer hint does not apply to XMLIndex.

See Also:

"Processing XMLType Methods and XML-Specific SQL Functions" for information about streaming evaluation of binary XML data

XMLIndex Path Subsetting: Specifying the Paths You Want to Index

One of the advantages of XMLIndex is that it is very general: you need not specify which XPath locations to index; you need no prior knowledge of the XPath expressions that will be queried. By default, XMLIndex indexes all possible XPath locations in your XML data.

However, if you are aware of the XPath expressions that you are most likely to query, you can narrow the focus of XMLIndex indexing and thus improve performance. Having fewer unnecessary indexes means that less space is required for indexing, which improves index maintenance during DML operations. Having fewer indexed nodes improves DDL performance, and having a smaller path table improves query performance.

You narrow the focus of indexing by pruning the set of XPath expressions (paths) corresponding to XML fragments to be indexed, specifying a subset of all possible paths. You can do this in two alternative ways:

Exclusion – Start with the default behavior of including all possible XPath expressions, and exclude some of them from indexing.
Inclusion – Start with an empty set of XPath expressions to be used in indexing, and add paths to this inclusion set.

You can specify path subsetting either when you create an XMLIndex index using CREATE INDEX or when you modify it using ALTER INDEX. In both cases, you provide the subsetting information in the PATHS parameter of the statement's PARAMETERS clause. For exclusion, you use keyword EXCLUDE. For inclusion, you use keyword INCLUDE for ALTER INDEX and no keyword for CREATE INDEX (list the paths to include). You can also specify namespace mappings for the nodes targeted by the PATHS parameter.

For ALTER INDEX, keyword INCLUDE or EXCLUDE is followed by keyword ADD or REMOVE, to indicate whether the list of paths that follows the keyword is to be added or removed from the inclusion or exclusion list. For example, this statement adds path /PurchaseOrder/Reference to the list of paths to be excluded from indexing:

ALTER INDEX po_xmlindex_ix REBUILD
  PARAMETERS ('PATHS (EXCLUDE ADD (/PurchaseOrder/Reference))');

To alter an XMLIndex index so that it includes all possible paths, you can use keyword ALL in place of an explicit list of paths – use either EXCLUDE REMOVE (ALL) or INCLUDE ADD (ALL); they are equivalent. (You cannot exclude all paths.)

See Also:

"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of the PARAMETERS clause

Examples of XMLIndex Path Subsetting

This section presents some examples of defining XMLIndex indexes on subsets of XPath expressions.

Example 5-28 XMLIndex Path Subsetting With CREATE INDEX

CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLINDEX
  PARAMETERS ('PATHS (INCLUDE (/PurchaseOrder/LineItems//* 
                               /PurchaseOrder/Reference))');

This statement creates an index that indexes only top-level element PurchaseOrder and some of its children, as follows:

All LineItems elements and their descendants
All Reference elements

It does that by including the specified paths, starting with an empty set of paths to be used for the index.

Example 5-29 XMLIndex Path Subsetting With ALTER INDEX

ALTER INDEX po_xmlindex_ix REBUILD
  PARAMETERS ('PATHS (INCLUDE ADD (/PurchaseOrder/Requestor 
                                   /PurchaseOrder/Actions/Action//*))');

This statement adds two more paths to those used for indexing. These paths index element Requestor and descendants of element Action (as well as their ancestors).

Example 5-30 XMLIndex Path Subsetting Using a Namespace Prefix

If an XPath expression to be used for XMLIndex indexing uses namespace prefixes, you can use a NAMESPACE MAPPING clause to the PATHS list, to specify those prefixes. Here is an example:

CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex
  PARAMETERS ('PATHS INCLUDE (/PurchaseOrder/LineItems//*   /PurchaseOrder/ipo:Reference)
                     NAMESPACE MAPPING (xmlns="http://xmlns.oracle.com"
                                        xmlns:ipo="http://xmlns.oracle.com/ipo"))');

XMLIndex Path-Subsetting Rules

The following rules apply to XMLIndex path subsetting:

The paths must reference only child and descendant axes, and they must test only element and attribute nodes or their names (possibly using wildcards). In particular, the paths must not involve predicates.
You cannot specify both path exclusion and path inclusion; choose one method or the other.
If an index was created using path exclusion (inclusion), then you can modify it using only path exclusion (inclusion) — index modification must either further restrict or further extend the path subset. For example, you cannot create an index that includes certain paths and subsequently modify it to exclude certain paths.

Using XMLIndex on Oracle XML DB Repository

A database administrator (DBA) can create an XMLIndex index on resources in Oracle XML DB Repository to improve querying of XML data or metadata (system-defined or user-defined).

Creating an XMLIndex Index on Repository Resources

Only a user with database role XDBADMIN can create an XMLIndex index on Oracle XML DB Repository. After creating such an index, you restrict it to those resources that will actually be queried.

You cannot index resources that contain a REF (in their contents or metadata) to a row in a view or to a row in a table that is not hierarchy-enabled. An attempt to do so will raise an error.

Follow this procedure as a database administrator to create an XMLIndex index on specific repository resources:

Create the XMLIndex index on the repository. This excludes repository paths /Resource/ACL, /Resource/RefCount, and /Resource/RCList from the index created. It creates the path table and all secondary indexes, but it does not index any resources.
```
CALL DBMS_XDB_ADMIN.CreateRepositoryXMLIndex();
```
No resources are indexed at this point. They are indexed after step 2.
Specify the repository resources to index, by repository path. You use PL/SQL procedure XMLIndexAddPath in package DBMS_XDB_ADMIN to do this.
```
DBMS_XDB_ADMIN.XMLIndexAddPath(<path to index>, <disable secondary indexes?>);
```
The first parameter to XMLIndexAddPath is a repository path expression that targets a resource (file or folder) to index. (Note: A repository path expression is not an XPath expression.) The second parameter is a Boolean value that, if TRUE, disables loading of the path-table secondary indexes during the execution of XMLIndexAddPath. Set this to TRUE only if there are few existing rows in the path table and you are adding many new rows. Disabling the secondary indexes while a path is being indexed prevents updating of the path table index, specifically, loading (populating) of the secondary indexes each time a row is added to it. This can speed up bulk loading of resources.

To drop an XMLIndex index created on the repository, do this:

CALL DBMS_XDB_ADMIN.DropRepositoryXMLIndex();

See Also:

Chapter 21, "Accessing Oracle XML DB Repository Data" for information on repository resources and resource paths
Oracle Database PL/SQL Packages and Types Reference for information on the procedures in PL/SQL package DBMS_XDB_ADMIN

Removing Repository Resources From Indexing With XMLIndex

Follow this procedure as a database administrator to remove a resource from indexing:

Repeat this procedure for each ancestor of the resource, to first remove them from indexing. You cannot remove a resource from indexing until you have first removed all of its ancestors from indexing.
Specify the repository resource to be removed from indexing, by repository path. You use PL/SQL procedure XMLIndexRemovePath in package DBMS_XDB_ADMIN to do this.
```
DBMS_XDB_ADMIN.XMLIndexRemovePath(<path to index>, <recursively?>);
```
The first parameter to XMLIndexRemovePath is a repository path expression that targets a resource to remove from indexing. The second parameter is a Boolean value that, if TRUE, removes the targeted resource and all of its descendants from indexing; if FALSE, only the targeted resource is removed from indexing.

See Also:

Oracle Database PL/SQL Packages and Types Reference for information on PL/SQL procedure DBMS_XDB_ADMIN.XMLIndexREmovePath

Querying Repository Data and Metadata Indexed With XMLIndex

For an XMLIndex on repository data or metadata to be used when you query the repository, all of the targeted resources must be indexed. Use SQL functions under_path and equals_path to target resources by repository path when you query. If you do not use either of these functions, then under_path('/') is implied.

Example 5-31 presents a query of resource data, PurchaseOrder elements whose Reference element has value TFOX-20021009123335520PDT.

Example 5-31 Using XMLIndex When Querying Resource Data

SELECT ANY_PATH FROM RESOURCE_VIEW
  WHERE XMLExists('declare namespace r = "http://xmlns.oracle.com/xdb/XDBResource.xsd"; (: :)
                   /r:Resource/r:Contents/PurchaseOrder[Reference="TFOX-20021009123335520PDT"]'
                  PASSING RES)
    AND under_path (RES, '/home/OE/PurchaseOrders/2002/') = 1;
 
ANY_PATH
--------------------------------------------------------------
/home/OE/PurchaseOrders/2002/Jan/TFOX-20021009123335520PDT.xml
 
1 row selected.

The under_path expression here limits the paths queried so that they include only data indexed using XMLIndex. The existsNode expression targets XML nodes using an XPath expression.

You can query repository metadata similarly. To do that, simply target resource elements that represent metadata, either system-defined or user-defined. Example 5-32 shows a query of the CreationDate element in system metadata.

Example 5-32 Using XMLIndex When Querying Resource Metadata

SELECT ANY_PATH FROM RESOURCE_VIEW
  WHERE XMLExists('declare namespace r = "http://xmlns.oracle.com/xdb/XDBResource.xsd"; (: :)
                   /r:Resource/r:CreationDate/text()=xs:dateTime("2005-02-07T18:31:53.093179")'
                  PASSING RES)
   AND under_path (RES, '/home/OE/PurchaseOrders/2002/') = 1;
 
ANY_PATH
-----------------------------------------------------------------
/home/OE/PurchaseOrders/2002/Apr
/home/OE/PurchaseOrders/2002/Apr/AMCEWEN-20021009123336171PDT.xml
/home/OE/PurchaseOrders/2002/Apr/AMCEWEN-20021009123336271PDT.xml
/home/OE/PurchaseOrders/2002/Apr/EABEL-20021009123336251PDT.xml
/home/OE/PurchaseOrders/2002/Apr/PTUCKER-20021009123336191PDT.xml
/home/OE/PurchaseOrders/2002/Apr/PTUCKER-20021009123336291PDT.xml
. . .

144 rows selected.

Dropping an XMLIndex Index on Created on Repository Resources

Only a user with database role XDBADMIN can drop an XMLIndex index created on Oracle XML DB Repository. To drop an XMLIndex index created on the repository, do this:

CALL DBMS_XDB_ADMIN.DropRepositoryXMLIndex();

See Also:

Oracle Database PL/SQL Packages and Types Reference for information on PL/SQL procedure DBMS_XDB_ADMIN.DropRepositoryXMLIndex

XMLIndex Parallelism

You can use a PARALLEL clause (with optional degree) when creating or altering an XMLIndex index, to ensure that the index creation and maintenance are carried out in parallel. This can improve the performance of index creation and maintenance. It can also consume more storage, because storage parameters apply separately to each query server process. For example, an index created with an INITIAL value of 5M and a parallelism degree of 12 consumes at least 60M of storage during index creation.

The syntax for the parallelism clause for CREATE INDEX and ALTER INDEX is the same as for other domain indexes:

{ NOPARALLEL | PARALLEL [ integer ] }

Example 5-33 creates an XMLIndex index with a parallelism degree of 10.

Example 5-33 Creating an XMLIndex Index in Parallel

CREATE INDEX po_xmlindex_ix ON sale_info (sale_po_clob) INDEXTYPE IS XDB.XMLIndex 
  PARALLEL 10;

In Example 5-33, the path table and the secondary indexes are created with the same parallelism degree as the XMLIndex index itself, 10, by inheritance. You can specify different parallelism degrees for these by using separate PARALLEL clauses. Example 5-34 demonstrates this.

Example 5-34 Using Different PARALLEL Degrees for XMLIndex Internal Objects

CREATE INDEX po_xmlindex_ix ON sale_info (sale_po_clob) INDEXTYPE IS XDB.XMLIndex 
  PARAMETERS ('PATH TABLE po_path_table (PARALLEL 10)
               PATH ID INDEX po_pathid_ix
               ORDER KEY INDEX po_orderkey_ix (PARALLEL 5)') NOPARALLEL;

In Example 5-34, the XMLIndex index itself is created serially, because of NOPARALLEL. The secondary index on path-table column PATHID is also populated serially, because no parallelism is specified explicitly for it; it inherits the parallelism of the XMLIndex index. The path table itself will be created with a parallelism degree of 10, and the secondary index on path-table column ORDER_KEY will be populated with a degree of 5, due to their explicit parallelism specifications.

Any parallelism you specify for an XMLIndex index, its path table, or its secondary indexes is exploited during subsequent DML operations and queries.

See Also:

Oracle Database SQL Language Reference for information on the CREATE INDEX parallel clause
"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of the PARAMETERS clause

Asynchronous (Deferred) Maintenance of XMLIndex Indexes

By default, XMLIndex indexing is updated (maintained) at each DML operation, so that it remains in sync with the base table. In some situations, you might not require this, and using possibly stale indexes might be acceptable. In that use case, you can decide to defer the cost of index maintenance, performing at commit time only or at some time when database load is reduced. This can improve DML performance. It can also improve index maintenance performance by enabling bulk loading of unsynchronized index rows when an index is synchronized.

Using a stale index has no effect, other than performance, on DML operations. It can have an effect on query results, however: If the index is not up-to-date at query time, then the query results might not be up-to-date either. Even if only one column of a base table is of data type XMLType, all queries on that table reflect the database data as of the last synchronization of the XMLIndex index on the XMLType column.

You can specify index maintenance deferment using the parameters clause of a CREATE INDEX or ALTER INDEX statement.

Be aware that even if you defer synchronization for an XMLIndex index, the following database operations will automatically synchronize the index:

Any DDL operation on the index – ALTER INDEX or creation of secondary indexes
Any DDL operation on the base table – ALTER TABLE or creation of another index

Table 5-4 lists the synchronization options and the ASYNC clause syntax you use to specify them. The ASYNC clause is used in the PARAMETERS clause of a CREATE INDEX or ALTER INDEX statement for XMLIndex.

Table 5-4 Index Synchronization

When to Synchronize	ASYNC Clause Syntax
Always	`ASYNC (ALWAYS)` This is the default behavior. You can specify it explicitly, to cancel a previous `ASYNC` specification.
Upon commit	`ASYNC (ON COMMIT)`
Periodically	`ASYNC (EVERY "repeat_interval")` `repeat_interval` is the same as for the calendaring syntax of `DBMS_SCHEDULER` To use `EVERY`, you must have the `CREATE JOB` privilege.
Manually, on demand	`ASYNC (MANUAL)` You can manually synchronize the index using PL/SQL procedure `DBMS_XMLINDEX.SyncIndex`.

Optional ASYNC syntax parameter STALE is intended for possible future use; you need never specify it explicitly. It has value FALSE whenever ASYNC is ALWAYS; otherwise it has value TRUE. Specifying an explicit STALE value that contradicts this rule raises an error.

Example 5-35 creates an XMLIndex index that is synchronized every Monday at 3:00 pm, starting tomorrow.

Example 5-35 Specifying Deferred Synchronization for XMLIndex

CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex
  PARAMETERS ('ASYNC (SYNC EVERY "FREQ=HOURLY; INTERVAL = 1")');

Example 5-36 manually synchronizes the index created in Example 5-35.

Example 5-36 Manually Synchronizing an XMLIndex Index Using SYNCINDEX

EXEC DBMS_XMLINDEX.SyncIndex('OE', 'PO_XMLINDEX_IX');

When XMLIndex index synchronization is deferred, all DML changes (inserts, updates, deletions) made to the base table since the last index synchronization are recorded in a table, one row per DML operation. The name of this table is the value of column PEND_TABLE_NAME of static public views USER_XML_INDEXES, ALL_XML_INDEXES, and DBA_XML_INDEXES.

You can examine this table to determine when synchronization might be appropriate for a given XMLIndex index. The more rows, the more the index is likely to be in need of synchronization.

See Also:

Oracle Database PL/SQL Packages and Types Reference, section "Calendaring Syntax", for the syntax of repeat_interval
Oracle Database PL/SQL Packages and Types Reference for information on PL/SQL procedure DBMS_XMLINDEX.SyncIndex

Collecting Statistics on XMLIndex Objects For the Cost-Based Optimizer

The Oracle Database cost-based optimizer determines how to most cost-effectively evaluate a given query, including which indexes, if any, to use. For it to be able to do this accurately, you must collect statistics on various database objects.

For XMLIndex, you normally need to collect statistics on only the base table on which the XMLIndex index is defined (using, for example, procedure DBMS_STATS.gather_table_stats). This automatically collects statistics for the XMLIndex index itself, as well as the path table and its secondary indexes. If you delete statistics on the base table (using procedure DBMS_STATS.delete_table_stats), then statistics on the other objects are also deleted. Similarly, if you collect statistics on the XMLIndex index (using procedure DBMS_STATS.gather_index_stats), then statistics are also collected on the path table and its secondary indexes.

Example 5-37 collects statistics on the base table po_clob. Statistics are automatically collected on the XMLIndex index, its path table, and the secondary path-table indexes.

Example 5-37 Automatic Collection of Statistics on XMLIndex Objects

CALL DBMS_STATS.gather_table_stats(USER, 'PO_CLOB');

Guidelines for Using XMLIndex

The following are some guidelines for using XMLIndex. These guidelines are applicable only when the two alternatives discussed return the same result set.

Avoid prefixing // with ancestor elements. For example, use //c, not /a/b//c, provided these return the same result set.
Avoid prefixing /* with ancestor elements. For example, use /*/*/*, not /a/*/*, provided these return the same result set.
Be aware that if an XPath expression indicates that the full XMLType table or column is needed anyway, then an XMLIndex will not be used, since it would not improve performance. This is the case, for instance, if you access the document root (/), as in this query:
SELECT COUNT(*) FROM po_clob WHERE existsNode(OBJECT_VALUE, '/') = 1;
When you expect a single result, use extractValue instead of extract plus method getStringVal(), so that an index on column VALUE of the path table can be used. (You must thus provide keyword VALUE in the PARAMETERS clause when you create the XMLIndex index.)

Use count(*), not count(extractValue(…)) in a SELECT clause, when possible. For example, if you know that a LineItem element in a purchase-order document has only one Description child, use this:

SELECT count(*) FROM po_clob, XMLTable('//LineItem' PASSING OBJECT_VALUE);

Do not use this:

SELECT count(li.value)
 FROM po_clob p, XMLTable('//LineItem' PASSING p.OBJECT_VALUE
                          COLUMNS value VARCHAR2(30) PATH '/LineItem/Description') li;

Reduce the number of XPath expressions used in a query FROM list as much as possible. For example, use this:

SELECT li.description
  FROM po_clob p,
       XMLTable('PurchaseOrder/LineItems/LineItem' PASSING p.OBJECT_VALUE
                COLUMNS description VARCHAR2(256)
                PATH '/LineItem/Description') li;

Do not use this:

SELECT li.description
  FROM po_clob p,
       XMLTable('PurchaseOrder/LineItems' PASSING p.OBJECT_VALUE) ls,
       XMLTable('LineItems/LineItem'      PASSING ls.OBJECT_VALUE
                COLUMNS description VARCHAR2(256)
                PATH '/LineItem/Description') li;

If you use an XPath expression in a query to drill down inside a virtual table (created, for example, using SQL function XMLTable), then create a secondary index on the order key of the path table using SQL function sys_orderkey_depth. Here is an example of such a query; the selection navigates to element Description inside virtual line-item table li.
```
SELECT li.description
  FROM po_clob p,
       XMLTable('PurchaseOrder/LineItems/LineItem' PASSING p.OBJECT_VALUE
                COLUMNS description VARCHAR2(256)
                PATH '/LineItem/Description') li;
```
Such queries are evaluated using function sys_orderkey_depth, which returns the depth of the order-key value. Because the order index uses two columns, the index needed is a composite index over columns ORDER_KEY and RID, as well as over function sys_orderkey_depth applied to the ORDER_KEY value. For example:
```
CREATE INDEX depth_ix ON my_path_table
  (RID, sys_orderkey_depth(ORDER_KEY), ORDER_KEY);
```
See Also:
Example 5-26 for an example that shows the use of sys_orderkey_depth

PARAMETERS Clause for CREATE INDEX and ALTER INDEX

This section presents the syntax for the PARAMETERS clause of SQL statements CREATE INDEX and ALTER INDEX for use with XMLIndex.

PARAMETERS Clause Syntax for CREATE INDEX and ALTER INDEX

The PARAMETERS clause is PARAMETERS ('XMLIndex_parameters'), where XMLIndex_parameters is one or more repetitions of XMLIndex_parameter_clause:

XMLIndex_parameters ::=

Description of the illustration xmlindex_parameters.gif

XMLIndex_parameter_clause ::=

Description of the illustration xmlindex_parameter_clause.gif

create_index_paths_clause ::=

Description of the illustration create_index_paths_clause.gif

alter_index_paths_clause ::=

Description of the illustration alter_index_paths_clause.gif

namespace_mapping_clause ::=

Description of the illustration namespace_mapping_clause.gif

path_table_clause ::=

Description of the illustration path_table_clause.gif

path_id_clause ::=

Description of the illustration path_id_clause.gif

order_key_clause ::=

Description of the illustration order_key_clause.gif

xml_index_value_clause ::=

Description of the illustration xmlindex_value_clause.gif

Usage of XMLIndex_parameters

There can be at most one XMLIndex_parameter_clause of each type in any XMLIndex_parameters clause. For example, there can be at most one PATHS clause, at most one path_table_clause, and so on.

Usage of XMLIndex_parameter_clause for ALTER INDEX

Clause XMLIndex_parameter_clause can be used with ALTER INDEX only at the table level, and only to add and remove paths.

Usage of PATHS Clause

The following considerations apply to using the PATHS clause.

There can be at most one PATHS clause in a CREATE INDEX statement. That is, there can be at most one occurrence of PATHS followed by create_index_paths_clause.
Clause create_index_paths_clause is used only with CREATE INDEX; alter_index_paths_clause is used only with ALTER INDEX.

Usage of create_index_paths_clause and alter_index_paths_clause

The following considerations apply to using create_index_paths_clause and alter_index_paths_clause.

The INDEX_ALL_PATHS clause rebuilds the index with all paths included.
An explicit list of paths to index can include wildcards and //.
XPaths_list is a list of one or more XPath expressions, each of which includes only child axis, descendant axis, name test, and wildcard (*) constructs.
If XPaths_list is omitted, then everything in all documents is indexed.
For each unique namespace prefix that is used in an XPath expression in XPaths_list, a standard XML namespace declaration is needed, to provide the corresponding namespace information.
You can change an index in ways that are not reflected directly in the syntax by dropping it and then creating it again as needed. For example, to change an index that was defined by including paths to one that is defined by excluding paths, drop it and then create it using EXCLUDE.

Usage of xml_index_value_clause

The following considerations apply to using xml_index_value_clause.

Column VALUE is created as VARCHAR2(4000).
If clause xml_index_value_clause consists only of the keyword VALUE, then the value index is created with the usual default attributes.
If clause path_id_clause consists only of the keywords PATH ID, then the path-id index is created with the usual default attributes.
If clause order_key_clause consists only of the keywords ORDER KEY, then the order-key index is created with the usual default attributes.

Usage of ASYNC Clause

The following considerations apply to using the ASYNC clause.

ALWAYS means automatic synchronization occurs for each DML statement.
MANUAL means no automatic synchronization occurs. You must manually synchronize the index using DBMS_XMLINDEX.SyncIndex.
EVERY repeat_interval means automatically synchronize the index at interval repeat_interval. The syntax of repeat_interval is the same as that for PL/SQL package DBMS_SCHEDULER, and it must be enclosed in double-quotes ("). To use EVERY you must have the CREATE JOB privilege.
ON COMMIT means synchronize the index immediately after a commit operation. The commit does not return until the synchronization is complete. Since the synchronization is performed as a separate transaction, there can be a short period when the data is committed but index changes are not yet committed.
STALE is optional. A value of TRUE means that query results might be stale; a value of FALSE means that query results are always up-to-date. The default value, and the only permitted explicitly specified value, is as follows.
- For ALWAYS, STALE is TRUE.
- For any other ASYNC option besides ALWAYS, STALE is FALSE.

See Also:

Oracle Database SQL Language Reference for the syntax of index_attributes
Oracle Database SQL Language Reference for the syntax of segment_attributes_clause
Oracle Database SQL Language Reference for the syntax of table_properties
Oracle Database SQL Language Reference for the syntax of parallel_clause
Oracle Database SQL Language Reference for additional information about the syntax and semantics of CREATE INDEX
Oracle Database SQL Language Reference for additional information about the syntax and semantics of ALTER INDEX
Oracle Database PL/SQL Packages and Types Reference, section "Calendaring Syntax", for the syntax of repeat_interval

Oracle Text Indexes on XML Data

You can create an Oracle Text index on an XMLType column. An Oracle Text CONTEXT index enables SQL function contains for full-text search over XML. With structured storage, XPath rewrite can often rewrite XPath function ora:contains to SQL function contains, so in those cases too an Oracle Text index can be employed.

See Also:

Chapter 11, "Full-Text Search Over XML Data" for more information about using Oracle Text operations with Oracle XML DB

Creating and Using Oracle Text Indexes

To create an Oracle Text index, use CREATE INDEX, specifying the INDEXTYPE as CTXSYS.CONTEXT, as illustrated in Example 5-38.

Example 5-38 Creating an Oracle Text Index

CREATE INDEX po_otext_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS CTXSYS.CONTEXT;
 
Index created.

You can also perform Oracle Text operations such as contains and score on XMLType columns. Example 5-39 shows an Oracle Text search using SQL function contains.

Example 5-39 Searching XML Data Using SQL Function CONTAINS

SELECT DISTINCT extractValue(OBJECT_VALUE, 
                             '/PurchaseOrder/ShippingInstructions/address') "Address"
  FROM po_clob
  WHERE contains(OBJECT_VALUE,
                 '$(Fortieth) INPATH (PurchaseOrder/ShippingInstructions/address)') > 0;
 
Address
------------------------------
1200 East Forty Seventh Avenue
New York
NY
10024
USA
 
1 row selected.

The explain plan for this query shows, in two ways, that the Oracle Text CONTEXT index is used: it references the index explicitly, as a domain index, and it refers to SQL function contains in the predicate information.

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
Plan hash value: 274475732
 
--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |     7 | 14098 |    10  (10)| 00:00:01 |
|   1 |  HASH UNIQUE                 |             |     7 | 14098 |    10  (10)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| PO_CLOB     |     7 | 14098 |     9   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX              | PO_OTEXT_IX |       |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CTXSYS"."CONTAINS"(SYS_MAKEXML('2B0A2483AB140B35E040578C8A173FEC',523
              3,"XMLDATA"),'$(Fortieth) INPATH (PurchaseOrder/ShippingInstructions/address)')>0)
 
20 rows selected.

Oracle Text Indexes Are Used Independently of Other Indexes

Oracle Text indexing is completely orthogonal to the other types of indexing described in this chapter. Whenever SQL function contains or XPath function ora:contains is used, an Oracle Text index can be used for full-text search.

Example 5-40 demonstrates this in the case where both an XMLIndex index and an Oracle Text index are defined on the same XML data. The query is the same as in Example 5-39. The Oracle Text index is created on the VALUE column of the XMLIndex path table of Example 5-14.

Example 5-40 Using an Oracle Text Index and an XMLIndex Index

CREATE INDEX po_otext_ix ON my_path_table (VALUE) INDEXTYPE IS CTXSYS.CONTEXT;
 
Index created.

EXPLAIN PLAN FOR
  SELECT DISTINCT extractValue(OBJECT_VALUE, '/PurchaseOrder/ShippingInstructions/address') "Address"
    FROM po_clob
    WHERE contains(OBJECT_VALUE, '$(Fortieth) INPATH (PurchaseOrder/ShippingInstructions/address)') > 0;
 
Explained.
 
--
SET ECHO OFF;
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------
Plan hash value: 2664483039
 
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                               |     1 |  2014 |     3  (34)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| MY_PATH_TABLE                 |     1 |  3524 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | SYS78942_PO_XMLINDE_ORDKEY_IX |     1 |       |     2   (0)| 00:00:01 |
|   3 |  HASH UNIQUE                |                               |     1 |  2014 |     3  (34)| 00:00:01 |
|*  4 |   TABLE ACCESS FULL         | PO_CLOB                       |     1 |  2014 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("SYS_P0"."PATHID"=HEXTORAW('35EF580A')  AND SYS_XMLI_LOC_ISNODE("SYS_P0"."LOCATOR")=1)
   2 - access("SYS_P0"."RID"=:B1)
       filter("SYS_P0"."RID"=:B1)
   4 - filter("CTXSYS"."CONTAINS"(SYS_MAKEXML("PO_CLOB"."XMLDATA"),'$(Fortieth) INPATH
              (PurchaseOrder/ShippingInstructions/address)')>0)
 
Note
-----
   - dynamic sampling used for this statement
 
24 rows selected.

The explain plan in Example 5-40 references both the XMLIndex index and the Oracle Text index, indicating that both are used.

The XMLIndex index is indicated by its path table, MY_PATH_TABLE, and its order-key index, SYS78942_PO_XMLINDE_ORDKEY_IX.
The Oracle Text index is indicated by the reference to SQL function contains in the predicate information.

Footnote Legend

Footnote 1: The actual path table implementation may be slightly different.