Oracle® XML DB Developer's Guide 11g Release 1 (11.1) Part Number B28369-01 |
|
|
View PDF |
You can create indexes on your XML data, to focus on particular parts of it that you query often, and thus improve performance. This chapter includes guidelines for doing this. It describes various ways that you can index XMLType
data, whether schema-based or non-schema-based, and regardless of the XMLType
storage model you use (binary XML, unstructured, hybrid, or structured).
This chapter contains these topics:
Note:
The explain plans shown here are for illustration only. If you run the examples presented here in your environment your explain plans might not be identical.See Also:
Oracle Database Concepts for an overview of indexing
Oracle Database Advanced Application Developer's Guide for information about using indexes in application development
Table 5-1 identifies the documentation for some user tasks involving indexes.
Table 5-1 Oracle XML DB Tasks Involving Indexes
For information about how to... | See... |
---|---|
Create a function-based index on unstructured |
"Creating Function-Based Indexes on Unstructured XMLType Tables and Columns" |
Create a function-based index on structured |
"Creating Function-Based Indexes on Structured XMLType Tables and Columns" |
Create B-tree indexes on objects underlying structured |
"Using Indexes to Tune Queries on Collections Stored as OCTs" |
Create an |
|
Name the path table when creating an |
|
Specify storage options when creating an |
|
Obtain the name of an |
|
Rename an XMLIndex index |
|
Drop an XMLIndex index |
|
Show all existing secondary indexes on an |
|
Obtain the name of a path table for an |
|
Obtain the name of an |
|
Create a secondary index on an |
"Creating Additional Secondary Indexes on an XMLIndex Path Table" |
Create a function-based index on a path-table |
|
Create a numeric index on a path-table |
|
Create a date index on a path-table |
|
Create an Oracle Text |
|
Show whether an |
"How to Tell If XMLIndex is Used" |
Turn off use of an |
|
Extract an XML fragment using |
|
Exclude or include particular XPath expressions from use by an |
"XMLIndex Path Subsetting: Specifying the Paths You Want to Index" |
Specify namespace prefixes for XPath expressions used for |
"XMLIndex Path Subsetting: Specifying the Paths You Want to Index" |
Create an |
"Using XMLIndex on Oracle XML DB Repository" |
Query Oracle XML DB Repository using |
"Using XMLIndex on Oracle XML DB Repository" |
Specify that an |
|
Change the parallelism of an |
|
Schedule maintenance for an |
"Asynchronous (Deferred) Maintenance of XMLIndex Indexes" |
Manually synchronize an |
"Asynchronous (Deferred) Maintenance of XMLIndex Indexes" |
Collect statistics on a table or index for the cost-based optimizer |
|
Create an Oracle Text |
|
Use an Oracle Text |
|
Show whether an Oracle Text |
|
Database indexes improve performance by providing faster access to table data. The use of indexes is particularly recommended for online transaction processing (OLTP) environments involving few updates.
You can create indexes on one or more table columns, or on a functional expression. XML data, however, has its own, fine-grained structure, which is not necessarily reflected in the structure of the database tables used to store it. For this reason, effectively indexing XML data can be a bit different from indexing most database data.
For structured XML storage, XML objects such as elements and attributes correspond to object-relational columns and tables, so creating B-tree indexes on those columns and tables provides an excellent way to effectively index the corresponding XML objects. Here, the storage model directly reflects the fine-grained structure of the XML data, so there is no special problem for indexing structured XML data.
For unstructured and hybrid XML storage, indexing a database column using the standard sorts of index (B-tree, bitmap) is generally not helpful for accessing particular parts of an XML document. If an XMLType
column that contains an XML document is stored as a CLOB
instance, then the details within that document are inaccessible to the column index — the entire document acts as a single unit as far as the column index is concerned. In hybrid storage, part of an XML document is broken up and stored object-relationally (structured storage), but one or more XML fragments are stored as CLOB
instances (unstructured storage). A typical use case here is mapping an XML-schema complexType
or a complex element to CLOB
storage, because the entire fragment is generally accessed as a unit. For standard indexes, it acts as a unit for indexing as well.
XMLIndex
provides a general, XML-specific index that indexes the internal structure of XML data. One of its main purposes is to overcome the indexing limitation presented by unstructured and hybrid storage of XML data, that is, CLOB
storage. It does this by indexing the XML tags of your document and identifying document fragments based on XPath expressions that target them. It can also index scalar node values, to provide quick lookup based on individual values or ranges of values. It also records document hierarchy information for each node it indexes: relations parent–child, ancestor–descendant, and sibling.
See Also:
"XMLIndex"In addition to XMLIndex
, you can use function-based indexes and Oracle Text indexes with XML data. In releases prior to Oracle Database 11g Release 1 (11.1), CTXXPath
indexes are also sometimes appropriate for use with XML data.
In many cases where an XPath expression targets a singleton node, a function-based index can be effective in increasing access performance. In particular, SQL functions XMLQuery
, XMLExists
, XMLCast
, extract
, extractValue
, and existsNode
are useful candidates for this kind of index on XML data. If a functional expression in a query WHERE
clause matches a function-based index, then access to the corresponding data can sometimes be faster even than that provided by a more general XMLIndex
index. In addition, for structured storage, defining an index based on SQL function extractValue
often leads, by XPath rewrite, to automatic creation of (B-tree) indexes on the underlying objects. In this case also, the XPath target must be a singleton element or attribute.
See Also:
"Function-Based Indexes on XMLType Data"Besides accessing XML nodes such as elements and attributes, it is sometimes important to provide fast access to particular passages of text within XML text nodes. This is the purpose of Oracle Text indexes: they index full-text strings. An Oracle Text CONTEXT
index enables SQL function contains
for full-text search over XML. With structured storage, XPath rewrite can often rewrite XPath function ora:contains
to SQL function contains
, so in those cases too an Oracle Text index can be employed. Full-text indexing is particularly useful for document-centric applications, which often contain a mix of XML elements and text-node content. Full-text searching can often be made more powerful, more focused, by combining it with structural XML searching, that is, by restricting it to certain parts of an XML document, which are identified by using XPath expressions.
See Also:
"Oracle Text Indexes on XML Data"Another type of index that is available for indexing XML data, CTXXPath
, is deprecated, starting with Oracle Database 11g Release 1 (11.1). It has been superseded by XMLIndex
, and it is made available only for use with older database releases. It cannot help in extracting an XML fragment, and it acts only as a preliminary filter for equality predicates; after such filtering, XPath expressions are evaluated functionally (that is, without the benefit of XPath rewrite).
Note:
TheCTXSYS.CTXXPath
index is deprecated in Oracle Database 11g Release 1 (11.1). The functionality that was provided by CTXXPath
is now provided by XMLIndex
.
Oracle recommends that you replace CTXXPath
indexes with XMLIndex
indexes. The intention is that CTXXPath
will no longer be supported in a future release of the database.
Which indexes are used when more than one might apply in a given case? Cost-based optimization determines the index or indexes to use, so that performance is maximized. Oracle Text indexes apply only to text, which, for XML data, means text nodes. Whenever text nodes are targeted and a corresponding Oracle Text index is defined, it is used. If other indexes are also appropriate in a particular context, then they can be used as well. However, just because an index is defined and might appear applicable in a given situation does not mean that it will be used — it will not be used if the cost-based optimizer deems that its use would not be cost-effective.
You can create a function-based index on an XMLType
table or column, whether the data is XML schema-based or not, and whether the XMLType
storage is structured, unstructured, or binary XML. A function-based index is created by evaluating the specified functions for each row in the target table or column and storing the value in the index. You can create a function-based index as either a B-tree index or a bitmap index.
If a function-based index is defined on XML data that is not managed using structured storage, then the index is created by invoking the function on the XML content and indexing the result. (This is not necessarily what happens when you try to create a function-based index on XML data in structured storage — see "Creating Function-Based Indexes on Structured XMLType Tables and Columns".)
Example 5-1 shows the creation of an index based on SQL function extractValue
, where the XML data is in unstructured (CLOB
) storage. The data is first copied to CLOB
-based table po_clob
from structured-storage table purchaseorder
of standard database schema OE
. Both of these tables will be used in examples throughout this chapter.
Example 5-1 Creating a Function-Based Index on a CLOB XMLType Instance
The CREATE INDEX
statement in this example creates a function-based index on the value of the text node of element Reference
. This index enforces the uniqueness constraint on the text-node value.
CREATE TABLE po_clob OF XMLType XMLTYPE STORE AS CLOB ELEMENT "http://localhost:8080/source/schemas/poSource/xsd/purchaseOrder.xsd#PurchaseOrder"; Table created. INSERT INTO po_clob SELECT OBJECT_VALUE FROM OE.purchaseorder; 132 rows created. CREATE UNIQUE INDEX po_fn_based_ix ON po_clob (extractValue(OBJECT_VALUE, '/PurchaseOrder/Reference')); Index created. INSERT INTO po_clob VALUES (XMLType(bfilename('XMLDIR', 'EABEL-20021009123335791PDT.xml'), nls_charset_id('AL32UTF8'))); INSERT INTO po_clob * ERROR at line 1: ORA-00001: unique constraint (OE.PO_FN_BASED_IX) violated
The cost-based optimizer considers using a function-based index only when the function that is included in the WHERE
clause is identical to the function that was used to create the index.
To see this, consider the queries and explain plans in Example 5-2. These queries each find a purchase-order document based on the text node of element Reference
. The first query, which uses function existsNode
to locate the document, does not use the index created in Example 5-1; the second query, which uses function extractValue
, does use the index. This is because the index was created using extractValue
.
Example 5-2 Function-Based Index Is Used Only by a Matching Query
EXPLAIN PLAN FOR SELECT OBJECT_VALUE FROM po_clob WHERE existsNode(OBJECT_VALUE, '/PurchaseOrder[Reference="EABEL-20021009123335791PDT"') = 1; Explained. -- SET ECHO OFF PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------- Plan hash value: 2803800196 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 42 | 84084 | 23 (27)| 00:00:01 | |* 1 | TABLE ACCESS FULL| PO_CLOB | 42 | 84084 | 23 (27)| 00:00:01 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(EXISTSNODE(SYS_MAKEXML('0BAD982DA615296BE040578CB00B1198', 3460,"PO_CLOB"."XMLDATA"),'/PurchaseOrder[Reference="EABEL-2002100912333 5791PDT"')=1) 15 rows selected. EXPLAIN PLAN FOR SELECT OBJECT_VALUE FROM po_clob WHERE extractValue(OBJECT_VALUE, '/PurchaseOrder/Reference') = 'EABEL-20021009123335791PDT'; Explained. SET ECHO OFF PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------- Plan hash value: 2594805861 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2002 | 1 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| PO_CLOB | 1 | 2002 | 1 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | PO_FN_BASED_IX | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(EXTRACTVALUE(SYS_MAKEXML('0BAD982DA615296BE040578CB00B1198',3460,"XMLDAT A"),'/PurchaseOrder/Reference')='EABEL-20021009123335791PDT') Note ----- - dynamic sampling used for this statement 19 rows selected.
When you use structured XMLType
storage, there are a few things to be aware of when creating a function-based index:
The element or attribute being targeted by the function must be a singleton, that is, it must occur only once in the XML Document — a function-based index must not target a collection. See "No XPath Rewrite for EXTRACTVALUE Applied to a Collection".
If the function being indexed is extractValue
, then Oracle XML DB tries to rewrite your function-based CREATE INDEX
statement to a different CREATE INDEX
statement that is not function-based and does not use the XPath-expression argument as such. Instead, the resulting index is a B-tree index that operates directly on the underlying objects. This B-tree index takes the place of the function-based index that you specify. See "XPath Rewrite for EXTRACTVALUE Indexes on Singleton Elements or Attributes".
If Oracle XML DB cannot rewrite the CREATE INDEX
statement, then a function-based index is created, just as for unstructured storage. This is what happens if the function is not extractValue
, and it can also happen in a few cases for extractValue
.
Example 5-3 shows a CREATE INDEX
statement for a structured XMLType
table, purchaseorder
, in sample database schema OE
. This statement tries to create a function-based index using SQL function extractValue
, and it targets a singleton XML element, Reference
— each purchase-order document has a single Reference
element.
Example 5-3 CREATE INDEX with EXTRACTVALUE on a Singleton Element or Attribute
CREATE INDEX po_fn_based_ix ON purchaseorder (extractValue(OBJECT_VALUE, '/PurchaseOrder/Reference')); Index created.
Oracle XML DB rewrites this CREATE INDEX
statement into the statement in Example 5-4. The index created is a B-tree index on the underlying object-relational columns. To see this, you can look at the COLUMN_NAME
column for the index in table USER_IND_COLUMNS
. This shows that the underlying object-relational column REFERENCE
is used.
SELECT INDEX_NAME, TABLE_NAME, COLUMN_NAME FROM USER_IND_COLUMNS
WHERE INDEX_NAME = 'PO_FN_BASED_IX' AND TABLE_NAME = 'PURCHASEORDER';
INDEX_NAME TABLE_NAME COLUMN_NAME
---------------- ---------------- ----------------------
PO_FN_BASED_IX PURCHASEORDER "XMLDATA"."REFERENCE"
1 row selected.
Example 5-4 XPath Rewrite of an EXTRACTVALUE Index on a Singleton Element or Attribute
CREATE INDEX po_fn_based_ix ON purchaseorder p (p."XMLDATA"."REFERENCE"); Index created.
Example 5-3 and Example 5-4 yield the same result; they are two ways of creating the same index.
SELECT INDEX_NAME, TABLE_NAME, COLUMN_NAME FROM USER_IND_COLUMNS
WHERE INDEX_NAME = 'PO_FN_BASED_IX' AND TABLE_NAME = 'PURCHASEORDER';
INDEX_NAME TABLE_NAME COLUMN_NAME
---------------- ---------------- ----------------------
PO_FN_BASED_IX PURCHASEORDER "XMLDATA"."REFERENCE"
1 row selected.
A function-based index must target a singleton, not a collection. This section demonstrates this for structured storage.
If a collection is stored in a CLOB
instance, you cannot directly access its members. In structured storage, a collection is stored as an ordered collection table or an XMLType
instance, which means that you can directly access its members. Because the structured storage model directly reflects the fine-grained structure of the XML data, you can create indexes that target collection members.
What you cannot do is create an index based on function extractValue
that targets a collection with an XPath expression, and expect that Oracle XML DB will rewrite your CREATE INDEX
statement, creating the necessary indexes on the underlying objects.
If you want such indexes, you must create them yourself. To do this, you must understand the structure of the SQL object that is used to manage the collection. Given this information, you can create the required (B-tree) indexes directly on the appropriate SQL-object attributes using conventional object-relational SQL.
The aim of this section is to show that you cannot count on XPath rewrite to create the appropriate indexes for you. Refer to "Using Indexes to Tune Queries on Collections Stored as OCTs" for an example of how to create such indexes manually.
This section deliberately tries to create a function-based index on a repeating attribute in a collection, to see what happens. This is tried in a couple of ways, without success.
Suppose that you want to create an index on attribute Id
of element Part
. You might try to create the index as in Example 5-5. However, when an element or attribute being indexed occurs multiple times in a document, a CREATE INDEX
operation on an XPath expression that targets it fails, because SQL function extractValue
can return only a single value for each row that it processes. In this case, there can be multiple occurrences of attribute Id
in a purchase-order document, because its ancestor element LineItem
is a collection.
Example 5-5 Trying to Create a Function-Based Index on a Repeating Attribute
CREATE INDEX po_fn_based_ix ON purchaseorder
(extractValue(OBJECT_VALUE, '/PurchaseOrder/LineItems/LineItem/Part/@Id'));
CREATE INDEX po_fn_based_ix ON purchaseorder
*
ERROR at line 1:
ORA-19025: EXTRACTVALUE returns value of only one node
You can instead create an index by replacing function extractValue
with a combination of function extract
and XMLType
method getStringVal()
, as shown in Example 5-6.
Example 5-6 Creating a Function-Based Index Using EXTRACT and getStringVal()
CREATE INDEX po_fn_based_ix ON purchaseorder (extract(OBJECT_VALUE, 'PurchaseOrder/LineItems/LineItem/Part/@Id').getStringVal()); Index created.
This CREATE INDEX
operation succeeds in creating a function-based index, but the index is not what you might expect. The index is created by invoking SQL function extract
and XMLType
method getStringVal()
for each row in the table, and then indexing that result against the rowid of the row.
The problem with this technique is that extract
returns multiple nodes. The result of applying function extract
is a single XMLType
fragment that contains all of the matching nodes.
Note:
In general, avoid creating an index based on SQL functionextract
. It is unlikely that the index will be useful.The result of then invoking method getStringVal()
on the XMLType
instance that contains this fragment is a concatenation of the nodes in the fragment. What is indexed is therefore the concatenation of the three UPC codes (attribute Id
), and not, as intended, each of the individual UPC codes. This is shown in Example 5-7.
Example 5-7 Function-Based Index on Concatenated Nodes
SELECT extract(OBJECT_VALUE, '/PurchaseOrder/LineItems') XML,
extract(OBJECT_VALUE, 'PurchaseOrder/LineItems/LineItem/Part/@Id').getStringVal() INDEX_VALUE
FROM purchaseorder
WHERE existsNode(OBJECT_VALUE, '/PurchaseOrder[Reference="SBELL-2002100912333601PDT"]') = 1;
XML INDEX_VALUE
----------------------------------------------------------------- -----------------------------------
<LineItems> 71551500905837429140222715515011020
<LineItem ItemNumber="1">
<Description>A Night to Remember</Description>
<Part Id="715515009058" UnitPrice="39.95" Quantity="2"/>
</LineItem>
<LineItem ItemNumber="2">
<Description>The Unbearable Lightness Of Being</Description>
<Part Id="37429140222" UnitPrice="29.95" Quantity="2"/>
</LineItem>
<LineItem ItemNumber="3">
<Description>Sisters</Description>
<Part Id="715515011020" UnitPrice="29.95" Quantity="4"/>
</LineItem>
</LineItems>
1 row selected.
The way to resolve this problem is to use structured storage with ordered collection tables (OCTs), to force each XML node that is indexed to be stored as a separate row. The index can be created directly on the appropriate OCT, using object-relational SQL similar to that generated by XPath rewrite.
In sum, you can effectively index XML data that is stored object-relationally (structured storage) by indexing the database columns that correspond to XML nodes. In the singleton extractValue
case, Oracle XML DB does this for you automatically as a convenience; you use the simple syntax of creating an index based on function extractValue
with an XPath-expression argument, and Oracle XML DB does the rest.
See Also:
"Using Indexes to Tune Queries on Collections Stored as OCTs" for an example of indexing an ordered collection tableThis section contains these topics:
Creating, Dropping, Altering, and Examining an XMLIndex Index
Creating Additional Secondary Indexes on an XMLIndex Path Table
XMLIndex Path Subsetting: Specifying the Paths You Want to Index
Collecting Statistics on XMLIndex Objects For the Cost-Based Optimizer
B-tree indexes can be used advantageously with structured storage — they provide sharp focus by targeting the underlying objects directly. They are generally ineffective, however, in addressing the detailed structure (elements and attributes) of an XML document or fragment stored in a CLOB
instance. That is the special domain of XMLIndex
: unstructured and hybrid storage.
A typical use case for XMLIndex
is where you generally expect to access certain portions of a document in their entirety, so you pack those portions into one or more CLOB
instances. You might nevertheless sometimes need to query within these document portions. XMLIndex
can help here, whereas XPath rewrite is ineffective in this case.
Another use case is where an XML schema contains xsd:any
elements, for lack of any specific knowledge of the document structure and data types involved. The data corresponding to these elements is stored in CLOB
instances, and XMLIndex
can be used to speed access to it.
In addition to CLOB
storage of XML, you can also use XMLIndex
to index binary XMLType
data.
Unlike a B-tree index, which you define for a specific database column that represents an individual XML element or attribute, an XMLIndex
index is, by default, very general: Unless you specify a more narrow focus by detailing specific XPath expressions to use or not to use in indexing, indexing with XMLIndex
applies to all possible XPath expressions for your XML data.
An XMLIndex
index can be used for SQL functions XMLQuery
, XMLTable
, XMLExists
, XMLCast
, extract
, extractValue
, and existsNode
. It presents the following advantages over other indexing methods:
An XMLIndex
index is effective in any part of a query; it is not limited to use in a WHERE
clause. This is not the case for any of the other kinds of indexes you might use with XML data.
XMLIndex
can thus speed access to both SELECT
list data and FROM
list data, making it useful for XML fragment extraction, in particular. Function-based indexes (and CTXXPath
indexes, which are deprecated) cannot be used to extract document fragments.
You need no prior knowledge of the XPath expressions that will be used in queries. XMLIndex
is completely general. This is not the case for function-based indexes. If you do have such prior knowledge, then you can often improve performance by tailoring XMLIndex
indexing to those paths most queried.
You can use an XMLIndex
index with either XML schema-based or non-schema-based data. It can be used with unstructured storage, hybrid storage, and binary XML storage. B-tree indexing is appropriate only for XML schema-based data that is stored object-relationally (structured storage); it is ineffective for XML schema-based data stored in a CLOB
instance.
For hybrid storage of XML schema-based data, XMLIndex
can handle XPath expressions that target document fragments that are stored within a CLOB
instance. XPath rewrite is ineffective in such cases.
You can use an XMLIndex
index for searches with XPath expressions that target collections, that is, nodes that occur multiple times within a document. This is not the case for function-based indexes.
XMLIndex
indexing — both index creation and index maintenance — can be carried out in parallel, using multiple database processes. This is not the case for function-based indexes (and CTXXPATH
indexes, which are deprecated).
Updating of XMLIndex
indexes on binary XML storage can be accomplished in a piecewise manner, improving DML performance considerably. This is not the case for any of the other kinds of indexes you might use with XML data.
The following types of XPath expressions are not indexed by XMLIndex
:
Applications of XPath functions, except ora:contains
, , , , and (these are all indexed). In particular, user-defined XPath functions are not indexed.
Axes other than child
, descendant
, and attribute
, that is, axes parent
, ancestor
, following-sibling
, preceding-sibling
, following
, preceding
, and ancestor-or-self
.
Expressions using the union operator, |
(vertical bar).
XMLIndex
is a domain index; it is designed specifically for the domain of XML data. It is a logical index, which has three components:
A path index – This indexes the XML tags of a document and identifies its various document fragments.
An order index – This indexes the hierarchical positions of the nodes in an XML document. It keeps track of parent–child, ancestor–descendant, and sibling relations.
A value index – This indexes the values of an XML document. It provides lookup by either value equality or value range. A value index is used for values in query predicates (WHERE
clause).
XMLIndex
is implemented using a path table and a set of (local) secondary indexes corresponding to its components. These are all owned by the owner of the base table upon which the XMLIndex
index is created. The path table contains one row for each indexed node in the XML document. For each indexed node, the path table stores:
The corresponding rowid of the table that stores the document.
A locator, which provides fast access to the corresponding document fragment. For binary XML storage of XML schema-based data, it also stores data-type information.
An order key, to record the hierarchical position of the node in the document. You can think of this as a Dewey decimal key like that used in library cataloging and Internet protocol SNMP. In such a system, the key 3.21.5
represents the node position of the fifth child of the twenty-first child of the third child of the document root node.
Table 5-2 shows the main informationFoot 1 that is in the path table. The path index and the order index each use two columns of the path table. Together, columns PATHID
and RID
represent the path index; columns ORDER_KEY
and RID
represent the order index. Secondary indexes are created automatically for columns PATHID
and ORDER_KEY
.
Table 5-2 XMLIndex Path Table
Column | Data Type | Description |
---|---|---|
|
|
Unique identifier for the XPath path to the node. |
|
|
Rowid of the table used to store the XML data. |
|
|
Decimal order key that identifies the hierarchical position of the node. (Document ordering is preserved.) |
|
|
Fragment-location information. Used for fragment extraction. For binary XML storage of XML schema-based data, data-type information is also stored here. |
|
|
Text of an attribute node or a simple element node. |
Example 5-8 explores the contents of the path table for two purchase-order documents.
Example 5-8 Path Table Contents for Two Purchase Orders
<PurchaseOrder> <Reference>SBELL-2002100912333601PDT</Reference> <Actions> <Action> <User>SVOLLMAN</User> </Action> </Actions> . . . </PurchaseOrder> <PurchaseOrder> <Reference>ABEL-20021127121040897PST</Reference> <Actions> <Action> <User>ZLOTKEY</User> </Action> <Action> <User>KING</User> </Action> </Actions> . . . </PurchaseOrder>
An XMLIndex
index on an XMLType
table or column storing these purchase orders will include a path table that has one row for each indexed node in the XML documents. Suppose that the system assigns the following PATHID
s when indexing the nodes according to their XPath expressions:
PATHID | Indexed XPath |
---|---|
1 |
/PurchaseOrder |
2 |
/PurchaseOrder/Reference |
3 |
/PurchaseOrder/Actions |
4 |
/PurchaseOrder/Actions/Action |
5 |
/PurchaseOrder/Actions/Action/User |
The resulting path table would then be something like this (column LOCATOR
is not shown):
PATHID | RID | ORDER_KEY | VALUE |
---|---|---|---|
1 |
R1 |
1 |
—— |
2 |
R1 |
1.1 |
SBELL-2002100912333601PDT |
3 |
R1 |
1.2 |
—— |
4 |
R1 |
1.2.1 |
—— |
5 |
R1 |
1.2.1.1 |
SVOLLMAN |
1 |
R2 |
1 |
—— |
2 |
R2 |
1.1 |
ABEL-20021127121040897PST |
3 |
R2 |
1.2 |
—— |
4 |
R2 |
1.2.1 |
—— |
5 |
R2 |
1.2.1.1 |
ZLOTKEY |
4 |
R2 |
1.2.2 |
—— |
5 |
R2 |
1.2.2.1 |
KING |
Though you might create secondary indexes on path-table columns, you can generally ignore the path table itself. You cannot access the path table, other than to DESCRIBE
it and create (secondary) indexes on it. You need never explicitly gather statistics on the path table. You need only collect statistics on the XMLIndex
index or the base table on which the XMLIndex
index is defined; statistics are collected and maintained on the path table and its secondary indexes transparently.
A secondary index on column VALUE
is used with XPath expressions in a WHERE
clause that have predicates involving string matches. For example:
/PurchaseOrder[Reference/text() = "SBELL-2002100912333601PDT"]
Column VALUE
stores the effective text value of a simple element node (no children) or an attribute node. For an element, this is obtained by concatenating all of the text nodes of the element — comments and processing instructions are ignored during indexing.
Column VALUE
is a fixed size, VARCHAR2(4000)
. Any overflow (beyond 4000 bytes) during index creation or update is truncated, but the LOCATOR
value for that row is then flagged so that the full value can be retrieved from the base table when needed.
In addition to the 4000-byte limit for column VALUE
, there is a limit on the size of a key for the secondary index created on this column. This is the case for B-tree and function-based indexes as well; it is not an XMLIndex
limitation. The index-key size limit is a function of the block size for your database. It is this limit that determines how much of VALUE
is indexed.
In sum, only the first 4000 bytes of the effective text value are stored in column VALUE
, and only the first N bytes of column VALUE
are indexed, where N is the index-key size limit (N < 4000). Because of the index-key size limit, the index on column VALUE
acts only as a preliminary filter for the effective text value.
For example, suppose that your database block size requires that the VALUE
index be no larger than 800 bytes, so that only the first 800 bytes of the effective text value is indexed. The first 800 bytes of the effective text value is first tested, using XMLIndex
, and only if that text prefix matches the query value is the rest of the effective text value tested.
The secondary index on column VALUE
is an index on SQL function substr
(substring equality), because that function is used to test the text prefix. This function-based index is created automatically as part of the implementation of XMLIndex
for column VALUE
.
For example, the XPath expression /PurchaseOrder[Reference/text() = :1]
in a query WHERE
clause might, in effect, be rewritten to a test something like this:
substr(VALUE, 1 800) = substr(:1, 1, 800) AND VALUE = :1;
This conjunction contains two parts, which are processed from left to right. The first test uses the index on function substr
as a preliminary filter, to eliminate text whose first 800 bytes do not match the first 800 bytes of the value of bind variable :1
.
Only the first test uses an index — the full value of column VALUE
is not indexed. After preliminary filtering by the first test, the second test checks the entire effective text value — that is, the full value of column VALUE
— for full equality with the value of :1
. This check does not use an index.
Even if only the first 800 bytes of text is indexed, it is important for query performance that up to 4000 bytes be stored in column VALUE
, because that provides quick, direct access to the data, instead of requiring, for example, extracting it from deep within a CLOB
-instance XML document. If the effective text value is greater than 4000 bytes, then the second test in the WHERE
-clause conjunction will require accessing the base-table data.
Note that neither the VALUE
column 4000-byte limit nor the index-key size affect query results in any way; they can affect only performance.
As mentioned, XMLIndex
can be used with XML schema-based data. If an XML schema specifies a defaultValue
value for a given element or attribute, and a particular document does not specify a value for that element or attribute, then the defaultValue
value is used for the VALUE
column.
If you do not specify a secondary index for column VALUE
when you create the XMLIndex
index, then a default secondary index is created on column VALUE
. This default index has the default properties — in particular, it is an index for text (string-valued) data only.
You can, however, create a VALUE
index of a different type. For example, you can create a number-valued index if that is appropriate for many of your queries. You can create multiple secondary indexes on the VALUE
column. An index of a particular type is used only when it is appropriate. For example, a number-valued index is used only when the VALUE
column is a number; it is ignored for other values. Secondary indexes on path-table columns are treated like any other secondary indexes — you can alter them, drop them, mark them unusable, and so on.
See Also:
"Creating Additional Secondary Indexes on an XMLIndex Path Table" for examples of creating secondary indexes on column VALUE
"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of the PARAMETERS
clause
Information about the standard database indexes is available in static public views USER_INDEXES
, ALL_INDEXES
, and DBA_INDEXES
. Similar information about XMLIndex
indexes is available in static public views USER_XML_INDEXES
, ALL_XML_INDEXES
, and DBA_XML_INDEXES
. Table 5-3 describes the columns in each of these views.
Table 5-3 XMLIndex Static Public Views
Column Name | Type | Description |
---|---|---|
|
|
Owner of the index. Not available for |
|
|
Name of the |
|
|
Owner of the base table on which the index is defined. |
|
|
Name of the base table on which the index is defined. |
|
|
Name of the |
|
|
Parameters specific to the index. These can include the set of XPath paths defining path-subsetting and the name of a scheduler job for synchronization. |
|
|
Asynchronous index updating specification. See "Asynchronous (Deferred) Maintenance of XMLIndex Indexes". |
|
|
Name of the table that records base-table DML operations since the last index synchronization. See "Asynchronous (Deferred) Maintenance of XMLIndex Indexes". |
You create an XMLIndex
index by declaring the index type to be XDB.XMLIndex
, as illustrated in Example 5-9.
Example 5-9 Creating an XMLIndex Index on XMLType Unstructured Storage
CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex;
Index created.
This creates an XMLIndex
index named po_xmlindex_ix
on XMLType
table po_clob
.
Note:
Although you can partition anXMLType
table or column, you cannot create an XMLIndex
index on such a table or column.You can create an XMLIndex
index on CLOB
portions of hybrid XMLType
storage, that is, on CLOB
data that is embedded within object-relational storage. Example 5-10 illustrates this. It assumes that the XML schema used maps the LineItems
element to CLOB
, such as is shown in Example 5-10.
Example 5-10 Creating an XMLIndex Index on XMLType Hybrid Storage
CREATE INDEX po_xmlindex_hybrid_ix ON li_clob (extract(OBJECT_VALUE, '/PurchaseOrder/LineItems')) INDEXTYPE IS XDB.XMLIndex;
Example 5-11 XML Schema Fragment that Maps LineItems to CLOB Storage
<xs:element name="LineItems" type="LineItemsType" xdb:SQLName="LINEITEMS" xdb:SQLType="CLOB"/>
You can obtain the name of an XMLIndex
index on a particular XMLType
table (or column), as shown in Example 5-12. You can also select INDEX_NAME
from DBA_INDEXES
or ALL_INDEXES
, as appropriate.
Example 5-12 Obtaining the Name of an XMLIndex Index on a Particular Table
SELECT INDEX_NAME FROM USER_INDEXES WHERE TABLE_NAME = 'PO_CLOB' AND ITYP_NAME = 'XMLINDEX'; INDEX_NAME --------------- PO_XMLINDEX_IX 1 row selected.
You rename or drop an XMLIndex
index just as you would any other index, as illustrated in Example 5-13. This renaming changes the name of the XMLIndex
index only. It does not change the name of the path table — you can rename the path table separately.
Example 5-13 Renaming and Dropping an XMLIndex Index
ALTER INDEX po_xmlindex_ix RENAME TO new_name_ix;
Index altered.
DROP INDEX new_name_ix;
Index dropped.
Similarly, you can change other index properties using other ALTER INDEX
options, such as REBUILD
. XMLIndex
is no different from other index types in this respect.
You can use the PARAMETERS
clause of a CREATE INDEX
statement to name the path table. Example 5-14 names the path table "my_path_table".
Example 5-14 Naming the Path Table of an XMLIndex Index
CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex
PARAMETERS ('PATH TABLE my_path_table');
If you do not name the path table this way, then its name is generated by the system, using the index name you provide to CREATE INDEX
as a base. Example 5-15 shows this for the index created in Example 5-9.
Example 5-15 Determining the System-Generated Name of an XMLIndex Path Table
SELECT PATH_TABLE_NAME FROM USER_XML_INDEXES WHERE TABLE_NAME = 'PO_CLOB' AND INDEX_NAME = 'PO_XMLINDEX_IX'; PATH_TABLE_NAME ------------------------------ SYS72060_PO_XMLINDE_PATH_TABLE 1 row selected.
By default, the storage options of the XMLIndex
path table and its secondary indexes are derived from the storage properties of the base table on which the XMLIndex
index is created. You can specify different storage options by using a PARAMETERS
clause when you create the index, as shown in Example 5-16. The PARAMETERS
clause of CREATE INDEX
(and ALTER INDEX
) must be between single quotation marks ('
).
See Also:
"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of thePARAMETERS
clauseExample 5-16 Specifying Storage Options When Creating an XMLIndex Index
CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex PARAMETERS ('PATH TABLE po_path_table (PCTFREE 5 PCTUSED 90 INITRANS 5 STORAGE (INITIAL 1k NEXT 2k MINEXTENTS 3 BUFFER_POOL KEEP) NOLOGGING ENABLE ROW MOVEMENT PARALLEL 3) PATH ID INDEX po_path_id_ix (LOGGING PCTFREE 1 INITRANS 3) ORDER KEY INDEX po_order_key_ix (LOGGING PCTFREE 1 INITRANS 3) VALUE INDEX po_value_ix (LOGGING PCTFREE 1 INITRANS 3)'); Index created.
Because XMLIndex
is a logical domain index, not a physical index, all physical attributes are either zero (0
) or NULL
.
In addition to specifying storage options for the path table, Example 5-16 names the secondary indexes. An index is created on column VALUE
because it is specified; otherwise, no such index would be created.
Like the name of the path table, the names of the secondary indexes on the path-table columns are generated automatically using the index name as a base, unless you specify them in the PARAMETERS
clause. Example 5-17 illustrates this, and shows how you can determine these names using public view USER_IND_COLUMNS
. It also shows that the path index and the order index each use two columns, including column RID
. Note, too, that no VALUE
index was created, by default.
Example 5-17 Determining the Names of the Secondary Indexes of an XMLIndex Index
SELECT INDEX_NAME, COLUMN_NAME, COLUMN_POSITION FROM USER_IND_COLUMNS WHERE TABLE_NAME IN (SELECT PATH_TABLE_NAME FROM USER_XML_INDEXES WHERE INDEX_NAME = 'PO_XMLINDEX_IX') ORDER BY INDEX_NAME, COLUMN_NAME; INDEX_NAME COLUMN_NAME COLUMN_POSITION ------------------------------ ------------ --------------- SYS73321_PO_XMLINDE_ORDKEY_IX ORDER_KEY 2 SYS73321_PO_XMLINDE_ORDKEY_IX RID 1 SYS73321_PO_XMLINDE_PATHID_IX PATHID 1 SYS73321_PO_XMLINDE_PATHID_IX RID 2 4 rows selected.
See Also:
Example 5-23 for a similar, but more complex exampleThe RENAME
clause of an ALTER INDEX
statement for XMLIndex
applies only to the XMLIndex
index itself. To rename the path table and secondary indexes, you must determine the names of these objects and use appropriate ALTER INDEX
statements on them directly. Similarly, to retrieve the physical properties of the secondary indexes or alter them in any other way, you will need to obtain their names, as in Example 5-17.
You can use ALTER INDEX
to modify any of the index parameters of the primary or secondary indexes . As a convenience, an alternative way to change properties of a path-id, order-key, or value index is to use ALTER INDEX
on the parent XMLIndex
index, providing a PARAMETERS
clause with the new properties.
See Also:
"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of thePARAMETERS
clauseThis section adds extra secondary indexes to the XMLIndex
index created in Example 5-16.
You can create any number of additional secondary indexes on the VALUE
column of the path table of an XMLIndex
index. These can be of different types, including function-based indexes and Oracle Text indexes.
Whether or not a given index is used for a given element occurrence when processing a query is determined by whether it is of the appropriate type for that value and whether it is cost-effective to use it.
Example 5-18 creates a function-based index on column VALUE
of the path table using SQL function substr
. This might be useful if your queries often use substr
applied to the text nodes of XML elements.
Example 5-18 Creating a Function-Based Index on Path-Table Column VALUE
CREATE INDEX fn_based_ix ON po_path_table (substr(VALUE, 1, 100)); Index created.
If you have many elements whose text nodes represent numeric values, then it can make sense to create a numeric index on the column VALUE
. However, doing so directly, in a manner analogous to Example 5-18, raises an ORA-01722 error (invalid number) if some of the element values are not numbers. This is illustrated in Example 5-19.
Example 5-19 Trying to Create a Numeric Index on Path-Table Column VALUE Directly
CREATE INDEX direct_num_ix ON po_path_table (to_number(VALUE)); CREATE INDEX direct_num_ix ON po_path_table (to_number(VALUE)) * ERROR at line 1: ORA-01722: invalid number
What is needed is an index that will be used for numeric-valued elements but will be ignored for element occurrences that do not have numeric values. Procedure createNumberIndex
of package DBMS_XMLINDEX
exists specifically for this purpose. You pass it the names of the database schema, the XMLIndex
index, and the numeric index to be created. Creation of a numeric index is illustrated in Example 5-20.
Example 5-20 Creating a Numeric Index on Column VALUE with Procedure createNumberIndex
CALL DBMS_XMLINDEX.createNumberIndex('OE', 'PO_XMLINDEX_IX', 'API_NUM_IX');
Note that because such an index is specifically designed to ignore elements that do not have numeric values, its use will not detect their presence. If there are non-numeric elements and, for whatever reason, the XMLIndex
index is not used in some query, then an ORA-01722
error will be raised. However, if the index is used, no such error will be raised, because the index ignores non-numeric data. As always, the use of an index will never change the result set — it will never give you different results, but use of an index can prevent you from detecting erroneous data.
Creating a date-valued index is similar to creating a numeric index; you use procedure DBMS_XMLINDEX.createDateIndex
. Example 5-21 shows this.
Example 5-21 Creating a Date Index on Column VALUE with Procedure createDateIndex
CALL DBMS_XMLINDEX.createDateIndex('OE', 'PO_XMLINDEX_IX', 'API_DATE_IX',
'dateTime');
Example 5-22 creates an Oracle Text CONTEXT
index on column VALUE
. This is useful for full-text queries on text values of XML elements. XPath predicates that use XPath function ora:contains
are rewritten to applications of SQL function contains
on column VALUE
. If a CONTEXT
index is defined on column VALUE
, then it will be used during predicate evaluation. An Oracle Text index is independent of all other VALUE
-column indexes.
Example 5-22 Creating an Oracle Text CONTEXT Index on Path-Table Column VALUE
CREATE INDEX po_otext_ix ON po_path_table (VALUE)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('TRANSACTIONAL');
Index created.
The query in Example 5-23 shows all of the secondary indexes created on the path table of an XMLIndex
index. The indexes created explicitly are in bold. Note, in particular, that some indexes, such as the function-based index created on column VALUE
, do not appear as such; the column name listed for it them a system-generated name, such as SYS_NC00006$
. This means that you cannot see these columns by executing a query with COLUMN_NAME = 'VALUE'
in the WHERE
clause.
Example 5-23 Showing All Secondary Indexes on an XMLIndex Path Table
SELECT c.INDEX_NAME, c.COLUMN_NAME, c.COLUMN_POSITION, e.COLUMN_EXPRESSION FROM USER_IND_COLUMNS c LEFT OUTER JOIN USER_IND_EXPRESSIONS e ON (c.INDEX_NAME = e.INDEX_NAME) WHERE c.TABLE_NAME IN (SELECT PATH_TABLE_NAME FROM USER_XML_INDEXES WHERE INDEX_NAME = 'PO_XMLINDEX_IX') ORDER BY c.INDEX_NAME, c.COLUMN_NAME; INDEX_NAME COLUMN_NAME COLUMN_POSITION COLUMN_EXPRESSION -------------------- ------------ --------------- ---------------------- API_DATE_IX SYS_NC00008$ 1 SYS_EXTRACT_UTC(SYS_XMLCONV("V ALUE",3,8,0,0,181)) API_NUM_IX SYS_NC00007$ 1 TO_BINARY_DOUBLE("VALUE") FN_BASED_IX SYS_NC00006$ 1 SUBSTR("VALUE",1,100) PO_ORDER_KEY_IX ORDER_KEY 2 PO_ORDER_KEY_IX RID 1 PO_OTEXT_IX VALUE 1 PO_PATH_ID_IX PATHID 1 PO_PATH_ID_IX RID 2 PO_VALUE_IX VALUE 1 9 rows selected.
See Also:
Oracle Database PL/SQL Packages and Types Reference for information on PL/SQL procedures createNumberIndex
and createDateIndex
in package DBMS_XMLINDEX
"Oracle Text Indexes Are Used Independently of Other Indexes" for information on using Oracle Text indexes
It is at query compile time that it is determined whether or not a given XMLIndex
index can be used, that is, whether the query can be rewritten into a query against the index. If it cannot be determined at compile time that an XPath expression in the query is a subset of the paths you specified to be used for indexing, then an XMLIndex
index is not used. For example, if the path /PurchaseOrder/LineItems//*
is included for indexing, then a query with /PurchaseOrder/LineItems/LineItem/Description
can use the index, but a query with //Description
cannot. The latter also matches potential Description
elements that are not children of /PurchaseOrder/LineItems
, and it is not possible at compile time to know if such additional Description
elements will be present in the data.
To know whether a particular XMLIndex
index has been used in resolving a query, you can examine an explain plan of the query. If the index is used, then its path table, order key, or path id will be referenced in the explain plan. The explain plan will not directly indicate that a domain index was used; it will not refer to the XMLIndex
index by name.
Example 5-24 shows that the XMLIndex
index created in Example 5-14 is used in a particular query.
Example 5-24 Examining an Explain Plan to See If XMLIndex Is Used
The reference to MY_PATH_TABLE
in the explain plan here indicates that the XMLIndex
index (created in Example 5-14) was used in this query. Similarly, reference to columns LOCATOR
, ORDER_KEY
, and PATHID
indicates the same thing.
SET AUTOTRACE ON EXPLAIN SELECT XMLQuery('/PurchaseOrder/Requestor' PASSING OBJECT_VALUE RETURNING CONTENT) FROM po_clob WHERE XMLExists('/PurchaseOrder[Reference="SBELL-2002100912333601PDT"]' PASSING OBJECT_VALUE); XMLQUERY('/PURCHASEORDER/REQUESTOR'PASSINGOBJECT_VALUERETURNINGCONTENT) ----------------------------------------------------------------------- <Requestor>Sarah J. Bell</Requestor> 1 row selected. Execution Plan . . . ---------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |Cost(%CPU)| Time | ---------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 24 | 15 (7)| 00:00:01 | | 1 | SORT GROUP BY | | 1 | 3524 | | | |* 2 | TABLE ACCESS BY INDEX ROWID | MY_PATH_TABLE | 2 | 7048 | 2 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | SYS55148_PO_XMLINDE_PATHID_IX | 6 | | 1 (0)| 00:00:01 | | 4 | NESTED LOOPS | | 1 | 24 | 15 (7)| 00:00:01 | | 5 | VIEW | VW_SQ_1 | 1 | 12 | 13 (0)| 00:00:01 | | 6 | HASH UNIQUE | | 1 | 5046 | | | | 7 | NESTED LOOPS | | | | | | | 8 | NESTED LOOPS | | 1 | 5046 | 13 (0)| 00:00:01 | |* 9 | TABLE ACCESS BY INDEX ROWID| MY_PATH_TABLE | 1 | 3524 | 11 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | SYS55148_PO_XMLINDE_PATHID_IX | 1 | | 2 (0)| 00:00:01 | |* 11 | INDEX RANGE SCAN | SYS55148_PO_XMLINDE_PATHID_IX | 6 | | 1 (0)| 00:00:01 | |* 12 | TABLE ACCESS BY INDEX ROWID | MY_PATH_TABLE | 1 | 1522 | 2 (0)| 00:00:01 | | 13 | TABLE ACCESS BY USER ROWID | PO_CLOB | 1 | 12 | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(SYS_XMLI_LOC_ISNODE("SYS_P0"."LOCATOR")=1) 3 - access("SYS_P0"."PATHID"=HEXTORAW('74C39DFE') AND "SYS_P0"."RID"=:B1) 9 - filter("SYS_P5"."PATHID"=HEXTORAW('6F7C') AND SYS_XMLI_LOC_ISNODE("SYS_P5"."LOCATOR")=1) 10 - access("SYS_P5"."VALUE"='SBELL-2002100912333601PDT') 11 - access("SYS_P2"."PATHID"=HEXTORAW('093CA37E') AND "SYS_P5"."RID"="SYS_P2"."RID") 12 - filter(SYS_XMLI_LOC_ISNODE("SYS_P2"."LOCATOR")=1 AND "SYS_P2"."ORDER_KEY"<"SYS_P5"."ORDER_KEY" AND "SYS_P5"."ORDER_KEY"<SYS_ORDERKEY_MAXCHILD("SYS_P2"."ORDER_KEY") AND SYS_ORDERKEY_DEPTH("SYS_P2"."ORDER_KEY")+1=SYS_ORDERKEY_DEPTH("SYS_P5"."ORDER_KEY")) . . .
Given the name of a path table from an explain plan such as this, you can obtain the name of its XMLIndex
index as shown in Example 5-25. (This is more or less opposite to the query in Example 5-15.)
Example 5-25 Obtaining the Name of an XMLIndex Index from Its Path-Table Name
SELECT INDEX_NAME FROM USER_XML_INDEXES WHERE PATH_TABLE_NAME = 'MY_PATH_TABLE'; INDEX_NAME ------------------------------ PO_XMLINDEX_IX 1 row selected.
XMLIndex
can be used for XPath expressions in the SELECT
list, the FROM
list, and the WHERE
clause of a query, and it is useful for SQL functions XMLQuery
, XMLTable
, XMLExists
, XMLCast
, extractValue
, existsNode
, and extract
. Unlike function-based indexes (and CTXXPath
indexes, which are deprecated), XMLIndex
indexes can be used when you extract an XML fragment from a document. Example 5-26 illustrates this.
Example 5-26 Using XMLIndex to Extract an XML Fragment
SET AUTOTRACE ON EXPLAIN SELECT li.description, li.itemno FROM po_clob, XMLTable('/PurchaseOrder/LineItems/LineItem' PASSING OBJECT_VALUE COLUMNS "DESCRIPTION" VARCHAR(40) PATH '/LineItem/Description', "ITEMNO" INTEGER PATH '/LineItem/@ItemNumber') li WHERE XMLExists('/PurchaseOrder[Reference="SBELL-2002100912333601PDT"]' PASSING OBJECT_VALUE); DESCRIPTION ITEMNO ---------------------------------------- ---------- A Night to Remember 1 The Unbearable Lightness Of Being 2 Sisters 3 3 rows selected.
The explain plan for this query shows, by referring to the path table, that XMLIndex
is used. It also shows the use of internal SQL function sys_orderkey_depth
— see "Guidelines for Using XMLIndex".
Execution Plan . . . ---------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |Cost(%CPU)| Time | ---------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1546 | 8 (13)| 00:00:01 | |* 1 | FILTER | | | | | | |* 2 | TABLE ACCESS BY INDEX ROWID | MY_PATH_TABLE | 1 | 3524 | 2 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | SYS63727_PO_XMLINDE_PATHID_IX | 6 | | 1 (0)| 00:00:01 | |* 4 | FILTER | | | | | | |* 5 | TABLE ACCESS BY INDEX ROWID | MY_PATH_TABLE | 1 | 3524 | 2 (0)| 00:00:01 | |* 6 | INDEX RANGE SCAN | SYS63727_PO_XMLINDE_PATHID_IX | 1 | | 1 (0)| 00:00:01 | . . . |* 14 | TABLE ACCESS BY INDEX ROWID| MY_PATH_TABLE | 1 | 3524 | 2 (0)| 00:00:01 | |* 15 | INDEX RANGE SCAN | SYS63727_PO_XMLINDE_VALUE_IX | 1 | | 1 (0)| 00:00:01 | |* 16 | INDEX RANGE SCAN | SYS63727_PO_XMLINDE_ORDKEY_IX | 1 | | 1 (0)| 00:00:01 | |* 17 | TABLE ACCESS BY INDEX ROWID | MY_PATH_TABLE | 1 | 1522 | 2 (0)| 00:00:01 | | 18 | TABLE ACCESS BY USER ROWID | PO_CLOB | 1 | 12 | 1 (0)| 00:00:01 | |* 19 | INDEX RANGE SCAN | SYS63727_PO_XMLINDE_PATHID_IX | 6 | | 1 (0)| 00:00:01 | |* 20 | TABLE ACCESS BY INDEX ROWID | MY_PATH_TABLE | 1 | 1522 | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(:B1<SYS_ORDERKEY_MAXCHILD(:B2)) 2 - filter("SYS_P3"."ORDER_KEY">:B1 AND SYS_XMLI_LOC_ISNODE("SYS_P3"."LOCATOR")=1 AND "SYS_P3"."ORDER_KEY"<SYS_ORDERKEY_MAXCHILD(:B2) AND SYS_ORDERKEY_DEPTH("SYS_P3"."ORDER_KEY")= SYS_ORDERKEY_DEPTH(:B3)+1) 3 - access("SYS_P3"."PATHID"=HEXTORAW('54393E4C') AND "SYS_P3"."RID"=:B1) 4 - filter(:B1<SYS_ORDERKEY_MAXCHILD(:B2)) 5 - filter("SYS_P6"."ORDER_KEY">:B1 AND SYS_XMLI_LOC_ISNODE("SYS_P6"."LOCATOR")=1 AND "SYS_P6"."ORDER_KEY"<SYS_ORDERKEY_MAXCHILD(:B2) AND SYS_ORDERKEY_DEPTH("SYS_P6"."ORDER_KEY")= SYS_ORDERKEY_DEPTH(:B3)+1) 6 - access("SYS_P6"."PATHID"=HEXTORAW('7DE452AA') AND "SYS_P6"."RID"=:B1) filter(SYS_PATHID_IS_NMSPC("SYS_P6"."PATHID")=0) 14 - filter("SYS_P11"."PATHID"=HEXTORAW('6F7C') AND SYS_XMLI_LOC_ISNODE("SYS_P11"."LOCATOR")=1) 15 - access("SYS_P11"."VALUE"='SBELL-2002100912333601PDT') 16 - access("SYS_P11"."RID"="SYS_P8"."RID" AND "SYS_P8"."ORDER_KEY"<"SYS_P11"."ORDER_KEY") 17 - filter("SYS_P8"."PATHID"=HEXTORAW('093CA37E') AND SYS_XMLI_LOC_ISNODE("SYS_P8"."LOCATOR")=1) 19 - access("SYS_P0"."PATHID"=HEXTORAW('7676FDEA') AND "SYS_P0"."RID"="PO_CLOB".ROWID) 20 - filter(SYS_XMLI_LOC_ISNODE("SYS_P0"."LOCATOR")=1) . . .
You can turn off the use of XMLIndex
in any of these ways:
Use optimizer hint /*+ NO_XMLINDEX_REWRITE */
Use optimizer hint /*+ NO_XMLINDEX_REWRITE_IN_SELECT */
Use optimizer hint /*+ NO_XML_QUERY_REWRITE */
Each of these turns off the use of all XMLIndex
indexes. In addition to turning off use of XMLIndex
, NO_XML_QUERY_REWRITE
turns off all XPath rewrite (XMLIndex
is part of XPath rewrite).
Hint NO_XMLINDEX_REWRITE_IN_SELECT
turns off the use of XMLIndex
indexes only for XPath expressions in the SELECT
list; XMLIndex
indexes can still be used for XPath expressions in other query parts, such as a WHERE
clause or a FROM
clause. This hint can be especially useful with XML data that is stored as binary XML, in cases where streaming evaluation of XPath expressions in a SELECT
list provides better performance than XMLIndex
.
Example 5-27 shows the use of these optimizer hints.
Example 5-27 Using Optimizer Hints to Turn Off XMLIndex
SELECT /*+ NO_XMLINDEX_REWRITE */ count(*) FROM po_clob WHERE existsNode(OBJECT_VALUE, '/*') = 1; SELECT /*+ NO_XMLINDEX_REWRITE_IN_SELECT */ extractValue(li.OBJECT_VALUE, '/LineItem/Description') FROM po_clob p, table(XMLSequence(extract(p. OBJECT_VALUE, '/PurchaseOrder/LineItems/LineItem'))) li; SELECT /*+ NO_XMLINDEX_REWRITE_IN_SELECT */ li.description FROM po_clob p, XMLTable('/PurchaseOrder/LineItems/LineItem' PASSING OBJECT_VALUE COLUMNS "DESCRIPTION" VARCHAR(40) PATH '/LineItem/Description') li; SELECT /*+ NO_XML_QUERY_REWRITE */ count(*) FROM po_clob WHERE existsNode(OBJECT_VALUE, '/*') = 1;
In each of the queries that uses hint NO_XMLINDEX_REWRITE_IN_SELECT
, the XPath expression in the SELECT
list does not use XMLIndex
, but the XPath expression in the FROM
clause, /PurchaseOrder/LineItems/LineItem
, might use XMLIndex
. Note that in the query that uses function XMLTable
, the XPath expression that corresponds to column li.description
does not appear in the SELECT
list textually, but it is treated as if it does because of XPath rewrite. That is, XPath rewrite treats the XPath expression as if it were present in the SELECT
list.
Note:
TheNO_INDEX
optimizer hint does not apply to XMLIndex
.See Also:
"Processing XMLType Methods and XML-Specific SQL Functions" for information about streaming evaluation of binary XML dataOne of the advantages of XMLIndex
is that it is very general: you need not specify which XPath locations to index; you need no prior knowledge of the XPath expressions that will be queried. By default, XMLIndex
indexes all possible XPath locations in your XML data.
However, if you are aware of the XPath expressions that you are most likely to query, you can narrow the focus of XMLIndex
indexing and thus improve performance. Having fewer unnecessary indexes means that less space is required for indexing, which improves index maintenance during DML operations. Having fewer indexed nodes improves DDL performance, and having a smaller path table improves query performance.
You narrow the focus of indexing by pruning the set of XPath expressions (paths) corresponding to XML fragments to be indexed, specifying a subset of all possible paths. You can do this in two alternative ways:
Exclusion – Start with the default behavior of including all possible XPath expressions, and exclude some of them from indexing.
Inclusion – Start with an empty set of XPath expressions to be used in indexing, and add paths to this inclusion set.
You can specify path subsetting either when you create an XMLIndex
index using CREATE INDEX
or when you modify it using ALTER INDEX
. In both cases, you provide the subsetting information in the PATHS
parameter of the statement's PARAMETERS
clause. For exclusion, you use keyword EXCLUDE
. For inclusion, you use keyword INCLUDE
for ALTER INDEX
and no keyword for CREATE INDEX
(list the paths to include). You can also specify namespace mappings for the nodes targeted by the PATHS
parameter.
For ALTER INDEX
, keyword INCLUDE
or EXCLUDE
is followed by keyword ADD
or REMOVE
, to indicate whether the list of paths that follows the keyword is to be added or removed from the inclusion or exclusion list. For example, this statement adds path /PurchaseOrder/Reference
to the list of paths to be excluded from indexing:
ALTER INDEX po_xmlindex_ix REBUILD
PARAMETERS ('PATHS (EXCLUDE ADD (/PurchaseOrder/Reference))');
To alter an XMLIndex
index so that it includes all possible paths, you can use keyword ALL
in place of an explicit list of paths – use either EXCLUDE REMOVE (ALL)
or INCLUDE ADD (ALL)
; they are equivalent. (You cannot exclude all paths.)
See Also:
"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of thePARAMETERS
clauseThis section presents some examples of defining XMLIndex
indexes on subsets of XPath expressions.
Example 5-28 XMLIndex Path Subsetting With CREATE INDEX
CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLINDEX PARAMETERS ('PATHS (INCLUDE (/PurchaseOrder/LineItems//* /PurchaseOrder/Reference))');
This statement creates an index that indexes only top-level element PurchaseOrder
and some of its children, as follows:
All LineItems
elements and their descendants
All Reference
elements
It does that by including the specified paths, starting with an empty set of paths to be used for the index.
Example 5-29 XMLIndex Path Subsetting With ALTER INDEX
ALTER INDEX po_xmlindex_ix REBUILD PARAMETERS ('PATHS (INCLUDE ADD (/PurchaseOrder/Requestor /PurchaseOrder/Actions/Action//*))');
This statement adds two more paths to those used for indexing. These paths index element Requestor
and descendants of element Action
(as well as their ancestors).
Example 5-30 XMLIndex Path Subsetting Using a Namespace Prefix
If an XPath expression to be used for XMLIndex
indexing uses namespace prefixes, you can use a NAMESPACE MAPPING
clause to the PATHS
list, to specify those prefixes. Here is an example:
CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex PARAMETERS ('PATHS INCLUDE (/PurchaseOrder/LineItems//* /PurchaseOrder/ipo:Reference) NAMESPACE MAPPING (xmlns="http://xmlns.oracle.com" xmlns:ipo="http://xmlns.oracle.com/ipo"))');
The following rules apply to XMLIndex
path subsetting:
The paths must reference only child and descendant axes, and they must test only element and attribute nodes or their names (possibly using wildcards). In particular, the paths must not involve predicates.
You cannot specify both path exclusion and path inclusion; choose one method or the other.
If an index was created using path exclusion (inclusion), then you can modify it using only path exclusion (inclusion) — index modification must either further restrict or further extend the path subset. For example, you cannot create an index that includes certain paths and subsequently modify it to exclude certain paths.
A database administrator (DBA) can create an XMLIndex
index on resources in Oracle XML DB Repository to improve querying of XML data or metadata (system-defined or user-defined).
Only a user with database role XDBADMIN
can create an XMLIndex
index on Oracle XML DB Repository. After creating such an index, you restrict it to those resources that will actually be queried.
You cannot index resources that contain a REF
(in their contents or metadata) to a row in a view or to a row in a table that is not hierarchy-enabled. An attempt to do so will raise an error.
Follow this procedure as a database administrator to create an XMLIndex
index on specific repository resources:
Create the XMLIndex
index on the repository. This excludes repository paths /Resource/ACL
, /Resource/RefCount
, and /Resource/RCList
from the index created. It creates the path table and all secondary indexes, but it does not index any resources.
CALL DBMS_XDB_ADMIN.CreateRepositoryXMLIndex();
No resources are indexed at this point. They are indexed after step 2.
Specify the repository resources to index, by repository path. You use PL/SQL procedure XMLIndexAddPath
in package DBMS_XDB_ADMIN
to do this.
DBMS_XDB_ADMIN.XMLIndexAddPath(<path to index>, <disable secondary indexes?>);
The first parameter to XMLIndexAddPath
is a repository path expression that targets a resource (file or folder) to index. (Note: A repository path expression is not an XPath expression.) The second parameter is a Boolean value that, if TRUE
, disables loading of the path-table secondary indexes during the execution of XMLIndexAddPath
. Set this to TRUE
only if there are few existing rows in the path table and you are adding many new rows. Disabling the secondary indexes while a path is being indexed prevents updating of the path table index, specifically, loading (populating) of the secondary indexes each time a row is added to it. This can speed up bulk loading of resources.
To drop an XMLIndex
index created on the repository, do this:
CALL DBMS_XDB_ADMIN.DropRepositoryXMLIndex();
See Also:
Chapter 21, "Accessing Oracle XML DB Repository Data" for information on repository resources and resource paths
Oracle Database PL/SQL Packages and Types Reference for information on the procedures in PL/SQL package DBMS_XDB_ADMIN
Follow this procedure as a database administrator to remove a resource from indexing:
Repeat this procedure for each ancestor of the resource, to first remove them from indexing. You cannot remove a resource from indexing until you have first removed all of its ancestors from indexing.
Specify the repository resource to be removed from indexing, by repository path. You use PL/SQL procedure XMLIndexRemovePath
in package DBMS_XDB_ADMIN
to do this.
DBMS_XDB_ADMIN.XMLIndexRemovePath(<path to index>, <recursively?>);
The first parameter to XMLIndexRemovePath
is a repository path expression that targets a resource to remove from indexing. The second parameter is a Boolean value that, if TRUE
, removes the targeted resource and all of its descendants from indexing; if FALSE
, only the targeted resource is removed from indexing.
See Also:
Oracle Database PL/SQL Packages and Types Reference for information on PL/SQL procedureDBMS_XDB_ADMIN.XMLIndexREmovePath
For an XMLIndex
on repository data or metadata to be used when you query the repository, all of the targeted resources must be indexed. Use SQL functions under_path
and equals_path
to target resources by repository path when you query. If you do not use either of these functions, then under_path('/')
is implied.
Example 5-31 presents a query of resource data, PurchaseOrder
elements whose Reference
element has value TFOX-20021009123335520PDT
.
Example 5-31 Using XMLIndex When Querying Resource Data
SELECT ANY_PATH FROM RESOURCE_VIEW WHERE XMLExists('declare namespace r = "http://xmlns.oracle.com/xdb/XDBResource.xsd"; (: :) /r:Resource/r:Contents/PurchaseOrder[Reference="TFOX-20021009123335520PDT"]' PASSING RES) AND under_path (RES, '/home/OE/PurchaseOrders/2002/') = 1; ANY_PATH -------------------------------------------------------------- /home/OE/PurchaseOrders/2002/Jan/TFOX-20021009123335520PDT.xml 1 row selected.
The under_path
expression here limits the paths queried so that they include only data indexed using XMLIndex
. The existsNode
expression targets XML nodes using an XPath expression.
You can query repository metadata similarly. To do that, simply target resource elements that represent metadata, either system-defined or user-defined. Example 5-32 shows a query of the CreationDate
element in system metadata.
Example 5-32 Using XMLIndex When Querying Resource Metadata
SELECT ANY_PATH FROM RESOURCE_VIEW
WHERE XMLExists('declare namespace r = "http://xmlns.oracle.com/xdb/XDBResource.xsd"; (: :)
/r:Resource/r:CreationDate/text()=xs:dateTime("2005-02-07T18:31:53.093179")'
PASSING RES)
AND under_path (RES, '/home/OE/PurchaseOrders/2002/') = 1;
ANY_PATH
-----------------------------------------------------------------
/home/OE/PurchaseOrders/2002/Apr
/home/OE/PurchaseOrders/2002/Apr/AMCEWEN-20021009123336171PDT.xml
/home/OE/PurchaseOrders/2002/Apr/AMCEWEN-20021009123336271PDT.xml
/home/OE/PurchaseOrders/2002/Apr/EABEL-20021009123336251PDT.xml
/home/OE/PurchaseOrders/2002/Apr/PTUCKER-20021009123336191PDT.xml
/home/OE/PurchaseOrders/2002/Apr/PTUCKER-20021009123336291PDT.xml
. . .
144 rows selected.
Only a user with database role XDBADMIN
can drop an XMLIndex
index created on Oracle XML DB Repository. To drop an XMLIndex
index created on the repository, do this:
CALL DBMS_XDB_ADMIN.DropRepositoryXMLIndex();
See Also:
Oracle Database PL/SQL Packages and Types Reference for information on PL/SQL procedureDBMS_XDB_ADMIN.DropRepositoryXMLIndex
You can use a PARALLEL
clause (with optional degree) when creating or altering an XMLIndex
index, to ensure that the index creation and maintenance are carried out in parallel. This can improve the performance of index creation and maintenance. It can also consume more storage, because storage parameters apply separately to each query server process. For example, an index created with an INITIAL
value of 5M and a parallelism degree of 12 consumes at least 60M of storage during index creation.
The syntax for the parallelism clause for CREATE INDEX
and ALTER INDEX
is the same as for other domain indexes:
{ NOPARALLEL | PARALLEL [ integer ] }
Example 5-33 creates an XMLIndex
index with a parallelism degree of 10.
Example 5-33 Creating an XMLIndex Index in Parallel
CREATE INDEX po_xmlindex_ix ON sale_info (sale_po_clob) INDEXTYPE IS XDB.XMLIndex
PARALLEL 10;
In Example 5-33, the path table and the secondary indexes are created with the same parallelism degree as the XMLIndex
index itself, 10, by inheritance. You can specify different parallelism degrees for these by using separate PARALLEL
clauses. Example 5-34 demonstrates this.
Example 5-34 Using Different PARALLEL Degrees for XMLIndex Internal Objects
CREATE INDEX po_xmlindex_ix ON sale_info (sale_po_clob) INDEXTYPE IS XDB.XMLIndex PARAMETERS ('PATH TABLE po_path_table (PARALLEL 10) PATH ID INDEX po_pathid_ix ORDER KEY INDEX po_orderkey_ix (PARALLEL 5)') NOPARALLEL;
In Example 5-34, the XMLIndex
index itself is created serially, because of NOPARALLEL
. The secondary index on path-table column PATHID
is also populated serially, because no parallelism is specified explicitly for it; it inherits the parallelism of the XMLIndex
index. The path table itself will be created with a parallelism degree of 10, and the secondary index on path-table column ORDER_KEY
will be populated with a degree of 5, due to their explicit parallelism specifications.
Any parallelism you specify for an XMLIndex
index, its path table, or its secondary indexes is exploited during subsequent DML operations and queries.
See Also:
Oracle Database SQL Language Reference for information on the CREATE INDEX
parallel clause
"PARAMETERS Clause for CREATE INDEX and ALTER INDEX" for the syntax of the PARAMETERS
clause
By default, XMLIndex
indexing is updated (maintained) at each DML operation, so that it remains in sync with the base table. In some situations, you might not require this, and using possibly stale indexes might be acceptable. In that use case, you can decide to defer the cost of index maintenance, performing at commit time only or at some time when database load is reduced. This can improve DML performance. It can also improve index maintenance performance by enabling bulk loading of unsynchronized index rows when an index is synchronized.
Using a stale index has no effect, other than performance, on DML operations. It can have an effect on query results, however: If the index is not up-to-date at query time, then the query results might not be up-to-date either. Even if only one column of a base table is of data type XMLType
, all queries on that table reflect the database data as of the last synchronization of the XMLIndex
index on the XMLType
column.
You can specify index maintenance deferment using the parameters clause of a CREATE INDEX
or ALTER INDEX
statement.
Be aware that even if you defer synchronization for an XMLIndex
index, the following database operations will automatically synchronize the index:
Any DDL operation on the index – ALTER INDEX
or creation of secondary indexes
Any DDL operation on the base table – ALTER TABLE
or creation of another index
Table 5-4 lists the synchronization options and the ASYNC
clause syntax you use to specify them. The ASYNC
clause is used in the PARAMETERS
clause of a CREATE INDEX
or ALTER INDEX
statement for XMLIndex
.
Table 5-4 Index Synchronization
When to Synchronize | ASYNC Clause Syntax |
---|---|
Always |
This is the default behavior. You can specify it explicitly, to cancel a previous |
Upon commit |
|
Periodically |
To use |
Manually, on demand |
You can manually synchronize the index using PL/SQL procedure |
Optional ASYNC
syntax parameter STALE
is intended for possible future use; you need never specify it explicitly. It has value FALSE
whenever ASYNC
is ALWAYS
; otherwise it has value TRUE
. Specifying an explicit STALE
value that contradicts this rule raises an error.
Example 5-35 creates an XMLIndex index that is synchronized every Monday at 3:00 pm, starting tomorrow.
Example 5-35 Specifying Deferred Synchronization for XMLIndex
CREATE INDEX po_xmlindex_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS XDB.XMLIndex
PARAMETERS ('ASYNC (SYNC EVERY "FREQ=HOURLY; INTERVAL = 1")');
Example 5-36 manually synchronizes the index created in Example 5-35.
Example 5-36 Manually Synchronizing an XMLIndex Index Using SYNCINDEX
EXEC DBMS_XMLINDEX.SyncIndex('OE', 'PO_XMLINDEX_IX');
When XMLIndex
index synchronization is deferred, all DML changes (inserts, updates, deletions) made to the base table since the last index synchronization are recorded in a table, one row per DML operation. The name of this table is the value of column PEND_TABLE_NAME
of static public views USER_XML_INDEXES
, ALL_XML_INDEXES
, and DBA_XML_INDEXES
.
You can examine this table to determine when synchronization might be appropriate for a given XMLIndex
index. The more rows, the more the index is likely to be in need of synchronization.
See Also:
Oracle Database PL/SQL Packages and Types Reference, section "Calendaring Syntax", for the syntax of repeat_interval
Oracle Database PL/SQL Packages and Types Reference for information on PL/SQL procedure DBMS_XMLINDEX.SyncIndex
The Oracle Database cost-based optimizer determines how to most cost-effectively evaluate a given query, including which indexes, if any, to use. For it to be able to do this accurately, you must collect statistics on various database objects.
For XMLIndex
, you normally need to collect statistics on only the base table on which the XMLIndex
index is defined (using, for example, procedure DBMS_STATS.gather_table_stats
). This automatically collects statistics for the XMLIndex
index itself, as well as the path table and its secondary indexes. If you delete statistics on the base table (using procedure DBMS_STATS.delete_table_stats
), then statistics on the other objects are also deleted. Similarly, if you collect statistics on the XMLIndex
index (using procedure DBMS_STATS.gather_index_stats
), then statistics are also collected on the path table and its secondary indexes.
Example 5-37 collects statistics on the base table po_clob
. Statistics are automatically collected on the XMLIndex
index, its path table, and the secondary path-table indexes.
The following are some guidelines for using XMLIndex
. These guidelines are applicable only when the two alternatives discussed return the same result set.
Avoid prefixing //
with ancestor elements. For example, use //c
, not /a/b
//c
, provided these return the same result set.
Avoid prefixing /*
with ancestor elements. For example, use /*/*/*
, not /a
/*/*
, provided these return the same result set.
Be aware that if an XPath expression indicates that the full XMLType
table or column is needed anyway, then an XMLIndex
will not be used, since it would not improve performance. This is the case, for instance, if you access the document root (/
), as in this query:
When you expect a single result, use extractValue
instead of extract
plus method getStringVal()
, so that an index on column VALUE
of the path table can be used. (You must thus provide keyword VALUE
in the PARAMETERS
clause when you create the XMLIndex
index.)
Use count(*)
, not count(extractValue(…))
in a SELECT
clause, when possible. For example, if you know that a LineItem
element in a purchase-order document has only one Description
child, use this:
SELECT count(*) FROM po_clob, XMLTable('//LineItem' PASSING OBJECT_VALUE);
Do not use this:
SELECT count(li.value) FROM po_clob p, XMLTable('//LineItem' PASSING p.OBJECT_VALUE COLUMNS value VARCHAR2(30) PATH '/LineItem/Description') li;
Reduce the number of XPath expressions used in a query FROM
list as much as possible. For example, use this:
SELECT li.description FROM po_clob p, XMLTable('PurchaseOrder/LineItems/LineItem' PASSING p.OBJECT_VALUE COLUMNS description VARCHAR2(256) PATH '/LineItem/Description') li;
Do not use this:
SELECT li.description FROM po_clob p, XMLTable('PurchaseOrder/LineItems' PASSING p.OBJECT_VALUE) ls, XMLTable('LineItems/LineItem' PASSING ls.OBJECT_VALUE COLUMNS description VARCHAR2(256) PATH '/LineItem/Description') li;
If you use an XPath expression in a query to drill down inside a virtual table (created, for example, using SQL function XMLTable
), then create a secondary index on the order key of the path table using SQL function sys_orderkey_depth
. Here is an example of such a query; the selection navigates to element Description
inside virtual line-item table li
.
SELECT li.description FROM po_clob p, XMLTable('PurchaseOrder/LineItems/LineItem' PASSING p.OBJECT_VALUE COLUMNS description VARCHAR2(256) PATH '/LineItem/Description') li;
Such queries are evaluated using function sys_orderkey_depth
, which returns the depth of the order-key value. Because the order index uses two columns, the index needed is a composite index over columns ORDER_KEY
and RID
, as well as over function sys_orderkey_depth
applied to the ORDER_KEY
value. For example:
CREATE INDEX depth_ix ON my_path_table (RID, sys_orderkey_depth(ORDER_KEY), ORDER_KEY);
This section presents the syntax for the PARAMETERS
clause of SQL statements CREATE INDEX
and ALTER INDEX
for use with XMLIndex
.
The PARAMETERS
clause is PARAMETERS ('
XMLIndex_parameters
')
, where XMLIndex_parameters
is one or more repetitions of XMLIndex_parameter_clause
:
XMLIndex_parameters ::=
XMLIndex_parameter_clause ::=
create_index_paths_clause ::=
alter_index_paths_clause ::=
namespace_mapping_clause ::=
path_table_clause ::=
path_id_clause ::=
order_key_clause ::=
xml_index_value_clause ::=
There can be at most one XMLIndex_parameter_clause
of each type in any XMLIndex_parameters
clause. For example, there can be at most one PATHS
clause, at most one path_table_clause
, and so on.
Clause XMLIndex_parameter_clause
can be used with ALTER INDEX
only at the table level, and only to add and remove paths.
The following considerations apply to using the PATHS
clause.
There can be at most one PATHS
clause in a CREATE INDEX
statement. That is, there can be at most one occurrence of PATHS
followed by create_index_paths_clause
.
Clause create_index_paths_clause
is used only with CREATE INDEX
; alter_index_paths_clause
is used only with ALTER INDEX
.
The following considerations apply to using create_index_paths_clause
and alter_index_paths_clause
.
The INDEX_ALL_PATHS
clause rebuilds the index with all paths included.
An explicit list of paths to index can include wildcards and //
.
XPaths_list
is a list of one or more XPath expressions, each of which includes only child axis, descendant axis, name test, and wildcard (*
) constructs.
If XPaths_list
is omitted, then everything in all documents is indexed.
For each unique namespace prefix that is used in an XPath expression in XPaths_list
, a standard XML namespace
declaration is needed, to provide the corresponding namespace information.
You can change an index in ways that are not reflected directly in the syntax by dropping it and then creating it again as needed. For example, to change an index that was defined by including paths to one that is defined by excluding paths, drop it and then create it using EXCLUDE
.
The following considerations apply to using xml_index_value_clause
.
Column VALUE
is created as VARCHAR2(4000)
.
If clause xml_index_value_clause
consists only of the keyword VALUE
, then the value index is created with the usual default attributes.
If clause path_id_clause
consists only of the keywords PATH ID
, then the path-id index is created with the usual default attributes.
If clause order_key_clause
consists only of the keywords ORDER KEY
, then the order-key index is created with the usual default attributes.
The following considerations apply to using the ASYNC
clause.
ALWAYS
means automatic synchronization occurs for each DML statement.
MANUAL
means no automatic synchronization occurs. You must manually synchronize the index using DBMS_XMLINDEX.SyncIndex
.
EVERY
repeat_interval
means automatically synchronize the index at interval repeat_interval
. The syntax of repeat_interval
is the same as that for PL/SQL package DBMS_SCHEDULER
, and it must be enclosed in double-quotes ("
). To use EVERY
you must have the CREATE JOB
privilege.
ON COMMIT
means synchronize the index immediately after a commit operation. The commit does not return until the synchronization is complete. Since the synchronization is performed as a separate transaction, there can be a short period when the data is committed but index changes are not yet committed.
STALE
is optional. A value of TRUE
means that query results might be stale; a value of FALSE
means that query results are always up-to-date. The default value, and the only permitted explicitly specified value, is as follows.
For ALWAYS
, STALE
is TRUE
.
For any other ASYNC
option besides ALWAYS
, STALE
is FALSE
.
See Also:
Oracle Database SQL Language Reference for the syntax of index_attributes
Oracle Database SQL Language Reference for the syntax of segment_attributes_clause
Oracle Database SQL Language Reference for the syntax of table_properties
Oracle Database SQL Language Reference for the syntax of parallel_clause
Oracle Database SQL Language Reference for additional information about the syntax and semantics of CREATE INDEX
Oracle Database SQL Language Reference for additional information about the syntax and semantics of ALTER INDEX
Oracle Database PL/SQL Packages and Types Reference, section "Calendaring Syntax", for the syntax of repeat_interval
You can create an Oracle Text index on an XMLType
column. An Oracle Text CONTEXT
index enables SQL function contains
for full-text search over XML. With structured storage, XPath rewrite can often rewrite XPath function ora:contains
to SQL function contains
, so in those cases too an Oracle Text index can be employed.
See Also:
Chapter 11, "Full-Text Search Over XML Data" for more information about using Oracle Text operations with Oracle XML DBTo create an Oracle Text index, use CREATE INDEX
, specifying the INDEXTYPE
as CTXSYS.CONTEXT
, as illustrated in Example 5-38.
Example 5-38 Creating an Oracle Text Index
CREATE INDEX po_otext_ix ON po_clob (OBJECT_VALUE) INDEXTYPE IS CTXSYS.CONTEXT; Index created.
You can also perform Oracle Text operations such as contains
and score
on XMLType
columns. Example 5-39 shows an Oracle Text search using SQL function contains
.
Example 5-39 Searching XML Data Using SQL Function CONTAINS
SELECT DISTINCT extractValue(OBJECT_VALUE,
'/PurchaseOrder/ShippingInstructions/address') "Address"
FROM po_clob
WHERE contains(OBJECT_VALUE,
'$(Fortieth) INPATH (PurchaseOrder/ShippingInstructions/address)') > 0;
Address
------------------------------
1200 East Forty Seventh Avenue
New York
NY
10024
USA
1 row selected.
The explain plan for this query shows, in two ways, that the Oracle Text CONTEXT
index is used: it references the index explicitly, as a domain index, and it refers to SQL function contains
in the predicate information.
PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------- Plan hash value: 274475732 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7 | 14098 | 10 (10)| 00:00:01 | | 1 | HASH UNIQUE | | 7 | 14098 | 10 (10)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| PO_CLOB | 7 | 14098 | 9 (0)| 00:00:01 | |* 3 | DOMAIN INDEX | PO_OTEXT_IX | | | 4 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("CTXSYS"."CONTAINS"(SYS_MAKEXML('2B0A2483AB140B35E040578C8A173FEC',523 3,"XMLDATA"),'$(Fortieth) INPATH (PurchaseOrder/ShippingInstructions/address)')>0) 20 rows selected.
Oracle Text indexing is completely orthogonal to the other types of indexing described in this chapter. Whenever SQL function contains
or XPath function ora:contains
is used, an Oracle Text index can be used for full-text search.
Example 5-40 demonstrates this in the case where both an XMLIndex
index and an Oracle Text index are defined on the same XML data. The query is the same as in Example 5-39. The Oracle Text index is created on the VALUE
column of the XMLIndex
path table of Example 5-14.
Example 5-40 Using an Oracle Text Index and an XMLIndex Index
CREATE INDEX po_otext_ix ON my_path_table (VALUE) INDEXTYPE IS CTXSYS.CONTEXT; Index created. EXPLAIN PLAN FOR SELECT DISTINCT extractValue(OBJECT_VALUE, '/PurchaseOrder/ShippingInstructions/address') "Address" FROM po_clob WHERE contains(OBJECT_VALUE, '$(Fortieth) INPATH (PurchaseOrder/ShippingInstructions/address)') > 0; Explained. -- SET ECHO OFF; PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------- Plan hash value: 2664483039 ------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2014 | 3 (34)| 00:00:01 | |* 1 | TABLE ACCESS BY INDEX ROWID| MY_PATH_TABLE | 1 | 3524 | 1 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | SYS78942_PO_XMLINDE_ORDKEY_IX | 1 | | 2 (0)| 00:00:01 | | 3 | HASH UNIQUE | | 1 | 2014 | 3 (34)| 00:00:01 | |* 4 | TABLE ACCESS FULL | PO_CLOB | 1 | 2014 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("SYS_P0"."PATHID"=HEXTORAW('35EF580A') AND SYS_XMLI_LOC_ISNODE("SYS_P0"."LOCATOR")=1) 2 - access("SYS_P0"."RID"=:B1) filter("SYS_P0"."RID"=:B1) 4 - filter("CTXSYS"."CONTAINS"(SYS_MAKEXML("PO_CLOB"."XMLDATA"),'$(Fortieth) INPATH (PurchaseOrder/ShippingInstructions/address)')>0) Note ----- - dynamic sampling used for this statement 24 rows selected.
The explain plan in Example 5-40 references both the XMLIndex
index and the Oracle Text index, indicating that both are used.
The XMLIndex
index is indicated by its path table, MY_PATH_TABLE
, and its order-key index, SYS78942_PO_XMLINDE_ORDKEY_IX
.
The Oracle Text index is indicated by the reference to SQL function contains
in the predicate information.
Footnote Legend
Footnote 1: The actual path table implementation may be slightly different.