Skip Headers
Oracle® Text Reference
11g Release 1 (11.1)
Part Number B28304-01
Home
Book List
Index
Master Index
Contact Us
Next
View PDF
Contents
List of Tables
Title and Copyright Information
Preface
Audience
Documentation Accessibility
Related Documentation
Conventions
What's New in Oracle Text?
Oracle Database 11
g
Release 1 (11.1) New Features in Oracle Text
Oracle Database 10
g
Release 2 (10.2) New Features in Oracle Text
1
Oracle Text SQL Statements and Operators
ALTER INDEX
ALTER TABLE: Supported Partitioning Statements
CATSEARCH
CONTAINS
CREATE INDEX
DROP INDEX
MATCHES
MATCH_SCORE
SCORE
2
Oracle Text Indexing Elements
2.1
Overview
2.1.1
Creating Preferences
2.2
Datastore Types
2.2.1
DIRECT_DATASTORE
2.2.1.1
DIRECT_DATASTORE CLOB Example
2.2.2
MULTI_COLUMN_DATASTORE
2.2.2.1
Indexing and DML
2.2.2.2
MULTI_COLUMN_DATASTORE Restriction
2.2.2.3
MULTI_COLUMN_DATASTORE Example
2.2.2.4
MULTI_COLUMN_DATASTORE Filter Example
2.2.2.5
Tagging Behavior
2.2.2.6
Indexing Columns as Sections
2.2.3
DETAIL_DATASTORE
2.2.3.1
Synchronizing Master/Detail Indexes
2.2.3.2
Example Master/Detail Tables
2.2.4
FILE_DATASTORE
2.2.4.1
PATH Attribute Limitations
2.2.4.2
FILE_DATASTORE and Security
2.2.4.3
FILE_DATASTORE Example
2.2.5
URL_DATASTORE
2.2.5.1
URL Syntax
2.2.5.2
URL_DATASTORE Attributes
2.2.5.3
URL_DATASTORE and Security
2.2.5.4
URL_DATASTORE Example
2.2.6
USER_DATASTORE
2.2.6.1
Constraints
2.2.6.2
Editing Procedure after Indexing
2.2.6.3
USER_DATASTORE with CLOB Example
2.2.6.4
USER_DATASTORE with BLOB_LOC Example
2.2.7
NESTED_DATASTORE
2.2.7.1
NESTED_DATASTORE Example
2.3
Filter Types
2.3.1
CHARSET_FILTER
2.3.1.1
UTF-16 Big- and Little-Endian Detection
2.3.1.2
Indexing Mixed-Character Set Columns
2.3.2
AUTO_FILTER
2.3.2.1
Indexing Formatted Documents
2.3.2.2
Explicitly Bypassing Plain Text or HTML in Mixed Format Columns
2.3.2.3
Character Set Conversion With AUTO_FILTER
2.3.3
NULL_FILTER
2.3.3.1
Indexing HTML Documents
2.3.4
MAIL_FILTER
2.3.4.1
Filter Behavior
2.3.4.2
About the Mail Filter Configuration File
2.3.4.3
Mail_Filter Example
2.3.5
USER_FILTER
2.3.5.1
User Filter Example
2.3.6
PROCEDURE_FILTER
2.3.6.1
Parameter Order
2.3.6.2
Procedure Filter Execute Requirements
2.3.6.3
Error Handling
2.3.6.4
Procedure Filter Preference Example
2.4
Lexer Types
2.4.1
AUTO_LEXER
2.4.1.1
AUTO_LEXER Attributes Inherited from BASIC_LEXER
2.4.1.2
AUTO_LEXER Language-Independent Attributes
2.4.1.3
AUTO_LEXER Language-Dependent Attributes
2.4.1.4
AUTO_LEXER User-Defined Dictionary Attributes
2.4.2
BASIC_LEXER
2.4.2.1
Stemming User-Dictionaries
2.4.2.2
BASIC_LEXER Example
2.4.3
MULTI_LEXER
2.4.3.1
Multi-language Stoplists
2.4.3.2
MULTI_LEXER Example
2.4.3.3
Querying Multi-Language Tables
2.4.4
CHINESE_VGRAM_LEXER
2.4.4.1
CHINESE_VGRAM_LEXER Attribute
2.4.4.2
Character Sets
2.4.5
CHINESE_LEXER
2.4.5.1
CHINESE_LEXER Attribute
2.4.5.2
Customizing the Chinese Lexicon
2.4.6
JAPANESE_VGRAM_LEXER
2.4.6.1
JAPANESE_VGRAM_LEXER Attributes
2.4.6.2
JAPANESE_VGRAM_LEXER Character Sets
2.4.7
JAPANESE_LEXER
2.4.7.1
Customizing the Japanese Lexicon
2.4.7.2
JAPANESE_LEXER Attributes
2.4.7.3
JAPANESE LEXER Character Sets
2.4.7.4
Japanese Lexer Example
2.4.8
KOREAN_MORPH_LEXER
2.4.8.1
Supplied Dictionaries
2.4.8.2
Supported Character Sets
2.4.8.3
Unicode Support
2.4.8.4
KOREAN_MORPH_LEXER Attributes
2.4.8.5
Limitations
2.4.8.6
KOREAN_MORPH_LEXER Example: Setting Composite Attribute
2.4.9
USER_LEXER
2.4.9.1
Limitations
2.4.9.2
USER_LEXER Attributes
2.4.9.3
INDEX_PROCEDURE
2.4.9.4
INPUT_TYPE
2.4.9.5
QUERY_PROCEDURE
2.4.9.6
Encoding Tokens as XML
2.4.9.7
XML Schema for No-Location, User-defined Indexing Procedure
2.4.9.8
XML Schema for User-defined Indexing Procedure with Location
2.4.9.9
XML Schema for User-defined Lexer Query Procedure
2.4.10
WORLD_LEXER
2.4.10.1
WORLD_LEXER Attribute
2.4.10.2
WORLD_LEXER Example
2.5
Wordlist Type
2.5.1
BASIC_WORDLIST
2.5.2
BASIC_WORDLIST Example
2.5.2.1
Enabling Fuzzy Matching and Stemming
2.5.2.2
Enabling Sub-string and Prefix Indexing
2.5.2.3
Setting Wildcard Expansion Limit
2.6
Storage Types
2.6.1
BASIC_STORAGE
2.6.1.1
Storage Default Behavior
2.6.1.2
Storage Examples
2.7
Section Group Types
2.7.1
Section Group Examples
2.7.1.1
Creating Section Groups in HTML Documents
2.7.1.2
Creating Sections Groups in XML Documents
2.7.1.3
Automatic Sectioning in XML Documents
2.8
Classifier Types
2.8.1
RULE_CLASSIFIER
2.8.2
SVM_CLASSIFIER
2.9
Cluster Types
2.9.1
KMEAN_CLUSTERING
2.10
Stoplists
2.10.1
Multi-Language Stoplists
2.10.2
Creating Stoplists
2.10.3
Modifying the Default Stoplist
2.10.3.1
Dynamic Addition of Stopwords
2.11
System-Defined Preferences
2.11.1
Data Storage
2.11.1.1
CTXSYS.DEFAULT_DATASTORE
2.11.1.2
CTXSYS.FILE_DATASTORE
2.11.1.3
CTXSYS.URL_DATASTORE
2.11.2
Filter
2.11.2.1
CTXSYS.NULL_FILTER
2.11.2.2
CTXSYS.AUTO_FILTER
2.11.3
Lexer
2.11.3.1
CTXSYS.DEFAULT_LEXER
2.11.3.2
CTXSYS.BASIC_LEXER
2.11.4
Section Group
2.11.4.1
CTXSYS.NULL_SECTION_GROUP
2.11.4.2
CTXSYS.HTML_SECTION_GROUP
2.11.4.3
CTXSYS.AUTO_SECTION_GROUP
2.11.4.4
CTXSYS.PATH_SECTION_GROUP
2.11.5
Stoplist
2.11.5.1
CTXSYS.DEFAULT_STOPLIST
2.11.5.2
CTXSYS.EMPTY_STOPLIST
2.11.6
Storage
2.11.6.1
CTXSYS.DEFAULT_STORAGE
2.11.7
Wordlist
2.11.7.1
CTXSYS.DEFAULT_WORDLIST
2.12
System Parameters
2.12.1
General System Parameters
2.12.2
Default Index Parameters
2.12.2.1
CONTEXT Index Parameters
2.12.2.2
CTXCAT Index Parameters
2.12.2.3
CTXRULE Index Parameters
2.12.2.4
Viewing Default Values
2.12.2.5
Changing Default Values
3
Oracle Text CONTAINS Query Operators
3.1
Operator Precedence
3.1.1
Group 1 Operators
3.1.2
Group 2 Operators and Characters
3.1.3
Procedural Operators
3.1.4
Precedence Examples
3.1.5
Altering Precedence
ABOUT
ACCUMulate ( , )
AND (&)
Broader Term (BT, BTG, BTP, BTI)
DEFINEMERGE
DEFINESCORE
EQUIValence (=)
Fuzzy
HASPATH
INPATH
MDATA
MINUS (-)
Narrower Term (NT, NTG, NTP, NTI)
NEAR (;)
NOT (~)
OR (|)
Preferred Term (PT)
Related Term (RT)
SDATA
soundex (!)
stem ($)
Stored Query Expression (SQE)
SYNonym (SYN)
threshold (>)
Translation Term (TR)
Translation Term Synonym (TRSYN)
Top Term (TT)
weight (*)
wildcards (% _)
WITHIN
4
Special Characters in Oracle Text Queries
4.1
Grouping Characters
4.2
Escape Characters
4.2.1
Querying Escape Characters
4.3
Reserved Words and Characters
5
CTX_ADM Package
MARK_FAILED
RECOVER
SET_PARAMETER
6
CTX_CLS Package
TRAIN
CLUSTERING
7
CTX_DDL Package
ADD_ATTR_SECTION
ADD_FIELD_SECTION
ADD_INDEX
ADD_MDATA
ADD_MDATA_COLUMN
ADD_MDATA_SECTION
ADD_SDATA_COLUMN
ADD_SDATA_SECTION
ADD_SPECIAL_SECTION
ADD_STOPCLASS
ADD_STOP_SECTION
ADD_STOPTHEME
ADD_STOPWORD
ADD_SUB_LEXER
ADD_ZONE_SECTION
COPY_POLICY
CREATE_INDEX_SET
CREATE_POLICY
CREATE_PREFERENCE
CREATE_SECTION_GROUP
CREATE_SHADOW_INDEX
CREATE_STOPLIST
DROP_INDEX_SET
DROP_POLICY
DROP_PREFERENCE
DROP_SECTION_GROUP
DROP_SHADOW_INDEX
DROP_STOPLIST
EXCHANGE_SHADOW_INDEX
OPTIMIZE_INDEX
POPULATE_PENDING
RECREATE_INDEX_ONLINE
REMOVE_INDEX
REMOVE_MDATA
REMOVE_SECTION
REMOVE_STOPCLASS
REMOVE_STOPTHEME
REMOVE_STOPWORD
REMOVE_SUB_LEXER
REPLACE_INDEX_METADATA
SET_ATTRIBUTE
SYNC_INDEX
UNSET_ATTRIBUTE
UPDATE_POLICY
8
CTX_DOC Package
FILTER
GIST
HIGHLIGHT
IFILTER
MARKUP
PKENCODE
POLICY_FILTER
POLICY_GIST
POLICY_HIGHLIGHT
POLICY_MARKUP
POLICY_SNIPPET
POLICY_THEMES
POLICY_TOKENS
SET_KEY_TYPE
SNIPPET
THEMES
TOKENS
9
CTX_OUTPUT Package
ADD_EVENT
ADD_TRACE
DISABLE_QUERY_STATS
ENABLE_QUERY_STATS
END_LOG
END_QUERY_LOG
GET_TRACE_VALUE
LOG_TRACES
LOGFILENAME
REMOVE_EVENT
REMOVE_TRACE
RESET_TRACE
START_LOG
START_QUERY_LOG
10
CTX_QUERY Package
BROWSE_WORDS
COUNT_HITS
EXPLAIN
HFEEDBACK
REMOVE_SQE
STORE_SQE
11
CTX_REPORT Package
11.1
Procedures in CTX_REPORT
11.2
Using the Function Versions
DESCRIBE_INDEX
DESCRIBE_POLICY
CREATE_INDEX_SCRIPT
CREATE_POLICY_SCRIPT
INDEX_SIZE
INDEX_STATS
QUERY_LOG_SUMMARY
TOKEN_INFO
TOKEN_TYPE
12
CTX_THES Package
ALTER_PHRASE
ALTER_THESAURUS
BT
BTG
BTI
BTP
CREATE_PHRASE
CREATE_RELATION
CREATE_THESAURUS
CREATE_TRANSLATION
DROP_PHRASE
DROP_RELATION
DROP_THESAURUS
DROP_TRANSLATION
HAS_RELATION
NT
NTG
NTI
NTP
OUTPUT_STYLE
PT
RT
SN
SYN
THES_TT
TR
TRSYN
TT
UPDATE_TRANSLATION
13
CTX_ULEXER Package
WILDCARD_TAB
14
Oracle Text Utilities
14.1
Thesaurus Loader (ctxload)
14.1.1
Text Loading
14.1.2
ctxload Syntax
14.1.2.1
Mandatory Arguments
14.1.2.2
Optional Arguments
14.1.3
ctxload Examples
14.1.3.1
Thesaurus Import Example
14.1.3.2
Thesaurus Export Example
14.2
Knowledge Base Extension Compiler (ctxkbtc)
14.2.1
Knowledge Base Character Set
14.2.2
ctxkbtc Syntax
14.2.3
ctxkbtc Usage Notes
14.2.4
ctxkbtc Limitations
14.2.5
ctxkbtc Constraints on Thesaurus Terms
14.2.6
ctxkbtc Constraints on Thesaurus Relations
14.2.7
Extending the Knowledge Base
14.2.7.1
Example for Extending the Knowledge Base
14.2.8
Adding a Language-Specific Knowledge Base
14.2.8.1
Limitations for Adding a Knowledge Base
14.2.9
Order of Precedence for Multiple Thesauri
14.2.10
Size Limits for Extended Knowledge Base
14.3
Lexical Compiler (ctxlc)
14.3.1
Syntax of ctxlc
14.3.1.1
Mandatory Arguments
14.3.1.2
Optional Arguments
14.3.2
Performance Considerations
14.3.3
ctxlc Usage Notes
14.3.4
Example
15
Oracle Text Alternative Spelling
15.1
Overview of Alternative Spelling Features
15.1.1
Alternate Spelling
15.1.2
Base-Letter Conversion
15.1.2.1
Generic Versus Language-Specific Base-Letter Conversions
15.1.3
New German Spelling
15.2
Overriding Alternative Spelling Features
15.2.1
Overriding Base-Letter Transformations with Alternate Spelling
15.3
Alternative Spelling Conventions
15.3.1
German Alternate Spelling Conventions
15.3.2
Danish Alternate Spelling Conventions
15.3.3
Swedish Alternate Spelling Conventions
A
Oracle Text Result Tables
A.1
CTX_QUERY Result Tables
A.1.1
EXPLAIN Table
A.1.1.1
Operation Column Values
A.1.1.2
OPTIONS Column Values
A.1.2
HFEEDBACK Table
A.1.2.1
Operation Column Values
A.1.2.2
OPTIONS Column Values
A.1.2.3
CTX_FEEDBACK_TYPE
A.2
CTX_DOC Result Tables
A.2.1
Filter Table
A.2.2
Gist Table
A.2.3
Highlight Table
A.2.4
Markup Table
A.2.5
Theme Table
A.2.6
Token Table
A.3
CTX_THES Result Tables and Data Types
A.3.1
EXP_TAB Table Type
B
Oracle Text Supported Document Formats
B.1
About Document Filtering Technology
B.1.1
Latest Updates for Patch Releases
B.1.2
Restrictions on Format Support
B.1.3
Supported Platforms for AUTO_FILTER Document Filtering Technology
B.1.3.1
Supported Platforms
B.1.4
Environment Variables
B.1.5
General Limitations
B.2
Supported Document Formats
B.2.1
Text and Markup
B.2.2
Word Processing Formats
B.2.2.1
Word Processing Filtering Limitations
B.2.3
Spreadsheet Formats
B.2.3.1
Spreadsheet Format Limitations
B.2.4
Presentation Formats
B.2.4.1
Presentation Format Limitations
B.2.5
Display Formats
B.2.5.1
Filtering of PDF Format Documents
B.2.5.2
PDF Filtering Limitations
B.2.6
Graphic Formats
B.2.6.1
Graphics Formats Limitations
C
Text Loading Examples for Oracle Text
C.1
SQL INSERT Example
C.2
SQL*Loader Example
C.2.1
Creating the Table
C.2.2
Issuing the SQL*Loader Command
C.2.2.1
Example Control File:
loader1.dat
C.2.2.2
Example Data File:
loader2.dat
C.3
Structure of ctxload Thesaurus Import File
C.3.1
Alternate Hierarchy Structure
C.3.2
Usage Notes for Terms in Import Files
C.3.3
Usage Notes for Relationships in Import Files
C.3.4
Examples of Import Files
C.3.4.1
Example 1 (Flat Structure)
C.3.4.2
Example 2 (Hierarchical)
C.3.4.3
Example 3
D
Oracle Text Multilingual Features
D.1
Introduction
D.2
Indexing
D.2.1
Multilingual Features for Text Index Types
D.2.1.1
CONTEXT Index Type
D.2.1.2
CTXCAT Index Type
D.2.1.3
CTXRULE Index Type
D.2.2
Lexer Types
D.2.3
Auto Lexer Features
D.2.4
Basic Lexer Features
D.2.4.1
Theme Indexing
D.2.4.2
Alternate Spelling
D.2.4.3
Base Letter Conversion
D.2.4.4
Composite
D.2.4.5
Index stems
D.2.5
Multi Lexer Features
D.2.6
World Lexer Features
D.3
Querying
D.3.1
ABOUT Operator
D.3.2
Fuzzy Operator
D.3.3
Stem Operator
D.4
Supplied Stop Lists
D.5
Knowledge Base
D.5.1
Knowledge Base Extension
D.6
Multilingual Features Matrix
E
Oracle Text Supplied Stoplists
E.1
English Default Stoplist
E.2
Chinese Stoplist (Traditional)
E.3
Chinese Stoplist (Simplified)
E.4
Danish (dk) Default Stoplist
E.5
Dutch (nl) Default Stoplist
E.6
Finnish (sf) Default Stoplist
E.7
French (f) Default Stoplist
E.8
German (d) Default Stoplist
E.9
Italian (i) Default Stoplist
E.10
Portuguese (pt) Default Stoplist
E.11
Spanish (e) Default Stoplist
E.12
Swedish (s) Default Stoplist
F
The Oracle Text Scoring Algorithm
F.1
Scoring Algorithm for Word Queries
F.1.1
Word Scoring Example
F.1.2
DML and Scoring Algorithm
G
Oracle Text Views
G.1
CTX_CLASSES
G.2
CTX_FILTER_BY_COLUMNS
G.3
CTX_INDEXES
G.4
CTX_INDEX_ERRORS
G.5
CTX_INDEX_OBJECTS
G.6
CTX_INDEX_PARTITIONS
G.7
CTX_INDEX_SETS
G.8
CTX_INDEX_SET_INDEXES
G.9
CTX_INDEX_SUB_LEXERS
G.10
CTX_INDEX_SUB_LEXER_VALUES
G.11
CTX_INDEX_VALUES
G.12
CTX_OBJECTS
G.13
CTX_OBJECT_ATTRIBUTES
G.14
CTX_OBJECT_ATTRIBUTE_LOV
G.15
CTX_ORDER_BY_COLUMNS
G.16
CTX_PARAMETERS
G.17
CTX_PENDING
G.18
CTX_PREFERENCES
G.19
CTX_PREFERENCE_VALUES
G.20
CTX_SECTIONS
G.21
CTX_SECTION_GROUPS
G.22
CTX_SQES
G.23
CTX_STOPLISTS
G.24
CTX_STOPWORDS
G.25
CTX_SUB_LEXERS
G.26
CTX_THESAURI
G.27
CTX_THES_PHRASES
G.28
CTX_TRACE_VALUES
G.29
CTX_USER_ FILTER_BY_COLUMNS
G.30
CTX_USER_INDEXES
G.31
CTX_USER_INDEX_ERRORS
G.32
CTX_USER_INDEX_OBJECTS
G.33
CTX_USER_INDEX_PARTITIONS
G.34
CTX_USER_INDEX_SETS
G.35
CTX_USER_INDEX_SET_INDEXES
G.36
CTX_USER_INDEX_SUB_LEXERS
G.37
CTX_USER_INDEX_SUB_LEXER_VALS
G.38
CTX_USER_INDEX_VALUES
G.39
CTX_USER_ORDER_BY_COLUMNS
G.40
CTX_USER_PENDING
G.41
CTX_USER_PREFERENCES
G.42
CTX_USER_PREFERENCE_VALUES
G.43
CTX_USER_SECTIONS
G.44
CTX_USER_SECTION_GROUPS
G.45
CTX_USER_SQES
G.46
CTX_USER_STOPLISTS
G.47
CTX_USER_STOPWORDS
G.48
CTX_USER_SUB_LEXERS
G.49
CTX_USER_THESAURI
G.50
CTX_USER_THES_PHRASES
G.51
CTX_VERSION
H
Stopword Transformations in Oracle Text
H.1
Understanding Stopword Transformations
H.1.1
Word Transformations
H.1.2
AND Transformations
H.1.3
OR Transformations
H.1.4
ACCUMulate Transformations
H.1.5
MINUS Transformations
H.1.6
NOT Transformations
H.1.7
EQUIValence Transformations
H.1.8
NEAR Transformations
H.1.9
Weight Transformations
H.1.10
Threshold Transformations
H.1.11
WITHIN Transformations
I
AUTO_LEXER Parts-of-Speech Tagging
I.1
Tagging in Arabic
I.2
Tagging in Catalan
I.3
Tagging in Chinese - Traditional and Simplified
I.4
Tagging in Croatian
I.5
Tagging in Danish
I.6
Tagging in Dutch
I.7
Tagging in English
I.8
Tagging in Farsi
I.9
Tagging in Finnish
I.10
Tagging in French
I.11
Tagging in German
I.12
Tagging in Italian
I.13
Tagging in Japanese
I.14
Tagging in Korean
I.15
Tagging in Bokmal
I.16
Tagging in Nynorsk
I.17
Tagging in Portuguese
I.18
Tagging in Russian
I.19
Tagging in Slovak
I.20
Tagging in Slovenian
I.21
Tagging in Spanish
I.22
Tagging in Swedish
Index