PATH |
Integrating Oracle Context
Discussion
Oracle Context is a powerful searching extension to the Oracle Database engine that adds several features including support for mixing full-text qualifiers with traditional relational qualifiers and background creation of GISTS, and THEMES. It fairly easy to integrate Context searches with EOF since EOF supports custom SQL qualifiers.
CONTAINS()
Oracle Context adds a new SQL keyword called CONTAINS to the SQL parser inside the Oracle database. This new CONTAINS keyword requires three parameters:
CONTAINS(columnName, `some text to search for', containsId) > score
- ColumnName is a column within the selected table that has a special Context index created for it.
- ContainsId is an integer number that uniquely identifies a given CONTAINS clause within a Select statement. This identifier enables you to use logical operators to combine multiple CONTAINS clauses within a single Select statement.
- Score is the score threshold you must overcome to consider a record a hit (usually set to zero).
SCORE()
You can obtain the SCORE of each matching record by adding SCORE(containsId) to the SQL Select statement. However, this is a problem for EOF because your Select statements are usually autogenerated from the fixed list of attributes in an entity. If you are always searching with a fixed amount of CONTAINS clauses, you can make a copy of the entity and add the SCORE(containsId) as a derived column and fetch from this special EO that has SCORE attribute(s).
EOSQLQualifier for the CONTAINS Clause
In EOF, you can create an EOSQLQualifier to represent the CONTAINS clause and then AND/OR that EOSQLQualifier to another other EOQualifier to obtain a complete compound qualifier.
Overcoming DISTINCT Problems
A frequent problem when working with Context is that you might get duplicate records matching your query. The obvious way to eliminate the duplicates is to use the DISTINCT keyword in your Select. (see "Fetching Distinct Results") The problem arises because the column that is indexed with Oracle Context usually contains a LONG , a LONG RAW , or a BLOB . Oracle does not support using DISTINCT with these types of columns. This is one way to solve this problem:
Put the LONG RAW data you are indexing on in a separate table with just the primary key and LONG RAW column ( DOCUMENT_DATA ).
- Make an Entity to the table with all the other attributes (that is, DOCUMENT ).
- Make a to-one propagate primary key relationship to the table with the LONG RAW (that is, toDocumentData ).
- Make the inverse relationship from the LONG RAW table to the real table ( toDocument ).
- Query with an EOFetchSpecification setUsesDisitinct:YES on the DOCUMENT table using CONTAINS(toDocumentData.data,'some string',1)>0.
- Then, if you really want the long raw data, you can traverse the toDocumentData.data keypath to get the data.
Breaking up the long raw data into separate tables is a good convention because many databases have restrictions on using DISTINCT with long raw (memo) columns.
GISTS and THEMES
Oracle Context has very advanced facilities to do GISTS and THEMES . GISTS are summaries of the documents, and THEMES are a small set of topics covering all of the documents. You can set up ORACLE to automatically update GIST and THEME tables that correspond to your full-text data. You can create Entities in EOModeler that correspond to the GISTS and THEME tables, and you can create relationships in EOModeler from the GISTS/THEMES entities to the actual data entity. Then, you can present the user with a list of all the themes. You can also create a to-one relationship from the data entity to the GISTS entity to provide an easy way to show the summary of a fetched data record.
See Also
- Fetching Distinct Results
- Working with Unnormalized Tables
- EOSQLQualifier class specification in the Enterprise Objects Framework Reference
- Fetching with an Editing Context
- EOQualifier class specification in the Enterprise Objects Framework Reference
- EOFetchSpecification class specification in the Enterprise Objects Framework Reference
Revision History
21 July, 1998. David Scheck. First Draft.
19 November, 1998. Clif Liu. Second Draft.
© 1999 Apple Computer, Inc.