Labels

accessibility (2) ADF (1) archiver (3) cmu (1) contributor (13) cookie (1) DAM (3) date (3) download (3) dynamic list (4) ephox (5) fatwire (1) fck (1) filters (1) folders (4) headers (2) IBR (3) ImageAlchemy (3) java (4) javascript (2) layout/template (4) link (6) locale (2) multilingual (1) rendition (3) replicator (4) rules (1) schema (1) search (11) sites (1) sitestudio (24) ssp/sspu (5) SSUrlFieldName (2) stellent (4) timezone (1) urm (1) weblogic (1) workflow (2)

Monday 5 December 2011

OracleTextSearch: sorting by multiple fields = multiple problems

OracleTextSearch mangles the multi-sorting functionality of UCM search results. The MultiSort or SearchSortOptions components won't work, I don't know if there are newer versions that will. But the ability to sort on multiple fields is still possible, read on to figure out how to do it.

An attempt to sort on multiple fields uses the variable "SortSpec" and add it to the search request. It typically looks like this:
SortSpec="ORDER BY dDocTitle ASC, dDocName DESC"

This will sort results by title ascending then content ID descending. It works fine on DATABASE.FULLTEXT and DATABASE.METADATA systems and the two sorting components automagically assemble this sorting clause for the user.

OraceTextSearch systems don't work this way. They "optimise" text fields and add a prefix of "sd" when used for ORDER BY sorting. The format becomes this:
SortSpec="ORDER BY sddDocTitle ASC, sddDocName DESC"

But there are further complications...

The main headache is that only optimised fields can be used for multiple sorting and a maximum of 32 fields need to be manually selected for optimisation. This can be done from "Administration > Text Search Admin" in 10g and in 11g use the Configuration Manager applet and click the "Advanced Search Configuration" button. But wait...

Some systems will automatically use up all 32 fields and every search will fail, other systems will never optimise any fields - it depends on how & when it is installed. And that's 32 fields minus dDocName and dDocTitle who can't be excluded, minus all custom date fields and integer fields that are configured for indexing, minus all the system date & integer fields. If you want the full list, Oracle Support gives you some SQL that extracts a txt file that you need to wade through. Strewth.

I also found it fascinating that an exact match search on non-optimised fields will be treated as a "contains" search. This means "foo" is an exact match of "foo bar" apparently. I'm sure that would be polluting option-list metadata field searches somewhere. Make sure you optimise those option lists!

Optimised fields supposedly have to be less than 250 characters. I guess that rules out sorting by longtext or memo fields... I'm puzzled though about how dDocTitle works when Folders_g changes it to 255 chars.

Sorting by year or month is not possible, using an EXTRACT(YEAR BY dInDate) fails.

Frankly these issues have me as mad than a cut snake, I'd be hard pressed to recommend using OracleTextSearch. The benefits are supposedly faster indexing & searches, support for stemming, thesaurus & spell-checking (nothing provided of course) and the four drill-down filters.

One final important thing - some environments will add the clause ORDER BY and others will not. Put it in and take it out to see if you need it. I think it might be affected by the two multi-sort components and the server version.

No comments:

Post a Comment