Labels

accessibility (2) ADF (1) archiver (3) cmu (1) contributor (13) cookie (1) DAM (3) date (3) download (3) dynamic list (4) ephox (5) fatwire (1) fck (1) filters (1) folders (4) headers (2) IBR (3) ImageAlchemy (3) java (4) javascript (2) layout/template (4) link (6) locale (2) multilingual (1) rendition (3) replicator (4) rules (1) schema (1) search (11) sites (1) sitestudio (24) ssp/sspu (5) SSUrlFieldName (2) stellent (4) timezone (1) urm (1) weblogic (1) workflow (2)

Sunday 29 December 2013

Adding value to search

UCM search is probably the most fundamental part of the product yet it has barely changed since inception. Here are some tips & tricks for finding what you want.

User Search Tweaks

The simplest search is the Quick Search, on the top-right of the Content Server interface. Just hit the button and the most recent items are listed. This simplicity often confuses the new adopters and forms a lazy habit for those familiar with the system. For example, users on brand new systems will complain that the search results are polluted with developer objects, such as templates and definitions. What they don't realise is those items are simply the most recent and will "disappear" to the end of the list once the users start putting real content into the system. Resist requests to "hide" items.

To get better value out of the quick search, set up some targeted searches. The admin user might add a quicksearch that returns all fragments. (My first reaction to targeted search was "yuck, need to click the target icon to turn it on & off" but then I realised it could use a selector key. For example, the fragment search might use "f" so to search for all fragments I'd type "f *".)

I recommend the following setup to every user:
  1. Set up a targeted quick search with a query where the author is the current user. (Should be done by an admin so everyone gets it.)
  2. Run the search and save the query.
  3. Go to your profile settings and set the query as your default.
  4. Then add the default search to your homepage.
Adding the search for your own items to your homepage is a big boost to your productivity, putting your recent items up front where they can be accessed immediately.

Another value-adding feature that is not used enough is the Search Within. Run that query to return your own content and then use the Search Within to narrow down content by date or type. (Note that in 11g the Search Within was replaced by the "Search Form" link that allowed further modifications to the original search form, see Kyle's blog.)

System Search Cache & Maximum Results

Now for some server tweaks. I realised that looped pagination to defeat the hard limit of 200 results would perform too slowly because each search would invalidate the cache. This was particularly noticeable things like news websites, where a Dynamic List of news archives stretching back a few years would take up to 15 seconds to process. I believe the default caching limits are too low to be useful.


The search cache is configured to only hold the first 200 results. Any query within the 200 returns from the cache, but any query beyond 200 not only misses the cache but invalidates the first 200 cache. Even a repeated search will miss the cache. I recommend putting these settings in your config:
  • MaxResults=1000
  • SearchCacheHardLimit=1000
  • SearchCacheSoftLimit=40
This gives a huge cached archive of 1000 items, plenty for news archives. (Any more than that is impractical for a casual public website.) The soft limit is used for every search request made. The default is 20 but the system is designed to return 25 per request if you don't specify a ResultCount. It's a bit of a strange oversight. The Content Server interface uses a ResultCount of 20 but most users won't go beyond two pages, so I set the soft limit at 40. It's not a big change so it won't fill up the server memory. In the case of Dynamic Lists, the archive will go into the hard limit which is now set at 1000. This keeps it in the cache and means no more looping through queries. It chews a bit of memory but it should be ok, I found most systems rarely went beyond 10% caching memory on default settings so there is plenty of room for these changes.

One other config setting that may be helpful, the config flag of MaxExternalSearchResultCount=XX separates MaxResults for internal and external URM searches. On these systems, use &SearchEngineName=database to avoid the 2000 limit. In fact, always use database search. The OracleTextSearch option is great for indexing but horrid for search. Use a global rule for search and set a side-effect to use the database search engine, see my comment here. (Note you'll lose the search filters but they're rarely set to useful fields anyway.)

No comments:

Post a Comment