Labels

accessibility (2) ADF (1) archiver (3) cmu (1) contributor (13) cookie (1) DAM (3) date (3) download (3) dynamic list (4) ephox (5) fatwire (1) fck (1) filters (1) folders (4) headers (2) IBR (3) ImageAlchemy (3) java (4) javascript (2) layout/template (4) link (6) locale (2) multilingual (1) rendition (3) replicator (4) rules (1) schema (1) search (11) sites (1) sitestudio (24) ssp/sspu (5) SSUrlFieldName (2) stellent (4) timezone (1) urm (1) weblogic (1) workflow (2)

Wednesday 3 August 2011

The Secret to Unlimited Search Results

The Content Server search results can't go past the first 200 items. This frustrates users and developers. By comparison, Google reports millions of results. Why does UCM, with potentially millions of content items, stop at 200? The answer is that just because the interface stops, you don't have to as well.

A default search of content server returns 10 pages of 20 items giving a maximum of 200 results. The 20 items per page is set by a variable called ResultCount. Developers usually try to work around the issue by setting the ResultCount as high as possible... but it is also capped at 200 items. Administrators try to increase this ceiling by setting various configuration parameters but ultimately these are hard limits and increasing them will degrade server performance. These tricks are trying to solve the wrong problem.

When someone does a Google search and gets a million results, do they need all of them? Nope. Does Google display all of them on one page? Of course not. It uses pagination to break the results into readable pages, usually 10 per page. Most people don't go past the first three pages. In fact, most people don't go past the first page. They're looking for anything that resembles their query so they just click the first link that might do. I guess someone at Stellent used that reasoning to decide that 10 pages of 20 items was ample enough. But content management is a bit different because the user is looking for something specific. They might try browsing through more than the first three pages, or even ten pages... but are they really going to browse through a million items? Probably not. Just like Google, UCM doesn't need to return all the results at once. It just needs to let you navigate through more of the result pages.

The secret to UCM's unlimited search results is that only the built-in pagination gets capped at 200 items - so just ignore the pagination and calculate your own.

UCM uses values for ResultCount (number of items per page) and StartRow (the item to start pagination from) to build a page of results. You can see these values in the URL. Just change the StartRow to, say, 401 to get a page with search results for items 401 to 420. This is the way to access any 20 items from an unlimited amount of results.

To recalculate the pagination, there's the PageNumber and the TotalRows variables. Use all four variables with a bit of maths and you'll have your own unlimited pagination to cycle through Dynamic Lists. You can then write a simple component to replace the Content Server pagination with your unlimited pagination.

A side-effect of unlimited pagination is improved user behaviour. Previously they just chucked in a simple (or empty) query, moaned about browsing limitations and gave up. But once a user can go beyond 200 items they quickly realise that endlessly browsing results is a waste of time. Instead they invest their time in refining their queries. Ultimately this improves their usage and confidence of the system. Just like Google does.

7 comments:

  1. I thought I was using this with success, but it seems the pagination on my 11g system stops spitting out data once I get to StartRow=2000. Is the system capped at 2000? Using OraceText Index. Thanks!

    ReplyDelete
  2. Yeah, the geniuses at Oracle have recently decided to impose the URM max results settings onto the UCM as well. Might be new to PS5. There was a new setting somewhere to control it... I'll have to look into it. If I find something useful I'll blog about it.

    ReplyDelete
  3. Yikes, that sucks. We're running 11.1.1.5. I see that there is no such restriction in 10g. Lovely! Thanks for digging in to this.

    ReplyDelete
  4. From Oracle Support:
    "This is expected behavior, because OracleTextSearch has a limit of 2048 search results."

    Ah yeah, good ol' OTS. So anyway there's a database patch 12582138 to overcome the limitation, but meanwhile UCM got capped at 2000 so we don't see this bug. I'm not sure if the cap applies to non-OTS search, try appending &SearchEngineName=DATABASE to the URL.

    ReplyDelete
  5. I did find some recommendations on using SearchEngineName=DATABASE. I will test that out. I wonder if they query has to be converted to pure SQL though?

    ReplyDelete
  6. Answering myself on this one... I was able to figure it out. along with pagination worked for me. I got past 2048 results (all the way to 36000, in fact).

    ReplyDelete
  7. SearchEngineName=DATABASE param in search saves my life. Awesome. thanks.

    ReplyDelete