accessibility (2) ADF (1) archiver (3) cmu (1) contributor (13) cookie (1) DAM (3) date (3) download (3) dynamic list (4) ephox (5) fatwire (1) fck (1) filters (1) folders (4) headers (2) IBR (3) ImageAlchemy (3) java (4) javascript (2) layout/template (4) link (6) locale (2) multilingual (1) rendition (3) replicator (4) rules (1) schema (1) search (11) sites (1) sitestudio (24) ssp/sspu (5) SSUrlFieldName (2) stellent (4) timezone (1) urm (1) weblogic (1) workflow (2)

Wednesday, 10 October 2012

404 Page is Found!

For the unthinkable scenario where there's a broken link on your site, Site Studio provides a custom Error page. For reasons unfathomable it doesn't return a 404 HTTP header. Here's how to avoid all the problems this creates by adding the header.

I assume you know what a 404 header is. Web crawlers and other robots need it in order to exclude the URL from their references. Without it, for example, a search of your site in Google could return all your broken URLs instead of real pages. There's also the dilemma of the SiteStudioPublisher endlessly churning out broken pages instead of halting or discarding them. Cue #facepalm.

I had a client who complicated things with a service error on their error page, which meant it returned 503 headers instead. The GoogleBot kept returning to the site every few hours to get a fresh copy, and a 503 can't be cached by servers. The daily removal of pages from their site created an exponential performance drain on their system - no users were on their site but it was getting hammered by bots and flat out like a lizard drinking! Ugh.

Another fatal hazard I've seen is where SSP/SSPU is publishing a site and the error page references a broken URL, creating a recursive loop.

In the past the 404 problem was addressed by a custom component that could change the headers... but unfortunately these days it's nowhere to be found. Fortunately it's recently been rolled into the recent SiteStudio core. I'm not sure exactly when, could be the June 2011 release.

The first step is to create an error page on your website. Without it, the standard Content Server error page is displayed, which is uh, undesirable. Simply create a Section, assign a Page Template and then mark the section as the Error Handler.

Now, to mark the page with a 404 HTTP header, edit the template and right at the top, insert this code:
This tells the page to return a 404 header. Now that your site correctly returns a "404 Page not found" status you can spend some time creating a beautiful/witty/helpful 404 page, like these: or these:

No comments:

Post a Comment