Home Guide Tools Listing News Info Search About SearchTools

As of January, 2012, this site is no longer updated, due to work and health issues.


Testing Search Indexing and Date Problems

This page is testing what search indexers do with really strange page dates. Date errors make indexers waste cycles re-indexing unchanged content, but worse, they lie to searchers about the content currency, which is a vital element in assessing the value of a search result. For more information see the Report on Document Date Issues for Search Indexing.

For more robot indexing tests, see the List of Search Indexing Tests.

Normal Modification Date

This is the standard case: a static file with a modification date that is correctly passed on by the server.

Servers Lying: error dates

Some file servers and web servers are easily misconfigured and generate ludicrously wrong modification dates in the far past and future. These include Win 2003 and ATG servers, among others

Dynamic Dates

Technologies such as SSI, Perl, PLP, Python, ASP and others can generate pages on-d emend: they don't exist until someone sends a request for that URL. Because the information could be entirely new, these systems default to setting the date as the moment the file was generated.

Dynamic dates are a problem for search because the indexer can't tell if the content has changed, so it has to re-index the content, and make old content look misleadingly new.

Date Overrides

To avoid the problem with dynamic dates and server errors, date tags can identify the meaningful content date, solving the problems mentioned above.

HTTP Equivalent tags

While not explicitly part of the HTTP/HTML recommendations, some servers will extract HTTP EQUIV tags from pages and include them in headers sent back to the requesting client. You can use these commands to send a more useful date as part of the system response, although it may not get processed correctly for "HEAD" requests

<meta http-equiv="Last-Modified" content="01 Jan 2000 01:00:00 GMT">

Not all search engines will honor these tags, but they ought to!

(This solution suggested by "The Sad State of Dates" in the New Idea Engineering Newsletter.)

Meta tags

There are examples of schema Date meta tags in the HTML recommendations, and support for the Dublin Core date tags, which are standard across the Web. Again, these will probably not be sent with "HEAD" requests, but enterprise search engines can extract them during indexing for date accuracy.

When displaying dates, W3C recommends the ISO 8601 format, YYYY-MM-DD. The Dublin Core example looks like this:

<meta name="DC.date" content="2001-07-18" />

Not all search engines will honor these tags, but they ought to!

comment on page date problems on the blog post

Page Updated: 2011-03-08