Home
Guide
Tools List
News
Background
Search
About Us

Search Tools for Web Sites and Intranets

 

Current Search Tools News

Sphinx search: New SearchTools Report

Sphinx is an open source free search engine, written in C, using both SQL and custom index files to provide a very fast text search. The architecture scales to over a billion records by distributing the index and querying among multiple virtual and real processors. Read more and tell me about your Sphinx experience here. -- July 11, 2008

X-Robots-Tag

The final new element in the recent agreement on the Robots Exclusion Protocol is the "x-robots-tag". This is an addition to the HTTP header sent in response to a URL request. This header tag can enclose the same values as the Robots META tag: noindex, nofollow, noarchive, nosnippet and noodp. But unlike the meta tag, it's not limited to HTML: it can be applied to non-HTML items, such as PDF, text, office and CAD documents, which may not have useful Properties metadata. More... -- July 10, 2008

Crawling and Indexing Interactive Content

Adobe has a special Flash client for search engine indexing of SWF (Flash) file, beyond the static text. Google is implementing it now, Yahoo and MSN may follow. It's not clear what value the text is in Flash files, how the robots will extract it, what's going on with JavaScript and external XML files. Which reminded me that Google's been auto-filling forms for a while now. More in my new Indexing Interactive Content page or comment on the blog entry. - July 2, 2008

The Long Tail and Short Head of Search

I've just posted an article on the Long Tail, Short Head and Search. Every site, intranet and enterprise search log I've analyzed fits the model of the Long Tail, with a very few very popular search terms, then tailing off very quickly to unique queries (the Long Tail), creating a Zipf curve.

The Short Head -- the few most frequently used search terms -- is the best place to start in analyzing search engine usage. My article also gives some suggestions for taking the information and using it to improve a search engine. - June 26, 2008

HCIR 2008: Workshop on Human-Computer Interaction and Information Retrieval

Making the connection between interface and search, this workshop is focused this year on complex search tasks. The 2007 Workshop presentations ranged from visual text analysis to online consumer choice. This year's workshop will be 23 October, 2008, in Redmond, Washington, USA.

Article on the REP

My article is up on InfoToday: New Robots Exclusion Protocol Agreement Among Yahoo!, Google, and Microsoft Live Search. Nothing earthshaking, just a summary from a library point of view, and a quote from Danny Sullivan saying that this is an important first step. -- June 15, 2008


Recent & Interesting Search Links (my Furl archive)

For more news about search engines, see the News page.

What is a Search Tool
and Why Would I Want One?

Good question! If you have serious content, a site or intranet search engine will allow your visitors to jump directly to the topic they want. More...

Guide to Choosing & Using
Search Tools

The guide will help you learn more about site searching. Or try the remote search services on this site.

Overviews, Reviews, Links, and Surveys

Includes articles and books providing general site search information, and those with more specific product reviews.

Search Tools Product Listings

Alphabetical List or divided by platform: Java, Mac, Perl, Unix, and Windows; also Remote Search Hosting Services, Code Libraries, and Open Source Search Engines

Background Topics and Analysis

Multimedia Search, Faceted Metadata Search, PDF and Web Site Search, Intranets and EIPs, XML and Search, Information Architecture, Web Indexing Robot Spiders, and more.

Vivisimo Velocity Search & Clustering
Clusty
This site provides information, news and advice about web site searching technology. It is maintained by Avi Rappoport, associate AJ Summers and various Search Tools Consulting interns as a service to the Web community. We welcome your comments and suggestions: just contact us. We are also available for search tools needs analysis, competitive analysis, search tools installation and more: for information see the Consulting page , use our contact form or send .
Disclosure: Search Tools Consulting presents the SearchTools.com site as a free service to the web development community and is not sponsored by any advertisers. Search Tools Consulting also provides analysis and information to sites and institutions installing search engines, and to some search engine developers. We do not give them site visitor or survey personal information or allow our relationships with any vendors to change any product review or analysis.
Blog and Comment System

The SearchTools blog on LiveJournal provides an opportunity for you to tell me what you think about enterprise search tools for web sites and intranets, and about the SearchTools.com web site. You do not have to have an account to post, you can reply anonymously. All comments are screened, so there will be no blog spam.

leave a comment

[Valid RSS]RSS feed available

For more news about search engines, see the News page. Technorati Profile
 

Last Update: 2008-07-11

Home
Guide
Tools List
News
Search
About Us

SearchTools.com is Copyright © 1998-2007 Avi Rappoport / Search Tools Consulting
This work is licensed under a Creative Commons Sampling Plus 1.0 License.