| |
Current Search Tools News
Sphinx search: New SearchTools Report
Sphinx is an open source free search engine, written in C, using both SQL and custom index files to provide a very fast text search. The architecture scales to over a billion records by distributing the index and querying among multiple virtual and real processors. Read more and tell me about your Sphinx experience here. -- July 11, 2008
X-Robots-Tag
The final new element in the recent agreement on the Robots Exclusion Protocol is the "x-robots-tag". This is an addition to the HTTP header sent in response to a URL request. This header tag can enclose the same values as the Robots META tag: noindex, nofollow, noarchive, nosnippet and noodp. But unlike the meta tag, it's not limited to HTML: it can be applied to non-HTML items, such as PDF, text, office and CAD documents, which may not have useful Properties metadata. More... -- July 10, 2008
Crawling and Indexing Interactive Content
Adobe has a special Flash client for search engine indexing of SWF (Flash) file, beyond the static text. Google is implementing it now, Yahoo and MSN may follow. It's not clear what value the text is in Flash files, how the robots will extract it, what's going on with JavaScript and external XML files. Which reminded me that Google's been auto-filling forms for a while now. More in my new Indexing Interactive Content page or comment on the blog entry. - July 2, 2008
The Long Tail and Short Head of Search
I've just posted an article on the Long Tail, Short Head and Search. Every site, intranet and enterprise search log I've analyzed fits the model of the Long Tail, with a very few very popular search terms, then tailing off very quickly to unique queries (the Long Tail), creating a Zipf curve.
The Short Head -- the few most frequently used search terms -- is the best place to start in analyzing search engine usage. My article also gives some suggestions for taking the information and using it to improve a search engine. - June 26, 2008
HCIR 2008: Workshop on Human-Computer Interaction and Information Retrieval
Making the connection between interface and search, this workshop is focused this year on complex search tasks. The 2007 Workshop presentations ranged from visual text analysis to online consumer choice. This year's workshop will be 23 October, 2008, in Redmond, Washington, USA.
Article on the REP
My article is up on InfoToday: New Robots Exclusion Protocol Agreement Among Yahoo!, Google, and Microsoft Live Search. Nothing earthshaking, just a summary from a library point of view, and a quote from Danny Sullivan saying that this is an important first step. -- June 15, 2008
For more news about search engines, see the
News page.
|
|
Good
question! If you have serious content, a site or intranet search
engine will allow your visitors to jump directly to the topic
they want. More...
|
|
The
guide will help you learn more
about site searching. Or try the
remote search services on this site.
|
|
Includes
articles and books providing general
site search information, and those with more specific product reviews.
|
|
Alphabetical
List or divided by platform: Java,
Mac, Perl,
Unix, and Windows;
also Remote Search Hosting Services,
Code Libraries, and Open
Source Search Engines |
|
Multimedia
Search, Faceted Metadata
Search, PDF and Web Site Search,
Intranets and EIPs, XML and Search, Information Architecture,
Web Indexing Robot Spiders, and
more. |
|
|
This
site provides information, news and advice about web site searching
technology. It is maintained by Avi Rappoport, associate AJ Summers
and various Search Tools Consulting
interns as a service to the Web community. We welcome your comments
and suggestions: just contact us.
We are also available for search tools needs analysis, competitive
analysis, search tools installation and more: for information see
the Consulting page , use our
contact form or send
.
Disclosure:
Search Tools Consulting presents the SearchTools.com site as a free
service to the web development community and is not sponsored by
any advertisers. Search Tools Consulting also provides analysis
and information to sites and institutions installing search engines,
and to some search engine developers. We do not give them site visitor
or survey personal information or allow our relationships with any
vendors to change any product review or analysis. |
|
The
SearchTools
blog on LiveJournal provides an opportunity for you to tell
me what you think about enterprise search tools for web sites and
intranets, and about the SearchTools.com web site. You do not have
to have an account to post, you can reply anonymously. All comments
are screened, so there will be no blog spam.
leave
a comment
|
For more news about search engines, see the
News page. Technorati
Profile
|
|