SmartDiscovery
overall product includes a search engine, taxonomy creation and categorization,
entity extraction, document summarization, related documents, etc.
MetaText Server
automatically extracts and stores topical metadata about documents: summaries,
related documents, people, places and things. Includes a metadata repository
and a search engine with concept search, full-text, and Boolean queries.
VizServer offers visualization technologies for data in relational databases
and unstructured data repositories, improving data mining and text mining
processes.
Categorizerautomatically assigns documents to categories from a subject taxonomy
using linguistic and statistical algorithms.
Platform: Windows 2000, Unix: Solaris 2.7
Price: contact company
Features (Search and Retrieval)
Web robot crawler, file system indexer.
Integrates with Documentum, Lotus Notes, JDBC/ODBC for databases
Uses the security model in the system.
Metadata repository based on entity extraction, identifying companies, locations
and key people, can store in XML in database, or back in original environment.
Additional modules for Simplified and Traditional Chinese, Japanese and
Korean.
File formats supported include Microsoft Word, Microsoft Excel, PDF, XML,
Lotus WordPro, AmiPro, Corel Presentations, WordPerfect, Lotus Freelance,
Microsoft Works, Corel Quattro Pro, HTML, Lotus 1-2-3, ASCII and approximately
70 other file formats.
Keyword and Boolean searching.
Concept searching, based on noun-phrase co-occurrence, other topics related
to search words.
Browser-based administration and editing tools, defining collections, scheduling.
Collection Explorer for results pages, optional visualization.
Supports faceted metadata search display and browsing.
Includes taxonomy using the Star
Tree Java visualization engine, also for taxonomy management.
APIs in Java or C, XML access to content.
Sample code and extensive documentation.
Other Modules
LinguistX Library:
A collection of components for many languages that provide word and phrase
analysis, stemming, tokenization, parts of speech analysis, noun phrase
extraction, language identification, summarization, etc.
Murax natural-language
search engine and organizes search results into clusters, using co-occurrence
analysis. Also shows snippets of documents rather than the entire file.
WhizBang Engine - designed to crawl internal and external sources, classify
them according to type, and extract key entities. This applies external
structure to otherwise unstructured data.
Articles
Unstructured Data Management:
the elephant in the corner
(guest or customer access required) the(451) Report, November 2002 by Nick
Patience and Rachel Chalmers Describes the general features of SmartDiscovery, which combines the MetaText,
VizServer and Categorizer and taxonomy generation bought from WhizBang Labs.
Praises the language recognition and natural-language processing features.
From
search to find InfoWorld September 28, 2001 by Mario Apicella
Describes new search engines seeking to make it easier for buyers to find
products. Mentions H5 Technologies categorization of documents, Ultraseek (then Inktomi Search)
Software access to multiple file formats and Inxight Star Tree and
Table Lens visualization features.
Examples:
NASS website - the USDA's
National Agricultural Statistics Service (NASS) has census data for about
two million farms. They use the Inxight VizServer, and the TableLens software
included to do visualizations of this data.
The Inxight site is using Star
Tree for visual navigation and the Ultraseek engine for text search.
Avi Rappoport of Search Tools Consulting can help you evaluate your search engine, whether it's on a site, portal, intranet, or Enterprise.
Please contact SearchTools for more information.