Grapeshot Features
Features Table
Grapeshot is a versatile platform for building high performance indexing and search systems. It can signpost to data (like Google) or store data in its own XML database (for example a news wire database). The API allows you to create a search system inside your own application which has the probabilistic algorithms alongside rigid Boolean methods - to help users find concepts within certain parameters of a search (date range, source, document type etc).
Here is a list of features to help you make comparisons
| User Navigation | |
|---|---|
| More Like This | Automatically find related documents at the click of a button. Uses "Term Profiles" to find documents which share the same concepts. No thesaurus or topic tree required. |
| Auto Suggestions | Automatically suggest related useful words for the user to include in his or her search. No thesaurus or topic tree required. |
| Concept Searching | Allows users to initiate a search by highlighting an an email sentence or text paragraph to drive the search |
| Bypass The Search Box | Design embedded searches to drive off whole documents, or invite the user to highlight chosen texts. No typing required. |
| Parametric Searches | Users can restrict searches to certain categories, or Boolean keywords, or to certain price or date ranges |
| Search | |
| Term Weighting | Uses probabilistic information retrieval (baysian inference) alongside the latest BM25 (Best Match) algorithms |
| Concept Searching | Uses a collection of "term profiles" to sense the concept |
| Adjust Term Weights | Term weights automatically adjust during a user search session - personalises the search experience and improves rankings |
| Thesaurus Support | Can include thesauri to modify search |
| Fuzzy Match | Tri-gram analysis can be deployed to assist lookup even when the user types in a misspelling or word variant |
| Language Support | Can be adapted for French, Italian, Russian Cyrillic, Finnish etc |
| Stop List Support | Stop words can be applied during a search to improve recall, yet disabled when performing a phrase searching |
| Phrase Searching | Match for exact phrases including stop words |
| Term Highlighting | Each document can be indexed with word offset positions to facilitate term highlighting |
| Results Clustering | Scripts can be used to perform post-search clustering of the results list. Ability to define the maximum number of clusters permitted, and how to label each cluster |
| Search Alert | Possible to create saved searches which operate as alerts |
| Alert Halo | Automatically suggest additional words to be placed in to Alert profile |
| Group Alerts | Organise alerts for everyone, automatically based on current news-flow or recent additions to the dataset |
| Query Logging | Record search activity |
| Indexing | |
| Categorisation | Can categorise new documents by matching against a database of thesaurus leaves/nodes |
| Summarisation | Can determine appropriate summaries for each document, sentence by sentence |
| Personal Summaries | A document summary can be adjusted or influenced by referring to the words in the user's search |
| Bulk Indexing | Batch process large files, simultaneous as users do their searching |
| Incremental Indexing | Dribble fresh data into the index, for example live news |
| Multiple Collections | Possible to search across multiple indexes |
| System Design | |
| Collapsed Search Stack | Simplified protocol for exchanging data between user, search system and distributed indexes |
| XML Throughout | XML protocols used to send and receive queries, as well as store and recall index information; alongside XML scripting. |
| PIPs | Powerful Information Packets: XML packets can be transported in peer-to-peer systems, where XML packet contains Grapeshot queries, results, source data or B-tree data |
| Multiple Platform | Compile for variety of platforms such as Windows and Linux |
| Small Footprint | Just 300k in size - very high performance and versatile! |
| Database Management | |
| Simultaneous Update | Fresh data can be indexed simultaneous as the users do their searches. No need to lock out data files. |
| Scalable & Robust | Designs benchmarked and proven by Reuters. Search systems indexing 500+ million documents already been created. |
| Roll-Back | Possible to roll back to prior version of database |
| Encryption | Possible to encrypt data flows |
| Distributed Indexes | Possible to create distributed indexes, with each index closer to the dynamic changes of local content |
| Commercial | |
| Expert Heritage | Grapeshot designed by Dr. Martin Porter (Porter Stemming) |
| References | Grapeshot has commercial track record with Muscat experience and new Grapeshot install base |
| Pricing | Flexible pricing for OEMs and System Integrators |
| Training | Available onsite or at our Cambridge UK offices |
| Documentation | Complete API documentation available |
| Support | Available to help you fast-track software developments using Grapeshot |




