2.8.5 (December 29 2010)

New Features

ENGINE-36 Text & fieldedText Dimensions Can be Memory Resident

The any text or fieldedText type dimension can now be made fully memory resident. To enable this feature, use the new “inMemory” dimension attribute.

ENGINE-100 Improved Text Matching When Using Phrases, Starts With, Stemming and Phonetic Analyzers

The engine now supports a new feature for dimensions that can improve relevance of text searches when stemming and phonetic analyzers (soundexes and metaphones) are used. With stemming and phonetic analysis, the engine discards the original text value and replaces it with a normalized version that can faciliate searches. The result is that at query time, the engine no longer knows if what the user searched for was an exact match to the original word. This new feature resolves that shortcoming. The new feature can be enabled with dimension attribute “storeOriginalWord”. This feature opens rich possibilities for using phonetic analyzers on proper names.

At query time, “startsWith” searches can be enhanced such that exact matches to the query are ranked higher than incomplete matches. To enable this feature use the new criteria field “scoring”.

Additionally, the proximity query parser will apply a new scoring option to add a boost to the query phrase when “expandPhrases” is in effect. This allows customers to determine how much an exact phrase match is weighed compared to matches against individual words in the phrase.

ENGINE-100 New Query Criteria Scoring and Relevance Options

The query request criterion can now include an optional “scoring” field that can further refine scoring for an individual criterion.

At query time, “startsWith” searches can be enhanced such that exact matches to the query are ranked higher than incomplete matches. To enable this feature use the new criteria field “scoring”.

Additionally, the proximity query parser will apply a new scoring option to add a boost to the query phrase when “expandPhrases” is in effect. This allows customers to determine how much an exact phrase match is weighed compared to matches against individual words in the phrase. The new scoring field option is “phraseMatchBoost”. Using the new scoring field style option “difference”, customers can determine the per-character scoring penalty that the engine will appy for matches that are not identical to the search term prefix. Alternatively, customers can choose to use scoring style “constant” to assign a specific score to identical matches (“scoreEqual”) versus starts with matches (“scoreDifferent”).

When a dimension applies the new “storeOriginalWord” attribute, the scoring field can be used to determine the scoring logic to apply, for example, how to score results where the search term matches the original word or when it was matched against a stemmed or phonetically analyzed form. To determine the score boost to apply when the query finds an identical match to the search term, use the new scoring field option “termMatchBoost”.

Finally, if the text query requests multiple terms, customers can specify if that any matches to the terms result in a maximal score. Previoulsy, if two terms were included in the query, then both terms would have to match in order for the query component to have a maximal score. With the new scoring field “multiValueMatch”, any term match results in the maximum criteria score. To enable constant scoring, set “multiValueMatch” to “any”.

ENGINE-216 ENGINE-303 Admin Tool Enhancements

The engine’s admin tool has been significantly enhanced to make mangement easier. Any setting made to alter the server behavior will now be made persistent. There is no longer any need to manually edit discovery.properties. At startup, the engine will automatically migrate appropriate settings from discovery.properties to the new settings schema. The settings are managed on a new page labeled “Settings”. The Admin tab has been renamed appropriately to “Feeds”.

The new settings are located discovery.settings.

Where practical, all settings take immediate effect. The following parameters are available on the new Settings page:

  • Engine Name
  • Admin Tool Time Zone
  • Locale
  • Custom CSS
  • Item Cache Maximum Item Count (Cache usage is also displayed).
  • Startup option to disable engine response to queries during statup and whether the engine rebuilds all indices from scratch when restarting.

ENGINE-315 JDK 1.4 Logging

The engine now uses Java 1.4 logging. Customers familiar with JDK 1.4 logging may configure their environment accordingly. Log rotation is automatically enabled in this version.

Enhancements

ENGINE-305 Start/Stop Script Restart Option Should Check if Engine is Running

The engine’s start/stop script “restart” option has been improved to check to see if the engine is running before taking the appropriate action.

ENGINE-17 Engine needs to report ANY changeset application, dimension indexing progress, checkpoint creation

The engine now reports on any behavior that needs to be logged. For example, small changesets and dimension changes are now always reported. Previously, only long-running events were reported.

ENGINE-301 Display of Comma Lists in Dimension Attributes

The admin tool Dimensions page now formats values in the “key” attribute such that the page layout remains whole.

ENGINE-295 ENGINE-297 Clearing Changesets

Clearing the item changeset database has been significantlly optimized, reducing the overall cost of reloading an index and also reducing the periodic cost of creating a checkpoint.

ENGINE-304 Text Type Dimension Query Optimization Improvement

When the new fieldedText type dimension was added, no recommendation was made to migrate from the existing type to the new version. Because of the way the fieldedText type dimension is implemented, there may be a slight query performance advantage to using this dimension type. Customers may choose to update their dimensions definitions replacing all text types with fieldedText types. The relevance of matches may change, so customers should verify that the fieldedText version meets their query relevance needs.

ENGINE-183 ENGINE-153 Multiple improvements to the stability and performance of text dimensions

Multiple improvements to the stability and performance of text dimensions.

Behavioral Changes

ENGINE-319 Empty Changeset HTTP status code

If you user curl or another application to submit an empty changeset to the engine, the new response status code is 204 “No Content. Previously, the engine would reponse with HTTP status code 500 “Internal Server Error”.

ENGINE-315 JDK 1.4 Logging

The engine’s log file no longer rolls over when the engine starts. The JDK 1.4 logging system manages engine log rollover when the log file reaches capacity.

ENGINE-324 Rename engine “lucene” folder to “text”

The folder name of the engine text dimensions, when stored in the file system has been renamed from “lucene” to “text”. After installed version 2.8.5, the original lucene folder can be deleted.

Bug Fixes

ENGINE-4 ENGINE-88 Minor indexing issues were addressed

Minor indexing issues were addressed.

ENGINE-309 Get Properties Empty String Query Value

The “properties” top-level query field “properties” would flag an error if the property list was the empty string. This has been corrected. The empty string is now interpreted to mean <em>all</em> properties.