3.7.0 (June 6 2012)

New Features

ENGINE-700 Full-Width CJK Characters Converted to ASCII

This feature is in preparation for future support of CJK (Chinese, Japanese & Korean) character sets. When enabled for text dimensions, any full-width CJK character will be converted to its ASCII (half-width) equivalent.

Here is an incomplete list of full-width characters that will be converted: !"#$%&'()*+,-./;<=>?@ABCXYZ[\]^`abc ... xyz{|}~ and 0123456789 (0xff01 through 0xff5e).

To enable the conversion use the new text type dimension attribute normalizeFullWidthChars="true".

Improvements

ENGINE-744 Large Changeset Transfer Progress

The log file now shows progress when transferring the bytes of large changesets. Previously, the log file did not indicate any progress during the transfer.

ENGINE-735 Changeset Application Performance

The application of changesets on engine startup and during feed synchronization has been dramatically improved.

ENGINE-701 ENGINE-677 Text Dimension Improvements

Synonyms are no longer applied to stemmed variations of words. When synonyms expand to phrases, they must match the phrase exactly (no slop allowed). Word split logic on delimiters has been improved.

For example, searching for pit-bull would match pit-bull pitbull or pit bull it would not match the individual words “pit” and “bull” alone.

The same logic holds for synonym expansion. Searching for 24 (with a synonym mapping to twenty-four) will now no longer match documents that hit just twenty or four.

Behavioral Changes

ENGINE-738 Invalid SortBy Dimension Log Level

If a query sortBy criterion referred to an undefined dimension, a warning message would be written to the log file. The log level for this message has been reduced to FINE, which means that it is now suppressed from the log, reducing file bloat.

ENGINE-701 Default Value ignoreInverseDocumentFrequency is now true

Previously, ignoreInverseDocumentFrequency defaulted to false (it was not ignored). Tests with the improved text type dimension features demonstrated that ignoreInverseDocumentFrequency should default to true.

ENGINE-701 Obsolete Features

The following features have been made obsolete by the improved text pipeline: ignoreTermFrequency. The experimental Lucene query has also been removed because its features have been superceded by the latest text type dimension improvements.

This text dimension query criterion key expandPhrases has been removed. Instead, use matchRequires. The values for matchRequires are any and all. Default value is any. This option will cull items from the result set based on the setting. any will not cull at all. all will cull any items from the results that do not match all of the words in the query.

Bug Fixes

ENGINE-726 Text Dimension Matches with Phrases Across Fields

In certain text dimension configurations using multiple fields, searching for words that span field boundaries will now properly score items that match more exactly across field boundaries than when all of the query words are found anywhere in a single field.

Assuming that we’re looking for a person who is associated with a company, we might create a text dimension with Name, Company and Bio fields.

ENGINE-737 Poison Pill on Engine Startup / Invalid Feed Interval

If an engine Feed interval was set to 0 and the feed was still enabled, the engine would fail on startup. This has been fixed.

ENGINE-765 HTTP Client Timeouts

Incoming and outgoing HTTP requests were not correctly handling timeout situations. This releases the correct logic to handle timeouts, particularly as it relates to the phone-home monitor.