4.2 (May 24 2017)¶

Compatibility Changes¶

ENGINE-1057 Paged search results now have a soft limit of 10,000¶

Any combination of startIndex and pageSize that exceed this limit will act as if the search only contained 10,000 matches.

See max-paged-docs to learn about relaxing this limit.

ENGINE-1054 No longer restricts startIndex to be less than totalSize¶

If you request a page that starts after the last result the returned startIndex is no longer capped based on the totalSize of the results

Query:

{
  "startIndex": 100,
  "pageSize": 10,
  "criteria":[{"dimension":"example"}]
}

Now preserves startIndex:

{
  "itemIds": [],
  "exactMatches": [],
  "relevanceValues": [],
  "pageSize": 10,
  "currentPageSize": 0,
  "startIndex": 100,
  "exactSize": 24,
  "totalSize": 59,
  "datasetSize": 5000
}

Previously the capped based on totalSize:

{
  "itemIds": [],
  "exactMatches": [],
  "relevanceValues": [],
  "pageSize": 10,
  "currentPageSize": 0,
  "startIndex": 59,
  "exactSize": 24,
  "totalSize": 59,
  "datasetSize": 5000
}

Improvements¶

ENGINE-1041 Migrate away from legacy numeric fields to the new Lucene 6 point fields¶

Lucene 6 introduced better support for indexing numeric data with it’s new N-dimensional point fields. This replaces their previous term/trie based numeric fields. The engine now takes advantage of the new field types for integer, double, long, time, and geoloc dimensions.

ENGINE-1064 No longer show zero count rows on the indices tab for keyword dimensions¶

Rows with a count of zero are now filtered out when displaying the indices tab for keyword dimensions. This is helpful for quick data navigation when combined with the provider and content type filters.

ENGINE-1049 Upgrades Apache Lucene from 6.2.1 to 6.5.1¶

The Apache Lucene library has been upgraded from 6.2.1 to 6.5.1.

ENGINE-1057 Adds soft limit for deep paging and groupBy topN¶

Using startIndex and pageSize to obtain search results past the first 10,000 hits is now prevented. Similarly a previously undocumented limit of 100 for the groupBy topN option is now documented and configurable.

See max-paged-docs and max-groupby-topn to learn about relaxing these limits.

ENGINE-1050 Support for specifying different sort criteria for fuzzy tail¶

Adds soryByFuzzy as an option that allows you to specify a different sort order for the fuzzy tail. When enabled this forces exact matches to be ordered first.

Example to randomize exact matches and order fuzzy matches by distance to a location:

{
  "criteria": [{"dimension":"example"}],
  "pageSize": 20,
  "sortBy": [
    {"builtin":"random"}
  ],
  "sortByFuzzy": [
    {"dimension":"location","longitude":-74.04,"latitude":40.69}
  ]
}

ENGINE-1056 Debug API no longer exposes placeholder fuzzy queries when there is no fuzzy tail¶

The debug API response will no longer contain queryFuzzy or explainFuzzy when there is no fuzzy tail. Previously a fake empty fuzzy tail would be described.

ENGINE-1055 Invalid queries now return a HTTP 400 instead of 500¶

Queries that failed validation would return a HTTP 500 status (server error). They now return a HTTP 400 status (bad request) and log the payload.

You can trigger this by POSTing a query with either invalid JSON or a negative startIndex.

ENGINE-1058 Upgrades dependent libraries¶

Upgrades dependent libraries.

library	previous	current
icu4j	56.1	58.2
jetty	8.1.15	9.4.1
springframework	4.1.8	4.3.6
commons-fileupload	1.3.1	1.3.2
commons-io	2.4	2.5
slf4j	1.7.13	1.7.22

Bug Fixes¶

ENGINE-1063 Indices tab forgets the selected dimension when you change the provider or content type¶

If you have a provider or content type filter dimension configured in the settings tab of the admin interface your current selection would be lost when you change the provider or content type when looking at a single dimension’s data on the indices tab. You can now change these selections without loosing your place.

ENGINE-1059 XML entity limit in Java 8u101¶

In release 4.1 we migrated from our older bundled Woodstox XML parser to the one provided by the JVM. Oracle updated this in Java 8u101 to have a default limit to the number of entity expansions that could happen. For some client changesets this limit is hit and the changeset cannot be processed. Generating an error like so:

[20170105 10:04:10,698] [0000001c] [ERROR] [com.t11e.discovery.lucene.ChangesetIndexUpdater] [index-0] Problem processing changesets, will no longer process changes
java.lang.RuntimeException: java.lang.RuntimeException: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[929140,10699]
Message: JAXP00010004: The accumulated size of entities is "50,000,001" that exceeded the "50,000,000" limit set by "FEATURE_SECURE_PROCESSING".
        at com.t11e.discovery.lucene.ChangesetRecoverer.apply(ChangesetRecoverer.java:472)

The engine now sets the appropriate JAXP property to ensure this limit it not enforced.

This affects Java 8u101 and higher, clients still using Java 8u91 or lower are not affected by this bug. If you aren’t ready to upgrade to release 4.2 and need a temporary workaround for release 4.1 you can set add -Djdk.xml.totalEntitySizeLimit=0 to the jvm.args line in your discovery.properties file.

ENGINE-1052 Queries that use groupBy and a custom sortBy can have an incorrectly populated exactMatches array¶

Triggered when using groupBy and a custom sortBy that does not place exact matches first. When populating the exactMatches array, the engine was incorrectly promoting groups to exact if their first matching document (based on the current sortBy) was exact instead of making a group exact when any of it’s matching documents are exact. With this change, the contents of the exactMatches array agree with exactSize and any facet or drillDown counts.

Query:

{
  "criteria": [{"dimension":"example"}],
  "groupBy": {"dimension":"group"},
  "sortBy": [{"builtin":"exactMatch","reverse":true}],
  "pageSize": 10
}

Could previously return:

{
  "itemIds": ["g1","g2","g3"],
  "exactMatches": [false,false,true],
  "relevanceValues": [1.0,1.0,1.0],
  "isGrouped": true,
  "pageSize": 10,
  "currentPageSize": 3,
  "startIndex": 0,
  "exactSize": 2,
  "totalSize": 3,
  "datasetSize": 5000
}

And will now return:

{
  "itemIds": ["g1","g2","g3"],
  "exactMatches": [true,false,true],
  "relevanceValues": [1.0,1.0,,1.0],
  "isGrouped": true,
  "pageSize": 10,
  "currentPageSize": 3,
  "startIndex": 0,
  "exactSize": 2,
  "totalSize": 3,
  "datasetSize": 5000
}

ENGINE-1053 Returned page can be too large when spanning the exact/fuzzy boundary with groupBy enabled¶

If the current page spanned the exact and fuzzy boundary when using groupBy the parallel arrays in the response would be too long. This bug was introduced in release 4.0.

Query:

{
  "criteria": [{"dimension":"mysearch"}],
  "groupBy": {"dimension": "mygroup"},
  "startIndex": 1,
  "pageSize": 2
}

Could previously return:

{
  "itemIds": ["g2","g3","g4"],
  "exactMatches": [true,false,false],
  "relevanceValues": [1.0,0.0,0.0],
  "isGrouped": true,
  "pageSize": 2,
  "currentPageSize": 3,
  "startIndex": 1,
  "exactSize": 2,
  "totalSize": 35,
  "datasetSize": 5000
}

And will now return:

{
  "itemIds": ["g2","g3"],
  "exactMatches": [true,false],
  "relevanceValues": [1.0,0.0],
  "isGrouped": true,
  "pageSize": 2,
  "currentPageSize": 2,
  "startIndex": 1,
  "exactSize": 2,
  "totalSize": 35,
  "datasetSize": 5000
}

4.2 (May 24 2017)¶

Compatibility Changes¶

ENGINE-1057 Paged search results now have a soft limit of 10,000¶

ENGINE-1054 No longer restricts startIndex to be less than totalSize¶

Improvements¶

ENGINE-1062 Dynamically defined facets for numeric dimensions¶

ENGINE-1041 Migrate away from legacy numeric fields to the new Lucene 6 point fields¶

ENGINE-1064 No longer show zero count rows on the indices tab for keyword dimensions¶

ENGINE-1049 Upgrades Apache Lucene from 6.2.1 to 6.5.1¶

ENGINE-1057 Adds soft limit for deep paging and groupBy topN¶

ENGINE-1050 Support for specifying different sort criteria for fuzzy tail¶

ENGINE-1056 Debug API no longer exposes placeholder fuzzy queries when there is no fuzzy tail¶

ENGINE-1055 Invalid queries now return a HTTP 400 instead of 500¶

ENGINE-1058 Upgrades dependent libraries¶

Bug Fixes¶

ENGINE-1063 Indices tab forgets the selected dimension when you change the provider or content type¶

ENGINE-1059 XML entity limit in Java 8u101¶

ENGINE-1052 Queries that use groupBy and a custom sortBy can have an incorrectly populated exactMatches array¶

ENGINE-1053 Returned page can be too large when spanning the exact/fuzzy boundary with groupBy enabled¶