Search Services 2.0.0 Release

cancel
Showing results for 
Search instead for 
Did you mean: 

Search Services 2.0.0 Release

angelborroy
Alfresco Employee
0 3 6,474

Alfresco Search Services 2.0.0 has been released.

 

Obtaining this release

The ZIP Distribution file can be downloaded from the following URL:

https://download.alfresco.com/cloudfront/release/community/SearchServices/2.0.0/alfresco-search-serv...

If you are using Docker, you can get the new Image by typing:

 

docker pull alfresco/alfresco-search-services:2.0.0

 

And finally, the source code for this version is available in:

https://github.com/Alfresco/SearchServices/tree/2.0.0

 

Alfresco Insight Engine 2.0.0 has been also released.

Insight Engine and Insight Engine Zeppelin artifacts can be downloaded from Quay.io using Enterprise credentials.

 

Upgrading

Search 2.0.0 will require a full re-index upon deployment. This is necessary to accommodate the removal of the index store and improved storage of date fields.

 

New Features

Search Services

  • Alfresco Index Store (Solr content store) removal
    • Solr schema simplification
    • Using atomic updates approach instead of removing and creating documents for updating operations
      • In order to use this feature, we enabled by default SOLR Transaction Log. This will increase the storage size for your indexes, but it must not be disabled in solrconfig.xml file as it's required for the atomic updates feature
    • Improved management of replication relying on default SOLR mechanism
    • Reviewing your backup and restore procedures is required, as the folder $SOLR_HOME/contentstore is not created anymore
  • Fine control for FIX tool in Search Services
    • Allow targeting of FIX operations to transactions in a particular timeframe
    • Addition of dry run option for FIX tool to analyse effects before committing
    • Limitations on the number of simultaneous processes that can be run
    • Implementation of a Disable command allowing FIX actions to be cancelled
    • New parameters added to the Core admin FIX tool (at request time and in configuration):
      • dryRun Optional request parameter which can be true or false (defaults to true). When true, the health report is generated, but the reindex work is not scheduled
      • fromTxCommitTime Optional request parameter which indicates the lower bound (the minimum transaction commit time) of the target transactions we want to check/fix
      • toTxCommitTime Optional request parameter which indicates the upper bound (the maximum transaction commit time) of the target transactions we want to check/fix
      • maxScheduledTransactions Optional request parameter which controls the maximum number of transactions that will be scheduled. If this is not specified in the request (as a request parameter) then the system checks the following property in the solrcore.properties: alfresco.admin.fix.maxScheduledTransactions = 500
      • New response shape
        {
          {
          "responseHeader": {
          "QTime": 1,
          "status": 0
          },
          "action": {
          "status": "scheduled",
          "txToReindex": {
          "txInIndexNotInDb": {
          "192": 282 <- Tx 192 is associated to 282 nodes (they will be deleted)
          "827": 99 <- Tx 827 is associated to 99 nodes (they will be deleted)
          ...
          },
          "duplicatedTx": {
          "992": 8 <- Tx 992 is associated to 8 nodes (they will be deleted)
          "127": 82 <- Tx 127 is associated to 82 nodes (they will be deleted)
          ...
          },
          "missingTx": {
          "888": 84 <- Tx 888 is associated to 84 nodes (they will be added/replaced in the index)
          "929": 12 <- Tx 929 is associated to 12 nodes (they will be added/replaced in the index)
          ...
          }
          },
          "aclChangeSetToReindex": {
          // this is much pretty the same,
          // ACLTXID -> ACLs counts instead of TXID -> DBID
          }
        }

 

  • Improvements to exact term search behaviour
    • Use of the "=" operator now provides exact term match rather than full field match
    • Using the following search string in Share "=Taxi Driver", you'll find documents containing exactly these words in this order.
  • Refactoring of cascade tracker for large transactions for improved indexing performance

Increasing the Transaction Batch Size for nodes and ACLs has an impact while the maximum number for your deployment is not reached. After that, you can increase this batch size but there will be no performance changes

alfresco.transactionDocsBatchSize (default 2000)
alfresco.changeSetAclsBatchSize   (default 2000)

Increasing the Node Batch Size can improve your performance while you are down the right number for your deployment. After that, you can increase this batch size but the performance will be penalised

alfresco.nodeBatchSize                   (default 50)
alfresco.cascade.tracker.nodeBatchSize   (default 10)
alfresco.contentUpdateBatchSize          (default 2000)
alfresco.aclBatchSize                    (default 100)

Increasing the maximum number of Parallel Threads improved performance until the maximum number for our deployment was reached. However in a real world deployment it may be useful to use a lower number to avoid impacting other processes.

alfresco.metadata.tracker.maxParallelism   (default 32)
alfresco.cascade.tracker.maxParallelism    (default 32)
alfresco.content.tracker.maxParallelism    (default 32)
alfresco.acl.tracker.maxParallelism        (default 32)

  • Improved storage of date fields
# Date/Datetime fields only: if this property is set to true (default value) each date/datetime field
#
# - will be indexed as a whole value
# - will generate additional fields corresponding to its constituent parts (year, month, day, hour, minute, second)
#
# If this property is set to false then the only the whole value is indexed.
# This will result in a smaller index, but date function support will be disabled.
#
alfresco.destructureDateFields=true

 

Insight Engine

  • Added support for Excel and Tableau to Alfresco Search & Insight Engine
  • Includes multiple improvements to SQL support, particularly for date functions
  • Requires 3rd party ODBC/JDBC drivers from CData - https://www.cdata.com/drivers/alfresco/

 

Third Party Product Versions

  • Solr 6.6.5
  • Jetty 9.3.27.v20190418
  • Zeppelin 0.8.2

 

Compatibility

  • Search Services 2.0.0 works with ACS 6.2
  • Insight Engine 2.0.0 works with ACS 6.2 and AGS 3.2.0
  • JVM 11 is required

 

Details

 

  • SEARCH-1315: SQL compliant date/datetime management in InsightEngine
  • SEARCH-1825: Support for Date operations
  • SEARCH-1826: Field types and data type conversion issue with Date fields
  • SEARCH-2151: Support for Date Functions in SELECT Clause
  • SEARCH-2166: Support for Date Functions in WHERE Clause
  • SEARCH-2167: Support for Date Functions in GROUP BY Clause
  • SEARCH-2173: Support for Date Functions in ORDER BY Clause
  • SEARCH-2171: Index dates in a decomposed way (year, month, day and etc)
  • SEARCH-2296: Support for QUARTER function
  • SEARCH-2304: Support for CAST AS TIMESTAMP function for Date and Datetime fields
  • SEARCH-2316: Error is produced when trying to produce a graph for number of nodes created by year and type
  • SEARCH-2317: Can't apply a filter to a date field
  • SEARCH-2353: Support for TIMESTAMPADD(timeUnit, integer, datetime)
  • SEARCH-2354: Support for DAYOFMONTH, DAYOFWEEK, DAYOFYEAR
  • SEARCH-2366: Support SQL TIMESTAMP format
  • SEARCH-2132: InsightEngine builds virtual "alfresco" table calling IndexSearcher methods several times

 

  • SEARCH-1369: Update/Upgrade Solr configuration for enabling graph filters support
  • SEARCH-1371: Lucene Match Version Upgrade (from 4.9 to LATEST)
  • SEARCH-2227: Improve fields and field types definition in SolrSchema
  • SEARCH-1688: Remove AlfrescoClusteringComponent in favour of the built-in Solr Clustering Component
  • SEARCH-1689: Changes on CachedDocTransformer
  • SEARCH-1692: Remove SolrContentStoreTest
  • SEARCH-1693: Changes on Highlighter as consequence of ContentStore removal
  • SEARCH-1694: Changes on Fingerprint as consequence of ContentStore removal
  • SEARCH-1702: Changes on SolrInfomationServer as consequence of ContentStore removal
  • SEARCH-1707: Changes on schema.xml as consequence of ContentStore removal
  • SEARCH-2024: Replace SolrContentStore with Fake implementation
  • SEARCH-2025: Remove SolrContentStore classes
  • SEARCH-2044: New DocTransformer for converting Date fields
  • SEARCH-2059: Changes required by the new content status detection mechanism

 

  • SEARCH-2233: Expose additional parameters for FIX tool in REST API
  • SEARCH-2248: Add a limit to a number of txns per run of FIX tool
  • SEARCH-2330: Implement disable/enable indexing commands for tracking process

 

  • SEARCH-2356: Support setting default search log level with docker installations
  • SEARCH-2372: Update base docker image
  • SEARCH-2127: Load balancer config for multiple search slave nodes
  • SEARCH-2118: Official support of OpenSSL 1.0.2 series seems to be ended
  • SEARCH-2129: AclTracker is indexing ACLs more than one time
  • SEARCH-2331: SolrException: Invalid Number when searching on d:int property with value exceeding Integer.MAX_VALUE
  • SEARCH-2126: In sharded mode, the facets with 0 results are supressed in the response unlike standalone mode

 

Known Issues

  • SEARCH-2316: Error is produced in Tableau when trying to produce a graph for number of nodes created by year and type
  • SEARCH-2297: Fields in a custom model defined as a DATE can get recognised as a STRING
  • SEARCH-2400: Default values for "shared.properties" have changed from 1.4.
  • SEARCH-2145: Slow reindex speed in 2.0 comparing to 1.4 in 50 million repository
  • SEARCH-2461: Exact term queries behave different agains the DB and SOLR
  • SEARCH-2460: Solr Action REINDEX doesn't work when using "nodeid" parameter
  • SEARCH-2496: No Helm Charts are available
  • SEARCH-2538: Solr properties belonging an Aspect are not removed when the aspect is removed
About the Author
Angel Borroy is Hyland Developer Evangelist. Over the last 15 years, he has been working as a software architect on Java, BPM, document management and electronic signatures. He has been working with Alfresco during the last years to customize several implementations in large organizations and to provide add-ons to the Community based on Record Management and Electronic Signature. He writes (sometimes) on his personal blog http://angelborroy.wordpress.com. He is (proud) member of the Order of the Bee.
3 Comments