We are upgrading our Alfresco 5.2 community edition with Solr 4 index to Alfresco 6.1 with Solr 6.
Everything went fine until we tried the upgrade with our production data. The solr6 index is successfully build within 1-2 hours and searching is possible but the Cascade Tracker which updates the path of child nodes when a parent node is moved will run for months.
The problem is that getting the node metadata from the repository takes 5-10s for each child node and if a child node needs to be updated in the index it takes 30-50s to update it.
There are roughly 100,000 cascading transactions with hundreds of child nodes so it will be finished in a year.
I cloned Alfresco Search Service and put some logging and time measurement in the CascadeTracker and SolrInformationServer to find this out.
After some hours solr consumes all memory (no matter how much we give it) and it also consumes much cpu power and slows down the whole system. It seems the postgresql database is very quiet and has not much to do.
I tried more RAM (32GB), more CPU (6), more worker threads but nothing helps.
I did not find anything about this problem and I run out of ideas.
Some facts about our setup:
- approx. 5,000,000 nodes and 1,000,000 transactions
- 5000 sites
- Index size: 2GB
- alf_data: 75GB
- Alfresco Search Service 188.8.131.52
- ACS 6.1.2-ga
- Postgres 11.5
- docker-compose deployment
Any ideas are welcome!