Mismatched Solr Index numDocs and Size on Disk

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Member II

Mismatched Solr Index numDocs and Size on Disk

Hi,

Can anyone provide some advice please, we have a load balanced Alfresco CE 4 environment using Solr for indexing.  The environment has two front end Alfresco servers, each pointing to their own solr server.  The Alfresco environments are synchronized, however we are seeing a difference in the stats and disk space used on the solr servers.  4GB of disk space used more in solr server 2 even though the stats indicate 41558 less documents.  Is it worth running a repair on the indexes? re-building the indexes takes 4-5 days and leaves the system unusable so didn't really want to go down that road unless essential.

Solr Server Index Data

9 Replies
Highlighted
Alfresco Employee

Re: Mismatched Solr Index numDocs and Size on Disk

¿Alfresco CE 4.0.d?

Software Engineer in Alfresco Search Team.
Highlighted
Member II

Re: Mismatched Solr Index numDocs and Size on Disk

CE 4.2f

Highlighted
Alfresco Employee

Re: Mismatched Solr Index numDocs and Size on Disk

That version does not support clustering, so probably it will be additional errors to the SOLR one you found.

Software Engineer in Alfresco Search Team.
Highlighted
Member II

Re: Mismatched Solr Index numDocs and Size on Disk

Clustering isn't configured, the two front end servers are load balanced without clustering.  that isn't the problem.

Highlighted
Alfresco Employee

Re: Mismatched Solr Index numDocs and Size on Disk

So you have 2 alfresco webapps using their own SOLR but sharing the same database and filesystem?

Software Engineer in Alfresco Search Team.
Highlighted
Member II

Re: Mismatched Solr Index numDocs and Size on Disk

yep.

Highlighted
Alfresco Employee

Re: Mismatched Solr Index numDocs and Size on Disk

When you say that reindexing leaves the system unusable is because the process consumes a lot of resources, right?

If you cannot reindex to have both SOLR Indexes paired, then you can inspect every node to identify and classify the errors.

What looks weird is that maxDoc in Node 1 is 2... but 3... in Node 2. This number should be, more or less, the same, as is the max value for the ID in the table ALF_NODE.

Software Engineer in Alfresco Search Team.
Highlighted
Member II

Re: Mismatched Solr Index numDocs and Size on Disk

Yes, during a re-index the solr server doesn't have enough resource to respond to new queries in a timely manner.

The max value in the ALF_NODE table is: 18341293

Highlighted
Partner

Re: Mismatched Solr Index numDocs and Size on Disk

If you trust server 2, I would recommend copying the index directories from that server to server 1. This is a well known procedure during upgrades of clusters (that is, creating the index in a single server, and then copy the files to the other servers).

As usual, keep a backup copy of index 1, just in case. Ideally, you should stop both servers before getting the backup and performing the copy of the files. Although that may mean a short period of down time, it will be much faster that performing a full reindex.

Let us know how it went.

Cheers,

Luis