What is a good way to troubleshoot the solr 6 database deployed with the Alfresco Search services 1.1.0?
I am using Alfresco Community 201707EA version with a Linux / postgres config , and originally was using the solr4 that was deployed by default. My repository has approx 70K PDFs, and had a 7Gb 'Alfresco' Core.
I tested with the easiest search services setup: installed a non-SSL setup running in the $ALFRESCO home directory with the same user that is running alfresco. The only changes I made to the solr.in.sh was to add the SOLR_JAVA_HOME to be the same JVM that the Alfresco install is using. I made the appropriate changes to the alfresco properties files and removing solr4 from the webapps directory and started both solr6 and alfresco with no errors.
The index starts to build, and I see the appropriate amount of CPU activity that indicates the indexing. after maybe 20 mins the solr 6 admin page shows that there are 160K num docs and max docs, and the Alfresco core is under 300Mb, and based on the CPU activity it is finished. This is way too small, obviously. There are NO errors or info messages in any of the logs.
- How do I force a reindex? There is a 'reindex' button, but it doesn't seem to do anything. In solr4, i would delete the files in
- any tips on troubleshooting on what is not being indexed? Some stuff must be indexed, because a search on 'the' returns a few thousand records, but the solr 6 section of the docs are a little light on troubleshooting.
The total size of the solr 6 disk usage is 2.6Gb, which is smaller then the 7 Gb Solr 4 disk usage. I was expecting it to be larger with the fingerprint hash, but I will start to take a look closer. For instance, I have duplicate documents that aren't showing up as being the same or similar with the FTS fingerprint query. I suspect that something is not completing, but there are no info/warn/errors being posted in the solr or alfresco logs. Some documents have fingerprint results, but I'll start another thread with some specific Fingerprint-y questions.
You should compare in terms of number of documents indexed.
But when doing it in terms of size, take into consideration that old deleted documents in SOLR 4 cores, may occupy in disk. When deleting a document in Alfresco, the index in SOLR is updated (not deleted). So many deletions or massive deletions, without a complete reindexation, may do increase your indices size.
The comparison here would be: delete / reindex with solr4 and get a 7 Gb disk space Alfresco core size. Switch the index to solr 6 and similarly reindex the same content and get a 2.5 Gb core size. That strikes me as a little suspicious, but as you say, I am focusing on the wrong metric, and I should be comparing the index node count. Based on what I THINK is inconsistent Fingerprint behavior, I need to get some more specifics to ask better questions.