I have a Alfresco Community (201707) installation which i am using to compare the default solr 4 vs solr 6 in the alfresco-search-services-1.1.0 install.
After a full index with Solr 4, I get the following info from the solr4 admin page:
Num Docs: 163458
Max Docs: 163458
...
Deleted Docs: 0
...
Master (Searching) 1524504594659 159 6.5 GB
...
Nodes in Index: 70921
Transactions in Index: 80844
Approx transactions remaining: 0...
Unindexed Nodes: 11441
Error Nodes in Index: 0
in the solr4 SUMMARY report, I can see that it's done:
Node count with FTSStatus Clean 69165
Node count with FTSStatus Dirty 0
Node count with FTSStatus New 0
When I test the solr 6 setup, I stop the alfresco app, make the changes to the alfresco install for Solr 6, start the solr server and the alfresco server, and let it re-index. It plugs along for a few hours, and then completes with the following stats:
Num Docs:164357
Max Doc:164357
...
Deleted Docs: 0
...
Master (Searching) 1524581958240 586 2.48 GB
, and in the SUMMARY report:
Alfresco Nodes in Index 70937
Alfresco Transactions in Index 81470
Alfresco Unindexed Nodes 11698
Alfresco Error Nodes in Index 0Node count with FTSStatus Clean 69181
Node count with FTSStatus Dirty 0
Node count with FTSStatus New 0
When i run the ERROR query I get nothing:
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"q":"ERROR*",
"wt":"json"}},
"response":{"numFound":0,"start":0,"docs":[]
}}
So the indexer looks done and comparable volume-wise to the solr4 setup.
What first concerned me was the significantly smaller size: the Solr4 6.5 Gb vs Solr6 2.5 Gb size after a complete reindex, when I was expecting a 15% size increase with the introduction of fingerprints.
There are some docs that I can't get in a full text search result set, even though the docs have the index aspect attached. I can try to reindex one of those docs, but no luck
http://[myip]:8983/solr/admin/cores?action=reindex&query=sys%5C%3Anode%5C-dbid%3A135156
At reindex time I saw a few
"FlateFilter: stop reading corrupt stream due to a DataFormatException"
and
"An error occured when reading table hmtx"
But no more then I saw on the solr4 setup.
Any thoughts on how best to troubleshoot the inconsistencies?
Also, I know i can't upgrade to the pdfbox 2.0.X in 5.2, but anyone able to replace the pdfbox-1.8.10.jar and pdfbox-1.8.10.jar with pdfbox-1.8.13.jar and pdfbox-1.8.13.jar to get over the pdfbox probs?
I re-read a previous question that Cesar Capillas had answered, and I think he answered the potential for the size discrepancies (https://community.alfresco.com/message/830710-request-for-solr-6-search-services-troubleshooting-adv...). I needed to look at my shared.properties, so thanks for the previous answer
Ask for and offer help to other Alfresco Content Services Users and members of the Alfresco team.
Related links:
By using this site, you are agreeing to allow us to collect and use cookies as outlined in Alfresco’s Cookie Statement and Terms of Use (and you have a legitimate interest in Alfresco and our products, authorizing us to contact you in such methods). If you are not ok with these terms, please do not use this website.