Hello and thank you for your response!
From what I understand you are suggesting:
- create a custom content model, with a property that contains the full document text, and set it accordingly
- in the context of Alfresco Share, create an advanced search field that can search in the new property
- when searching keywords, the document should appear
1. Are these new properties beeing sent by default to Solr for indexing? or are there extra steps to be done?
The problem that I can see with this approach is that when searcing for keywords, I can find the document but there would not be any highlight done to the actual keywords (searching in Alfresco Share)
What I am hoping to achieve is to tap in the actual document data extraction mechanism of Alfresco and provide myself the actual data.
2. One thing that I am not sure about is who is doing the data extraction? Alfresco or Solr?
- does Alfresco extract the data from a document and then sends that data to be indexed to Solr?
- or does Alfresco send the entire document (with or without properties) to Solr and Solr does all the data extraction and indexing work? (data extraction using Apache Tika)
- from what I can gather, both are capable of doing data extraction and that confuses me
3. Can I tap in that mechanism, with code or some API to actually do some of the work myself?
My end goal here is to provide an intuitive user experience regarding the search functionality, both with dicoverability and user friendliness (by showing highlighted serached text)