Increase Max File Size That Solr Indexes

Showing results for 
Search instead for 
Did you mean: 
Member II

Increase Max File Size That Solr Indexes

Hello everyone,

I have installed Alfresco Community Edition Vers 5.2 on windows (using exe file). As I noticed in my log file, when I upload a PDF file larger than 10 MB, the Alfresco (Solr) is not extracting its text and therefore the file content can not be searched. The log file says:

Metadata extraction rejected, Extracter: org.alfresco.repo.content.metadata.PdfBoxMetadataExtracter@39882d66 Reason: Max doc size exceeded 10 MB.

I would appreciate it if someone could tell me how can I increase this size. I have already tried some solutions (for example increasing alfresco.contentStreamLimit located in file alfresco-community/solr4/archive-SpaceStore/conf/solrcore and alfresco-community/solr4/workspace-SpaceStore/conf/solrcore)

Thanks a lot in advance.

1 Reply
Senior Member

Re: Increase Max File Size That Solr Indexes

The limitation is defined in your Alfresco repository which converts the pdf to text. Please check your transformer configuration which is by default defined in alfresco-ce-repository/ at 5.2.g-patched · ecm4u/alfresco-ce-repository · Git... (sorry I didn't find a valid tag in the Alfresco git repo for 5.2).

Depending on the transformer which takes the task you should increase the maxSourceSizeKBytes.



and set debuggin in your log4j properties

to find out which transformer actually is running for your documents and/or install GitHub - OrderOfTheBee/ootbee-support-tools: OOTBee Support Tools addon to extend set of administrat... to debug and modify transformation config  from your browser.