Re: Not able to index content of large pdfs in database mysql
Well, not really cross-posting as the OP is different. But the answer in the other thread is definitely spot on for a similar issue with transformers. What is not mentioned in the other thread is that the transformer config is also documented.
But in this case we are talking about metadata extractors, and these have separately configured limits. In fact, the PdfBox extractor is about the only one that has a configured limit via the global property content.metadataExtracter.pdf.maxDocumentSizeMB