Transformation triggers huge CPU load

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Active Member II

Transformation triggers huge CPU load

Hello everybody, a few days ago I uploaded a folder of files (.doc, .txt, .pdf,....) on my Alfresco server (201707GA), then I noticed that CPU load started increasing uninterruptedly, reaching a CPU load of 400% and over. I tried to restart Alfresco server, but CPU started again to increase until the server was unusable. Searching on the forums I enabled debug log for Alfresco transformation:

log4j.logger.org.alfresco.repo.content.transform.TransformerDebug=DEBUG log4j.logger.org.alfresco.repo.content.transform=DEBUG

and after enabling these settings, in the log files I noticed a  lot of lines regarding the transformation of a single file:

2020-02-07 00:05:30,299 DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-700] 68 pdf txt document_of_deployment_ugpa.delay_NO DATA_.pdf 803.9 KB -- index -
- SolrIndexer
2020-02-07 00:05:30,310 DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-700] 68 workspace://SpacesStore/84abc9c6-e665-4285-b263-f61b6e972f68
2020-02-07 00:05:30,310 DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-700] 68 **a) [50] PdfBox < 25 MB 0 ms
2020-02-07 00:05:30,310 DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-700] 68 b) [120] TikaAuto < 25 MB 0 ms
2020-02-07 00:05:30,311 DEBUG [content.transform.TransformerDebug] [http-bio-8443-exec-700] 68.1 pdf txt document_of_deployment_ugpa.delay_NO DATA_.pdf 803.9 KB PdfBox

this sequence of lines repeated about every 5 minutes for about 150 times...I deleted that file from Alfresco and the server load returned normal.

Now my question is: is there a way to avoid this behaviour, without disabling Alfresco transformation? Maybe limiting the number of retries?

Is it an issue related to document preview or indexing?

1 Reply
Highlighted
Community Manager
Community Manager

Re: Transformation triggers huge CPU load

Have you looked at this JIRA - seems to be similar. I haven't read the entire JIRA, but there may be some suggestions there: https://issues.alfresco.com/jira/browse/MNT-7666

 

There are some parameters that can be used to change the timeouts, etc. https://docs.alfresco.com/community/references/dev-extension-points-content-transformer.html