Content Transformation Debug

cancel
Showing results for 
Search instead for 
Did you mean: 

Content Transformation Debug

resplin
Intermediate
0 0 5,999

Obsolete Pages{{Obsolete}}

The official documentation is at: http://docs.alfresco.com




Introduction


In the 4.0 nightly community builds, extra log4j debug was added (code from 4.0.1 and 3.4.8 service packs) to track which content transformers have been used, which are available, why they have been selected and why they are not available.


Detail


Debug


The following section shows how to turn on debug and includes an example sequence of uploading a Microsoft Excel file and then previewing it in Share.


Turning on debug


This is achieved by adding the following log4j property.



log4j.logger.org.alfresco.repo.content.transform.TransformerDebug=DEBUG

To debug conversions that call a shell programm (e.g. imagamagick convert) you can also set:





log4j.logger.org.alfresco.util.exec.RuntimeExec=DEBUG

Creating document library thumbnail



DEBUG 1         xlsx png  Small1.xlsx 12.7 KB ContentService.transform(...)
DEBUG 1         **a) transformer.complex.OpenOffice.Image<Complex>>   0 ms
DEBUG 1           b) transformer.complex.JodConverter.Image<Complex>> 0 ms
DEBUG 1.1       xlsx pdf  Small1.xlsx 12.7 KB transformer.OpenOffice<Proxy>>
DEBUG 1.2       pdf  png  <TemporaryFile>> 16.1 KB transformer.complex.PDF.Image<Complex>>
DEBUG 1.2.1     pdf  png  <TemporaryFile>> 16.1 KB transformer.PdfToImage<Failover>>
DEBUG 1.2.1.1   pdf  png  <TemporaryFile>> 16.1 KB failover.transformer.PdfRenderer.PdfToImage
DEBUG 1.2.2     png  png  <TemporaryFile>> 34.9 KB transformer.ImageMagick<Proxy>>
DEBUG 1         Finished in 2,361 ms

  • Debug lines for each request are prefixed with the same number.
  • This request shows the creation of a png file for the thumbnail used to represent the Excel file in the Document Library.
  • As can be seen, this uses a complex transformer, which means it is composed of lower level transformers.
    • In this case the xlsx (Excel extension) is converted to pdf (1.1) then to a png (1.2).
    • This example looks a little strange as the png created in 1.2.1 (by 1.2.1.1) is then transformed into another png in 1.2.2 which turns it into a thumbnail.
  • Each nested transformer is given a nested number.
  • The name of the source file (Small1.xlsx) to each transformer is included. Were an intermediate file is used there is no name and <TemporaryFile>> appears.
  • Don't worry too much about all the detail yet.

Indexer checks to see if it should run on upload



DEBUG 2         xlsx txt  Small1-1.xlsx 12.7 KB ADMLuceneIndexer
DEBUG 2         **a) transformer.Poi                    58 ms
DEBUG 2         Run later. Transformer average (58 ms) > 20 ms
DEBUG 2         Finished in 2 ms Transformer NOT called

  • The Indexer makes a check to see if it is worth converting the xlsx to text at this stage.
  • As can be seen, it does not do so, as the average transformation time is 58 ms which is above the 20 ms threshold.
  • As a result file will get indexed later in an background thread.

Indexer does not index the thumbnail



DEBUG 3         png  txt  doclib 10.2 KB ADMLuceneIndexer NO transformers
DEBUG 3         Finished in 2 ms Transformer NOT called

  • The Indexer now checks to see if it can index the png file that was just created.
  • It does not attempt to transform the png as there are no transformers that can do this as shown on line 2.
  • In this case the name (such as doclib or preview) indicates the class of thumbnail.

Indexer runs in background



DEBUG 4         xlsx txt  Small1.xlsx 12.7 KB ADMLuceneIndexer
DEBUG 4         **a) transformer.Poi                    58 ms
DEBUG 4         Finished in 64 ms

  • The Indexer now transforms the xlsx file to text so that it may add it to the index. This is taking place in a background thread.

Preview



DEBUG 5         xlsx swf  Small1.xlsx 12.7 KB ContentService.transform(...)
DEBUG 5         **a) transformer.complex.JodConverter.Pdf2swf<Complex>> 0 ms
DEBUG 5           b) transformer.complex.OpenOffice.Pdf2swf<Complex>>   0 ms
DEBUG 5.1       xlsx pdf  Small1.xlsx 12.7 KB transformer.JodConverter<Proxy>>
DEBUG 5.2       pdf  swf  <TemporaryFile>> 16.1 KB transformer.Pdf2swf<Proxy>>
DEBUG 5         Finished in 916 ms

  • Moving to the preview page, the xlsx needs to be transformed into the mimetype 'application/x-shockwave-flash' (extension swf).
  • Again this requires a complex transformer. In this case there are two, one called ' 'transformer.complex.JodConverter.Pdf2swf' and the other is called  'transformer.complex.OpenOffice.Pdf2swf'.
  • The '0 ms' value after the  transformer name is the average time it has taken to transform content in the past. In this case neither have been used before.
  • The '**'  indicates which one is going to be used. The one with the lowest value is used.
  • As the swf file is stored, you will not see this transformation take place again.
  • You can also see the size of the xlsx file (12.7 KB) and the intermediate pdf file (16.1 KB) are shown. This is useful when setting  up limits on transformations.
  • You may have spotted that the second line indicates where the request is coming from. In this case a public API ContentService.transform(...), which converts a file.

Indexer does not index the preview



DEBUG 6         swf  txt  preview 87.9 KB ADMLuceneIndexer NO transformers
DEBUG 6         Finished in 2 ms Transformer NOT called

  • The Indexer then checks to see if it can index the swf file just created, in the same way it did for the png in request 3.

More debug


Store values


To gain a little more background context to the transformations, you may wish to turn on some more log4j output. The following Content Audit trace shows high level operations on nodes performed by the user. The following example shows the upload and the preview of the file Small1.xlsx.



log4j.appender.File.Threshold=trace
log4j.logger.org.alfresco.repo.audit.access.AccessAuditor=trace


TRACE [org.alfresco.repo.audit.access.AccessAuditor]
        /alfresco-access/transaction/action=CREATE
        /alfresco-access/transaction/aspects/add/cm:author=null
        /alfresco-access/transaction/aspects/add/cm:titled=null
        /alfresco-access/transaction/node=workspace://SpacesStore/d262bae1-cebe-4d33-bf73-9e728a8ce63f
        /alfresco-access/transaction/path=/app:company_home/st:sites/cm:aaa/cm:documentLibrary/cm:Small1.xlsx
        /alfresco-access/transaction/properties/add/cm:author=adavis
        /alfresco-access/transaction/properties/add/cm:content=contentUrl=store://2012/2/13/14/6/0b401def-49f9-4fd2-b1e1-fb3d14a59e5f.bin|mimetype=application/vnd.openxmlformats-officedocument.spreadsheetml.sheet|size=13017|encoding=UTF-8|locale=en_GB_|id=754
        /alfresco-access/transaction/properties/add/cm:created=Mon Feb 13 14:06:02 GMT 2012
        /alfresco-access/transaction/properties/add/cm:creator=admin
        /alfresco-access/transaction/properties/add/cm:modified=Mon Feb 13 14:06:03 GMT 2012
        /alfresco-access/transaction/properties/add/cm:modifier=admin
        /alfresco-access/transaction/properties/add/cm:name=Small1.xlsx
        /alfresco-access/transaction/properties/add/sys:locale=en_GB
        /alfresco-access/transaction/properties/add/sys:node-dbid=1581
        /alfresco-access/transaction/properties/add/sys:node-uuid=d262bae1-cebe-4d33-bf73-9e728a8ce63f
        /alfresco-access/transaction/properties/add/sys:store-identifier=SpacesStore
        /alfresco-access/transaction/properties/add/sys:store-protocol=workspace
        /alfresco-access/transaction/sub-actions=createNode updateNodeProperties createContent readContent addNodeAspect
        /alfresco-access/transaction/type=cm:content
        /alfresco-access/transaction/user=admin

TRACE [org.alfresco.repo.audit.access.AccessAuditor]
        /alfresco-access/transaction/action=addNodeAspect
        /alfresco-access/transaction/aspects/add/rn:renditioned=null
        /alfresco-access/transaction/node=workspace://SpacesStore/d262bae1-cebe-4d33-bf73-9e728a8ce63f
        /alfresco-access/transaction/path=/app:company_home/st:sites/cm:aaa/cm:documentLibrary/cm:Small1.xlsx
        /alfresco-access/transaction/sub-actions=addNodeAspect readContent
        /alfresco-access/transaction/type=cm:content
        /alfresco-access/transaction/user=admin

TRACE [org.alfresco.repo.audit.access.AccessAuditor]
        /alfresco-access/transaction/action=READ
        /alfresco-access/transaction/node=workspace://SpacesStore/d262bae1-cebe-4d33-bf73-9e728a8ce63f
        /alfresco-access/transaction/path=/app:company_home/st:sites/cm:aaa/cm:documentLibrary/cm:Small1.xlsx
        /alfresco-access/transaction/sub-actions=readContent
        /alfresco-access/transaction/type=cm:content
        /alfresco-access/transaction/user=admin

Transformer Limits



DEBUG 7         txt  png  Large1.txt 24.5 MB ContentService.transform(...)
DEBUG 7         **a) transformer.complex.Text.Image<Complex>> 0 ms
DEBUG 7         --b) transformer.PdfToImage<Failover>>        > 1 MB
DEBUG 7.1       txt  pdf  Large1.txt 24.5 MB transformer.PdfBox.TextToPdf
DEBUG 7.1       Page limit (1) reached. Returning EOF
DEBUG 7.2       pdf  png  <TemporaryFile>> 1.8 KB transformer.complex.PDF.Image<Complex>>
DEBUG 7.2.1     pdf  png  <TemporaryFile>> 1.8 KB transformer.PdfToImage<Failover>>
DEBUG 7.2.1.1   pdf  png  <TemporaryFile>> 1.8 KB failover.transformer.PdfRenderer.PdfToImage
DEBUG 7.2.2     png  png  <TemporaryFile>> 50.8 KB transformer.ImageMagick<Proxy>>
DEBUG 7         Finished in 664 ms

  • When a transformer is not selected because of a source size limit, this is reflected in the output with a '--' prefix. This example shows a transformation request from txt to png. However in this case, transformer.PdfToImage is not able to handle source files larger than 1 MB, because it would be too slow or it is known that another transformer is faster or better for another reason.
  • Also note the 'Page limit (1) reached. Returning EOF' log message indicating that another limit has been applied. In this case only one page of pdf has been generated as there is no need to create more when generating a png thumbnail for the Document Library.

Trace


Turning on trace


More information is available if the TransformerDebug logging level is set to TRACE rather than DEBUG.



log4j.logger.org.alfresco.repo.content.transform.TransformerDebug=TRACE

Extra trace in in transformation requests



TRACE 8         store://2012/2/13/14/2/8ec08fe3-8690-4f6f-9d1f-2aef176a63d7.bin
TRACE 8         application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/x-shockwave-flash
DEBUG 8         xlsx swf  Small1.xlsx 12.7 KB ContentService.transform(...)
DEBUG 8         **a) transformer.complex.OpenOffice.Pdf2swf<Complex>> 0 ms
DEBUG 8         --b) transformer.not.enabled<Complex>> > 0 bytes
TRACE 8.1       application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/pdf
DEBUG 8.1       xlsx pdf  Small1.xlsx 12.7 KB transformer.OpenOffice<Proxy>>
TRACE 8.1       Finished in 502 ms
TRACE 8.2       store://D:\workspaces\V4.0.0\apache-tomcat-6.0.29\temp\Alfresco\ComplextTransformer_intermediate_xlsx_2713205269292030258.pdf
TRACE 8.2       application/pdf application/x-shockwave-flash
DEBUG 8.2       pdf  swf  <TemporaryFile>> 8.1 KB transformer.Pdf2swf<Proxy>>
TRACE 8.2       Finished in 199 ms
DEBUG 8         Finished in 720 ms

This includes:


  • The full source and target mimetypes (including intermediate stages), which is useful when configuring new transformations or limits on transformations.
  • The time taken to finish each sub transformer.
  • Transformers that have a limit of 0 bytes for a given mimetype. In other words they have been turned off.

Trace getMaxSourceSizeBytes



TRACE 9         application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/png
TRACE 9         xlsx png  ContentService.getMaxSourceSizeBytes() = 1 MB
TRACE 9         **a) transformer.complex.OpenOffice.Image<Complex>> 2,275 ms
TRACE 9         Finished in 3 ms Transformer NOT called

TRACE 10         application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/x-shockwave-flash
TRACE 10         xlsx swf  ContentService.getMaxSourceSizeBytes() = 1 MB
TRACE 10         **a) transformer.complex.OpenOffice.Pdf2swf<Complex>> 0 ms
TRACE 10         Finished in 7 ms Transformer NOT called

TRACE 11         application/vnd.openxmlformats-officedocument.spreadsheetml.sheet image/jpeg
TRACE 11         xlsx jpg  ContentService.getMaxSourceSizeBytes() = 1 MB
TRACE 11         **a) transformer.complex.OpenOffice.Image<Complex>> 2,275 ms
TRACE 11         Finished in 6 ms Transformer NOT called

  • There will also be extra requests to the ContentService that have not resulted in a transformation taking place. For example, when the first file of a given mimetype is previewed in Share, the trace will show all the transformers that are available when creating various thumbnails. You will note there is call from ContentService.getMaxSourceSizeBytes() for each type of thumbnail. The call is setting up a cache and includes a maximum size that may be converted. This trace includes no filename or size, as the caller is gathering general information.

Trace extra calls to getTransformer



TRACE 12         store://2012/2/13/14/2/8ec08fe3-8690-4f6f-9d1f-2aef176a63d7.bin
TRACE 12         application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/x-shockwave-flash
TRACE 12         xlsx swf  Small1.xlsx 12.7 KB ContentService.getTransformer(...)
TRACE 12         **a) transformer.complex.OpenOffice.Pdf2swf<Complex>> 0 ms
TRACE 12         Finished in 3 ms Transformer NOT called

TRACE 13         store://2012/2/13/14/2/8ec08fe3-8690-4f6f-9d1f-2aef176a63d7.bin
TRACE 13         application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/x-shockwave-flash
TRACE 13         xlsx swf  Small1.xlsx 12.7 KB ContentService.getTransformer(...)
TRACE 13         **a) transformer.complex.OpenOffice.Pdf2swf<Complex>> 0 ms
TRACE 13         Finished in 5 ms Transformer NOT called

TRACE 14         store://2012/2/13/14/2/8ec08fe3-8690-4f6f-9d1f-2aef176a63d7.bin
TRACE 14         application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/x-shockwave-flash
DEBUG 14         xlsx swf  Small1.xlsx 12.7 KB ContentService.transform(...)
                 . . .

  • With trace turned on you will see extra requests not normally shown that are made to work out if it is possible to transform content before actually making the request. In the example above there are two such calls before the actual transform.

Active transformers


Sometimes it is useful to discover all the active transformers in the system, which transformations they support, any size limits they might have, how long they have taken on average and if there are any other transformers that have declared the same combination of mimetypes as an explicit transformation. Explicit transformations are used in preference to all other transformers that are not explicit.

This debug is produced when a file called debugTransformers.txt that contains 18 bytes exactly is uploaded to Share. This is to avoid the debug being generated accidentally. When done, all active and inactive transformers are listed. The debug is normally almost a couple of thousand lines long. The following is a small snippet.



DEBUG 15         Active and inactive transformers (list not reduced by 'explicit' settings)
DEBUG 15.1       transformer.ImageMagick<Proxy>> 107 ms
DEBUG 15.1           1) eps  eps   unlimited
DEBUG 15.1           2) eps  jp2   unlimited
DEBUG 15.1           3) eps  cgm   unlimited
                     . . .
DEBUG 15.1         361) dwt  dwt   unlimited
DEBUG 15.2       failover.transformer.PdfRenderer.PdfToImage 324 ms
DEBUG 15.2           1) pdf  png   unlimited
DEBUG 15.3       failover.transformer.PdfBox.PdfToImage 0 ms
DEBUG 15.3           1) pdf  png   unlimited
                     . . .
DEBUG 15.6       transformer.PdfBox 0 ms
DEBUG 15.6           1) pdf  txt   unlimited EXPLICIT
DEBUG 15.6           2) pdf  html  unlimited
DEBUG 15.6           3) pdf  xhtml unlimited
DEBUG 15.6           4) pdf  xml   unlimited
                     . . .
DEBUG 15.10      transformer.complex.OpenOffice.Image<Complex>> 3,400 ms
DEBUG 15.10          1) txt  eps   1 MB
DEBUG 15.10          2) txt  jp2   1 MB
DEBUG 15.10          3) txt  cgm   1 MB
DEBUG 15.10          4) txt  gif   1 MB
                     . . .
DEBUG 15.31 transformer.complex.JodConverter.Pdf2swf<Complex>> 0 ms INACTIVE
DEBUG 15         Finished in 2,026 ms Transformer NOT called

Shell excecution


Thelog4j instruction



log4j.logger.org.alfresco.util.exec.RuntimeExec=DEBUG

will in addition log the command run:



2013-02-07 13:05:30,879  DEBUG [util.exec.RuntimeExec] [defaultAsyncAction1] Execution result:
   os:         Linux
   command:    [convert, /opt/alfresco412Pgsolr/tomcat/temp/Alfresco/ImageMagickContentTransformerWorker_source_765123998470929903.jpg[0], -auto-orient, -resize, 100x100>, /opt/alfresco412Pgsolr/tomcat/temp/Alfresco/ImageMagickContentTransformerWorker_target_5572158630074900053.png]
   succeeded:  true
   exit code:  0
   out:       
   err:       

Content Transformation
Troubleshooting