CMIS query taking long - performance issue?

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Established Member

CMIS query taking long - performance issue?

Alfresco Community Edition 5.2 (Default settings)

Has anyone done anything with the logging on the Postgresql database side for Alfresco?

In the postgresql.conf file, I have the following properties

log_destination  = 'stderr'

logging_collector = on

log_directory = 'pg_log'

log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'

log_min_messages = debug5

log_min_error_statement = debug5

log_min_duration_statement = 0

log_checkpoints = on
log_connections = on
log_disconnections = on
log_duration = on

log_error_verbosity = default

log_lock_waits = on

log_statement = 'none'

I tried different values for few of the properties above.

I could never make it to work for my situation. 

Sometimes, a log file has been created in this pg_log folder, but always with 0 bytes. Nothing gets written there.

For some settings, no file has been created.

My purpose is to get the Postgresql query for the CMIS select query that I am running from my application.

We know this CMIS query takes more than 20 seconds consistently.

This query has our own metadata custom (aspect) properties in the where clause. 

In order to tune the database with indices, I would like to see the raw database query on the other side.

Any ideas?

6 Replies
Highlighted
Alfresco Employee

Re: Postgresql logging of long running queries

Not sure about this, but probably the query is served by SOLR instead of database. Can you check it?

Software Engineer in Alfresco Search Team.
Highlighted
Established Member

Re: Postgresql logging of long running queries

How would I check if my CMIS is going through Solr or hitting PostgreSQL directly?

I am not clear on this yet. 

However, following are my CMIS queries.

Following some material online is not clear to me.

Unless we see somewhere in the log or some kind of indicator that this is hitting Solr or hitting database, we cannot say for sure.

SELECT doc.*, saloalias.* FROM cmis:document AS doc JOIN salo:documentProperties AS saloalias ON doc.cmisSmiley SurprisedbjectId = saloalias.cmisSmiley SurprisedbjectId  WHERE  IN_TREE(doc, 'workspace://SpacesStore/0a813931-e5d4-4a4a-82ad-8ccbbd1c2405')  AND (saloalias.salo:lo_link = 1965) AND (saloalias.salo:lo_category = 'Bond')


SELECT doc.*, saloalias.* FROM cmis:document AS doc JOIN salo:documentProperties AS saloalias ON doc.cmisSmiley SurprisedbjectId = saloalias.cmisSmiley SurprisedbjectId WHERE IN_FOLDER(doc, 'workspace://SpacesStore/9315813f-ac89-4b05-9281-ce04eda1cf7d') AND (saloalias.salo:lo_link = 1965) AND (saloalias.salo:lo_category = 'Pers')

salo:documentProperties is our own custom aspect for our own type.

Please note that all the conditions in the where clause has our custom aspect properties.

What do you think?

Highlighted
Established Member

Re: Postgresql logging of long running queries

BTW, I was able to generate file logs. I also did application level p6spy logging. 

I am getting lots of log information. So many queries.

I am doing this initial work on my development machine first before getting on to the production server.

In my development machine, I have the same environment pretty much. I mean Alfresco 5.2 community and Postgresql 9.4, etc. And our J2EE project connecting to this Alfresco.

For example, in the p6spy approach, I get queries like this logged in the p6spy.log. These queries look similar to the queries that are logged in the postgres log files as well.

select             assoc.id                    as id,             parentNode.id               as parentNodeId,             parentNode.version          as parentNodeVersion,             parentStore.protocol        as parentNodeProtocol,             parentStore.identifier      as parentNodeIdentifier,             parentNode.uuid             as parentNodeUuid,             childNode.id                as childNodeId,             childNode.version           as childNodeVersion,             childStore.protocol         as childNodeProtocol,             childStore.identifier       as childNodeIdentifier,             childNode.uuid              as childNodeUuid,             assoc.type_qname_id         as type_qname_id,             assoc.child_node_name_crc   as child_node_name_crc,             assoc.child_node_name       as child_node_name,             assoc.qname_ns_id           as qname_ns_id,             assoc.qname_localname       as qname_localname,             assoc.is_primary            as is_primary,             assoc.assoc_index           as assoc_index                         from             alf_child_assoc assoc             join alf_node parentNode on (parentNode.id = assoc.parent_node_id)             join alf_store parentStore on (parentStore.id = parentNode.store_id)             join alf_node childNode on (childNode.id = assoc.child_node_id)             left join alf_store childStore on (childStore.id = childNode.store_id)               where             parentNode.id = 452                      order by             assoc.assoc_index ASC,             assoc.id ASC

select acs.id from alf_acl_change_set acs where acs.commit_time_ms = ( select max(commit_time_ms) from alf_acl_change_set where commit_time_ms <= 1540411920042 )

update alf_lock set version = version + 1, lock_token = 'not-locked', start_time = 0, expiry_time = 0 where excl_resource_id = 7 and lock_token = '90d3e150-8886-4307-8746-593805757475'

select txn.id as id, txn.commit_time_ms as commit_time_ms, count(case when node.type_qname_id != 149 then 1 end) as updates, count(case when node.type_qname_id = 149 then 1 end) as deletes from alf_transaction txn join alf_node node on (txn.id = node.transaction_id) WHERE txn.commit_time_ms >= 1540408271763 and txn.commit_time_ms < 1540411871763 group by txn.commit_time_ms, txn.id order by txn.commit_time_ms ASC, txn.id ASC

Oh boy, now I am into this mystery of identifying the (PostgreSQL) database queries that exactly correspond to my CMIS query.

Still I am not clear on this question - Are my CMIS queries going through Solr or directly to the database?

As you said, first I need to be sure on this.

Any directions for this would be a great help.

Highlighted
Alfresco Employee

Re: Postgresql logging of long running queries

Probably this can help:

https://www.zylk.net/en/web-2-0/blog/-/blogs/how-to-get-logs-for-alfresco-querys

Software Engineer in Alfresco Search Team.
Highlighted
Alfresco Employee

Re: Postgresql logging of long running queries

For an equivalent CMIS query:

SELECT doc.*, title.* 
FROM cmis:document AS doc JOIN cm:titled AS title
  ON doc.cmis:objectId = title.cmis:objectId
WHERE IN_TREE(doc, 'workspace://SpacesStore/b457a0ba-7bb6-4ee6-94b0-adca866ab777')
      AND title.cm:description like 'A%'

I have following output in the logs

alfresco | 2018-10-25 07:57:36,860 DEBUG [impl.solr.DbOrIndexSwitchingQueryLanguage] [http-nio-8080-exec-4] Using SOLR query: db-cmis for SearchParameters [language=cmis-alfresco, query=SELECT doc.*, title.* FROM cmis:document AS doc JOIN cm:titled AS title ON doc.cmis:objectId = title.cmis:objectId WHERE IN_TREE(doc, 'workspace://SpacesStore/b457a0ba-7bb6-4ee6-94b0-adca866ab777') AND title.cm:description like 'A%', stores=[workspace://SpacesStore], queryParameterDefinitions=[], excludeDataInTheCurrentTransaction=false, sortDefinitions=[], locales=[en], mlAnalaysisMode=EXACT_LANGUAGE_AND_ALL, limitBy=FINAL_SIZE, permissionEvaluation=EAGER, limit=100, allAttributes=[], textAttributes=[], maxItems=100, skipCount=0, defaultFTSOperator=AND, defaultFTSFieldOperator=AND, queryTemplates={}, namespace=http://www.alfresco.org/model/content/1.0, maxPermissionChecks=-1, maxPermissionCheckTimeMillis=-1, defaultFieldName=TEXT, fieldFacets=[], facetQueries=[], filterQueries=[], pivots=[], stats=null, useInMemorySort=null, maxRawResultSetSizeForInMemorySort=null, extraParameters={cmisVersion=CMIS_1_1}, excludeTenantFilter=false, isBulkFetchEnabled=true, queryConsistency=DEFAULT, sinceTxId=null, searchTerm=null, highlight=null, interval=null, range=null, timezone=null, spellCheck=false]
alfresco | 2018-10-25 07:57:37,759 DEBUG [impl.solr.SolrQueryHTTPClient] [http-nio-8080-exec-4] Sent :/solr/alfresco/cmis?wt=json&fl=DBID%2Cscore&rows=100&df=TEXT&start=0&locale=en&alternativeDic=DEFAULT_DICTIONARY&cmisVersion=CMIS_1_1&fq=%7B%21afts%7DAUTHORITY_FILTER_FROM_JSON&fq=%7B%21afts%7DTENANT_FILTER_FROM_JSON
alfresco | 2018-10-25 07:57:37,761 DEBUG [impl.solr.SolrQueryHTTPClient] [http-nio-8080-exec-4] with: {"tenants":[""],"defaultNamespace":"http://www.alfresco.org/model/content/1.0","defaultFTSFieldOperator":"AND","query":"SELECT doc.*, title.* FROM cmis:document AS doc JOIN cm:titled AS title ON doc.cmis:objectId = title.cmis:objectId WHERE IN_TREE(doc, 'workspace://SpacesStore/b457a0ba-7bb6-4ee6-94b0-adca866ab777') AND title.cm:description like 'A%'","templates":[],"allAttributes":[],"queryConsistency":"DEFAULT","authorities":["GROUP_EVERYONE","ROLE_ADMINISTRATOR","ROLE_AUTHENTICATED","admin"],"locales":["en_"],"textAttributes":[],"defaultFTSOperator":"AND","anyDenyDenies":true,"mlAnalaysisMode":"EXACT_LANGUAGE_AND_ALL"}
alfresco | 2018-10-25 07:57:37,763 DEBUG [impl.solr.SolrQueryHTTPClient] [http-nio-8080-exec-4] Got: 1 in 638 ms
alfresco | 2018-10-25 07:57:37,764 DEBUG [impl.solr.DbOrIndexSwitchingQueryLanguage] [http-nio-8080-exec-4] SOLR returned 1 results in 903ms

So it looks that the query is performed by SOLR.

I have enabled following loggers at /opt/alfresco/tomcat/shared/classes/alfresco/extension/custom-log4j.properties

log4j.logger.org.alfresco.repo.search.impl.solr.SolrQueryHTTPClient=DEBUG
log4j.logger.org.alfresco.repo.search.impl.solr.DbOrIndexSwitchingQueryLanguage=DEBUG
Software Engineer in Alfresco Search Team.
Highlighted
Established Member

Re: Postgresql logging of long running queries

Thx for your assistance. 

Following is the alfresco log. I did not check the Solr log.

I have both the IN_TREE query and the IN_FOLDER query. First time creating a document for an entity, it will create a folder under the specific site (for the client). Until it creates a folder, it will call IN_TREE query. Once the folder is created for an entity, then IN_FOLDER query will be called.

Following is the log. First one is for IN_TREE, and the last one is the IN_FOLDER. Scroll all the way down to see IN_FOLDER case.

As I see, IN_TREE goes through Solr. IN_FOLDER goes to DB directly.

For here, I forced calling this IN_TREE from CMIS workbench.

Am I reading/interpreting the log correctly?

If yes, I am wondering, what makes the IN_FOLDER query to take more than 20 seconds to return in our production environment. 

Note this organization/structure of documents in our repository: Site->Document Library->Folder for each document category->Folder for each entity that belongs to a category->documents that belong to the entity.

Now, when I try to retrieve documents of an entity that belongs to a category whose folder contains several million documents, it takes 20 seconds.

But when I retrieve documents of an entity that belongs to a category whose folder contains less than 100K documents, then it comes back in few milliseconds.

I also notice one thing. First time, I retrieve these documents of this large folder, it is fast. After the first time, it is consistently slow. I have tested this several times and confirmed this behavior in our production environment.

This is the performance issue I am seeing and trying to address. 

As other experts in the forum suggested, since the IN_FOLDER against a large folder is taking more than 20 seconds, and it is going to DB directly, we can put some indices in place and see if it helps.

I am having a hard time in finding/identifying a one to one mapping between my CMIS retrieval query to a postgresql query.

To create the indices, do I need to find the postgresql query and do explain plan, etc.

Any guidance on the direction/step that I could take, would help me a lot.

Alfresco log information

IN_TREE related log

2018-10-25 09:23:56,343 DEBUG [org.alfresco.repo.search.impl.solr.DbOrIndexSwitchingQueryLanguage] [http-apr-8180-exec-6]

Trying db query for db-cmis for SearchParameters [

language=cmis-alfresco,

query=SELECT doc.*, SALO_ALIAS.* FROM cmis:document AS doc JOIN salo:documentProperties AS SALO_ALIAS ON doc.cmisSmiley SurprisedbjectId = SALO_ALIAS.cmisSmiley SurprisedbjectId WHERE IN_TREE(doc, 'workspace://SpacesStore/0a813931-e5d4-4a4a-82ad-8ccbbd1c2405') AND (SALO_ALIAS.salo:lo_link = 1965) AND (SALO_ALIAS.salo:lo_category = 'Lic') , stores=[workspace://SpacesStore], queryParameterDefinitions=[], excludeDataInTheCurrentTransaction=false, sortDefinitions=[], locales=[en], mlAnalaysisMode=EXACT_LANGUAGE_AND_ALL, limitBy=FINAL_SIZE, permissionEvaluation=EAGER, limit=100, allAttributes=[], textAttributes=[], maxItems=100, skipCount=0, defaultFTSOperator=AND, defaultFTSFieldOperator=AND, queryTemplates={}, namespace=http://www.alfresco.org/model/content/1.0, maxPermissionChecks=-1, maxPermissionCheckTimeMillis=-1, defaultFieldName=TEXT, fieldFacets=[], facetQueries=[], filterQueries=[], pivots=[], stats=null, useInMemorySort=null, maxRawResultSetSizeForInMemorySort=null, extraParameters={cmisVersion=CMIS_1_1}, excludeTenantFilter=false, isBulkFetchEnabled=true, queryConsistency=DEFAULT, sinceTxId=null, searchTerm=null, highlight=null, interval=null, range=null, timezone=null, spellCheck=false]

2018-10-25 09:23:56,472 DEBUG [org.alfresco.repo.search.impl.solr.DbOrIndexSwitchingQueryLanguage] [http-apr-8180-exec-6]

DB query failed for db-cmis for SearchParameters [

language=cmis-alfresco,

query=SELECT doc.*, SALO_ALIAS.* FROM cmis:document AS doc JOIN salo:documentProperties AS SALO_ALIAS ON doc.cmisSmiley SurprisedbjectId = SALO_ALIAS.cmisSmiley SurprisedbjectId WHERE IN_TREE(doc, 'workspace://SpacesStore/0a813931-e5d4-4a4a-82ad-8ccbbd1c2405') AND (SALO_ALIAS.salo:lo_link = 1965) AND (SALO_ALIAS.salo:lo_category = 'Lic') , stores=[workspace://SpacesStore], queryParameterDefinitions=[], excludeDataInTheCurrentTransaction=false, sortDefinitions=[], locales=[en], mlAnalaysisMode=EXACT_LANGUAGE_AND_ALL, limitBy=FINAL_SIZE, permissionEvaluation=EAGER, limit=100, allAttributes=[], textAttributes=[], maxItems=100, skipCount=0, defaultFTSOperator=AND, defaultFTSFieldOperator=AND, queryTemplates={}, namespace=http://www.alfresco.org/model/content/1.0, maxPermissionChecks=-1, maxPermissionCheckTimeMillis=-1, defaultFieldName=TEXT, fieldFacets=[], facetQueries=[], filterQueries=[], pivots=[], stats=null, useInMemorySort=null, maxRawResultSetSizeForInMemorySort=null, extraParameters={cmisVersion=CMIS_1_1}, excludeTenantFilter=false, isBulkFetchEnabled=true, queryConsistency=DEFAULT, sinceTxId=null, searchTerm=null, highlight=null, interval=null, range=null, timezone=null, spellCheck=false]
org.alfresco.repo.search.impl.querymodel.QueryModelException: 09250171 Descendant/IN_TREE() is unsupported
at org.alfresco.repo.search.impl.querymodel.impl.db.functions.DBDescendant.prepare(DBDescendant.java:75)
at org.alfresco.repo.search.impl.querymodel.impl.db.DBFunctionalConstraint.prepare(DBFunctionalConstraint.java:80)
at org.alfresco.repo.search.impl.querymodel.impl.db.DBConjunction.prepare(DBConjunction.java:83)
at org.alfresco.repo.search.impl.querymodel.impl.db.DBQuery.prepare(DBQuery.java:225)
at org.alfresco.repo.search.impl.querymodel.impl.db.DBQueryEngine.executeQuery(DBQueryEngine.java:207)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
...

2018-10-25 09:23:56,474 DEBUG [org.alfresco.repo.search.impl.solr.DbOrIndexSwitchingQueryLanguage] [http-apr-8180-exec-6]

Using SOLR query: db-cmis for SearchParameters [

language=cmis-alfresco,

query=SELECT doc.*, SALO_ALIAS.* FROM cmis:document AS doc JOIN salo:documentProperties AS SALO_ALIAS ON doc.cmisSmiley SurprisedbjectId = SALO_ALIAS.cmisSmiley SurprisedbjectId WHERE IN_TREE(doc, 'workspace://SpacesStore/0a813931-e5d4-4a4a-82ad-8ccbbd1c2405') AND (SALO_ALIAS.salo:lo_link = 1965) AND (SALO_ALIAS.salo:lo_category = 'Lic') , stores=[workspace://SpacesStore], queryParameterDefinitions=[], excludeDataInTheCurrentTransaction=false, sortDefinitions=[], locales=[en], mlAnalaysisMode=EXACT_LANGUAGE_AND_ALL, limitBy=FINAL_SIZE, permissionEvaluation=EAGER, limit=100, allAttributes=[], textAttributes=[], maxItems=100, skipCount=0, defaultFTSOperator=AND, defaultFTSFieldOperator=AND, queryTemplates={}, namespace=http://www.alfresco.org/model/content/1.0, maxPermissionChecks=-1, maxPermissionCheckTimeMillis=-1, defaultFieldName=TEXT, fieldFacets=[], facetQueries=[], filterQueries=[], pivots=[], stats=null, useInMemorySort=null, maxRawResultSetSizeForInMemorySort=null, extraParameters={cmisVersion=CMIS_1_1}, excludeTenantFilter=false, isBulkFetchEnabled=true, queryConsistency=DEFAULT, sinceTxId=null, searchTerm=null, highlight=null, interval=null, range=null, timezone=null, spellCheck=false]

2018-10-25 09:23:56,668 DEBUG [org.alfresco.repo.search.impl.solr.SolrQueryHTTPClient] [http-apr-8180-exec-6] Sent :/solr4/alfresco/cmis?wt=json&fl=DBID%2Cscore&rows=100&df=TEXT&start=0&locale=en&alternativeDic=DEFAULT_DICTIONARY&cmisVersion=CMIS_1_1&fq=%7B%21afts%7DAUTHORITY_FILTER_FROM_JSON&fq=%7B%21afts%7DTENANT_FILTER_FROM_JSON

2018-10-25 09:23:56,668 DEBUG [org.alfresco.repo.search.impl.solr.SolrQueryHTTPClient] [http-apr-8180-exec-6] with: {"tenants":[""],"defaultNamespace":"http://www.alfresco.org/model/content/1.0","defaultFTSFieldOperator":"AND","query":"SELECT doc.*, SALO_ALIAS.* FROM cmis:document AS doc JOIN salo:documentProperties AS SALO_ALIAS ON doc.cmisSmiley SurprisedbjectId = SALO_ALIAS.cmisSmiley SurprisedbjectId WHERE IN_TREE(doc, 'workspace://SpacesStore/0a813931-e5d4-4a4a-82ad-8ccbbd1c2405') AND (SALO_ALIAS.salo:lo_link = 1965) AND (SALO_ALIAS.salo:lo_category = 'Lic') ","templates":[],"allAttributes":[],"queryConsistency":"DEFAULT","authorities":["GROUP_EVERYONE","ROLE_ADMINISTRATOR","ROLE_AUTHENTICATED","admin"],"locales":["en_"],"textAttributes":[],"defaultFTSOperator":"AND","anyDenyDenies":false,"mlAnalaysisMode":"EXACT_LANGUAGE_AND_ALL"}
2018-10-25 09:23:56,668 DEBUG [org.alfresco.repo.search.impl.solr.SolrQueryHTTPClient] [http-apr-8180-exec-6] Got: 7 in 76 ms
2018-10-25 09:23:56,668 DEBUG [org.alfresco.repo.search.impl.solr.DbOrIndexSwitchingQueryLanguage] [http-apr-8180-exec-6] SOLR returned 7 results in 194ms

IN_FOLDER query related log
2018-10-25 10:25:12,062 DEBUG [org.alfresco.repo.search.impl.solr.DbOrIndexSwitchingQueryLanguage] [http-apr-8180-exec-10]

Trying db query for db-cmis for SearchParameters [

language=cmis-alfresco,

query=SELECT doc.*, SALO_ALIAS.* FROM cmis:document AS doc JOIN salo:documentProperties AS SALO_ALIAS ON doc.cmisSmiley SurprisedbjectId = SALO_ALIAS.cmisSmiley SurprisedbjectId WHERE IN_FOLDER(doc, 'workspace://SpacesStore/9315813f-ac89-4b05-9281-ce04eda1cf7d') AND (SALO_ALIAS.salo:lo_link = 1965) AND (SALO_ALIAS.salo:lo_category = 'Lic'), stores=[workspace://SpacesStore], queryParameterDefinitions=[], excludeDataInTheCurrentTransaction=false, sortDefinitions=[], locales=[en], mlAnalaysisMode=EXACT_LANGUAGE_AND_ALL, limitBy=FINAL_SIZE, permissionEvaluation=EAGER, limit=100, allAttributes=[], textAttributes=[], maxItems=100, skipCount=0, defaultFTSOperator=AND, defaultFTSFieldOperator=AND, queryTemplates={}, namespace=http://www.alfresco.org/model/content/1.0, maxPermissionChecks=-1, maxPermissionCheckTimeMillis=-1, defaultFieldName=TEXT, fieldFacets=[], facetQueries=[], filterQueries=[], pivots=[], stats=null, useInMemorySort=null, maxRawResultSetSizeForInMemorySort=null, extraParameters={cmisVersion=CMIS_1_1}, excludeTenantFilter=false, isBulkFetchEnabled=true, queryConsistency=DEFAULT, sinceTxId=null, searchTerm=null, highlight=null, interval=null, range=null, timezone=null, spellCheck=false]
2018-10-25 10:25:12,070 DEBUG [org.alfresco.repo.search.impl.solr.DbOrIndexSwitchingQueryLanguage] [http-apr-8180-exec-10] DB returned 7 results in 8ms