During tuning long time queries, we observed this query
select distinct RES.*, DEF.KEY_ as PROC_DEF_KEY_, DEF.NAME_ as PROC_DEF_NAME_, DEF.VERSION_ as PROC_DEF_VERSION_, DEF.DEPLOYMENT_ID_ as DEPLOYMENT_ID_, VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.VAR_TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_, VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_, VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_, VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LAST_UPDATED_TIME_ as VAR_LAST_UPDATED_TIME_, VAR.LONG_ as VAR_LONG_ from ACT_HI_PROCINST RES left outer join ACT_RE_PROCDEF DEF on RES.PROC_DEF_ID_ = DEF.ID_ left outer join ACT_HI_VARINST VAR ON RES.PROC_INST_ID_ = VAR.EXECUTION_ID_ and VAR.TASK_ID_ is null order by RES.START_TIME_ desc, VAR.LAST_UPDATED_TIME_ asc LIMIT 20 OFFSET 0;
This query search process instance with process variables.
This query without distinct return same count of record like with it.
But time for query decrease for 6 times.
This query name is "selectHistoricProcessInstancesByQueryCriteria" from HistoricProcessInstance.xml.
I think, distinct should be removed from this query.
There was a reason why the distinct keyword was added. Maybe not needed for all databases but in some it used to return duplicate records. You can use a native query or execute custom SQL to get a better performant query Activiti User Guide