Exact Term Queries in Search Services 2.0

cancel
Showing results for 
Search instead for 
Did you mean: 

Exact Term Queries in Search Services 2.0

Alfresco Employee
1 2 528

One of the smaller changes in Search Services 2.0 is that we've changed the way the = operator behaves.  Previously this operator resulted in an exact field match, but it has always been documented as an exact term match.  For example this is from the docs page for ACS 5.2:

To search for an exact term, prefix the term with "=".

In Search Services 2.0 we have fixed the behaviour to match the documentation.  This will not have an impact when querying for fields unless they are tokenised.

For example here we have some example queries against the (tokenised) cm:name field with a corpus of two documents:

Examples of using exact term queries against a corpus of two documents.Examples of using exact term queries against a corpus of two documents.When we query for "=Driver" then we now return the document called "Taxi Driver". Previously in Search Services 1.4.0 we would not get any results.  When we query for "=Taxi =Driver" then we get both documents returned since they have the term "Taxi" in the name (and the default operator is OR). Previously we would only get results where the name was exactly "Taxi" or exactly "Driver".

Having discussed this at length within the team we realise that it can be quite hard to envisage the impact of this change in all situations, and we felt this was particularly complex when combined with phrase queries.  Here's a table showing various queries and document names with highlighting where the behaviour has changed in Search Services 2.0.

Changes in behaviour for various phrase and exact term queries.Changes in behaviour for various phrase and exact term queries.

Full details of this fix can be found in SEARCH-2228.

As mentioned this is one of the smaller changes that has gone into Search Services 2.0.0.  There is a more detailed list of features published here and for a more in-depth tour then you can register for Tech Talk Live #123: Discovering the "2" in Search Services 2.0.

2 Comments
Master

Hmm... I wonder if this change does not cause more inconsistencies between TMQ and SOLR search. Can you elaborate on how this change would affect queries such as

=cm:name:Driver
=cm:name:"Taxi Driver"
=cm:name:Taxi =cm:name:Driver

when explicitly executed against SOLR? Because the behaviour of TMQ for those queries is exactly what 1.4 (and previous versions) yielded, and if 2.0 now works differently, people can see sudden changes in search results when they use the default TRANSACTIONAL_IF_POSSIBLE, and some (other) part of the query changes to transparently switch between DB and SOLR execution.

Alfresco Employee

This is a great point @afaust - thanks for raising it.

Using the corpus from the table above, when we call the v1 REST API then we see:

{
  "query": {
    "query": "=cm:name:Driver"
  }
}

gets sent to the DB and returns 0 results, however if we send:

{
  "query": {
    "query": "=cm:name:Driver AND cm:name:*"
  }
}

then we get 3 results (the same as sending the first query to Solr directly).

I've raised a JIRA ticket here to investigate this further: https://issues.alfresco.com/jira/browse/SEARCH-2461