Search Documentation

resplin · ‎6 Jun 2015

The official documentation is at: http://docs.alfresco.com

Search
The repository can be searched in two ways.

Directly against the node service
Against an index using the Searcher component

XPath search using the node service

The node service has two methods that support XPath style searches.

/**
 * Select nodes using an xpath expression.
 * 
 * @param contextNode - the context node for relative expressions etc
 * @param XPath - the xpath string to evaluate
 * @param parameters - parameters to bind in to the xpath expression
 * @param namespacePrefixResolver - prefix to namespace mappings
 * @param followAllParentLinks - if false '..' follows only the primary parent links, if true it follows all 
 * @return a list of all the child assoc relationships to the selected nodes
 */
 public List<ChildAssocRef> selectNodes(NodeRef contextNode, 
                                        String XPath, 
                                        QueryParameterDefinition[] parameters,                                 
                                        NamespacePrefixResolver namespacePrefixResolver, 
                                        boolean followAllParentLinks);

/**
 * Select properties using an xpath expression 
 * 
 * @param contextNode - the context node for relative expressions etc
 * @param XPath - the xpath string to evaluate
 * @param parameters - parameters to bind in to the xpath expression
 * @param namespacePrefixResolver - prefix to namespace mappings
 * @param followAllParentLinks - if false '..' follows only the primary parent links, if true it follows all 
 * @return a list of property values 
 * TODO: Should be returning a property object 
 */
 public List<Serializable> selectProperties(NodeRef contextNode, 
                                            String XPath, 
                                            QueryParameterDefinition[] parameters, 
                                            NamespacePrefixResolver namespacePrefixResolver, 
                                            boolean followAllParentLinks);

Overview

Jaxen is used to evaluate the xpath expression using a custom document navigator. That means a full XPath implementation is available. The down side is that it uses the node service to navigate the node structure, essentially in the same way an XPath expression would be used against and XML DOM model. So whilst it is complete some queries will not be performant; particularly unconstrained full text search.

There are two methods to distinguish selecting only attributes and only elements. At the moment there is no support for selecting a mixture of properties and nodes. The XPath implementation supports the standard $namespace:name variable substitution and has the additional functions required by the JSR 170 specification (which hide some inbuilt functions such as contains())

Function extensions

like (SQL like pattern expressions using ? to match a singale character and % to match a string
contains (Google like full text search - in fact, the Lucene way)
deref (to follow references from reference properties to nodes

To suport these functions the node service interface defines the like() and contains() methods which are optionally supported by node service implementations. The indexing node service supports them fully.

Note: the functions contains() and like() can contain wild card elements at the start of the query but this may have performance issues.

The repository supports nodes with multiple parents. The is a much better alternative to reference nodes and can be used to avoid the limitiations with the deref() function. This raises the issue of a node having mutiple parents. The meaning of '..' in path expression could be 'find all parents' or 'find my primary parent'. This behaviour is controlled using the followAllParentLinks parameter on the methos calls.

XPath expressions are executed in the context of a given node - so that relative xpath expressions can be evaluated.
The store root node is '/'.

A name space prefix resolver is always required. For any XPath implementation there needs to be a way to map from the prefixes used in the XPath expression to the actually URIs that they represent. For example '/alf:space' would need to map 'alf' to the alfresco URI. The name space prefix resolver provides this support (as well as the information to navigate the name space axis, if required).

Parameters are provided as parameter defintions where the default value is used as the actual value. This identifes the fully qualified name and the type used to select the appropriate XPath type. Lists of node and attributes as parameters are not supported. We do not have property types to support this at the moment.

NOTE:

Property Objects
Collection types?

XPath Functions Available

Functions on Boolean Values

boolean
not
false
true

Functions on Numeric Values

number
ceiling
floor
round

Aggregate Functions

count
sum

Context Functions

last
position

Functions that Generate Sequences

id
document

Functions on Strings

string
concat
contains
normalize-space
starts-with
string-length
substring-after
substring-before
substring
translate

Functions on Nodes

name
namespace-uri
lang

Extension Functions

matrix-concat
evaluate
lower-case
upper-case
ends-with
subtypeOf
hasAspect
deref
like
contains
first

JCR Functions

jcr:like
jcr:score
jcr:contains
jcr:deref

Comparison with JSR 170

There are some differences between JSR 170 and what is provided here:

This is a complete XPath implementation
the like function currently also uses * meaning the same as # (could be fixed by changing the lucene implementation of wild card queries to use ? and %, not ? and *.
the contains function will look at all attributes on a node and the full text representation of any content
the contains function can be constrained to one attribute and the full text index
the deref function can not be used in a path (as shown in the JSR 170 examples) - Jaxen does not support this
the deref function must be given the full path of the attribute to dereference

Examples

Two simple examples to illustrate use in code.

// A name space resolver is required - this could be the name space service
DynamicNamespacePrefixResolver namespacePrefixResolver = new DynamicNamespacePrefixResolver(null);
namespacePrefixResolver.addDynamicNamespace(NamespaceService.ALFRESCO_PREFIX, NamespaceService.ALFRESCO_URI);
namespacePrefixResolver.addDynamicNamespace(NamespaceService.ALFRESCO_TEST_PREFIX, NamespaceService.ALFRESCO_TEST_URI);

// Select all nodes below the context node
List<ChildAssocRef> answer =  searchService.selectNodes(rootNodeRef, '*', null, namespacePrefixResolver, false);   
// Find all the property values for @alftest:animal    
List<Serializable> attributes = searchService.selectProperties(rootNodeRef, '//@alftest:animal', null, namespacePrefixResolver, false);

Other xpath examples and explanations:

Find all nodes with an @alftest:animal property equal to 'monkey'
'//.[@alftest:animal='monkey']'

Find all nodes directly linked to the current node
'*'

Find all nodes with one node between them and the current node
'*/*'

Find all nodes with two nodes between them and the current node
'*/*/*'

Find all nodes with three nodes between them and the current node
'*/*/*/*'

Find the parents of all nodes with three nodes between them and the current node
'*/*/*/*/..'
This may not be the same as '*/*/*' as nodes have multiples parents.
e.g. Going down we may follow a non primary child relationship and then navigate up the primary child relationship
     We may go up all parent relationships
     (We could control navigating only primary relationships)

Find all nodes below the context node (excluding the context node) 
'*//.'

Follow a named path from the current context node 
'alftest:root_p_n1'

Find all nodes below the context node (excluding the context node) that have an @alftest:animal property
'*//.[@alftest:animal]'

Find all nodes below the context node (excluding the context node) that have an @alftest:animal property equal to 'monkey'
'*//.[@alftest:animal='monkey']'

Find all nodes that have an @alftest:animal property equal to 'monkey'
(This will navigate to all nodes in the store and will have performance issues) 
'//.[@alftest:animal='monkey']'

Find all nodes that have an @alftest:animal property equal to the value of the variable $alf:test
'//.[@alftest:animal=$alf:test]'

Find the principal parent or all parents of the current context node
'..'

Find the values of all properties @alftest:animal
Again this will have performance issues as it will visit all nodes and all properties
'//@alftest:animal'

Find the values of all properties @alftest:reference
Again this will have performance issues as it will visit all nodes and all properties
'//@alftest:reference'

Derefernce the node identified by the attributes at /alftest:root_p_n1/alftest:n1_p_n3/@alftest:reference
The second attribute of the deref() function is not used at the moment
'deref(/alftest:root_p_n1/alftest:n1_p_n3/@alftest:reference, )'

Find all nodes in the store that have an attribute @alftest:animal ending with monkey
Again, this will visit all nodes in the repository.
'//*[like(@alftest:animal, '*monkey')]'

Find all nodes in the store that have an attribute @alftest:animal ending with monkey
Again, this will visit all nodes in the repository.
'//*[like(@alftest:animal, '%monkey')]'

Find all nodes in the store that have an attribute @alftest:animal starting with monk
Again, this will visit all nodes in the repository.
'//*[like(@alftest:animal, 'monk*')]'

Find all nodes in the store that have an attribute @alftest:animal starting with monk
Again, this will visit all nodes in the repository.
'//*[like(@alftest:animal, 'monk%')]'

Find all nodes in the store that have an attribute @alftest:animal that equal monk%
Again, this will visit all nodes in the repository.
TODO: check the requirements for escaping here
'//*[like(@alftest:animal, 'monk\%')]'

Find all the nodes with any attribute or content containing 'monkey'
This query will have the worst performance. It visits all nodes and searches an appropriate index for the full text   
and all attribute values. It is much more efficient to use the searcher API.
'//*[contains('monkey')]'

Find all the values of any attribute wher the attrbute or content contains 'monkey'
This query will have the worst performance. It visits all nodes and searches an appropriate index for the full text   
and all attribute values. It is much more efficient to use the searcher API.
'//@*[contains('monkey')]'

Find all the nodes with any attribute or content containing 'mon?ey' e.g. monkey, monaey, ...
This query will have the worst performance. It visits all nodes and searches an appropriate index for the full text   
and all attribute values. It is much more efficient to use the searcher API.
'//*[contains('mon?ey')]'

Find all the values of any attribute wher the attrbute or content contains 'mon?ey'
This query will have the worst performance. It visits all nodes and searches an appropriate index for the full text   
and all attribute values. It is much more efficient to use the searcher API.
'//@*[contains('mon?ey')]'

Similar pattern examples to teh above
'//*[contains('m*y')]'
'//@*[contains('mon*')]'
'//*[contains('*nkey')]'
'//@*[contains('?onkey')]'

Searching using the searcher component

The searcher component decides which indexing service to call for queries against a given store. Each store may support different indexing and different query languages. The default indexing store uses lucene to provide indexing and query support. It supports two languages: lucene and a very limited, but optimised, XPath implementation. The functionality of this xpath expression will be expanded over time.

Optimised XPath langauge

This can be called using the 'xpath' language specifier (case insensitive)

The implementation currently supports the following axes:

child
descendant
descendant-or-self
parent
self

It does not currently support the attribute axis or predictes.
These are next on the road map.

Parameterisation using $namsespace:name is not supported.

However text replacement is supported using ${namespace:name}

These queries can be canned in the query register.

Examples

The optimised xpath syntax is identical to that used for the PATH field in lucene queries.
Any PATH content below in the lucene query examples is also a valid xpath query.

Find all the attributes available for all nodes at any level (excluding the root node)
ResultSet results = searcher.query(storeRef, 'xpath', '//*', null, null);

Generate a query entirely by variable substitution
QueryParameterDefinition paramDef = new QueryParameterDefImpl(QName.createQName('alf:query', namespacePrefixResolver), (PropertyTypeDefinition) null, true, '//./*');
ResultSet results = searcher.query(storeRef, 'xpath', '${alf:query}', null, new QueryParameterDefinition[] { paramDef });

Lucene Language

This is the recommended language as it is supported by the recommended indexer.

The query language is described on the Lucene site http://lucene.apache.org/java/2_4_0/queryparsersyntax.html. The QueryParser has been modified to allow wild cards at the start of wild card query elements otherwise the syntax is the same.

Note that certain characters need to be escaped in the query string. There is support to do this on a static method on the LuceneQueryParser.

The following fields are available

ASPECT
- All the aspects of the node
- Tokenised as the fully qualified qname of each aspect
FTSSTATUS
- Indicates if there are attributes waiting to be indexed in the back ground. Could be used to indicate that full text searches may be out of date matches
ID
- The id from the node reference - all nodes in the index are from the same store
- A UUID
PARENT
- All the parent IDs (UUIDs)
PATH
- An XPATH expression used to select nodes
- This should only be access via a phrase query (ie in '') as it requires special tokenisation
PRIMARYPARENT
- The ID of the primary parent node
QNAME
- All the QNames by which this node is known in its parents
- Should be queried using phrases as it requires special tokenisation
TEXT
- The full text representation of the node content
TYPE
- The fully qualified type of the node

Attributes as fields

@{namespace-uri}name

Attributes should be searched using phrase expressions.

The following fields are used internally

ANCESTOR
ISCONTAINER
ISROOT
ISNODE
TX

Examples

// Find all the nodes under the root node by QName namespace:one
// The prefix must be resolved to a URI
ResultSet results = searcher.query(rootNodeRef.getStoreRef(), 'lucene', 'PATH:\'/namespace:one\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:one/namespace:five\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:one/namespace:five/namespace:twelve\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:*/namespace:*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:*/namespace:*/namespace:*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:one/namespace:*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:*/namespace:five/namespace:*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:one/namespace:*/namespace:nine\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/*/*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/*/namespace:five\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/*/*/*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:one/*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/*/namespace:five/*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/namespace:one/*/namespace:nine\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'//.\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'//*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'//*/.\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'//*/./.\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'//./*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'//././*/././.\'', null, null);
// Examples using the default namespace
results = searcher.query(storeRef, 'lucene', 'PATH:\'//common\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/one//common\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/one/five//*\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/one/five//.\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/one//five/nine\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/one//thirteen/fourteen\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/one//thirteen/fourteen//.\'', null, null);
results = searcher.query(storeRef, 'lucene', 'PATH:\'/one//thirteen/fourteen//.//.\'', null, null);

Type based queries.

escapeQName uses QueryParser static method to escape the string.

QName qname = QName.createQName(NamespaceService.ALFRESCO_URI, 'int-ista');
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(qname) + ':\'01\'', null, null);

qname = QName.createQName(NamespaceService.ALFRESCO_URI, 'long-ista');
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(qname) + ':\'2\'', null, null);
    
qname = QName.createQName(NamespaceService.ALFRESCO_URI, 'float-ista');
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(qname) + ':\'3.4\'', null, null);
      
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(QName.createQName(NamespaceService.ALFRESCO_URI, 'double-ista')) + ':\'5.6\'', null, null);
   
Date date = new Date();
String sDate = CachingDateFormat.getDateFormat().format(date);
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(QName.createQName(NamespaceService.ALFRESCO_URI, 'date-ista')) + ':\'' + sDate + '\'', null, null);
    
results = searcher.query(storeRef, 'lucene',
               '\\@' + escapeQName(QName.createQName(NamespaceService.ALFRESCO_URI, 'datetime-ista')) + ':\'' + sDate + '\'', null, null);

results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(QName.createQName(NamespaceService.ALFRESCO_URI, 'boolean-ista')) + ':\'true\'', null,
               null);

results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(QName.createQName(NamespaceService.ALFRESCO_URI, 'qname-ista')) + ':\'{wibble}wobble\'',
               null, null);
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(QName.createQName(NamespaceService.ALFRESCO_URI, 'guid-ista')) + ':\'My-GUID\'', null,
               null);
  
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(QName.createQName(NamespaceService.ALFRESCO_URI, 'category-ista')) + ':\'CategoryId\'',
               null, null);
 
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(QName.createQName(NamespaceService.ALFRESCO_URI, 'noderef-ista')) + ':\'' + n1 + '\'',
               null, null);
          
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(QName.createQName(NamespaceService.ALFRESCO_URI, 'path-ista')) + ':\''
               + nodeService.getPath(n3) + '\'', null, null);

Queries based on type.

results = searcher.query(storeRef, 'lucene', 'TYPE:\'' + testType.toString() + '\'', null, null);
    
results = searcher.query(storeRef, 'lucene', 'TYPE:\'' + testSuperType.toString() + '\'', null, null);

results = searcher.query(storeRef, 'lucene', 'ASPECT:\'' + testAspect.toString() + '\'', null, null);
      
results = searcher.query(storeRef, 'lucene', 'ASPECT:\'' + testSuperAspect.toString() + '\'', null, null);

Full text search examples

results = searcher.query(storeRef, 'lucene', 'TEXT:\'fox\'', null, null);
       
QName queryQName = QName.createQName('alf:test1', namespacePrefixResolver);
results = searcher.query(storeRef, queryQName, null);

Canned queries and query parameters

queryQName = QName.createQName('alf:test2', namespacePrefixResolver);
results = searcher.query(storeRef, queryQName, null);
       
queryQName = QName.createQName('alf:test2', namespacePrefixResolver);
QueryParameter qp = new QueryParameter(QName.createQName('alf:banana', namespacePrefixResolver), 'woof');
results = searcher.query(storeRef, queryQName, new QueryParameter[] { qp });
      
queryQName = QName.createQName('alf:test3', namespacePrefixResolver);
qp = new QueryParameter(QName.createQName('alf:banana', namespacePrefixResolver), '/one/five//*');
results = searcher.query(storeRef, queryQName, new QueryParameter[] { qp });
    
// TODO: should not have a null property type definition
QueryParameterDefImpl paramDef = new QueryParameterDefImpl(QName.createQName('alf:lemur', namespacePrefixResolver), (PropertyTypeDefinition) null, true, 'fox');
results = searcher.query(storeRef, 'lucene', 'TEXT:\'${alf:lemur}\'', null, new QueryParameterDefinition[] { paramDef });
       
paramDef = new QueryParameterDefImpl(QName.createQName('alf:intvalue', namespacePrefixResolver), (PropertyTypeDefinition) null, true, '1');
qname = QName.createQName(NamespaceService.ALFRESCO_URI, 'int-ista');
results = searcher.query(storeRef, 'lucene', '\\@' + escapeQName(qname) + ':\'${alf:intvalue}\'', null, new QueryParameterDefinition[] { paramDef });

Other

results = searcher.query(rootNodeRef.getStoreRef(), 'lucene', 'PARENT:\'' + rootNodeRef.toString() + '\'', null, null);
       
results = searcher.query(rootNodeRef.getStoreRef(), 'lucene', '+PARENT:\'' + rootNodeRef.toString() + '\' +QNAME:\'one\'', null, null);

Search Documentation

Search Documentation

Table of Contents

XPath search using the node service

Overview

XPath Functions Available

Functions on Boolean Values

Functions on Numeric Values

Aggregate Functions

Context Functions

Functions that Generate Sequences

Functions on Strings

Functions on Nodes

Extension Functions

JCR Functions

Comparison with JSR 170

Examples

Searching using the searcher component

Optimised XPath langauge

Examples

Lucene Language

Examples

We use cookies on this site to enhance your user experience