The official documentation is at: http://docs.alfresco.com
From Alfresco 4.0, Solr is available to support search within the Alfresco repository.
The existing embedded Lucene index will continue to be available in 4.x.
Choosing Solr for search support has many advantages but you will not be able to use them if you require:
Search has been moved into a sub-system with a 'solr' and 'lucene' implementation.
The Alfresco Solr Search sub-system supports the same query languages as the embedded Lucene sub-system. The same fields (ID , PARENT, properties) are also available. The only minor difference is that Solr only supports the opencmis based CMIS query language. This is more strict in its adherence to the CMIS specification - type and aspect names are case sensitive.
The Solr sub-system has the following improvements:
To configure alfresco to use Solr set the following properties
These properties can also be set via JMX (MBeans - Alfresco - Configuration - Search) and the Share admin page if you are using Alfresco Enterprise. You can switch between Lucene and Solr - in JMX this is done by setting the manager sourceBeanName to 'solr' or 'lucene'. The subsystems make available their own related properties. The 'managed - solr' instance exposes solr.base.url. The Lucene sub-system exposes all the properties that had to be set at start up. These can now be configured live and the sub-system redeployed.
The search sub-systems can also be configured using the administration screens in the Enterprise product.
The distribution will contain a zip file named something like
This archive contains:
In the following instructions
These may be the same or different directories, depending on whether you have chosen to install Solr on a standalone server or the same server as Alfresco.
Use these instructions to replace or update the keys used to secure communications between Alfresco and Solr, using secure keys specific to your Alfresco installation.
The following instructions assume that Solr has been extracted and a keystore directory has already been created, either automatically by the Alfresco installer, or manually by following the instructions in #Installing Solr.
It is possible to set properties via the solr.xml file. To configure a property in solr.xml, remove it from the two core properties files and add it to solr.xml either as a common property or one that is core specific.
<solr persistent='true' sharedLib='lib' >
<cores adminPath='/admin/cores' adminHandler='org.alfresco.solr.AlfrescoCoreAdminHandler'>
<property name='data.dir.root' value='w:/woof' />
<core name='alfresco' instanceDir='workspace-SpacesStore' />
<core name='archive' instanceDir='archive-SpacesStore' >
<property name='data.dir.root' value='w:/woof' />
See http://wiki.apache.org/solr/SolrInstall for details on how to set up Solr in other containers.
Is very similar to the distribution.
The source contains a project ..\HEAD\root\projects\solr, and within that a directory with the basis of what the distribution contains, but with out the libraries. This is ..\HEAD\root\projects\solr\source\solr\instance. To build the libraries, change to the directory ..\HEAD\root and run the command 'ant deploy-solr'. The directory ..\HEAD\root\projects\solr\source\solr\instance should now match what you would extract from the distribution.
Then follow the steps above to complete the installation.
Get the new Solr distribution and set it up as above.
Apply your recorded local changes to the new distribution (or diff the old and new configurations)
The release notes will indicate if a rebuild of the index is required and the old indexes should not be used.
The installer carries out the above steps to install Solr in the same tomcat container as Alfresco. The connection URL is unchanged from the default. The Solr home is in the Alfresco data dir, which also contains the Solr data files.
<SOLR-ARCHIVE>\apache-solr-1.4.1.war' debug='0' crossContext='true'>
Communications between the repository and Solr are protected by SSL with mutual authentication out of the box. Both the repository and Solr have their own public/private key pair, which are stored in their own respective keystores. These keystores are bundled with Alfresco; the customer SHOULD create their own (see 'Generating New SSL Keys' below), otherwise other Alfresco installations will potentially be able to read their Solr-Repository traffic (the http endpoints with which Alfresco Solr communicates are secured only by SSL and expose potentially sensitive data such as content).
See also the wiki entry Data Encryption, section 'Alfresco Keystores'.
The repository has two keystores it uses for SSL:
These key stores can be stored wherever the customer desires; the following properties need to be updated accordingly in alfresco-global.properties.
Each Solr core similarly has two SSL keystores, the 'ssl.repo.client.keystore' containing a Solr public/private RSA key pair and the 'ssl.repo.client.truststore' containing the trusted Alfresco Certificate Authority certificate (which has been used to sign both the repository and Solr certificates).
Instructions for Generating Alfresco Repository SSL keystores
Instructions for Generating Alfresco Solr SSL keystores
When using JDK 7 and the default keystores, you may see this exception in the startup log :
Caused by: sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: Path does not chain with any of the trust anchors
It is not clear what exactly the issue is with the default keystores, see ALF-14311, but the issue does not occur if you generate you own private key / CA, and SSL keys / keystores, which you should be doing anyways if you plan to use SSL to secure the communication between the repository and the indexing server.
A sample shell script to automate the generation of these keys and certificates is available
here. SHA1SUM: 9195386ad7e68cca8c5e544be3c7ad1422e13b6f.
It defaults to using the default key aliases and passwords, though you may also want to change that. If you do, you'll have to reflect it in both the repo config (alfresco-global.properties), and each solr core config (solrcore.properties for each core, by default archive-SpacesStore and workspace-SpacesStore).
All urls for the Solr web application bundled with Alfresco are protected by SSL. In order to use these from a browser you need to import a browser-compatible keystore to allow mutual authentication and decryption to work. Follow these steps to import the keystore into your browser (these relate to Firefox, other browsers will have a similar mechanism):
(i) Open the Firefox Certificate Manager
(ii) Import the browser keystore 'browser.p12' that is located in your WEB_INF/classes/alfresco/keystore directory.
The password is 'alfresco'. This should result in a dialog indicating that the keystore has been imported successfully, as the following image shows.
'Your Certificates' should now contain the imported keystore with the Alfresco repository certificate.
(iii) In your browser, navigate to a Solr url e.g. http://localhost:8080/solr. This will result in the browser displaying an error dialog of the form shown in the image.
This is probably a result of the fact that the Alfresco certificate presented to the browser is not tied to the server IP address. In this case, simply view the certificate and confirm that it is signed by the Alfresco CA, by expanding 'I Understand the Risks' and selecting 'Add Exception':
then click 'View' to view the certificate.
Confirm that the certificate was issued by 'Alfresco CA' and then confirm the Security Exception (you may also want to uncheck the 'Permanently store this exception' checkbox).
Access to Solr will be granted as shown in the image below.
In alfresco-global.properties, set the property 'solr.secureComms' to 'none' and ensure that the property 'solr.port' is set to the correct non-SSL port of the application server in which Solr is running. Similarly, in each solrcore.properties file, set the property 'alfresco.secureComms' to 'none' and ensure that the property 'alfresco.port' is set to the correct non-SSL port of the application server in which your repository is running.
In the repository web.xml, remove the following configuration:
In the solr web.xml, remove the following configuration:
For community releases of 4.0, upgrade to Alfresco 4.0 - with the Lucene sub-system enabled. (Some patches may depend on search)
It is possible to upgrade to Enterprise versions of Alfresco 4.0 and later with Solr in place at upgrade time
Set up the Solr web app and then let it track the repository (you cna confirm this by tuning on debug described below)
You can use the Lucene search sub-system while this is going.
Configure the solr search subsystem properties (it does not have to be active to configure it via JMX or Share)
Check the Solr tracking status using the admin tools.
When you are up to date enough switch the search sub-system
You can always switch back - the Lucene index will rebuild from where it was as the sub-system starts.
Quick summary report
With multi-threaded tracking (available from 4.0.2) there are additional tracking details and tracking statistics
Additional information in 4.1.3 and later
General report - including the last TX indexed and the time
ACL TX specific report
Node specific report
ACL specific report
The next time the index commits the ceching used for PATH and ACL evaluation will be exhautively checked and
fixed up if in error.
Fix any issues as reported by the REPORT option
Can be used to remove transactions, acl transactions, nodes and acls from the index
May be used to create holes for testing.
This will create entries in the index. It will not delete the entry first so can be used to create duplicates
Read-read the default log4j configuration.
An optional resource parameter can be provided to specify a location on the classpath or file from which to load the log4j configuration.
As of 4.1.3 the packaged logging with SOLR has been replaced with log4j and can not be configured in the SOLR UI.
By default, SOLR specific logging configuration is in log4j-solr.properties anywhere on the classpath or in <SOLR_HOME>.
To specify an option core for the report. If absent a report is produced for each core e.g.
Solr index status (the normal Solr stuff)
The status of the index can also be checked via JMX.
MBeans - Alfresco - solrIndexes - <store alias>
The default Solr core summary is the default vue.
The operations can run the same consistency checks available by URLs above.
You can also fix index issues, check the index cache and backup individual indexes via JMX.
Each core of the Solr index can be backed up in its own right by URL or using cluster aware cron jobs from Alfresco.
The location and cron expression for each core can be setup via Share admin or JMX.
Ad hoc index back up cna be done via JMX or direct to Solr.
Solr will create a time-stamped sub-directory for each index back up you make.
It will contain a full index back up.
The Enterprise product allows a limit on the number of backups to be set via JMX or share.
SOLR can also be backed up direct using
Copy a backup index to the data directory for each core.
Restart Solr - it will start to track based on the state of the restored index.
Comment out the properties in this file (for each solr core) if you wish to set them via solr.xml
Configuration for multi-threaded tracking - 4.0.2 and later
HTTP Client configuration.
The Solr cache confguration from solrconfig.xml exposed as properties.
Check the performance of these caches for each core using the Solr admin pages.
Warming - set the number of cache entries to pre-build after each index update before the new index goes live.
The more warming you do the longer an index takes to become live - but the less time is spent as a result of cache misses.
These options require a bit of thought and tuning for individual use cases.
You can disable permissoin checks of they are not required
17-Nov-2011 11:22:45 org.alfresco.solr.tracker.CoreTracker trackRepository
INFO: .... from Transaction [id=374, commitTimeMs=1321528572425, updates=1, deletes=0]
17-Nov-2011 11:22:45 org.alfresco.solr.tracker.CoreTracker trackRepository
INFO: .... to Transaction [id=374, commitTimeMs=1321528572425, updates=1, deletes=0]
From Alfresco version 4.0.3, it is possible to dynamical add, remove and configure Solr cores to track any store in Alfresco. The Solr search sub-system can be configured via properties or JMX to support query for those cores.
Prerequisite: Alfresco 4.0.3 or greater, configured to use the Solr search sub-system.
Note you can set any property normally set in the solrcore.properties file on the url.
In the above example we have set data.dir.store=carrot which need to be prefixed by 'property.'.
You can update any property and cause the solr core to be reloaded using that setting using:
This can be used to set the data dir as above (which will cause a new copy of the index to be started)
You could use this method to adjust (and persist) a new value for the query cache size.
If you are done with the core or want to start again:
The Solr sub-system now supports a dynamic mapping of Alfresco stores to a Solr instance where the index for a store resides. Via properties this could be set as:
The same changes can be made via JMX. Add ',solrMappingSystem' to the solr.store.mappings attribute of the Solr search sub-subsystem. A new entry for this mapping will appear below the sub-system alongside the two existing mappings. The attributes of the mapping can then be configured.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.