Alfresco Solr Demo Script

cancel
Showing results for 
Search instead for 
Did you mean: 

Alfresco Solr Demo Script

resplin
Intermediate
0 0 2,186

Obsolete Pages{{Obsolete}}

The official documentation is at: http://docs.alfresco.com




Alfresco Tech Talk Live (TTL) Demo Script


This is the demo script Martin used for Episode #57 of Tech Talk Live. It still needs to be cleaned up a bit...




  1. From a fresh install show the Share Admin page plus related Solr dirs and files

    (Solr is installed in the same Tomcat container as Alfresco, and the connection URL is unchanged from the default. The Solr home is in the Alfresco data directory, which also contains the Solr data files.)

    1. Show search section under More... | More... logged in as admin

      Search Manager page shows what search sub-system is enabled

      Show Solr config page, note. changes here are setup in DB so cannot use alfresco-global.properties after this

      Backup 2am every night

      Want to keep config in puppet for production installations

    2. Walkthrough of Solr related files and installation directories

      alfresco-global.properties

      Show rest of properties in repository.properties

      Solr Webapp in tomcat/conf/Catalina/localhost/solr.xml

      Specifies where Solr Home is and where to pick up WAR

      alf_data/solr/solr.xml specifies the cores and there config dirs, explain core index dir and core config dir

      \workspace-SpacesStore\alfrescoModels

      alfrescoResources

      conf

      solrcore.properties about how it connects to Alfresco, only talk about the basic properties (4.0.2 multi thread tracking)

      schema.xml for core

      talk about AlfrescoDataType and how it reads alfrescoModels

      solrconfig.xml

      this is where the request handlers are configured and any extra search components are hooked up, no standard req handlers are used, specific once for alfresco, afts, cmis

      show the /alfresco one and its components and queryparser config



  2. From a fresh install show the Solr Admin interface and what is relevant



    1. https://localhost:8443/solr/

    2. Show importing cert from blog

    3. Talk about that you can access config and schema from Admin, so no need to have access to file system

    4. Show Schema Browser that displays all fields being indexed via dynamic field in schema

    5. Demo search: https://localhost:8443/solr/alfresco/select/?q=alfresco&version=2.2&start=0&rows=10&indent=on, will not work as wrong req handler used

    6. Demo search correct handler: https://localhost:8443/solr/alfresco/alfresco/?q=alfresco&version=2.2&start=0&rows=10&indent=on

    7. AFTS: https://localhost:8443/solr/alfresco/afts?q=@cm\:name:alfresco&indent=on (gives authority container, folders, content)

      add rows

      https://localhost:8443/solr/alfresco/afts?q=@cm\:name:alfresco&start=0&rows=20&indent=on

      (The Alfresco Solr search subsystem supports the same query languages as the embedded Lucene subsystem,

      and the same fields (ID , PARENT) are also available.)

      search only for folders

      https://localhost:8443/solr/alfresco/afts?q=TYPE:cm\:folder%20AND%20@cm\:name:alfresco&start=0&rows=...

  3. Turn off Solr and use Lucene on the fresh Alfresco install



    1. Stop Alfresco

    2. Config...



      ### Solr indexing ###

      #index.subsystem.name=solr

      #dir.keystore=${dir.root}/keystore

      #solr.port.ssl=8443

      ### Lucene indexing ###

      index.subsystem.name=lucene


    3. index.recovery.mode=FULL

    4. Start Alfresco

    5. Show logs while starting

    6. Remove index.recovery.mode=FULL

    7. Can remove /alf_data/solr, tomcat/conf/Catalina/localhost/solr.xml, tomcat/webapps/solr

  4. Configure fresh Alfresco install (the one that just runs with Lucene now) to use stand-alone Solr running in it's own Tomcat

SETUP TOMCAT WITH SOLR



  1. Install Tomcat 6.0.35 in a new dir

  2. Create data dir under new dir

  3. Unpack Solr distribution ZIP into data dir (point out there are no index dirs or model dirs)

  4. Copy keystore from tomcat\webapps\alfresco\WEB-INF\classes\alfresco\keystore to data dir

  5. Copy solr-tomcat-context.xml to tomcat/conf/Catalina/localhost/solr.xml

  6. Update the solr.xml so paths match the installation, set the location of the Solr war file and the location of the Solr home directory

  7. Update each coreís configuration and tell it where Alfresco is running and its data dir (EACH CORE)

    data.dir.root=C:/AlfrescoSolrTTL/data

    alfresco.host=localhost

  8. Alfresco need to talk over HTTPS with Solr so we need to configure that in server.xml, change all PORTs to not clash with Alfresco HTTPS change first 8 to 9:

    <Connector port='9443' protocol='org.apache.coyote.http11.Http11Protocol' SSLEnabled='true'

    maxThreads='150' scheme='https'

    keystoreFile='C:\AlfrescoSolrTTL/data/keystore/ssl.keystore' keystorePass='kT9X6oe68t' keystoreType='JCEKS'

    secure='true' connectionTimeout='240000'

    truststoreFile='C:\AlfrescoSolrTTL/data/keystore/ssl.truststore' truststorePass='kT9X6oe68t' truststoreType='JCEKS'

    clientAuth='false' sslProtocol='TLS' allowUnsafeLegacyRenegotiation='true' />

  9. Allow the Alfresco Repository to SSL authenticate with Solr

    <tomcat-users>

    <user username='CN=Alfresco Repository, OU=Unknown, O=Alfresco Software Ltd., L=Maidenhead, ST=UK, C=GB' roles='repository' password='null'/>

    </tomcat-users>

  10. set JAVA_HOME=C:\Alfresco4.0.2TTL\java

  11. Start tomcat

CONFIGURE ALFRESCO TO USE NEW TOMCAT WITH SOLR



  1. Stop Alfresco

  2. Config...

    ### Solr indexing ###

    index.subsystem.name=solr

    dir.keystore=${dir.root}/keystore

    solr.port.ssl=9443

    solr.host=localhost

    ### Lucene indexing ###

    #index.subsystem.name=lucene

  3. Start Alfresco

    If you start with Alfresco 4.0 without Solr and SSL then

    <Connector port='8443' protocol='org.apache.coyote.http11.Http11Protocol' SSLEnabled='true'

    maxThreads='150' scheme='https' keystoreFile='C:\Alfresco4.0.2TTL/alf_data/keystore/ssl.keystore' keystorePass='kT9X6oe68t' keystoreType='JCEKS'

    secure='true' connectionTimeout='240000' truststoreFile='C:\Alfresco4.0.2TTL/alf_data/keystore/ssl.truststore' truststorePass='kT9X6oe68t' truststoreType='JCEKS'

    clientAuth='false' sslProtocol='TLS' allowUnsafeLegacyRenegotiation='true' maxSavePostSize='-1' />

    and in tomcat-users.xml

    <tomcat-users>

    <user username='CN=Alfresco Repository Client, OU=Unknown, O=Alfresco Software Ltd., L=Maidenhead, ST=UK, C=GB' roles='repoclient' password='null'/>

    <user username='CN=Alfresco Repository, OU=Unknown, O=Alfresco Software Ltd., L=Maidenhead, ST=UK, C=GB' roles='repository' password='null'/>

    </tomcat-users>

    and in web.xml



    <security-constraint>

    <web-resource-collection>

    <web-resource-name>SOLR</web-resource-name>

    <url-pattern>/service/api/solr/*</url-pattern>

    </web-resource-collection>

    <auth-constraint>

    <role-name>repoclient</role-name>

    </auth-constraint>

    <user-data-constraint>

    <transport-guarantee>CONFIDENTIAL</transport-guarantee>

    </user-data-constraint>

    </security-constraint>

    <security-constraint>

    <web-resource-collection>

    <web-resource-name>SOLR</web-resource-name>

    <url-pattern>/s/api/solr/*</url-pattern>

    </web-resource-collection>

    <auth-constraint>

    <role-name>repoclient</role-name>

    </auth-constraint>

    <user-data-constraint>

    <transport-guarantee>CONFIDENTIAL</transport-guarantee>

    </user-data-constraint>

    </security-constraint>

    <security-constraint>

    <web-resource-collection>

    <web-resource-name>SOLR</web-resource-name>

    <url-pattern>/wcservice/api/solr/*</url-pattern>

    </web-resource-collection>

    <auth-constraint>

    <role-name>repoclient</role-name>

    </auth-constraint>

    <user-data-constraint>

    <transport-guarantee>CONFIDENTIAL</transport-guarantee>

    </user-data-constraint>

    </security-constraint>

    <security-constraint>

    <web-resource-collection>

    <web-resource-name>SOLR</web-resource-name>

    <url-pattern>/wcs/api/solr/*</url-pattern>

    </web-resource-collection>

    <auth-constraint>

    <role-name>repoclient</role-name>

    </auth-constraint>

    <user-data-constraint>

    <transport-guarantee>CONFIDENTIAL</transport-guarantee>

    </user-data-constraint>

    </security-constraint>

    <login-config>

    <auth-method>CLIENT-CERT</auth-method>

    <realm-name>Repository</realm-name>

    </login-config>

    <security-role>

    <role-name>repoclient</role-name>

    </security-role>

  4. Show searches when Solr is running and when it is stopped

    Stopping SOlr gives 0 search result, no error message

    You can see what part of the Share UI uses canned DB queries

  5. Show search logging (Tomcat Log Valve)

    1. Turn on AccessLogValve in Solr Tomcat

    2. Search for alfresco

      127.0.0.1 - CN=Alfresco Repository, OU=Unknown, O=Alfresco Software Ltd., L=Maidenhead, ST=UK, C=GB [01/Aug/2012:15:00:08 +0200]

      'POST /solr/alfresco/afts?q=%28%28alfresco++AND

      +%28%2BTYPE%3A%22cm%3Acontent%22

      +%2BTYPE%3A%22cm%3Afolder%22%29%29+AND+

      -TYPE%3A%22cm%3Athumbnail%22+AND+

      -TYPE%3A%22cm%3AfailedThumbnail%22+AND+

      -TYPE%3A%22cm%3Arating%22%29+AND+

      NOT+ASPECT%3A%22sys%3Ahidden%22&

      wt=json&

      fl=DBID%2Cscore&

      rows=502&

      df=keywords&

      start=0&

      locale=en_GB&

      fq=%7B%21afts%7D

      AUTHORITY_FILTER_FROM_JSON&fq=%7B%21afts%7D

      TENANT_FILTER_FROM_JSON HTTP/1.1' 200 2983

  6. Setup plain HTTP connection - (production env inside firewall, performance improvements, less packages, inspect calls)

ALFRESCO SIDE



  1. Stop Alfresco

  2. config...

    solr.port=9090

    solr.secureComms=none

  3. In the alfresco web.xml, remove the following configuration:

    <security-constraint>

    <web-resource-collection>

    <url-pattern>/service/api/solr/*</url-pattern>

    </web-resource-collection>

    <auth-constraint>

    <role-name>repoclient</role-name>

    </auth-constraint>

    <user-data-constraint>

    <transport-guarantee>CONFIDENTIAL</transport-guarantee>

    </user-data-constraint>

    </security-constraint>

    <login-config>

    <auth-method>CLIENT-CERT</auth-method>

    <realm-name>Repository</realm-name>

    </login-config>

    <security-role>

    <role-name>repoclient</role-name>

    </security-role>

  4. Start Alfresco

SOLR SIDE



  1. Stop Tomcat

  2. For each core:
    alfresco.secureComms=none

  3. In the solr web.xml, remove the following configuration:

    <security-constraint>

    <web-resource-collection>

    <url-pattern>/*</url-pattern>

    </web-resource-collection>

    <auth-constraint>

    <role-name>repository</role-name>

    </auth-constraint>

    <user-data-constraint>

    <transport-guarantee>CONFIDENTIAL</transport-guarantee>

    </user-data-constraint>

    </security-constraint>

    <login-config>

    <auth-method>CLIENT-CERT</auth-method>

    <realm-name>Solr</realm-name>

    </login-config>

    <security-role>

    <role-name>repository</role-name>

    </security-role>

  4. Start Tomcat

  5. How would I rebuild the index from scratch

    Note: the index.recovery.mode=FULL is not used by SOLR - only lucene

    1. Stop the SOLR web app

    2. Delete the index data directory for each core

    3. Optionally, delete the models cached on the SOLR side for each core (e.g. ...\archive-SpacesStore\alfrescoModels\*)

    4. Start the SOLR web app


  6. Turn on more detailed loggin in Solr

    http://localhost:9090/solr/

    Select the ì[LOGGING]î link

    For info about tracking set INFO level logging for:

    org.alfresco.solr.tracker.CoreTracker

    org.alfresco.solr.tracker.CoreTrackerJob

    org.alfresco.solr.tracker.CoreWatcherJob

    For query debug set FINE for:

    org.alfresco.solr.query.AbstractQParser

    org.alfresco.solr.query.AlfrescoFTSQParserPlugin

    org.alfresco.solr.query.AlfrescoLuceneQParserPlugin

    org.alfresco.solr.query.CmisQParserPlugin

    For response timing (query time and match reporting) set INFO

    org.apache.solr.core.SolrCore

    Use the ìsetî option at the bottom of the page to save the changes

  7. SOLR status summary for cores:

    http://localhost:9090/solr/admin/cores?action=SUMMARY&wt=xml

  8. SOLR Overal all status report

    http://localhost:9090/solr/admin/cores?action=REPORT&wt=xml

  9. SOLR index status (the normal SOLR stuff)

    http://localhost:9090/solr/admin/cores?action=STATUS&wt=xml

  10. Show index inspection with the Luke Lucene tool

    C:\Users\mbergljung\Downloads>java -jar lukeall-3.5.0.jar

    Check term frequency for field

  11. Solr Alfresco Schema and how to extend it, see blog

Solr