Obsolete Pages{{Obsolete}}
The official documentation is at: http://docs.alfresco.com
This page explains the design of Alfresco's Lucene indexes and how they relate to each other. Version 2 has been introduced in Alfresco Version 1.4.
The index structure is made up indexes and deltas.
All the information from a transaction is stored as an index delta. The deletions should be applied to any previous indexes or deltas and then the index concatenated to the result.
For example, if the first transaction to an index committed nodes 1, 2, 3 and 4 it would contain a deletion list for 1, 2, 3 and 4 and then the index containing the information for nodes 1, 2, 3 and 4
Index
Delta 1:
Deletions: 1 2 3 4
Index: 1 2 3 4
The overall index is 1, 2, 3 and 4;
If the next transaction deletes node 1, updates node 2 and adds node 5.
Index
Delta 1:
Deletions: 1 2 3 4
Index: 1 2 3 4
Delta 2:
Deletions: 1 2 5
Index: 2 5
The overall Index is 3, 4, 2 and 5
Lets say the next transaction updates node 4 deletes node 5 and adds node 6. This creates Delta Index 3.
Index
Delta 1:
Deletions: 1 2 3 4
Index: 1 2 3 4
Delta 2:
Deletions: 1 2 5
Index: 2 5
Delta 3:
Deletions: 4 5 6
Index: 4 6
The overall Index is 3, 2, 4 and 6
This index could be simplified in two ways
- applying the deletions
- merging into a bigger index
The first stage would be to apply the deletion defined in Delta 1 to give:
Index
Index 1:
Index: 1 2 3 4
Delta 2:
Deletions: 1 2 5
Index: 2 5
Delta 3:
Deletions: 4 5 6
Index: 4 6
Then to apply the deletions from Delta 2 to give:
Index
Index 1':
Index: 3 4
Index 2:
Index: 2 5
Delta 3:
Deletions: 4 5 6
Index: 4 6
Then to apply the deletions from Delta 3 to give :
Index
Index 1:
Index: 3
Index 2':
Index: 2
Index 3:
Index: 4 6
These indexes could then be merged to produce
Index
Index 4:
Index 3 2 4 6
The order of additions to the index being preserved.
The application of deletes and merging of indexes can be done in the background.
In this design, prepare has to position a delta in the list. Once done, its position is fixed. Creating the index delta and storing the deletes is done by the transaction thread. Commit is then just a status change which needs to be persisted to the IndexInfo file.
For recovery a duplicate index info file is created to allow for failure while writing one or the other.
A Lucene reader is required that applies deletes to one reader and appends another.
The components of the index remain unchanged unless they have deletions applied or are merged together.
IndexReaders are valid until these actions occur. They can be cached and reused until they go out of date.
This is managed via reference counting and an out of date flag.
After merges unused index information is cleaned up in the background.
The merge and deletion algorithm aims for a fixed number of index files waiting to be merged or deltas waiting to have deletions applied.
NIO is used for file locking, RandomAccessFile buffering and flushing to disk. This is sufficient to provide locking for sharing indexes between mutiple repositories. Additional work is required to verify and update index state as altered by other readers and to remove pending operations that hang because the JVM dies or looses communication.
Version 1 has several issues which are addressed by version 2
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Ask for and offer help to other Alfresco Content Services Users and members of the Alfresco team.
Related links:
By using this site, you are agreeing to allow us to collect and use cookies as outlined in Alfresco’s Cookie Statement and Terms of Use (and you have a legitimate interest in Alfresco and our products, authorizing us to contact you in such methods). If you are not ok with these terms, please do not use this website.