Derek Tasks

resplin · ‎6 Jun 2015

Obsolete Pages{{Obsolete}}

The official documentation is at: http://docs.alfresco.com

Archived

Copy and Checkout

Copy And Checkout Refactor

QName Persistence

Problem

Apart from the child association localname, all QNames in Alfresco are static. Duplicated QNames are:

Persistence of Static QName
Table	Column	Length	Load Factor
node	type_qname	255	1.0
node_aspects	qname	200	3.0
node_properties	qname	200	8.0
node_assoc	type_qname	255	minor
child_assoc	type_qname	255	1.0

For each node, then, there are approximately 2710 characters stored that are duplicated between nodes. This significantly bloats the database as well as the size of the L1 and L2 caches.

Solution

A solution was written for the BRANCHES/SCHEMA_1.3 but were rolled out before the merge. It involves having a StaticQNameEntity and associated static_qname table that is referenced by foreign key relationships from the above tables. Only the unique ID of the static_qname will be stored in the node tables.

Because the static_qname values are immutable within the server, a non-clustered, static cache can be used to map between the ID and QName - and vice versa. Hibernate could manage the ID to QName mapping, but it would hydrate entities from either the database or the L2 cache each time the QName was accessed.

Most static QName usage currently uses QNameUserType in Hibernate. This can be replaced by a StaticQNameUserType for all static QName mappings. The child_assoc.qname column will keep use of the QNameUserType as it is not a static QName.

Auditable Aspect Properties

Problem

The cm:auditable aspect is applied to everything. This wasn't the case when the system was first written. Subsequently, with the AVM tables, the aspect's properties have been put directly onto the AVM node table equivalent. This can be done with the normal repository's alf_node table as well.

Solution

Put the auditable properties such as cm:modified directly on the alf_node table for speedier access.

NodeStatus

Problem

The NodeStatus entity serves two functions. The first is to keep track of the transaction ID of of existing and deleted nodes so that the continuous index updating components can ensure that deleted nodes are removed from the indexes. The second is to act as a map from the protocol-identifier-guid' key (NodeRef) to the unique node ID. The use of this entity to access nodes adds overhead to all node operations by:

A relatively high proportion of cache misses and therefore extra database queries when accessing nodes
Keeping data related to deleted nodes for long periods of time - and there isn't currently a clean-up process
Duplicated NodeRef information
Weak foreign key relationships between NodeStatus, Node and Store.

The index recovery components need to keep transaction IDs, which are time ordered, in order to be able to rebuild partial indexes in original transaction order without double checking the existence of all nodes. The change_txn_id column must therefore be preserved for both existing and deleted nodes.

Solution

NodeStatusRefactor.png

The NodeStatus will cease to exist. Instead, the change transaction ID will be put directly onto the Node entity for creations and updates. Deletions will be recorded in the DeletedNode entity. The NodeRef key will be enforced by the database for both existing and deleted nodes. For both tables, a unique key will be enforced for the NodeRef.

The NodeStatus currently provides a cache of NodeRef->Node. This can be replaced with a much simpler internal cache that maps NodeRef->List

Derek Tasks

Derek Tasks

Table of Contents

Copy and Checkout

QName Persistence

Problem

Solution

Auditable Aspect Properties

Problem

Solution

NodeStatus

Problem

Solution

We use cookies on this site to enhance your user experience