Versioning Improvements

cancel
Showing results for 
Search instead for 
Did you mean: 

Versioning Improvements

afaust
Master
2 4 8,939

This is a speculative document from BeeCon Hackathon 2016, collecting ideas for improving Alfresco's Versioning.

Current Implementation

  • Versioning behaviors are available on content that have the cm:versionable aspect
  • Relevant aspects:
    • cm:versionable
  • Relevant properties:
    • cm:initialVersion
    • cm:autoVersion
    • cm:versionLabel
    • cm:autoVersionOnUpdateProps
  • Relevant stores:
    • version2Store
    • lightWeightVersionStore (legacy)
  • Upload in Share (5.0/5.1) ensures versionable aspect is set and reads default values from upload.post.desc.xml (which may be different from defaults in contentModel.xml and thus result in inconsistent behaviour)
  • Creation of a document via CMIS (5.0/5.1) ensures versionable aspect is set and defaults cm:autoVersion and cm:autoVersionOnUpdateProps to false regardless of contentModel.xml values
  • Updating content stream via CMIS always creates a minor version (CMIS spec does not require it: 'A repository MAY automatically create new document versions as part of this service method.')
  • Legacy VTI module ensures versionable aspect is set immediately before executing an operation that is version-aware (initial version will be the state before the operation, but not the initial state of the document) (see: MNT-3342, ALF-6344)
  • Associations to versionable nodes are stored only referencing the live node (not the version)
  • Restoring a version creates a new version with properties + association based on the old version
  • Some technical properties are stored on the versioned node without an underlying content model, causing “residual” properties and value persistence in the DB as serializable blobs

See the attached document for a depiction of how the current versioning works with metadata. Though the behavior is not obvious, it is consistent.

Current Limitations

  • Version store locking for big repositories causes poor performance
    • i.e. when creating versionable documents in parallel transactions may optimistically lock (then fail and retry) due to locking the version store root when creating the document-specific version root
  • When you got a long version history, details page rendering becomes slow
    • All versions are always loaded by the backend data web script
  • When the user interacts with Alfresco through different interfaces, the cm:versionable aspect isn’t always applied. As a result the user sees versioning as being inconsistent.The initial version of a document can be lost if it is added to the repository and then edited; the versioning aspect is added on save, and the initial version created is the edited document not the original document.
  • Bootstrapping content does not support simple bootstrapping of a version history for the node (but does allow bootstrapping a custom initial version label)
  • Versions are not access controllable - anyone with access to main document can access full version history
  • Restoring an old version with an association to a non-existing node will not restore the association but still delete the current association
    • In child-association case this can mean deleting a primary child with no simple way to restore again (reverting to version before reverting doesn’t restore child)
    • The lightweightVersionStore is no longer used, and should be removed.
  • Inconsistent behaviour with regards to (auto) versioning depending on interface being used and pre-conditions of the node

Customer Need

  • Business user needs to track all versions and all metadata changes
    • Including the first version of uploaded content.
  • Business users, administrators or developers need optional ability to define the initial version label for a new document based on document or type/aspect level (some customers want to start at specific version, say 0.1 instead of 1.0)
    • Type specific labeling / state specific labeling
  • Business users, administrators or developers need optional ability to define “business version” labels in addition / as alternative to technical version labels
  • Administrators or developers often want the ability to easily define specific events (i.e. state property transition) for automatically creating a major version (and optionally discarding “work” version)
  • End users want to have more explicit control over which versions are preserved
  • End users want to be able to branch and continue separate version history when copying a document
  • End users want to have the option of a more concise version comparison utility focusing on the changes instead of the full state of two versions
  • End users want to see versioned associations when looking at a specific version or version comparison
    • End users want to be able to look at details of associated, versioned nodes
  • End users want to have the option to easily compare two specific versions (not necessarily direct predecessor / successor pairs)
  • End users don’t always want simple property update / auto-version to create a new minor version
    • Auto-version should create “checkpoint” type of versions with the same major.minor label as the base document
    • “Checkpoint” version can be explicitly promoted to a new version
  • End users want to have the option of reverting individual property / association changes instead of the entire version
    • Administrators / developers may want to define which properties or associations can be individually restored / reverted
  • Alfresco administrator needs to control repository growth
    • Can’t store too many versions
  • Developers might want to update the content of the current version without creating a new version (for example to add a signature to a PDF)

Requirements

Version types

  • A version can be either a “business version” (major/minor - as before) or a “checkpoint version”.
  • “Business versions” have labels applied as before
  • “Checkpoint versions” share the label of the previous “Business versions”
  • “Checkpoint versions” act as “working versions” that are created whenever an auto-version operation executes (including a simple SharePoint save on a non checked-out document)
  • A “checkpoint version” can be promoted to a new “business version” with an explicit action or a checkin
  • Administrators can configure (via global properties) if handling of “checkpoint versions” is enabled for a specific interface (WebDAV, FTP, CIFS, CMIS, Share)
  • Administrators can configure (via global properties) if handling of “checkpoint versions” is enabled for a specific type (using inheritance during config lookup)
  • Administrators / users can configure if handling of “checkpoint versions” is enabled for a specific node (node properties similar to auto-version)
  • Administrators can configure (via global properties) if “checkpoint versions” are to be kept when promoting one to a new business version (default: “checkpoint versions” are cleared when a new “business version” is created)

Version storage

  • Versions are stored with the actual version label and an internal version identifier (for technical / chronological ordering)
  • Version service API allows setting the version label in an explicit call to createVersion
  • Version service allows configuring a default initial version label (via global properties) for a specific type (using inheritance during config lookup)
  • Version service allows configuring a default initial version label for a specific node via a property of the cm:versionable aspect
  • Version roots for individual documents are stored in a structured (bucketed) hierarchy aimed at reducing chance of database contention / locking during history creation

Version Branching

  • Versions of documents are stored in a separate structured (bucketed) hierarchy and included in the version roots of documents via secondary child associations
  • A version of a document can be contained in multiple version roots for different documents to allow reuse of history
  • Deleting a version of a document that is included in multiple roots only deletes the secondary child association that links it into the root
  • Deleting a version of a document that is included in only one root deletes the version node
  • The copy service API / copy behaviour callback API allow specifying if the history of a node should be copied into the target node
  • Administrators can configure (via global properties) if “checkpoint versions” of node histories should be copied into target nodes during copying
  • The copy service API / copy behaviour callback API allow specifying if “checkpoint versions” of a nodes history should be copied into the target node

Workaround solutions that may help

  • When KeenSoft sets up Alfresco, they always add a behavior so that a version gets automatically created by default when new content is created in the repository.
4 Comments
resplin
Intermediate

I wanted to thank again everyone who worked on this document last year: Axel Faust Nice Work, Mittal Patoliya Nice Work, iblanco _ Nice Work

These sorts of fundamental changes to the repository take a long time to consider and implement, but I have found this document to be very useful and will continue to refer to it.

afaust
Master

I should really try and find some time to add various issues / aspects I have recently discussed with people from customer(s) that were confused by versioning behaviour resulting from one of their customisations and argued vehemently that core versioning is at fault...

resplin
Intermediate

I should also thank Mikel Asla Nice Work since he was at the hack-a-thon in person ( Projects and Teams BeeCon Hackathon 2016) and discussed this with me. It doesn't look like he has an account here to reward with a badge, so a public "thank you" will have to suffice. [Badge now awarded.]

mitpatoliya
Moderator
Moderator

Thank You Richard. Always happy to contribute Smiley Happy