Obsolete Pages{{Obsolete}}
The official documentation is at: http://docs.alfresco.com
RequirementsActivities Service3.0
DRAFT/WIP
Introduction
Alfresco Activities is part of Alfresco's open source social computing platform. Alfresco v3.0 provides support for a news/activity feed in the context of an enterprise generating and acting upon content. This document outlines the requirements for Alfresco Activities and links to a design to support those requirements. This is a forward looking document whose purpose is to ensure the scope and direction is in the appropriate ball-park, rather than a statement of fact.
Requirements & scope
Alfresco 3.0 is primarily targeted at the extranet / site collaboration use-case. The requirements list is subject to change and some priorities may move up or down. Requirements are scoped using high-level MoSCoW priorities:
- [M]ust
- [S]hould
- [C]ould
- [W]ont - at least not for this (3.0) release
Activities
What is an activity ?
- activity represents an action that has taken place within an Alfresco client interface (app/tool)
- activity is typically initiated by the Alfresco app/tool/component/service on behalf of a user (it is not necessarily initiated by the underlying repository)
- activity is of a given/named type specified by the Alfresco app/tool (eg. document added)
- activity is performed at a particular point in time (post date)
- activity may have associated data dependent on type of activity
- activity may be performed within a given site/network context
- activity may be performed within a given app/tool context
- activity may be sensitive, that is, associated with data that is permission controlled, therefore, the activity itself may be permission controlled (can or can't be read)
- activity may be rendered into one or more UI views (activity summary)
TODO: clarify - list of app/tool/component/service types, eg. doclib, wiki, blog, forum, calendar, site (eg. managing site membership)
Activity Types
Activities may be raised by one or more Alfresco applications. The posted activity must have a uniquely named activity type.
Here is a candidate list of pre-defined activity types for Alfresco 3.0. The exact list is subject to change, as we develop the Alfresco 3.0 app/tools:
- Added, updated, and deleted documents
- triggered on versioning
- includes changes to metadata (explicitly denoted in feed ) - TODO: clarify - how ?
- does not include updates to tags (point to validate post 3.0 Community)
- Uploaded and expanded ZIP
- Added and deleted folders
- Added and removed members (person joined/left site)
- User role changes (change of user role for a site)
- New comments (on any artifact in a site, including documents, blog entries, etc.)
- Workflow-generated activities (requires explicit posting via customizing workflow definition)
- Added, updated, and deleted events (calendar entries)
- Published, updated, and deleted wiki pages
- Published, updated, and deleted blog entries
- Blog entry published to external blog engine
- New forum topics and new forum posts
User Connections
The activity service will need to be able to get the set of connections for a given user.
- each user will belong to zero or more sites/networks [M]
- each site will have one or more members (site has at least one site admin - a site represents a group (e.g. project) who have a common interest) [M]
- each user may manage a personal list of direct (mutually trusted) friends/colleagues [C]
- each user may have one or more indirect friends/colleagues (all members of networks they belong to) [C]
In summary, the list of connections for a user will be the set of all members of sites to which a user belongs. It could also include the list of direct colleagues (if available).
Assumptions:
- activities are typically raised within the context of a site (and optionally within the context of an app/tool)
- if an activity is raised outside of a given site context, then it will only be visible to others if they are direct colleagues
Activities Feed
- feed per user/site
- each user is presented with a personal activities feed [M]
- each site has an activities feed [S]
- each user also has a 'my activities' feed (ie. activities posted by the user) [W]
- each group has an activities feed [W]
- special group 'everyone' has an activities [W]
- user can subscribe (opt-in) to specific activities [W]
- activities for a given user feed are filtered out (excluded)
- if the activity meets any opt-out criteria for the user - see feed controls below [M]
- if the user does not have read permission on the activity - see privacy controls below [C]
- each feed is of a finite number of activities
- query limited to a maximum number of activities [M]
- feed limited to activities of a maximum age - after which they will be purged [M]
- feed limited to maximum number of activities - after which they will be purged [S]
- above feed controls are initially system wide [M]
- above feed controls are user specific [W]
NOTE: this may mean some activities are never seen by a user if the post frequency is high-enough
Feed Presentation
- an activities feed may be rendered by one or more applications
- Alfresco application component (AJAX/Flex Client)
- A Feed Reader (RSS/ATOM compliant)
- Custom application
- by default, an activities feed is presented in descending activity post date
- typically, an activities feed client will display the activities list as
- collapsed; display of title only - user explicitly expands each item to reveal an activity summary
- expanded; display of title and summary
- an activity summary provides a detailed view of an activity
- each activity type has its own summary view
- view may be rendered:
- server-side (e.g. XHTML) for clients such as a Feed Reader
- client-side (e.g. AJAX/Flex) for clients such as Alfresco Application (TODO: clarify this)
- view may present information or action controls (e.g. view document)
- an activity summary view may be customized by a customer (TODO: clarify)
- activity type may generate multiple views, where each view has a different format
- each user may control how the news feed is presented (see query/sort criteria)
Feed Controls
The activity service must allow the user to manage opt-out feed controls (via the application) to reduce/filter the number of activities received.
- explicit filter - each user may control which activities are listed in their feed, based on opt-out criteria
- any activity originating from a given site [M]
- any activity originating from a given site and given application [M]
- any activity originating from a given application (across sites) [S]
- any activity initiated by a given user [W - revisit for 3.1]
- any activity of a given type [W - revisit for 3.1]
- implicit filter - activities for a given user feed may be excluded
- if the user does not have read permission on the activity - depends on site permission model [C]
- priority criteria - requirements currently unspecified [W]
Privacy Controls
The application could allow the user to manage opt-out privacy controls, to prevent certain activities being posted. Also, related to fine-grained permissions which should provide implicit privacy control.
- activities must have implicit privacy controls, where appropriate, based on permissions [C]
- user can explicitly control which of their activities are private i.e. not posted/syndicated [W]
Non-functional
Scalable
The Activity Service should be designed and implemented to:
- provide a scalable activity feed
- run in-process on a single server [M]
- but is architected so that it can become distributed (eg. grid nodes) [S]
- run in an Alfresco cluster
- run without explicit configuration of specific cluster nodes [S]
- distribute work across the Alfresco cluster [C]
Performance
Aim is to take Alfresco to majority of users in enterprise.
- sub-second response for 'retrieve user activity feed', regardless of no. of employees in enterprise, user network topology, no. of content items and frequency of activities
- NOTE: It is not essential for the activity feed to reflect up-to-the-second activity posts. Like e-mail, or other syndication feeds, an end-user typically does not sit in their client checking for new e-mail every second. However, a delay of 1 hour is probably too much. A delay of up to 10 minutes is probably ok.
- posting of an activity should have minimal impact on write operations
- asynchronous posting is acceptable, and guaranteed delivery is not required
The implementation of 'retrieve user activity feed' is unlikely to be calculated on a per user request, but instead pre-calculated in the background. Given this, the following figures provide rough estimates of how much data the activity service needs to handle.
The following figures assume the simplest implementation where each user feed is completely pre-calculated (i.e. no unions etc on query). Posts/Hr represents the number of user activity posts while Activities/Hr represents the number of generated activities to fulfill all relevant user feeds.
Concurrent Writers (% of users) 2
Minutes between posts 20
Activities Filtered % (policies etc) 0
Activity Size (in bytes) 512
Activity History (per user) 100
Users Concurrent Posts/Hr Friends Network Members Activities/Hr Activities/Hr/Usr Size(MB)/Hr Activities/Sec Total Storage (GB)
10 0.2 0.6 5 5 6 0.6 0.00 0.002 0.0005
100 2 6 20 10 180 1.8 0.09 0.050 0.0048
1000 20 60 40 50 5400 5.4 2.64 1.500 0.0477
10000 200 600 80 100 108000 10.8 52.73 30.000 0.4768
100000 2000 6000 100 500 3600000 36 1757.81 1000.000 4.7684
1000000 20000 60000 150 1000 69000000 69 33691.41 19166.667 47.6837
Source Spreadsheet
Extension Points
The Alfresco system will provide out-of-the-box support for activities. However, the following extension points should be provided for the development of new Alfresco plug-ins and applications.
- ability to post custom activities for user-defined (uniquely named) activity type [S]
- definition of activity summary view(s) for given activity type - for presentation in clients - via activity templates [S]
- allow activity templates to be localised [S]
Ideally, the above definitions can be created/registered without re-starting the Alfresco system.
Activities Service Specification
Public APIs
Summary
Alfresco clients at a minimum must be able to post an activity event and retrieve a corresponding activities feed. The APIs include:
- post an activity [M]
- retrieve user connections
- site members [M]
- direct friends [C]
- indirect friends [W]
- retrieve feeds (supporting feed presentation requirements)
- retrieve user activities feed [M]
- retrieve site activities feed [S]
- retrieve user 'my activities' feed [C]
- manage opt-out feed controls for a user
- get feed controls for a user [M]
- set opt-out by site [M]
- set opt-out by site and app/tool [S]
- set opt-out by app/tool (across sites) [C]
- set opt-out by user [C]
- set opt-out by path (across sites and app/tools) [W]
- delete specified feed control [M]
Post Activity
Posting an activity could be:
- either: immediate - with all associated data [M]
- or: pending - with secondary repository lookup based on a reference (eg. nodeRef) to get associated data [S]
NOTE: activities for deleted items, will need to be immediate, since associated data will not be available for subsequent lookup
Retrieve Activities Feed
Retrieve feed for a user or site.
- user feed to get feed entries for currently logged in user [M]
- site feed to get feed entries for specified site - public site accessible by everyone, private site requires currently logged in user to be a member of the site [M]
- admin user should be able to get feed entries for a specified user feed or site feed [S]
Feed will be sorted in descending date order by default, and will also be pre-filtered by the feed generator based on feed (opt-out) controls.
It is assumed that client can perform additional filtering/sorting, although we could allow the client to also query/filter some of the following:
- originating site [S]
- originating app/tool [C]
- originating user [C]
- activity type [C]
- date/time [C]
- date/time range [C]
- paged [C]
- max items (