Docker Alfresco Resources?

josh_barrett · ‎16 Mar 2017

I am trying to figure out how to Dockerize Alfresco Enterprise (5.1 or higher) and am not sure exactly were to start.

The company I work for has Docker infrastructure built out to production and some critical applications running on it. I have been asked to try to get the our Alfresco up and running in Docker containers.

I don't think Alfresco has a Docker Trusted Registry that is open to pull from. I did see some projects out there but not sure where to go. Even an example Dockerfile would be a great start.

I hear Alfresco uses Docker internally for development. I haven't seen any documentation or artifacts on this.

It could be I am not looking in the right place.

Any help getting me started would be greatly appreciated.

angelborroy · ‎16 Mar 2017

Alfresco has only internal Dockerfile configurations, which are not planned to share with customers or external developers.

There are other Community resources for Docker like this one: https://github.com/keensoft/alfresco-docker-template

Hyland Developer Evangelist

resplin · ‎17 Mar 2017

I am excited to see the discussion around this question, because I would like to hear what people are doing with Docker.

One of my product responsibilities is the installation experience for Alfresco Content Services. We recognize that the installer is rather dated, and we have looked at alternatives including distributing Alfresco as a set of official Docker containers. I have two concerns:

Is Docker widely adopted enough that it can be considered acceptable to require new users of Alfresco (potentially non-technical people) to setup Docker in order to obtain Alfresco?
Is Docker mature enough that when someone takes one of our images and puts it in production they are likely to be successful? (This blog post has me particularly nervous.)

We have recently had a few customers (such as Josh_Barrett _) tell us that they are betting on Docker, which has us motivated to move forward faster.

Our engineering team has experimented a bit with Docker. Particularly, we have three projects in different stages of planning:

Converting our internal developer and QA tooling to use Docker images. The work on the tooling is done, but the images are just monolithic containers built with chef-alfresco. We figured we would improve the images later.
Producing a developer oriented Dockerfile that can be consumed by new developers as part of the SDK. We have made progress on the image for Alfresco Process Services (Activiti), but haven't yet started working on Alfresco Content Services (Alfresco Community Edition).
Creating a "reference deployment", which is a set of docker images that the documentation team can refer to as a reference for a production deployment in a variety of scenarios. We are currently evaluating what this sort of effort would look like if we were to tackle it this year, but we don't yet have a team committed to it.

I know that the Order of the Bee's Honeycomb Edition of Alfresco has also experimented with docker.

A few useful conversations in that community:

[OOTB-hive] Thoughts on Honeycomb that need your input (Richard Esplin)

[OOTB-hive] Thoughts on Honeycomb that need your input (Andreas Steffan)

GitHub - marsbard/docker-alfresco: Containerised Alfresco

Other unofficial Docker images:

https://hub.docker.com/r/pdubois/docker-alfresco/

GitHub - DrWolf-OSS/docker-alfresco

https://hub.docker.com/r/gui81/alfresco/

https://github.com/gui81/docker-alfresco

I am very interested in hearing what you are planning to do with Docker, and how you think an official Alfresco Docker image should look.

* Edited to fix some typos and clarify a confusing sentence.

jamen · ‎17 Mar 2017

Docker being a target platform is of interest, but I'd also like to take a step back on this and consider containers generally. How does one scale Alfresco components elastically UP or HORIZONTALLY? What are the barriers for this using community tools as well as more closed Enterprise environments?

To that end I'm very interested on a container type strategy on how one can have:-

* Centrally controlled and common component installation builds

* Instance specific configuration that can be registered in a central configuration manager (like Zookeeper) or simpler instance centric configuration technique (e.g. environment variable)

Some of the challenges I've seen so far (which could be to do with constraints in the environments I've seen) are as follows:-

* Component configurations seem to be more declarative in nature rather than dynamic

* Limitations in some environments can exacerbate the former (e.g. on clustering some environments require TCP for clustering as flooding the network with packets is frowned upon)

* How configurations are performed don't easily allow for injection at a container level (e.g. surf configuration sets properties in Share for userHeader and URL's in an XML file, the best that can be done is injecting in a filepath or URL to make things more dynamic)

So it would be great to ensure that when we are talking about containers, we do consider the macro-architecture and associated configurations. If this is sound (as a checkpoint) then the ensuing implementation Docker or otherwise will follow.

angelborroy · ‎18 Mar 2017

And don't forget about the one I referred in my reply, which is based on a work from Mikel Asla which was presented at BeeCon 2016: BeeCon 2016 >> Talks

This template was also picked by Alfresco (Enzo Rivello) to start with its own development last year...

Hyland Developer Evangelist

pdubois · ‎25 Mar 2017

I have explored a bit Docker possibilities having in mind support. A problem in support is to reproduce complex clients architectures using limited resources. In that situation containers can certainly help.

* Centrally controlled and common component installation builds

Possible answer : It is possible to configure automatic build of images, see : https://github.com/pdubois/docker-alfresco Every time something pushed to github, or a new tag created the corresponding image is build automatically on docker hub, see : https://hub.docker.com/r/pdubois/docker-alfresco/tags/. Docker hub is configured to detect events from github corresponding specific project.

It is possible to create distinct images for every Alfresco components i.e : SOLR, Alfresco, file transfer receiver, … The general idea is to have images where those components are installed and images tagged and build automatically. When deploying a configuration you can use docker-compose to assemble your components, connect them, express dependencies and scale them vertically or horizontally (see also docker swarm). A small example is provided in « docker-alfresco » combining Alfresco container with external database container.

Another example can be found here https://hub.docker.com/r/pdubois/solr6/ It deploys a stack made of 3 containers (DB, Alfresco, Solr6 search server 1.0.0).

Note : other advantage is that some pre build images can be used from docker hub like building blocks for your deployments. Meaning you do not have to maintain them (I.e: DB, ngnix, ...)

In experimental project docker-alfresco, Installation in done during image building phase and modules are installed during initial startup. Modules are copied from your host in the image and installed unattended. If you need another set of modules, rebuild the image, stop running container, delete it and start it again win new parameters.

The containers must be throw-away containers/disposable . All the Alfresco related data state is stored outside of the container on container data volumes. Not in the yml examples but small changes are required to do so.

* Instance specific configuration that can be registered in a central configuration manager (like Zookeeper) or simpler instance centric configuration technique (e.g. environment variable)

Possible answer:

All the configuration is passed to the container as parameter and seen from inside the container as environment variables and that configuration is impacted on Alfresco configuration files before startup by scripts substituting or adding configuration. This is done when executing the entry point of the container. A scripts is substituting the environment variables in the configuration files therefore same image with same parameters producing same container configured the same.

* Component configurations seem to be more declarative in nature rather than dynamic

Possible answer :

It is true that docker files and yml files for docker-compose are declarative but they can be made more dynamic using variables substitution. Inside of your containers you can have scripts that are triggered suring distinct phases of the container life cycle. You can use docker swarm to scale up or down.

To centralize, orchestrate, scaling up and down your deployments you can use

* Limitations in some environments can exacerbate the former (e.g. on clustering some environments require TCP for clustering as flooding the network with packets is frowned upon)

Answer :

You can create a dedicated network for all of your deployements. Docker-compose implicitly create a network for you for every instance of yous stach that is deployed. See : https://docs.docker.com/engine/userguide/networking/ preventing the issue to occur.

Another possibility is to use https://cloud.docker.com Docker Cloud. Docker cloud can use similar yml files to docker-compose but it can also manage your infrastructure to some extend. Docker cloud is a cloud application (SAAS) that is compatible with multiple cloud providers, see https://docs.docker.com/machine/get-started-cloud/ but you can also bring in you own nodes hosted on premise.

* How configurations are performed don't easily allow for injection at a container level (e.g. surf configuration sets properties in Share for userHeader and URL's in an XML file, the best that can be done is injecting in a filepath or URL to make things more dynamic)

Possibility:

Alfresco Modules containing configuration can also be used.

jlesage · ‎5 May 2017

After reading this discussion, I took the time to share the work I made about Alfresco and Docker. I built 3 images for each main component (Alfresco, Share and Solr). All are built automatically by docker cloud. Theses images are smaller than images based on alfresco bundle and easily extendible to add custom configuration or AMP.

You can find details on my project page : http://jeci.fr/projets/alfresco-docker-cloud.html

This solution is very flexible. Developers can try it locally with docker-compose, or you can make a docker stack to run it in a swarm. But in swarm mode you have to take care of persistence. For database there is some solution like Galera. For the ContentStore you can use Alfresco S3 connector, and I developed connectors for Swift, Ceph en Openio that I will open source soon. For Solr I not found easy solution.

Currently I work on making a LibreOffice container which is almost completed

Resources :

Using Docker — Galera Cluster Documentation

My connectors announcement, in french sorry:

Connecteur Alfresco OpenIO

Connecteur Alfresco Ceph

Connecteur Swift pour Alfresco

Jérémie Lesage
https://jeci.fr

resplin · ‎12 May 2017

The new Docker Containers for Alfresco Process Services 1.6 allows injecting configuration through properties:

Installing Process Services using Docker | Alfresco Documentation

It would be interesting to use the same approach for Alfresco Content Services containers.

jamen · ‎13 May 2017

Taken me some time to respond on this. These are all very productive to improve the product. There are some design changes I think we can feed into the stack though based on a current project we are working on to roll out a farm of Alfresco services. Although some of the above are practical solutions on smaller enterprise environments they are not practical in all.

One observation is that some organisations in a service based deployment mode may offer an a la carte offering of one, solr, share, or activiti. In such a scenario, what components are primary or secondary? If you set the customer procurement experience to be "push a button" and get a one repository, solr or share might be optional. But there is a chicken and the egg sort of problem with share and solr whereby configuration is declarative in such a way where you may not know on process instantiation where to resolve another component. Sure it's true that solr and share depend on share functionally from startup, the same can't be said of a One. So I would ask whether some configurations should be settable after process instantiation rather than on bootstrap. This might be possible already, but it's certainly not standard way most of us deal with our setups which normally have our properties set on all components before we commence starting up.

On the network side of things, a lto of larger enterprise environments just don't allow for creation of a new network. In fact I've seen in some environments where only TCP transport is allowed for applications like alfresco. This then relies on the application having to resolve the appropriate network interface to use and (if there are multiple bind addresses). This is not an easy problem potentially if you're just being dynamically given a container with multiple network interfaces.

Lastly on application layer configuration, it's quite easy to override and extend surf configuration capabilities such that container centric (instance specific) files are includeable rather than having to build a bespoke war for every installation.

resplin · ‎3 Jan 2018

I recognize that we have gone a while without communicating about our work on this project, so I posted an update here: ‌