This long blog post explains how we are determining the future of shared network drive support in Alfresco Content Services. Specifically, we are considering dropping support for using the SMB protocol and instead increasing our investment in WebDAV for these use cases. We want to explain our reasoning and seek your feedback.
Analysis of the Problem
Alfresco has long worked to make the power of enterprise applications available to users who don't want to understand all the details of an ECM or BPM system. Early in the product's life, Alfresco added to the Content Repository the ability to be accessed as a shared network drive so that knowledge workers could receive the benefits of ECM while continuing their habit of "throwing everything in the Z: drive". We ended up implementing this capability three separate times: the CIFS dialect of SMBv1 in our JLAN module, standards compliant WebDAV, and the Windows specific WebDAV implemented in our AOS module. But the broad adoption of this capability has made it worthwhile.
That conversations mentions a number of ways to implement SMBv3 which we investigated: upgrading our current implementation, using 3rd party open source libraries (there aren't any mature implementations of the server), implementing a storage back-end for Samba, and using proprietary libraries. Recognizing that the effort involved in implementing SMBv3 would slow down our progress on the other priorities described on our Content Repository Roadmap 2017, we also looked at alternatives ways to meet the same use cases.
WebDAV is an obvious choice to replace SMB, as our implementation is mature and it is widely used by our customers. It is also more robust than SMB when used on high latency networks such as when deploying the Content Repository in a cloud environment like AWS, which is an increasingly common use case. In many ways, WebDAV is a better fit for ECM use cases than SMB, which is intended to be used by high performance filesystems. Customers who attempt to use ACS as a file server are sometimes disappointed as a content repository makes a different set of trade-offs from a file server; it has many more capabilities but lower total throughput. Specifically, a file server uses SMB to allow mid-file access and high performance operations by exposing raw file handles to client applications, but this is not possible when the content is encrypted, is stored in an object store like S3 or Centerra, or is stored in a cheap high-latency infrequent access storage tier.
Many of the use cases where customers have expressed a preference for SMB over WebDAV require high frequency mid-file access. These use cases are not suitable for pulling directly from an content repository because they don't allow it to perform ECM functions. Instead, customers should synchronize the desired content to the client machine and back to the repository when the file is finished being used. Our proprietary https://community.alfresco.com/community/ecm/blog/2016/12/02/desktop-sync-100-ga-for-windows?sr=sear... offers this capability, and is one of many solutions provided by both Alfresco partners and the open source community that can be used for this purpose.
Though our analysis suggests that WebDAV is an adequate replacement for SMBv1 in most use cases, we wanted to hear from a larger set of customers.
We sent a survey to 150 customers who have previously indicated that they use either the CIFS or WebDAV shared network drives, and 52 responded. Important findings included:
Concerns with SSO access to shared network drives. NTLMv1 is also insecure, and the survey showed that Kerberos is much more widely used.
We specifically asked customers why they don't use WebDAV in every circumstance, and a few key reasons surfaced:
Concerns about performance: WebDAV does not perform as well as CIFS on a local network. Part of this performance is due to the mid-file access that CIFS can provide, but we believe there is room for optimization in our implementation which will help address this concern.
Concerns about compatibility: Some applications, such as Adobe products, struggle when accessing large files over WebDAV. One reason why CIFS performs better for these applications is the direct file handles for mid-file access that we discussed earlier. The second reason is that our CIFS implementation intelligently handles the file shuffling these applications do during write operations. We plan to port this shuffling from our CIFS implementation to our WebDAV libraries.
Multiple customers raised a concern that the Windows 255 character limit impacts WebDAV folders. Our plan is to use repository shortcuts to make it easy to mount deep folders on short paths.
The largest file that can be shared with WebDAV is 4GB. Desktop Sync is a better way to work with such files, as working with a 4GB file over WebDAV would require many round-trips of the full file.
For those who are interested, here are the detailed results from the survey. Note that the questions are usually multi-select, so results do not add to 100%. Also, there was an "Other" option where respondents could enter additional text, which accounts for the last few results in many questions. I apologize for the truncation in the answers.
Truncated options: Shared network drive, Custom application, An official Alfresco Connector, Publication through a web portal or public web site, Other: From Jive or Liferay Portlets.
Truncated options: Engineering Designs, Graphic Design, Other: all types of research files.
As a result of this analysis and research, we intend to take the following actions:
It is expected that Alfresco Content Services 5.2 will be the last release with a CIFS implementation. Along with retiring CIFS, we will be retiring NTLM and the ACS Windows Explorer shortcuts. Instead we will recommend the use of WebDAV and Kerberos.
We will compensate for the identified shortcomings in WebDAV by:
Implementing smart file shuffling with WebDAV to increase compatibility with commonly used applications.
Making it easy to deep-link into the Alfresco Content Repository over WebDAV to avoid issues with path length.
Focusing on improving WebDAV performance at scale.
Continuing to improve the performance of Desktop Sync when used with very large files.
We are also considering SAML support for shared network drives, though that work is not currently scheduled and won't be available in the next release.
If there is sufficient customer demand for an SMBv3 implementation, we will reconsider that development effort. In order to lower the cost of development, it is likely that we would leverage a proprietary 3rd party library. As such, any future SMBv3 functionality is not expected to be part of our open source offerings.
I look forward to discussing the implications of this change in the comments below.