We need a free software license for distributed systems

It is good for users to be able to run, modify, distribute, and distribute modified versions of the software they use. If for nothing else, because such abilities make software much better and much cheaper, as proven by our empirical experience with free software. Everyone can agree this is good.

When software runs on a single local machine, then if the user has access to the source code of the software under a free software license, it is easy for the user to run modified versions of the software, and distribute those modified versions to others for collaboration. That collaboration creates a feedback loop which makes the free software even better.

When software runs as a service, as part of a broader distributed system on nodes not controlled by the user, then even if the user has access to the source code of the software under a free software license, it is relatively difficult for the user to run and distribute modified versions of the software.

The reason that it's more difficult to run such software is that the user needs to, in effect, administer their own distributed system which is composed of the service, other supporting services, and clients for that service, all spread across multiple nodes.

It's known that it's more difficult to run distributed systems than local software, but why?

In some respects, this is a simple technical problem: We have a good understanding of how to write and run individual processes on a single machine, and how even novice users can relatively easily administrate such a machine. But we have little understanding of how to do the same for distributed systems, spread across multiple nodes.

In some other respects, however, this is a social problem: The scripts and software and techniques that one user (which may be a single person or an entire organization) develops to run a specific distributed system, are typically not shared with other users. And, crucially, those scripts and techniques are not available to the end users of a service: If the end user wishes to run the service themselves, they start from nothing: they have to recreate the distributed system from scratch.

Now, it's not clear how useful such scripts would be. This comes back to the technical issues: We don't understand how to run distributed systems very well at a theoretical level.

However, imagine a world where many large users (Facebook, Amazon, Netflix, Google, etc) published the scripts they use to run their distributed system. The most useful and portable and flexible such scripts would be adopted by others, and they would be extended to be even more useful. This is the same dynamic that we see for all free software.

Linux is not "well-designed" on a theoretical level. However, in practice, it's high quality, relatively easy to use, and used extremely widely for many different purposes. It made it to this state because it was free software, and everyone contributed to the shared Linux codebase; they had to, if they wanted to ship products using Linux.

Today, there's no such feedback loop for software for running distributed systems. The software that's actually used for the largest distributed systems isn't available, even as it incorporates large amounts of free software to run as a service. We might use those services, but we're not allowed to see or use the code for the distributed system that runs it.

When a piece of "local" software links against a free library licensed under the GPL, that software is required to provide its users all the freedoms of the GPL for the entire combined work. This ensures that free software continues to be improved by its largest, most well-resourced users. Again, this has been most famously effective with Linux.

We can write similar licenses for distributed systems. When a distributed system links against a free service so licensed, that system would be required to provide its users with all the freedoms of free software. The users would have the ability to run and modify their own copies of the system, and make it better, just like with Linux. Such a license would be a free software license; this would be a provision essentially identical to copyleft.

Some are concerned that such a license would not allow a service to be hosted on top of proprietary cloud hosting; an end-user would not be able to use the APIs of a proprietary distributed system to run the service. That's something that the GPL dealt with, too; it contains a "system linking exemption", to allow linking against proprietary "system libraries", part of proprietary operating systems. Our new distributed license could have a similar system linking exemption to allow use with proprietary cloud hosting.

The proponents of the SSPL claim it is an attempt to write such a free software license. The detractors claim it's something completely different. I don't really care; I just want a free software license which requires that, when licensed software is used to provide a service as part of a larger distributed system, the users of that service must be provided the freedom to run and modify their own copies of that distributed system. If that's not the SSPL, then we should work on a new such license.

The AGPL was intended, in part, to guarantee this freedom to users, but it has failed. Proprietary distributed systems frequently incorporate AGPL software to provide services. They believe that as long as the individual process that provides the service complies with the AGPL, the rest of the distributed system does not need to comply; and it appears that the legal world agrees.

I want users to be able to run high-quality distributed systems easily. There are lots of technical advances that we can pursue to make this easier, but we should also pursue social remedies. We will never have a large collection of low-cost, high-quality distributed system software unless the free software feedback loop starts running for distributed systems.