Ben Collins-Sussman posted an interesting blog entry “Subversion’s Future?”, where one of the main points made was that while distributed source control systems are OK for smallish/open-source projects, Subversion’s sweet spot is with huge projects. I couldn’t disagree more. And here’s my response.
I’ve been using the distributed source controls systems for more than a decade and been watching other big projects using the distributed systems and it seems to me the DVCS provide the more benefits the bigger the size of the project. What are the characteristics of a huge project? In most cases, it means that there is a big team working on it. Big team means global team, all over the globe. This is not an “open-source thing”, this is a reality of corporate software development too. Most companies (well, at least those who are actually producing huge projects) are global companies, with offices in the U.S, Europe, India, etc. Working globally on the central repository *is* painful and slow. Tried to bisect the regression introduced between the releases, switching between many revisions, tried to follow the history of some code in Subversion? Doing this when the main repository is overseas is not fun.
A big team of engineers is typically organized in a hierarchy of smaller sub-teams focusing on particular area of the product. Again, it’s much more natural to organize a hierarchy of workspaces matching the structure of the organization. There are many benefits to that approach: mostly the members of such sub-team care about their area, not touching/changing anything in other places. And they better know their code, so they could find/fix new bugs and regressions faster. Typically, there is a special QA force for each sub-team, trained and specialized in testing particular area. Once they tested/OK’ed the particular state of the team code, it can be pushed upwards to the integration workspace. Thus, members of other teams won’t even be disrupted by local problems/regressions, since they would get the more stable and better tested code. The distributed source control allows to do that beautifully and naturally. Doing this in Subversion is seriously painful.
Also, what are those mythical huge projects that nobody knows about? How about OpenSolaris or Linux?Are they “huge enough”? How about Mozilla or Ubuntu or NetBeans or JDK or MySQL? And all these projects do use distributed source control tools. Solaris and Java SDK only recently were open-sourced. Before that they were, by all means, huge commercial software projects, each with many, many years of development by hundreds of people. They were developed with distributed source control system. There *is* a reason why Linux never used CVS/Subversion and why even commercial non open source distributed system was used to develop Linux (since there were no good open DVCS at the time). And the reason, of course, is that distributed source control helps managing the overwhelming complexity of the huge projects much better than centralized one.
One other point in the Ben’s blog entry was about usability and ease of use of Subversion. Yeah, it is easier to use in simple scenarios, but once the size of a project grows, it gets harder and harder. Besides, if engineers are smart enough to develop and maintain a huge project, adjusting to distributed source control systems would be piece of cake!
And for those folks who are stuck with Subversion, there is a great git-svn tool that would allow to leverage the power of distributed source control while working with centralized Subversion repository.
I’m not really saying that Subversion is “bad”. It was actually great for its time, but now there are better and smarter tools out there.