When picking a library, check how it's version-controlled
Even if you just want write a modest desktop version of Hello World, you’ll realize that you need tens of external libraries. You may end up using tons of them. Picking the right library is an art. You have to carefully balance between many different aspects. How mature is it? How actively is it developped? Does it have a community? If hopefully so, how responsive is it? Are they nice and competent people? Does it have a fancy website? Does it conform to the most recent 2003 W3C recommendation? There are many more questions to answer. I want to explore a single, less emphasized aspect. What version control system does it use?
Depending on your project, libraries fall into one of four categories.
- Spades, or batteries that should be included
- Diamonds, or the fancy stuff
- Clubs where noone knows what’s going on
- Hearts that lie closest to you
A library may fall into one category for a certain project and into another one for a different project. Let’s have a closer look at these suits.
Spades, or batteries that should be included
These are the libraries that should be part of your development platform. Well, they may be included in the next release. Or a simpler version is already included, but you need an extra feature supported by this library only. Or it just provides a usable API instead of the flawed one in your platform – like you can find an alternative implementation for almost any package in the java.*
namespace. Apache Commons usually belongs to this category.
Diamonds, or the fancy stuff
Libraries that provide a specific feature which is used by your project, but not absolutely necessary. It may important to have beatiful reports generated by your application, but reports score only one on your long list of features. Or to give a more technical example, being able to import JSON data into your database backend might come handy in some cases, but it’s not a must. If, however, you’re writing a generic database loader, that fancy JSON importer becomes a must.
Hearts that lie closest to you
There are a few libraries that are intimately linked to your project. There may be none, and they are probably not more than one or two. You couldn’t move without them, you can consider yourself lucky they exist. An “intimiate” library can be anything, depending on your project. It may be a canvas implementation if you’re working on a vector drawing program, or a geospatial database plugin if you’re doing something GPS-related.
Clubs where noone knows what’s going on
Finally, there are dependencies required by some library or by one of its dependencies. It’s difficult to hunt down if they are really needed, but you’ll keep them just to be on the safe side. If you have a complex enough project of some libraries falling into the above categories, you’ll be surprised if looked at the automatically managed dependencies. What’s this postscript-magic-2.4.3
doing here? Oh, it’s probably required by the report generator if you want postscript output. You never ever want postscript output, the format of a long gone era. You remove that library, then find out, it broke the pdf output in some mysterious way, even though, there is a pdf-magic-2.4.3
. So you put that postscript thingy back, and you leave even foobricator-0.8
untouched, who knows what would be broken if you removed it.
OK, but how is it related to version control?
The power of open-source is that, er, you have access to the sources. If you find a bug you can’t workaround and can’t wait to be fixed, you can dive into the code and fix the bug yourself. Then send the fix to the project owners or creators or maintainers so other people can benefit from it. If find a feature missing, you can implement it. And you are bound to find bugs and features missing in other people’s code. The more heavily and delicately you rely on someone else’s code, the sooner you’ll have such a case.
We are now getting quite close to version control. How do you fix that bug? Let’s see a Subversion scenario first. You download the most recent version, revision 998, tweak with it which takes a week. You are ready to send the patch to the author who just commited revision 1002. You do some diff gymnastics and send the patch. The author commits it to the codebase with big thanks to you. But if the author says he doesn’t like your patch, because he’s going to solve that issue a different way. Or you implemented a feature needed only by your project. You can have your own version of the library, but you won’t be able to benefit from its improvements unless you constantly merge the changes. Forking a Subversion project means you are on your own now.
On the other hand, if you use a library managed with Git or Mercurial, or any other system with a good concept of changsets, branching and merging, you can stay in sync with the original line of development. Some public repositories (Github, Bitbucket, etc) make it even easy to send a pull request.
To summarize, when picking a library, all other things being equal, avoid the ones in Subversion (or, horrors, CVS).