You may now be tired of hearing me say it, but I will say it again: Your repository contains every version of everything which has ever been checked in to the repository. This is a Good Thing. We sleep better at night because we know that our efforts are always additive, never subtractive. Nothing is ever lost. As the team regularly checks in more stuff, the complete historical record is preserved, just in case we ever need it. But this feature is also a Bad Thing. It turns out that keeping absolutely everything isn't all that useful if you can't find anything later.
My woodshop is a painfully vivid illustration of this problem. I have a habit of never throwing anything away. When I build a piece of furniture, I save every scrap of wood, telling myself that I might need it someday. I keep every screw, nail, bolt or nut, just in case I ever need it. But I don't organize these things very well. So when the time comes that I need something, I usually can't find it. I'm not necessarily proud of this confession, but my workshop stands as an expression of who I am. Those who love me sometimes find my habits to be endearing.
But there is nothing endearing about a development team that can't find something when they need it. A good SCM tool must do more than just keep every version of everything. It must also provide ways of searching and viewing and sorting and organizing and finding all that stuff.
In the rest of this chapter, I will discuss several mechanisms that SCM tools provide to help make the historical data more useful.
Perhaps the most important feature for dealing with old versions is the notion of a "label." In CVS, this feature is called a "tag." By either name, the concept is the same -- labels offer the ability to associate a name with a specific version of something in the repository. A label assigns a meaningful symbolic name to a snapshot of your code so you can later find that snapshot more easily.
This is not altogether different from the descriptive and memorable names we use for variables and constants in our code. Which of the following two lines of code is easier to understand?
if (errorcode == ERR_FILE_NOT_FOUND)
if (e == -43)
Similarly, which of the following is a more intuitive description of a specific version of your code?
We create (or "apply") a label by specifying a few things:
- The string for the name of the label. This should be something descriptive that you can either remember or recognize later. Don't be afraid to put enough information in the name of the label. Note that CVS has strict rules for the syntax of a tag name (must start with a letter, no spaces, almost no punctation allowed). I still follow that tradition even though Vault is more liberal.
- The folder to which the label will be applied. (You can apply a label or tag to a single file if you want, but why? Like most source control operations, labels are most useful when applied recursively to a whole folder.)
- Which versions of everything should be included in the snapshot. Often this is implicitly understood to be the latest version, but your SCM tool will almost certainly allow you to label something in the past. If it won't, take it out back and shoot it.
- A comment explaining the label. This is optional, and not all SCM tools support it, (CVS doesn't), but a comment can be handy when you want to explain more than might be appropriate to say in the name of the label. This is particularly handy if your team has strict rules for the syntax of label (V22.214.171.1246.prod) which prevent you from putting in other information you need.
For example, in the following screen dump from Vault, I am labeling version 155 of the folder $/src/sgd/libsgdcore:
It is worth clarifying here that labels play a slightly different role in some SCM tools. In Subversion or Vault, folders have version numbers. Using the example from my screen dump above, the folder $/src/sgd/libsgdcore is at version 155.