CS 537 Notes, Revision Control Systems


1 What are Revision Control Systems, and why are they critical?

A revision control system (also known as a version control system, source control system, or source management system) is a system for storing multiple versions of a file or collection of files. For example, bank statements are a simple form of revision control - they specify the state of your bank account after each activity. Let's say you've been working on a software project for a couple of months. You keep backups of your code because it's good practice. Eventually, you end up with this:
-rw-r--r--   1 bernat  bernat   8172 Jan 19 16:53 simulation.c
-rw-r--r--   1 bernat  bernat   8172 Jan 19 16:53 #simulation.c#
-rw-r--r--   1 bernat  bernat   8156 Jan 19 14:38 simulation.c~
-rw-r--r--   1 bernat  bernat   9312 Jan 17 12:01 simulation.c.old
-rw-r--r--   1 bernat  bernat   9320 Dec 21  2010 simulation.c.bak
-rw-r--r--   1 bernat  bernat   8905 Apr 16  2010 simulation.c.orig
-rw-r--r--   1 bernat  bernat   8678 Dec 13  2010 simulation.c.from-UMD
Or this:
-rw-r--r--   1 bernat  bernat   8172 Jan 19 16:53 simulation.c
-rw-r--r--   1 bernat  bernat   9312 Jan 17 12:01 simulation.c.2011.01.17
-rw-r--r--   1 bernat  bernat   9320 Dec 21  2010 simulation.c.2010.12.21
...
This is difficult to deal with, and error prone. Instead, what if you could do this?
brie(32)% emacs simulation.c
brie(33)% cvs commit simulation.c
    CVS commit message: Added water simulation capability
    CVS: committed version 18
brie(34)%
Where the cvs command handled all of the tracking necessary to keep previous versions. With this, I can say:
brie(40): cvs co simulation.c
brie(41): cvs co -r 15 simulation.c
brie(42): cvs co -D 20071231
The first command gets me the latest version of simulation.c, the second gets me version 15, and the third gets me the version from the end of last December.

2 Advantages of an RCS

What do RCSes give us? Many things. Let's go into that in more detail. The first is obvious: the original purpose of an RCS was to allow one person to keep backups of important files quickly and easily. But it gets better. The same system can also allow multiple people to edit the same file without stepping on each other's toes, or to allow one person to edit from multiple locations. Imagine that Alice and Bob are both editing simulation.c. Without an RCS, it is easy for changes to overlap (note: this is another good reason for multiple source files!). The conversation goes something like this:
Alice: "Bob, I have a new version of the file."
Bob: "But Alice, I made some changes as well. Now I need to integrate
them."
With a RCS, it goes something like this:
Alice: cvs commit simulation.c   <-- makes Alice's changes
Bob: cvs update simulation.c     <-- integrates Alice's changes
Bob: cvs commit simulation.c
Alice: cvs update simulation.c
As you can see, the RCS acts as a mediator between the programmers. It can identify when updates will conflict, and labels them so they're easily noticed. Let's summarize. An RCS can: And, in addition, can:

3 Current RCSes

There are many revision control systems; we will describe four of them.

4 Usage

We will discuss CVS and SVN since the CSL supports them directly. First, take a look at the CSL documentation:

SVN @ CSL , and CVS @ CSL

The basic concepts of a version control system are a repository, a current working copy, checking out, updating, and committing. In order: That covers the basics of version control systems. From there, things get more interesting. One feature supported (well) by SVN and GIT is the branch. A branch is a separate version of the code in the repository, and is particularly handy if you have multiple people working on different features in the same codebase or if you need to fix bugs and maintain an older version of your code. For the purposes of this class, you probably will not need a branch - instead, a single repository is sufficient.

5 Which is better?

Ahh... let's not go there. Clearly, RCS and CVS are obsolete. That's just to say they're time tested and bulletproof. They might not offer the most features, but more features are not always good. GIT is the bleeding edge and has some very nice features for managing source code distributed among tens or hundreds of people. On the other hand, for a single person or a small team it probably offers more complexity than it does benefits. Personally, I suggest SVN as a happy medium that is also CSL supported - CSL support goes a very long way.

Copyright © 2008, 2011 Andrew R. Bernat