Source Code Control for Web Development

So back in the old days of command lines and C (none of that ++ or # crap) CVS was the way you did source control. You checked out a branch, ran make, and then tested your code til it worked.

I am lost at how to go about it these days. I have me working on Windows, by co-worker is on a Mac, and the web server runs Linux. What free/cheap source code control system should I use in this case? Where should the repository be? The server is hosted by someone else so I can’t just use NFS to share the file system. Do I telnet into the server and do my checkout/in and use FTP to move stuff back and forth to the development machines?

There has to be some better way.

Have you looked into Subversion? It’s a source code controls system that is open source and very widely used. I’m not sure that it fits all your requirements, thought.

The big three nowadays are SVN, Git, and Mercurial (AKA Hg). All three of free both as in speech and beer, and work on all major platforms.

SVN is billed as a “better CVS”, in that if you’re used to CVS it will seem familiar. There are lots of graphical tools, and it’s been around longer than the other two. It is inherently a centralized system, in that there is one master server that everybody pushes their changes to. If you’re disconnected from the network, you can’t commit your changes, and you can’t look at history without being connected to the central server.

Git and Hg are newer and are both decentralized (or distributed) version control systems. This means that every checkout (more commonly called a “clone”) is a fully functional version control system in its own right. You can commit to your own local repository, look at history locally, all that good stuff. If you’re used to a centralized system like CVS it can be a bit of an adjustment, but many people find it to be incredibly useful, as you can feel free to commit your changes without publishing them. Git and Hg are much better with merging than CVS, since they have to be.

Both of these projects came out of the Linux kernel development community. Linus Torvalds, creator of Linux, also created Git, and it’s definitely more focused at power users, as it makes it easier in some ways to shoot yourself in the foot. Hg has the capabilities to do all the same types of things, but they don’t make it quite so obvious. There are graphical frontends, but they tend to not be quite so mature as those for SVN. Both have very rich commandline toolsets. Git is easy to script with perl or bash, and Hg is internally written in python and there are many python extensions (and you can easily write your own).

If you want free hosting, you can easily run your own server for any of them, or you can use Google Code for SVN or Hg hosting, Bitbucket for Hg, or GitHub for git.

I’ve probably opened up more questions than I’ve answered. I’ve used all three professionally, and I’ve contributed to both the Git and Hg Eclipse plugins. (I have one commit in the Git codebase, consisting of a one-word typo fix in some documentation. :slight_smile: )

Can I use Google Source for code non-open software?

Looks like no. I’ve used bitbucket a fair bit, and they do allow private projects for free up to a certain number of MBs/users, and after that you have a small hosting fee. I think github has something similar as well. Bitbucket lets you use SVN to access its repositories as well, even though they’re natively hosted using HG.

Bitbucket and Github both have integrated bugtrackers as well, which can be super handy.

CVS still exists and is still used. I don’t see why it wouldn’t work with your current environment. I personally have a CVS repository on a Linux system, access it from my Mac laptop and my desktop Windows box. It integrates nicely with many IDEs including Eclipse which I can also use on both the Windows and Mac box. I’m sure it’s old fashion and “out of date” but it does everything I need it to. There are CVS interfaces that will hook into the desktop as well if you get bored with the command line (eg. TortoiseCVS, SmartCVS… you can google a bunch of others)

I have not personally used a content management system, but you might want to check into something like Drupal, which is geared specifically towards web development. It runs under PHP, I believe.

I use Subversion as my personal version control system. Easy to setup and easy to use. The only issue about using it for the web is the .svn folders it puts in each directory. You can setup Apache to ignore them.

You can look into Perforce (www.perforce.com). I think it’s better than Subversion in many ways, but it isn’t open source. ($800/license). However, you can use it for free if you have two or fewer users which generally means it’s free for personal use. The advantage of Perforce is that it doesn’t pollute your working disk with CVS or .svn directories like certain other version control systems do.

You can have your web directory be a working directory for your version control system.

Git and Hg keep all their versioning info in a .git or .hg folder at the top of root of the repository, as opposed to spewing CVS or .svn folders all over the place, sounds like that is similar to Perforce.

SVN, Hg, and Git all have their ‘Tortoises’, for windows users who like having Explorer integration. There’s a SmartSVN and SmartGit as well, made by the same people who make SmartCVS.

One major advantage that all modern version control systems have over CVS is “atomic commits”. Basically, when you commit multiple files at once in CVS, you’re not really committing them all at once - you’re making N separate commits that happened to be at the same time with the same commit message, one for each file. This can really make problems when you’re trying to go back through your history.

I like Git. It’s really a lot better than SVN (which is only marginally better than CVS), and it really shines when you’re working with multiple people on different branches.

You can easily share/host a git repository over ssh; all you really need for that is ssh and read/write access for each user in the directory where you have the repository stored.

As mentioned, above, Git is a distributed version system so every user typically has his/her own separate repository, and you can then push and pull changes directly between users/repos - though if you do that a lot setting up a single, shared “master” repository where everybody pushes their finished changes usually makes sharing easier.

If you just want to try it out for a bit, or want to share some code to the world, http://github.com/ has free hosting for open source code with a fairly nice web front end, and a some tutorial material on how to use Git, which is useful. You can also host private code there for a fee, though you generally don’t really need to if you already have a linux/unix server with ssh access.

Only a minor addition I have to all the great information in this thread. It’s easy to use SVN in combination with SSH (just a matter of specifying svn+ssh as the protocol).

IIRC, there was a slight issue of integration with TortoiseSVN on Windows in setting up the keys (different formats, although that might solely have been something PuTTy related). Conversion was easy, however.

Virtually all of that applies to Hg, as well, if you replace Github with Bitbucket. (I don’t really have a dog in the Hg vs Git fight, as I’ve used and enjoyed both. I use Hg at work, but prefer Git on a philosophical level, as I think its technical underpinnings are more internally consistent.)

I’ve used Hg only a bit, but I’ve no problems with it either except that it’s annoying to have to learn two differing command sets. You can even convert Git repos <-> Hg if you want to (though I must say I’ve not done this on any large repositories, but it might be worth knowing for people who want to check out both).

My recommendation to the OP is Subversion given his background with CVS. With his small environment the advantages of git and Hg are minimal.

I think it depends. Git and Hg really push the idea of using parallel (private or shared) branches for doing work, and SVN’s limited merging facilities and central repository concept make that - extremely useful - idea much harder to implement.

On the other hand, it’s certainly true that switching from CVS to SVN is easier - the user interface/concept of SVN and CVS is pretty much identical. I’ve done CVS -> SVN -> Git and the switch from SVN to Git took a while, but in the end I do think it was worth it.

If you are a small shop and you have some old PC’s kicking around, why not check out TurnKey Linux appliances? They have one for SCM.

I love TurnKey Linux because they have full featured and fully configured Linux builds to make a true “turn-key” appliance out of any old PC. Just pop in the CD, answer a few prompts, and in ten minutes your machine is totally set up as an Ubuntu Linux box with some specific tool already running.

I have deployed one MediaWiki box and two MoinMoin boxes to two small organizations where I provide network support.

This is particularly neat because you don’t need to back up anything other than the data. If the machine blows up, just pop the Appliance cd into another old PC and copy your backed up data to it. No need to scratch your head and figure out how you set up networking on Ubuntu two years ago.

There is a Revision Control Appliance:
“An integrated revision control server combining the world’s best open source Version Control Systems: Subversion, Git, Bazaar, and Mercurial. A web interface for each system is included, making it easy to browse through the code base, compare revisions and manage repositories for multiple projects.”

And here’s a Bugzilla Appliance:
“Bugzilla is a Web-based general-purpose bugtracker and testing tool originally developed and used by the Mozilla project. One of Bugzilla’s major attractions to developers is its lightweight implementation and speed. Many projects use it to track feature requests as well. Bugs can be submitted by anybody, and will be assigned to a particular developer.”

And while you’re at it, why not a Wiki? Try either the MediaWiki Appliance or the MoinMoin Appliance.

As a CVS administrator for many years in our corporate environment, I would say that CVS is still alive and kicking!

As much as folks like SVN and say it’s the way of the future, I still like CVS for the following reasons:
[ul]
[li]It’s filesystem based. What you see is what you get. Repository access is totally based on traditional Unix file access rights.[/li][li]It’s very extensible and fast. We have hundreds of developers across many departments and divisions, on multiple continents, using the same CVS repository running on a small pizza box Linux machine.[/li][li]Since the repository is just one big directory tree, backups are simple. It is easy to perform incremental backups of the repository as well (e.g. backups that use hard links to generate hourly/daily/weekly snapshots). This is not as easy with a database-based system.[/li][li]I find it’s easier to figure out problems with this structure.[/li][li]CVS is well proven technology. Software usually does not improve with age, but I like tried-and-true SCM tools.[/li][/ul]Of course, there are lots of warts to CVS, so SVN might just be the thing. For example, there is no good standard way to see what is in the repository, and there is no metadata associated with tags (e.g. when the tag was applied and by who).

Subversion and git are also entirely filesystem based.

This is getting more in to IMHO territory, but I would never even consider using CVS for a new project, and I’d be extremely unlikely to use SVN either. CVS is an obsolete piece of garbage. It doesn’t know what a directory is, branching is extremely painful and there is no concept of a changeset(meaning you have to keep this metadata separately from the repository, which is a stupid and fragile waste of time).

SVN fixes the minor idiocies of CVS but makes no attempt to deal with its biggest weakness: its workflow does not(and cannot) support easy branching and merging. Branching in SVN doesn’t really exist: instead you can make copies of a directory structure in constant time and treat the copies as if they were new branches. This sound really cool in theory but the problem is that the only way to distinguish different branches of a repository is to get inside the head of whoever made the copies. This makes it extremely difficult for automated tools to see your branches.

Merging in SVN is even worse – it seems to have been hacked in after the fact. Merge metadata is attached as a property to objects. Objects without a mergeinfo property inherit it from their parent directory. This means that the only way to know what has been merged into a particular file is to recursively walk up the directory tree looking for that particular property. You also have to have a strict set of rules on how to perform merges properly in order to get the mergeinfo property applied to the right object. It’s very easy to screw this up. I’m involved with a project that has a fairly large SVN repository and merging things around is painful. The time to perform the merge seems to be proportional to the size of the repository and not the size of the changeset that gets merged around; for me, each merge takes about 5 minutes even if I’m merging a changeset that changed only one file.

I also have to use a ClearCase repository quite frequently, and for merging there’s just no comparison. ClearCase, despite being much maligned(and usually for good reason) for being quite slow, performs merges in seconds(it must be admitted that ClearCase’s merge metadata is not as expressive as SVN’s, though). ClearCase does have a lot of problems and I certainly wouldn’t recommend it for a new project, but I personally regard branching and merging to be the most important feature provided by a VCS and SVN’s deficiencies in this area are pretty stark in comparison with ClearCase.

Personally if I were starting a new project I’d use git or Hg. I personally don’t find the distributed paradigm to be that difficult to learn once you understand a centralized VCS like CVS or SVN, and a distributed VCS scales to any size project easily. With SVN you can get off the ground a little quicker because most developers already know how CVS-like VCSes work, but there are inherit scalability issues in their workflow that will make it difficult to grow your project past a certain size. Of course, it’s always possible that your project will never hit this limit, and the limit gets larger the better modularized your code is. On the other hand, the more software versions that you will be actively supporting with bug fixes (or even – horror of horrors – new features), the worse SVN scales.

If you’re willing to consider commercial, I think Perforce is almost certainly the dominant player in cross-platform development (SourceSafe/TFS may be more prevalent in Windows-only shops). Every company I’ve worked for in the last two decades uses it, and their customer list reads like a Who’s Who of high-tech companies, but it might be overkill for your needs. (They’ve got a small-number-of-users free license as well; “small” used to be 2, but I think it may be 5 now).

It has a huge variety of (free with the server) tools available: command-line, Java visual, web-based; integration with Office, Adobe tools, and almost every IDE (Visual Studio, XCode, Eclipse, etc.); and is smart about things like merging simultaneous or branched changes, per-client line endings, white-space-agnostic diffs, etc.

It uses the check-out/modify/check-in metaphor rather than the everything’s always-checked-out/submit-everything-that’s changed metaphor of CVS/subversion, which will take a little getting used to if you’re used to the CVS model. On large teams, the Perforce-style “intentional checkout” model prevents accidental checkins; on small teams I think it’s mainly a matter of preference.