Programmers - teach me about version control

friedo · March 11, 2016, 2:11am

RCS dates to 1982. SCCS was around for years before that. The idea of having a proper revision control system in place is not a new.

Another thing that isn’t new is the truism that banks and financial institutions have some of the worst IT practices in the industry.

HMS_Irruncible · March 11, 2016, 2:55am

At some point you’ll find yourself of several different versions of your script lying around with jerry-rigged names like current.py, worksgood.py, worksgood_last.py, worked_in_2013.py. That’s the point where you can recognize that a version control tool is a more convenient way to manage this than by name munging.

Git is not the most intuitive tool in the world, but it requires almost zero overhead for setup (you don’t even need Github, really, but it’s a good idea). And there are tons of friendly manuals and people who can help you understand it.

Version control is one of those basic skills that you apart from a hobbyist.

black_rabbit · March 11, 2016, 3:30am

Spoken like somebody who never had to use SCCS on an enterprise scale. In 2010.

friedo · March 11, 2016, 3:47am

My condolences. I did once have to use an ancient version of AEGIS which sat atop (I think) RCS. That was…interesting.

Melbourne · March 11, 2016, 5:51am

SVN does not require an HTTP server, although that is a common way to set it up.

With zero setup you can tell the svn client to attach directly to a file system folder. Type
file://svnserver/main/folder

instead of
http://svnserver/main/folder

BTW, your “server” (if you want one) does not need to be configured for HTTP. That would just be what you would do if you already had an HTTP server, and wanted to do zero additional setup. If you were actually setting up a server, you would probably install the svn protocol:
svn://svnserver/main/folder

brad_d · March 11, 2016, 7:19am

I second the recommendation to consider Subversion and its Windows GUI overlay, TortoiseSVN. I use it for any code I write that starts to get much more complicated than “Hello World.” The ability to make more sweeping changes off in their own branch while keeping the trunk working, as well as the capacity to trivially back out mistakes, makes some kind of version control well worth the time investment to me. I got by for many years using some of the ad hoc version control techniques mentioned upthread, and it all feels so amazingly primitive and clumsy as I look back on it.

Like Melbourne points out, SVN does not require an HTTP server. If it’s one person working on a single machine, the “file://” repository access works just fine, and is extremely easy to set up.

TortoiseSVN is the dominant version control system at the companies I’ve worked at. There’s a push to migrate to Git at my current employer, and just this last week I spent a short amount of time trying to figure out TortoiseGit - the learning curve appears quite steep. The code I’m actually using or working on, though, is still maintained through Subversion.

TwoCarrotSnowman · March 11, 2016, 8:13am

Thanks everyone - really helpful replies.

I registered for GitHub, worked through their tutorial, and the mists are clearing a little bit. Perhaps I could check my understanding?

[ol]
[li]GitHub allows you to create a repository which is a kind of online folder for the files in a particular project.[/li][li]Initially, files live in the master branch - this is the “official” version of the code. [/li][li]If you want to mess with the code, you can create a branch, which is a duplicate version of the files. You can mess with those as much as you like, because you’re not affecting the master.[/li][li]When you’re happy with the changes you’ve made, you can do a pull request, which makes the changes available to the master, but doesn’t implement them (I think)[/li][li]Finally, you do a merge, and the changes are baked into the master branch. The branch you created to work on the changes is now redundant and can be deleted.[/li][li]Behind the scenes, GitHub is using git to achieve all this[/li][/ol]
Is that about right? If so, I think I have the basic concepts down, and I can see the benefits. Even doing simple hobbyist stuff, branching sounds useful. This is for GitHub, though - when I read about git itself, I get a bit confused about staging folders, etc. GitHub seems to hide a lot of that from the user.

Where I’m struggling a bit now is the actual use of GitHub with the way I work. As I noted in the OP, I use Cloud 9, which is a hosted IDE. It works well for me. They boast of integration to GitHub, and indeed I have got as far as cloning a GitHub repo and having the files appear in Cloud 9 for editing. Where I got a bit confused is that I had created a branch for editing, and expected the branch to be the cloned folder in Cloud 9 - but it seemed to be the master. I think that, having got the GitHub repo cloned in Cloud 9, I need to use actual git commands on the command line within Cloud 9 (you get a full hosted Linux system so you can all the command line stuff) to switch from the master to the editing branch (and to commit the files, I think…?) Am I on the right track?

I found this guide online, but it only talks about syncing files you are working on, not how to point them to a particular branch. I can follow the logic, even if I don’t totally grok the syntax yet. Does the contents of that website seem sensible? I found it via Google, and I trust the Dope more than some random page I found online.

Thanks so much for all the advice - it’s really helpful.

friedo · March 11, 2016, 8:30am

Oops, you’re right. I forgot about that since I’ve been in the git world for so long now. I don’t miss svn.

TwoCarrotSnowman:

I registered for GitHub, worked through their tutorial, and the mists are clearing a little bit. Perhaps I could check my understanding?

[ol]
[li]GitHub allows you to create a repository which is a kind of online folder for the files in a particular project.[/li][li]Initially, files live in the master branch - this is the “official” version of the code. [/li][li]If you want to mess with the code, you can create a branch, which is a duplicate version of the files. You can mess with those as much as you like, because you’re not affecting the master.[/li][li]When you’re happy with the changes you’ve made, you can do a pull request, which makes the changes available to the master, but doesn’t implement them (I think)[/li][li]Finally, you do a merge, and the changes are baked into the master branch. The branch you created to work on the changes is now redundant and can be deleted.[/li][li]Behind the scenes, GitHub is using git to achieve all this[/li][/ol]
Is that about right?

All correct, but note that it’s not necessary to work on a branch. It’s perfectly fine to commit right to master if you want. Branches are more useful when working with other people (so you don’t clobber their stuff) or production systems (so you don’t accidentally deploy experimental code) or for doing experiments, so you don’t mess up your history with something that might not work.

You don’t have to do a pull request to merge your own code - you can always do that yourself. But if you want to use GitHub’s tools instead of the command line, you can send a pull request to yourself. Usually, pull requests are used to send changes to a repository that you don’t own (you are requesting that the owner pull your changes in.)

If so, I think I have the basic concepts down, and I can see the benefits. Even doing simple hobbyist stuff, branching sounds useful. This is for GitHub, though - when I read about git itself, I get a bit confused about staging folders, etc. GitHub seems to hide a lot of that from the user.

Where I’m struggling a bit now is the actual use of GitHub with the way I work. As I noted in the OP, I use Cloud 9, which is a hosted IDE. It works well for me. They boast of integration to GitHub, and indeed I have got as far as cloning a GitHub repo and having the files appear in Cloud 9 for editing. Where I got a bit confused is that I had created a branch for editing, and expected the branch to be the cloned folder in Cloud 9 - but it seemed to be the master. I think that, having got the GitHub repo cloned in Cloud 9, I need to use actual git commands on the command line within Cloud 9 (you get a full hosted Linux system so you can all the command line stuff) to switch from the master to the editing branch (and to commit the files, I think…?) Am I on the right track?

I found this guide online, but it only talks about syncing files you are working on, not how to point them to a particular branch. I can follow the logic, even if I don’t totally grok the syntax yet. Does the contents of that website seem sensible? I found it via Google, and I trust the Dope more than some random page I found online.

Thanks so much for all the advice - it’s really helpful.

I don’t know anything about Cloud 9, but when you clone a repository, you’ll get a copy of the master branch and all the other branches, but you have to tell your local copy which branch you want to use. By default it will use the master. If you can’t easily figure out how to switch branches within Cloud9, just start committing to master; it’s fine, nobody will arrest you. You can figure out branching later.

Stealth_Potato · March 11, 2016, 8:43am

TwoCarrotSnowman:

Thanks everyone - really helpful replies.

I registered for GitHub, worked through their tutorial, and the mists are clearing a little bit. Perhaps I could check my understanding?

[ol]
[li]GitHub allows you to create a repository which is a kind of online folder for the files in a particular project.[/li][li]Initially, files live in the master branch - this is the “official” version of the code. [/li][li]If you want to mess with the code, you can create a branch, which is a duplicate version of the files. You can mess with those as much as you like, because you’re not affecting the master.[/li][li]When you’re happy with the changes you’ve made, you can do a pull request, which makes the changes available to the master, but doesn’t implement them (I think)[/li][li]Finally, you do a merge, and the changes are baked into the master branch. The branch you created to work on the changes is now redundant and can be deleted.[/li][li]Behind the scenes, GitHub is using git to achieve all this[/li][/ol]
Is that about right? If so, I think I have the basic concepts down, and I can see the benefits. Even doing simple hobbyist stuff, branching sounds useful. This is for GitHub, though - when I read about git itself, I get a bit confused about staging folders, etc. GitHub seems to hide a lot of that from the user.

Well, you’ve conflated a couple of things, I think.

On GitHub, “pull requests” are a way to ask the owner of a repository to “pull” some changes that you have made in your own fork. For example, if you find somebody’s project on GitHub, and want to contribute to it, you “fork” it – which just means make a complete copy of the entire repository under your own account. You can then work on this repository, then when you’re done, issue a pull request, which is just a way of saying “please take branch X of my repository and merge it into branch Y of your repository.” Pull requests are not involved if you’re just working on your own stuff.

And working on feature branches and merging into master is just one possible workflow! Git is very flexible. You can just as easily make all your commits directly to the master branch. (But branches are very useful for helping manage changes, especially as projects grow in complexity.)

Also, even when using GitHub, you usually do most of your work on your local computer. You create branches, make commits, merge branches, and so on, and during this process, your local copy of the repository gradually becomes different from the copy that is stored on GitHub. Periodically, then, you do what is called a “push” – which just takes all the new things you’ve created on your local repo and sends them to the remote copy of the repo stored on GitHub.

(Note the terminology here as well. “Push” and “pull” are both about synchronizing differences between two repositories – “push” usually means that the repository accepting the changes is somewhere else, and “pull” means the repository is local. You have permission to “push” to the repositories you own on GitHub, but you cannot push to anybody else’s repository – instead, you must request that they “pull” from you.)

friedo · March 11, 2016, 9:00am

GitHub actually does allow you to send a pull request from one branch to another on the same repository; a fork is not necessary. This is a workflow commonly used by teams working on the same repo to do code review before merging.

Of course, if there’s only one person involved, a pull request is rather redundant.

TwoCarrotSnowman · March 13, 2016, 9:43am

I’d just like to say a huge thank you to everyone who contributed to this thread. I spent quite a lot of yesterday playing with GitHub, and while I still wouldn’t claim to be an expert in git, the concepts have definitely clicked in a way they didn’t before.

I can now create a new branch for a project I’m working on, check out that branch in Cloud 9 (my hosted IDE), make whatever changes I want to, commit the changes, push them back to GitHub and merge them into the Master branch. Woo hoo!

I know people have said there’s no point in doing pull requests as a solo programmer, but I find using branches to be a good way of experimenting with ideas that might break my working code*. As far as I can tell, having created a branch on GitHub, a pull request is the only way to merge it into the master - is that right? Or can I just somehow merge it directly?

ie anything I write that’s more complex than print(‘hello world’)

amanset · March 13, 2016, 1:39pm

Just a minor thing, but it is true that GitHub free accounts only allow public projects, right? So everyone else can get your code. Might be something to think about.

Bitbucket has free accounts that allow private repositories. And Atlassian have the free tool SourceTree for dealing with all the gitting.

BigT · March 14, 2016, 12:18am

I’m also wanting to understand this stuff. So my first question is–isn’t branching and committing done on GitHub? Sure, I will change my local copy (as it’s much faster to test that way), but then don’t I commit it to GitHub? Aren’t my branches on GitHub?

Now, I know you can use git and basically just use GitHub as a host. But can’t you also use it directly, and handle all that stuff in the nice GUI instead of the command line tool that gives you no visualization of your project?

Also, how does this workflow sound: Change local copy. Test. Commit to development branch. Continue working on local copy. When enough for a new version is ready, merge dev branch to master. After more exstensive testing, compile master branch and host.

Or would three branches be better? A dev branch that works as above. A beta branch where I commit when I think I have a new version. And then commit to master only after extensive testing, and include a compiled copy.

Oh, and feel free to replace GitHub with BitBucket in the above, too. The stuff I make now is all public, but I might not always stick with open source.

Jragon · March 14, 2016, 12:52am

Eh, sometimes I use pull requests on my own stuff. Pull Requests leave a history on the github page of major feature additions, and by default merging a PR is a no-fast-forward merge meaning you can undo a feature by removing one commit instead of keeping track of the start and end commits for that branch. (Of course, you can easily do a no-ff merge from command line too, but it’s not the default).

Note, though, that Pull Requests are a github thing, not a git thing. A Pull Request is just github’s sugar for requesting a merge to make collaboration easier.

Also, this is generally my workflow (for working alone, not as a team, for teams even bug fixes get branches and pull requests):

Minor changes can go straight on master
Major features go into a new branch
Before merging a branch, rebase the branch onto master, recompile/test, resolve conflicts
Do a no-fast-forward merge into master
Delete the branch

However, this really isn’t necessary for the OP. Honestly, I started git with just commit, add, push, pull, and reset. Generally with new files I do:

git add .
git commit -am “Whatever”

(The a isn’t necessary, but eh)

With no new files, just changes to existing ones:

git commit -am “Whatever”

And that’s sufficient for most purposes. Then when you get yourself in trouble just use “git reset --hard” to get yourself back to your previous commit (you can also use git reset --hard HEAD~N to go to N commit ago in case your recent few commits are borked).

There’s a ton of stuff you can do with git, but I recommend heavily against “learning get”, just google stuff and pick up exciting new commands like rebase and cherry-pick as you need them. I first started using branches as sandboxes where I wanted to make a huge change that would require multiple commits, but wanted to be able to back out in case I ended up not liking my changes.

Rysto · March 14, 2016, 3:52am

The branches exist both on your local machine and in GitHub. This is the key difference between a distributed version-control system like (DVCS) git and a traditional centralized system like svn or cvs. DVCS separates the traditional commit operation into two different operations, which git calls “commit” and “push”. Commit modifies the local branch and adds a new version to the repository. The commit commit action tells git to remember the exact state of your code as it is right now. Push takes commits done locally and updates the remote (github) branch to match. When you are working with others, push is basically the act of publishing your work to others.

Separating the commit actions from the publishing action is a really nice property of a DVCS. It gives you a lot more freedom to try wild experiments without affecting other people’s work. Sure, you can do that in a traditional system just by not committing, but if one step of your wild experiment goes wrong, you don’t any ability to go back to a previous version of your experiment that was working.

You can try branching, but centralized systems tend to make branching a pain in the ass. It’s usually difficult to just to create a branch, and once you have a branch, merging it back into the mainline can be incredibly painful. Some centralized systems do better at it than others, but none of them are as good as a full-featured DVCS.

BigT · March 14, 2016, 5:28pm

Rysto:

The branches exist both on your local machine and in GitHub. This is the key difference between a distributed version-control system like (DVCS) git and a traditional centralized system like svn or cvs. DVCS separates the traditional commit operation into two different operations, which git calls “commit” and “push”. Commit modifies the local branch and adds a new version to the repository. The commit commit action tells git to remember the exact state of your code as it is right now. Push takes commits done locally and updates the remote (github) branch to match. When you are working with others, push is basically the act of publishing your work to others.

Separating the commit actions from the publishing action is a really nice property of a DVCS. It gives you a lot more freedom to try wild experiments without affecting other people’s work. Sure, you can do that in a traditional system just by not committing, but if one step of your wild experiment goes wrong, you don’t any ability to go back to a previous version of your experiment that was working.

You can try branching, but centralized systems tend to make branching a pain in the ass. It’s usually difficult to just to create a branch, and once you have a branch, merging it back into the mainline can be incredibly painful. Some centralized systems do better at it than others, but none of them are as good as a full-featured DVCS.

I still don’t see how this can work, since GitHub will not have access to my file system–other than if I explicitly upload a file. Unless they include some sort of app. Again, I’m not talking about using git itself. I’m talking about using GitHub.

I also don’t see how this extra work is an advantage for single-developer projects, and I see a whole lot of those on GitHub.

If you’re saying you have to use git with GitHub, then I see no reason to actually use Github. I’ve seen git. It uses a command line for stuff that should very much be GUI-based. Buttons make more sense than single word commands you have to remember. And it’s generally better to navigate a folder hierarchy in a GUI.

Mangetout · March 14, 2016, 7:05pm

Version control still makes sense for a lone developer, because (amongst otger reasons):
[ul]
[li]it provides granular ‘undo’ and ‘redo’ in a way that archived manual copies of source cannot[/li][li]it enables a hotfix or patch to be released for a module, even if there is other work midway completed for the same module[/li][li]projects that begin as solo efforts don’t always remain that way[/li][/ul]

Kinthalis · March 14, 2016, 8:19pm

If you’re looking to have a lot of personal projects you want ot kepe private, I’d also recommend looking into bitbucket.

For open source software github is fantastic, but bitbucket is where I like to keep code I don’t necessarily want to share with the world This is because Github allows for unlimited open source projects, but you pay per private repository, while bitbucket charges you per collaborator.

Dr.Strangelove · March 14, 2016, 8:29pm

If nothing else, it’s a good way of backing up your project. Personally, I have several development machines that I switch between, and I need some centralized repository that’s always up. I use my own server but GitHub would be equivalent.

Code that only exists locally is just waiting to die. Even Linus Torvalds found this out the hard way.

Kinthalis · March 14, 2016, 8:29pm

You don’t need to use GitHub to use git, as you point out. You could always host a git server local to you. Of course using GitHub makes collaboration with others much easier, and it allows me, for example, to have access to my code and all versioning features from my PC at work and at home. It’s also safer to have your code both locally and in the cloud.

Github does have an app for windows and OSX (don’t know about linux) called GitHub desktop. Bitbucket also has one called SourceTree - well, SourceTree actually works with both BitBucket and GitHub actually I believe it’s a third party app.

As mentioned, single developer apps don’t always stay single developer apps. More over a good versioning scheme can be an important part of sound software development. Branching for features, keeping master for deployable code (that has passed all your tests), hotfixing from master, etc, etc are all good practices and give you more confidence about the state of your code. Not to mention having a complete history of your code at your fingertips!

Topic		Replies	Views
Any online version control software? Factual Questions	2	757	April 23, 2007
Recommendations: "Lite" software dev management tools Miscellaneous and Personal Stuff I Must Share	10	1172	October 9, 2014
Version control for documentation In My Humble Opinion	19	1784	October 6, 2015
Source Code Control for Web Development Factual Questions	23	2176	December 22, 2010
I still don't get Git (source control) Miscellaneous and Personal Stuff I Must Share	58	2056	March 8, 2024

Programmers - teach me about version control

Related topics