Previous threads:
I’m working in Go now, and have been getting into concurrency. I’ve been writing a few slightly complex component based systems – for practicality I decided to make it concurrent in various places. Obviously this means dealing with all the shit concurrency makes you deal with.
First, a tiny bit of syntax background since most of you haven’t used Go, a goroutine is essentially a routine that operates concurrently with other goroutines (including the main routine). They’re not threads at the OS level (but can be run in separate threads), but for a lot of cases you can basically think of them as threads, since they operate in similar ways. To spawn a goroutine you type “go” in front of a function call. You can sort of think of a function spawned with a goroutine as a unix command launched with & – all it means is that your current goroutine/thread doesn’t wait for it to return.
Another thing to understand is the “defer” keyword which will call a function after the current function returns. Useful for closing files and unlocking mutexes.
I am so happy my test cases accidentally caught this, because I wasn’t testing for it. They just caught it by accident.
// Without too much fluff, this just means "this function belongs to a pointer to an instance of an object of type "system"
func (sys *System) DoSomething() {
go functionWithSideEffect // Spawn a new goroutine with a side effect
}
func (sys *System) functionWithSideEffect {
sys.RWMutex.Lock() // lock a mutex for writing
defer sys.RWMutex.Unlock() // unlock mutex after functionWithSideEffect returns
sys.someVariable = something
}
func (sys *System) GetSomething() VarType {
sys.RWMutex.RLock() // lock a mutex for reading
defer sys.RWMutex.RUnlock() // Unlock after return
return sys.someVariable
}
Seems okay, maybe.
Here were my tests (well, the minimal version)
sys := NewSystem{}
sys.DoSomething()
a := sys.GetSomething()
if a == nil {
Errorf("a wasn't set. Oh no!!")
}
This, being a bug thread, was obviously calling the error function.
What’s the problem? Well… I had a race condition. With mutexes. Specifically, would the reader or writer lock it first?
I figured that it “made sense” to lock the mutex in the new goroutine. This was a bad idea. You see apparently whatever overhead goes into spawning a new goroutine happens concurrently, so DoSomething() returned before the code ever entered functionWithSideEffect().
However, the GetSomething function was not concurrent and did not have that overhead, so it got the lock first (i.e. didn’t wait for the function we called DoSomething for to do its thing), and thus when it returned the desired variable, it wasn’t set yet because the writer was waiting on the reader to unlock the mutex.
This was easily fixed by locking the mutex before spawning the goroutine, but the reason I put the mutex where it was in the first place was to try to head of race conditions (and concurrent modification problems, obviously).
I’m sure anybody who’s worked with concurrency and multithreaded environments before is laughing at me and how obvious a bug it was. And to be fair, I fixed it in about 10-20 minutes, I just found it annoying and a bit funny that there was a race condition about who was going to lock the thing meant to prevent a race condition.