My son is taking computer science in high school. We were discussing whether computers could multi-task. He said “according to his teacher” that computer multi-tasking was a myth and no computers currently do or have ever done true multi-tasking.
As we discussed it, we defined “multi-tasking” as the ability to run two or more processes at the same time, it doesn’t matter whether one is in the foreground and one or more in the background. I don’t know if this is a correct technical definition though.
I was under the impression that various operating systems like Unix, could do true multi-tasking (i.e.: that it was operating-system based rather than hardware based). He said he teacher told him that as processors have become more powerful they can seamlessly switch between processes so fast that it is “virtual” multi-tasking, but they’re still only doing one thing quickly.
I’ve done some googling, but can’t find the definitive yes or no we were hoping for.
I know there are some very IT knowledgable dopers out there. What’s the straight dope on multi-tasking?
You are writing this in 2018? So there’s a bit of confusion here.
All modern operating systems have a component called a component called a “scheduler”. Processes that want to run are stored in a data structure inside the scheduler, and the scheduler chooses the next process to run, lets it run for a while, then control returns to the scheduler.
The reason this is possible is actually, pretty much all the computers since the first IBM PCs had at a minimum other hardware components that could run, separate from the main task. At a minimum you need something called a “timer”, so the computer can track time separate from the process it is currently executing, and when the timer has a certain amount of time pass, it causes the computer to return to the scheduler.
Now, modern computers are much more complicated, of course. There are many more parts and it would not be useful to try to describe them all. The main thing is, they now do have multiple physical CPU cores - sort of like having more computers inside your computer - and have been capable of executing several parallel tasks at once for years now.
If a computer uses a roomful of processors, is that not “true” multi-tasking? What about multiple cores? What about a single core that handles multiple tasks in parallel because they all add up to less than 100% of its capacity so there is time to do them all? I don’t understand the objection.
This does sound like a true computer-science question rather than IT. A more complete answer would go into more detail on different types of multi-tasking (time slicing, preemption, cooperation, interruption, etc) , CPU architecture and ancillary hardware, hardware and software (including OS) support and integration, and other topics, but there is no doubt that computers do many things at the same time in every sense.
My expectation would be that there has been true multi-tasking since at least the invention of the external floating point processing unit. Task 1 chucks off a request to the FPU, yields the CPU to Task 2, and both CPU/Task 2 and FPU/Task 1 are processing data at the same time.
Graphics cards, similarly, are performing billions of calculations completely independent of the CPU and its not stopping just because the CPU flipped over to a different task.
And, of course, now with multi-core CPUs, there should be some real concurrency going on in the CPU itself. To some extent, it may be blocked from going as fully concurrent as it could ever want, due to being blocked on the kernel/access to hardware resources/etc. but any segments of code that are just doing complex calculations that take a while and have no externalities should happen in parallel with any other cores doing the same thing.
If you have a “hyperthreaded” dual CPU core that is “faked” multitasking as you define it (its just one actual core with efficent context switching)
But otherwise you totally have real multi-tasking if you have multiple processors. They are separate processors running at the same time with different registers, etc. running completely concurrently.
What would be the point of having more than 1 core per CPU if computers couldn’t multitask? You may as well allocate all that area to more cache.
More broadly, isn’t every reader of this thread using a computer which runs several processes at the same time, with some in the foreground and others in the background?
The teacher probably confused the statement: “Computers can virtually multitask with one core by switching rapidly” with: “Computers can only multitask in a virtual way which they accomplish by switching rapidly”.
FPU eh? Computers with multiple ALUs showed up by the end of the 1940s. So you can tell your kid that true parallel computers have definitely existed since 1950.
“Processes” are distinct from “threads”. A process contains one or more threads, but also sets of data (registers, memory, etc.). Modern pc’s with multiple cores can definitely run multiple threads simultaneously. However, even multi-core cpu’s share a single memory controller (and memory space), data bus (for expansion cards, like video cards), and, often, a certain layer of the cache is shared between all the processor cores, and I think that might be the definition of “process” the teacher is using. I can easily believe this distinction would be discussed in a modern high school comp sci class.
What happens if you have TWO computers? If you have them run two different programs at the same time, is that multi-tasking?
Because, that is what a “cluster” is. Two (or often times many more) computers connected with a fast network, all processing at the same time.
Yeah, true parallelism has existed for a long time at the instruction level. Old programming optimization involved a lot of deliberate ordering of instructions to take advantage of how many clock cycles you could effectively parallelize.
As far as the “task” level, multiple cores can absolutely run multiple processes at once. Truly parallel. There is some Kernel intervention that goes on with schedulers, but things still run in parallel.
As always, there are caveats – notably synchronization points such as accessing shared memory. This is why things like cache flailing are such a big deal, if two bits of memory fall on the same cache line the processor has to ensure that the memory is consistent between two threads and thus stop each one to painstakingly ensure read/write consistency. However, if you avoid these issues you can get true parallelism.
GPUs, in particular, are absurdly parallel. I’m not even joking. They’re parallel enough that if statements work differently. What an if statement does on modern GPUs is execute both branches simultaneously and take the result of the one that succeeds. Same with loops (if it can’t unroll them).
There are a few programming languages that mess with this though. Notably Python. It’s almost absurd to me that Python has become the de-facto language for numeric work when it just cannot multithread. Literally. There’s a global lock on the interpreter that prevents this. If one thread runs it gets exclusive access to the whole process. There is something called “multiprocessing” which is a super hacky way of achieving parallelism in python, but it involves… wait for it… using multiple processes with OS scheduling. Which is in fact the perfect refutation of what the instructor is saying, if the OS did what Python does, there’s be no benefit to hacking around the GIL, or a need for the GIL in the first place.
I guess it’s not… wrong that there is always some level of synchronization going on with memory and bus access if nothing else – much of it at a hardware level below the OS layer even, but I feel like it’s misleadingly oversimplified to say there’s “no true multitasking”.
I’ve taken, hell, helped teach entire courses on parallelism and highly parallel computing. I know of 90% of the weird pitfalls and caveats here, but at the end of the day you can absolutely get true parallel performance out of things albeit these tasks will be occasionally interrupted by the OS. But between those interruptions – if designed right, absolutely parallel.
Lets simplify this, I think your child misunderstood what the teacher was communicating, or the teacher is confused in themselves.
Multiprocessing: Using more than one CPU at a time.
Multitasking: Multiple tasks sharing a one CPU.
Things are far more complex these days with multiple cores etc… but that is related to concurrency. I am guessing that the teacher was trying to say that *Multitasking *is not Parallel computing.
To be fair, post SMP/multi-core systems these subjects were pretty poorly covered now as time sharing, multi-tasking and preemptive multitasking just “are” in almost all modern Operating systems and have been so for well over 20 years now.
“Multitasking” will be interleaved execution, it exists and has for almost 50 years. But these tasks are run one at a time but often appear to run concurrently to the user.
Jragon, if I may be allowed a mini-hijack, if Python is a de-facto standard for numerical work (and how could it be so, given what you explained), how come standard routines like LAPACK do not use it? Did you mean Python just creates a wrapper around the actual numerical work like the Numpy package, which I have found useful? Or, if it is so, what Python compiler do you recommend, if there be one, that can overcome the limitations you mentioned? Numba?
Slight diversion. Numeric work in Python is almost exclusively via numpy and other libraries. It is essentially the same model as Matlab. Your parallelism lives in the libraries. Once you dive into the library it can even release the GIL, and let the main program continue, syncing up later.
The GIL comes in for a lot of flack, but Guido makes a good point - until someone can come up with a way of avoiding it that doesn’t slow down single threaded code, it will stay. The GIL is only really needed to protect the top level dictionary, so you can write code that is multi-threaded, but it takes care.
You could craft Python for parallel work in a data-parallel form. That would take work. But the basic numpy paradigm is not far off anyway, and will usually get you quite significant speedup via true parallelism.
What’s so bad about the multiprocessing library? The asynchronous message passing paradigm is an excellent and reliable way to do multi-threading and it avoids *many *possible bugs. I wrote a fairly complex program with it last year and I had zero bugs related to the multi-threading itself during the development process.
And echoing Francis, with Python you also can use libraries that can use multiple cores. And in today’s world, basically the most computationally intensive thing we do is neural networks. Conveniently, neural networks generally involve 1-time complex setup of the architecture, but once running, use a fixed architecture.
This works with Python beautifully. You set up the architecture in Python by creating a container object and specifying the architecture. This code may be “slow” but 99.9% of the work is being done inside the library itself and it’s a 1 time cost. Once you set it up, modern libraries use the GPU at full speed to run the model, spending very little execution time inside your interpreted Python script.
Numpy is similarly quick. I just haven’t found the speed of Python to be an issue, and in the rare cases it is an issue, there are numerous tools and techniques to keep the bulk of your code readable and in Python, and you judiciously speed up the small portion that actually needs it by putting inline C in or using a C extension module or autogenerated C code.
You will also find a lot of jargon bandied about that has multiple meanings. That makes it hard to tease out what it meant. Words like process and thread and task need to interpreted with care. They have different meanings in different operating systems. Even across Unix variants it is sometimes a little unclear, but once you get to fundamentally different OS designs it gets even harder.
Unless the notion of task is well defined you are going to find it hard to get a proper answer. One might also want to define computer and processor with a little more precision. As noted above, any multi-core processor can run more than one thread of execution at one time. Hyperthreading processes are a very curious grey zone. They can only retire one instruction at a time, but can have instructions from more than one thread of execution in flight at once, and at every cycle multiple internal pipeline components can be working on each thread of control’s instructions. They are absolutely doing more than one thing at once some of the time. Beyond Intel’s feeble two thread hyperthreading, architectures like the MTA from Terra, could have instructions in flight from hundreds or thousands of separate threads at once. Each processor only retired one instruction at a time, but a significant amount of work could be occurring on a set of the executing threads every cycle. (MTA was a really neat set of ideas, and it is a great shame it all vanished.)
But if you take the notion of a computer as box of stuff, it is unusual for there to be only one processor core inside. Your phone has multiple cores, and even what might be considered as very simple devices can have quite surprising capability.
The notion of at the same time needs defining well too. As noted above, there are sources of contention inside a computer system. But systems are designed so that such contention is limited, and things can mostly progress efficiently without running into contention too often. Caches eliminate the majority of bus contention for memory accesses. Multi-level caches are used to good effect inside multi-core processors to held avoid internal contention, as well as generally speeding up the cache system.
Threading has a central lock, which is problematic for many needs. “Multitasking” isn’t really related to the options for concurrency in Python though, as this is an OS function.
If you are a python programer, or want to know more about modern concurrency needs and concerns watch this video by one of the Python core contributors.
If you don’t have the time to do so, threading doesn’t scale, parallel is complicated and distributed has latency. This is a horses for courses decision and not a generalized “good enough” topic. Some times “broadcasting” like GPUs can do is even a better option too if you want to get complicated.
But for the OP “real multitasking” has been in use for decades, don’t get confused by these distractions.
“True Multitasking” exists, has for decades and the teacher needs to review their materials or teaching methods if they make any other claims.
Which “needs” are those? At the company I work at, our main product is a near unmaintainable smoldering disaster because the codebase is almost solely in C/C++. And the product is a device with only 2 actual processor cores. There are around 700 separate threads and many fine grained handwritten locks and mutexes spread all throughout the codebase.
Rare failures from race conditions clog our defect list and some of them have been there for years because they are nearly impossible to find.
Correctness is far more important than speed. (for a recent example, note all the recent CPU bugs, most of them are hardware level race conditions from attempting to speed up serial tasks) And for our product, given we only have 2 real cores, we get zero benefit from multi-threading at all.
Most Python scripts similarly get zero benefit, being rate limited at some other point in the code. And for whatever it is that you want to run using multithreading - why not carve out the tiny piece of your code that actually legitimately needs the speed and use an extension? Or multiprocessing?
Go and Rust both have much better stories for multitask work. Go has its own scheduler for large-scale concurrent CPU-bound processes, such as web work, where you can spawn a Green Thread for, say, each inbound connection and let the scheduler figure out when to execute things.
Rust, in theory, is way better for intense compute-bound applications such as numeric computation or graphics work because it compile-time checks things like memory aliasing. (I say in theory because the type system ATM makes it a bit of a pain to work with matrices, we’re waiting on Type-Level numerics).
This is true in very limited situations, but there are a lot of cases where this just falls apart. For instance, any attempt to implement the A3C algorithm in Python is going to be obnoxious. So much that pytorch developed its own weird fork of multiprocessing just so people didn’t pull out their hair trying to figure it out. Hell, there are a ton of well-meaning tutorials on implementing A3C in python that are just plain wrong because they use threading instead of multiprocessing.
I’m working on research right now which involves dozens of neural nets in an ensemble (it’s complicated), and since none of the NN libraries really have a notion of submitting a batch of requests for multiple networks in parallel and awaiting their responses, we need a multithread architecture in order to get something scalable. Python’s model works until it really, really doesn’t.
Numpy itself is quick, sure, but there’s a lot you can’t do with Python without involving really weird niche extensions like numba. And honestly the worst thing with Python is how much you need to dip into other languages when writing applications and then write C-wrappers that you then wrap with a Python interface.
At one point we had to write an intensely performance sensitive application to do reinforcement learning on and it was clear Python wasn’t fast enough for the volume of work it had to do (think a small RTS-type game). We had to write the whole thing in a native compiled language, wrap that in a C interface, and then write the learning code in Python (because 99% of all ML libraries worth using are in Python).
Pretty much this. Whether the teacher was confused or the child was confused, the interpretation in the OP is flat-out wrong. Multitasking has existed for at least 55 years (I’m thinking of the DEC PDP-6, but it may have been around longer than that).
As rat avatar has noted, there is a fundamental distinction between multitasking and multiprocessing. Multiprocessing came later, partly because it was hard to figure out how to do it from an OS architecture standpoint, and partly because CPUs were so expensive that it was rare to have more than one anyway.
There are many variations of these concepts, too, with correspondingly different names. Multitasking can be time-sliced, preemptive, or (less commonly) cooperative, and all of the above at the same time. When multitasking occurs in support of sharing a single computer among multiple users for general computing, with support for individual user security, accounting, and authentication, it’s commonly referred to as timesharing, relatively rare today. One could also argue that multitasking can also occur in support of large numbers of financial transactions, generally backed by a database and fronted by middleware like IBM’s Customer Information Control System (CICS); such an arrangement is commonly referred to as transaction processing.
Multitasking may also occur in realtime systems engaged in work like data acquisition. A prime example of such a system was a DEC PDP-11 running one of the family of Realtime System eXecutives like RSX-11M. The scheduled units of execution were even called “tasks” (its successor, VAX/VMS, and later Windows, started to refer to these protected execution units with their own virtual address space as “processes”). This is kind of important to note, because sufficiently responsive priority-based interrupt-driven preemptive multitasking not only gives you the “appearance” of a task apparently having the resources of the whole computer, but in actual fact being able to respond to time-critical realtime events with arbitrarily small latency. Timesharing systems, OTOH, are in one sense less demanding of their scheduling algorithms (because responsiveness is not a bright line between mission success and failure), and on the other, large systems were potentially more demanding because simplistic round-robin timeslicing had to be replaced by elaborate “fairness algorithms” to keep the system responsive under the load of potentially hundreds of users. I guess the takeaway here is that not only has multitasking existed for a long time, but it has many variants with different design goals.
Concepts like “hyperthreading” in hardware should not be confused with multitasking. My own view on this is that these hardware assist capabilities, and to some extent even multiple cores, were to at least some extent encouraged by the fact that Windows at the UI level fundamentally sucks at multitasking.
Multiprocessing is completely different and is the actual parallel execution of tasks (or processes) on multiple CPUs. In most contexts these CPUs share common memory; this is formally called tightly-coupled multiprocessing. Building the first such OS was a major challenge, and the first such attempts were often compromises called master-slave configurations, or asymmetric multiprocessors, where the OS on a master processor “farmed out” a compute-intensive task to the slave. It was only later that OS architectures supporting fully symmetric multiprocessing with large numbers of nodes were developed. Today we take for granted that Windows supports symmetric multiprocessing across multiple cores, but again, this has nothing whatsoever to do with multitasking.
Someone mentioned multiple CPUs connected by a fast network. Here, memory is not shared and multiprocessing typically requires a collaborative distributed application architecture. In return for the disadvantages of loose coupling, clusters benefit from potentially high reliability due to redundancy and the potential for automatic failover. Thus, cluster configurations aren’t usually referred to multiprocessing, and are frequently used when high reliability is paramount. Examples were the VAXcluster and current Windows server clusters.