What's the relationship between C and C++?

No, it’s really both. Low-level machine access is important, of course (like the memory-mapped configuration or I/O registers you mentioned). But the close mapping between language statements and generated code is also important (for my day-to-day work, significantly more important).

Just the other day I was yet again presented with some crash information, which consisted of the assembly instructions near the fault and a register dump. I knew the function where it crashed but not much more. So I had to map the instructions back to the original source, making some inferences along the way due to the lack of symbols. It was annoying, but at least possible to do in C++. It would have been hopeless with some interpreted language, since the fault would have happened in the runtime or some external library. But in C++ it was relatively easy to map back to the source.

I was agreeing with everything you said, until you got to this bit, without, apparently, changing the subject.

The c runtime was always bigger than the FORTRAN or Pascal runtime. FORTRAN, because the runtime didn’t include much functionality. Pascal, because (for original design reasons) it lent itself to smarter linking.

The design of the library and runtime and linking system is surely is one of the reasons why c is a systems language (alongside, as the originators observed, random luck and being in the right place at the right time), but it’s not as if c ever generated a smaller ‘massive infrastructure’ than those languages.

If that were really true in the large scheme of things, then all applications would always be written in assembler. The fact that tracing a machine-level crash dump (I presume you mean an application crash dump) back to the original source code happens to be useful to you does not make it a priority in selecting an appropriate programming language. “Let’s write this huge application in this low-level language instead of this much more abstract language specifically designed for it, so that interpreting a crash dump will be easier” is a statement that I’m sure has been made exactly zero times in the history of application development. All the more so when the more abstract and appropriate language may let you do the development in one-tenth the time. Abstract high-level languages will have their own appropriate debugging tools, providing diagnostic information at an appropriate and useful level, even if operating systems continue to give us (often useless) crash information like “Stack overflow at 0x0069F820”.

The problem we seem to have here is ambiguity about exactly what a “runtime” is, because the term is used in different ways in different programming environments. I’ve seen it used to refer to the C standard library, but note what I said upthread: “(runtime systems are not to be confused with the C standard libraries)”.

The important distinction is that any sort of function or macro that a programmer wishes to use, which in C would be defined by an included header file, is simply a library module that is included or not, depending on need, under the control of the programmer. In that respect it’s exactly the same as calling a library routine from an assembly program.

Whereas a true runtime system is an under-the-covers infrastructure that the programmer never sees directly, for which the compiler creates calls in the generated code, and which must always be present because no compiled executable can run without it. These runtimes support intrinsic features of the language. The more complex and functionally rich the language, the bigger these runtime systems tend to be.

It’s not an application crash dump, though it’s sorta equivalent to one. It’s not an application at all. Not everything is an application.

Straight assembly is (mostly) a no-go due to the tremendous cost in developing and maintaining it (CPU intrinsics get us most of the way there without actual assembly). C/C++ hits a happy middle ground here; they are relatively high level languages, making for fairly productive engineers, but also with mostly deterministic code generation.

As it happens, just recently I fought a stack overflow bug of that nature. It was not the typical case of a recursive function run amok. In fact it only happened when I set a compiler flag to enable C++17 support!

Again, I was only able to make progress because of the clear translation from code to assembly. The new compiler mode tried to optimize some function calls by grouping together all the stack pointer math. Instead of basically doing a push/pop on each call, it saved all the pops for the end, which meant it wasn’t reusing the stack space, which meant it was using way more stack for functions with many function calls and big structures in the arguments.

It’s an example of non-deterministic code generation. But at least I was able to determine what the problem was and how to avoid it. Something that again would have been hopeless in an interpreted language.

I can come up with examples like this all day. I can’t afford to throw up my hands at errors with limited information. And I haven’t even gotten to the real low-level stuff, like decoding page table entries or dealing with system interrupts or figuring out race conditions in multithreaded code or priority inversion bugs or a million other things…

When it comes to performance there can be a lot more to things than how good the assembler listing looks. Modern processor implementations perform a lot of optimisation that is not reflected in the machine code.

I burned a few years of my life coding a mixed machine code, C++, and Python system. I wrote a discrete event simulator kernel in machine code that was used by a simulator library written in C++ where the simulations themselves were created manipulated visualised and debugged in Python. Ran on both Windows and Linux. So some amusing things to learn about the deep parts of each. One became intimately familiar with the systems and tradeoffs. One thing is that C++ does not play well with the Windows exception model. So much so that Microsoft advise that they preferred you didn’t use it.

One of the most important aspects of speed is that it isn’t about the instructions. You need to keep the pipeline full and that is about avoiding stalls. That is mostly about the data. Getting the data to and from the processor pipeline is the dominant place where you lose time. Caches are what makes things run fast. Then anything else that can stall your pipeline. Which mostly means jumps. Jump prediction tables and speculative execution help. None of this is obvious from inspection of the machine code. But modern compilers can perform some very deep optimisations if given the chance. Things like partial execution of code can yield insights the compiler can use to choose how blocks are structured or branches are taken. Hints can be inserted for preferred branch direction, and given enough information vector instructions can be used for tight critical stuff. It can be pretty amazing what can be done, but also depressing when it doesn’t actually work.

Data layout can kill your performance. On a 64 bit ISA alignment can mean huge wins or losses. But pressure on the I-cache can be an issue.

If you have a language that can express the semantics well it gives the compiler a lot more freedom to create good code. Writing code where you are trying to optimise things at the language level can be counter productive. The compiler can almost always do a better job of tracking dependencies in the code and working out what to evaluate when and how register allocation can be used to best effect. A lot depends upon the sort of code you are writing. Numerically intensive code has significantly different trade-offs, and the level of intrinsic parallelism possible similarly so.

There are lots of places where optimisations occur that are not obvious at the code level in higher level languages. Java has used just-in-time compilation for decades.

Compilers can perform lots of storage allocation optimisations. If it can statically prove an object allocated inside a routine is not visible outside the routine it can allocate it on the stack and avoid leaving it for GC. This can lead to significant gains in unexpected places. Some languages have missed a few tricks with how they lay out objects.

There has been an argument that using C++ let’s the compiler produce really good code. IMHO and experience the language is still too messy to allow really good optimisation. Believe it or not, a modern FORTRAN will wipe the floor with it for many HPC codes. For really low level tight operations getting down to the metal can yield big gains. The simulator kernel I wrote was tweaked down to the cycle level and we profiled it to death understanding where every cycle went, especially watching for cache and branch prediction gains. The core dispatcher could not be written in anything but assembler because it needed to bridge the gap between the OS process model and the call standard of the compiler. On Windows we needed to access segment registers as well, and manage both the C++ and Windows exception models. The main simulator was byte coded using a threaded interpreter. Again, one needed to understand exactly what the compiler generated, you could lose a huge mount of time with naive code structures, and understanding what each compiler did was critical. The Windows and Gnu C++ compilers do not generate similar code, and any idea that you could assume good code from either without going deeper was sadly wrong. This perhaps underlies the problem with assuming that any high level language gets you close enough to the machine to always craft fast code. With the same source code, compare the output of different compiler back-ends and see what you get. You can get some remarkable surprises.

Oh you mean the fun stuff! That was the sort of thing that kept me enjoying the game. There is little more satisfying than tracking down these sorts of bugs and fixing them.

Now I am having a flashback to writing (in Modula2) a text-UI interface that drew strongly on dBase and Norton programs for making dialog boxes and other user input elements with buttons/dropdowns/file selectors/check-boxes.

I even included double-line shadows on buttons that vanished as the mouse-down event occurred to simulate the button being pressed in.

And all with a self-designed message-passing input signaling mechanism that was primarily informed by reading a book on Microsoft Windows programming (but never actually having written any code for Windows).

Niklaus Wirth would like a word, and the word is Oberon

:rofl:

How does that relate in any way to what I said?

The point of Oberon was to write the Oberon OS in a high-level structured programming language (Oberon), so that as much of the OS code was accessible as possible to an Oberon programmer. The OS reflected the language features, and the language reflected the OS.

I have always found the concept appealing, but from a pragmatic perspective, I see why C is favoured for low-level programming, for all it’s faults.

I guess my point is that it (Oberon the programming language) was developed after C and C++ in an attempt to create a modern language with the simplicity of C, but with the goal of writing systems programs as one of its core capabilities. It ain’t BASIC or FORTRAN. I’m clearly not disputing that substantial portions of an OS can be written in an appropriately capable high-level language. But application-oriented languages like BASIC and FORTRAN are not such languages.

The modern equivalent is Rust. It’s a compiled language with features appropriate for OS development but without the memory safety problems of C/C++. I’m looking forward to seeing it used for OS and driver code (and should really learn it one of these days).

Rust is really interesting. For the OP’s purposes it might be a good language to pick. Certainly a vast improvement over trying to pick up C++.

Oh, if it were only that simple.
First, you are ignoring routing. Plunking down tons of transistors doesn’t help much if you can’t connect them to anything.
Then there are design rules. Since there is variation in placing transistors and signal paths onto silicon, you have to back off doing the most crowded placement in order to get any kind of decent yield. The smaller the feature size, the bigger the problem this is.
Then there is reliability. The smaller the features the higher the early life fallout and the lower the expected life of the part. Reliability people tend to be conservative. Our guy told me confidently that there should be a certain failure rate for our processor. Nope. Looking at field returns, which we collected, I could tell exactly how many failed and it was a lot less than his prediction. But it could easily go the other way.

Finally, designing with all those transistors is not easy, and you can see that the way we do it is to replicate chunks of the design, either processor cores or memory. That’s going to be less efficient than designing from scratch, and slower also, but it does get the design done in some kind of reasonable time.

But this only holds for processors. For computing systems I agree with you. Optimization, which I’m old enough to have had to worry about, should get done top down, of course, through measurement, and finding places where you screwed up. The kind of detailed optimization the old assembly language people loved can happen in certain applications, but is rare today.
But the response of Word to a new character is not a lot faster than the response of emacs on a terminal hooked up to a 3B-20 35 years ago.

I’m ignoring a lot more than that :slight_smile: ! But really, routing should be one of the easier problems–much easier in 3D than it is in 2D. Well, assuming you have the tools for it. Power and heat, on the other hand…

I was joking a bit there, my point being just that we are nowhere close to the physical limits of computing hardware. And I could certainly use that extra horsepower, so current hardware feels slower than it really needs to be.

Where I work, it goes without saying that almost all the processors we produce will have defects, and so almost all are sold with a few functional units disabled. I certainly expect this to continue as transistor counts rise further.

I think we need to go further, though. The processor I described will have to be exceptionally power efficient, and run just barely above its minimum voltage. Low enough that they’d experience bit errors at an unacceptable level. At a certain point, it’ll be cheaper to add error detection/correction circuits than to raise the voltage enough to make them reliable.

We’re starting to see this in some places. PCI-e 6.0 will be fast enough that it simply can’t guarantee reliable bit transfers. It adds forward error correction so that it can run reliably on an unreliable link (and without cranking up the power, etc.). We’re used to that with noisy radio links or physical media, but not short-distance “digital” (note: no longer really digital!) transfers. I think the same basic ideas will make it inside the processors themselves.

That’s true; it only gets worse when you include the rest of the system. The microsecond shaving I mentioned is only relevant because memory is slow (only 20 GB/s on a typical system). Disk, network, the OS, applications, etc. all take their further cut.

Never was digital. :stuck_out_tongue_winking_eye: It was always analog, just a matter of how much of an approximation you could get away with in design. You could do some fabulously evil circuit design with simple “digital” chips by understanding the intrinsic internal analog reality.
I remember trying to get this across to my undergraduate computer architecture class to help them understand why processor design was never a simple scaling problem. Everything is as wobbly as you can manage just before it stops working. I don’t know if it ever stuck. The EEs in the class at least got it. The CS students probably forgot 10 minutes after the exam.
I remember taking a Unibus extender cable into a lecture to explain a bus. A few students really enjoyed seeing that. It was about 10 feet long. Ah the days of steam driven computers when you had to carve your own bits from the rock with a hammer and chisel.

No, but mostly we could pretend that it was :slight_smile: . Well, at least I could, for a time…

One that I ran across recently: you can get a LED both to emit and sense ambient light with just two digital pins on a microcontroller. Hook the LED’s leads to the two pins with a current-limiting resistor inline. One direction, it’s a regular LED that you can control via PWM. However, if you reverse the polarity you will charge the small capacitance of the LED and pin drivers. If you then change the high-level pin to input, the charge will slowly drain based on the light levels–because even a plain LED will act as a crude photodiode. After some time the pin will switch from 1 to 0 as it falls below the threshold. The amount of time it takes will vary inversely based on ambient light levels. Get the timing right, and you can even put the sensing operation between PWM periods.

Ooooh!! :astonished: That is fabulous.

Yes, that is an appealing concept, but as I said earlier, Oberon was a modern language developed subsequent to the era of C and C++, with system programming as one of its primary goals. I just wanted to add here that your comment got me thinking about an old timesharing OS called RSTS/E that ran on the DEC PDP-11 series back in the 70s and 80s, which was centered around BASIC, and seems to be a superficially similar concept (but isn’t really). RSTS/E was commonly used in educational settings, and there were several of these systems in the university where I worked, mostly running on the larger models of PDP-11.

RSTS/E was always centered around BASIC (actually an extended version called BASIC-PLUS) and that was the only language supported in the initial versions, with the runtime for compiled BASIC integrated into the OS. Users were faced with the traditional “Ready” prompt and could enter BASIC commands interactively or compose programs. As a timesharing system, these earlier versions were essentially just a glorified multi-user BASIC. If there was ever an operating system that should have been written in BASIC, this was it. And, indeed, the system utilities were in fact written in BASIC-PLUS.

So does this contradict what I said about BASIC and FORTRAN being unsuitable for writing operating systems? No, it does not. It proves the point. Because these utilities – known back in the day as Commonly Used System Programs, or CUSPs – were really just special application programs that interfaced with the user to perform utility functions. The actual OS – the kernel of RSTS/E that literally ran in protected kernel mode and did all the work – was written in assembler. It had to be. Furthermore, the build and booting of RSTS/E was accomplished using a traditional OS, initially an old version of DOS-11, later RT-11, a single-user OS. Later versions of RSTS/E optionally supported other languages by including runtime systems for them in the OS package, and supported DCL – the Digital Command Language common to all DEC platforms – but it remained fundamentally oriented to BASIC.

So, I repeat: :slight_smile:

LISP is totally OK, though.

The LISP Machines were great. The hardware gave you a lot of the intrinsics you needed. The most critical being tagged memory, including the ability to differentiate pointers. This made it possible to know ab-initio where the pointers were no matter what you were dealing with. GC could be put in the OS. Lots of other fun stuff too became possible. It always impressed me that the virtual memory manager was coded in Lisp.

There were other machines of course. The Xerox machines running Smalltalk and Mesa similarly provided that ability to run a huge amount of code, including OS and OS kernel in high level languages. They boasted bespoke microcode for each language/system.

The ultimate expression must be the Linn Rekursiv. The name stemmed from its ability to recursively call microcode. The GC was microcoded, as was the rest of the memory allocation system. An allocate instruction could call the GC function, and the layered GC could recursively run until the allocate would succeed.

You can buy an argument about what is OS code and what is layered. The non-negotiable part is really how the OS manages context switching, interrupts, process dispatch, memory, and devices. The glory days of micro-kernels tried to strip out everything that could not be run in kernel mode, and put these other tasks in user mode processes. Things like V-kernel, Mach, Chorus. There were others. Choices is particularly interesting in this conversation as it was written in C++ and designed to be an OO operating system. We are talking an early C++. (The Choices guys were great fun, and I had a few hangovers from time spent with them.) Mach lives on inside the Apple Mac as Darwin. There was a stupid one-upmanship back in the early 90s with papers trumpeting how few lines of code were needed to implement their kernel. Sanity eventually returned. Nowadays micro-kernels remain interesting because of their value in creating secure OS implementations.