I found out what I like [related to computer coding]

Dr.Strangelove · April 10, 2019, 9:28am

I would describe myself as of a “low-level DIY bent” as well. The beauty of C++ is that it is very expressive and yet lets you write code where you know exactly what is going on at the instruction or byte level. Here, I have four examples of things that, as far as I know, simply could not be implemented in anything but C++. These are real things in code shipping to hundreds of millions of customers that I implemented within the last year.

Compile-time string scrambling
We sometimes have a need to put debug switches, messages and the like in shipping binaries. These may have strings that we don’t really want to appear in plaintext. It’s not that they’re a security issue, but we just don’t want to make it easy for people to get into trouble, and some of the strings could be taken out of context. So we like to scramble some of the more sensitive ones. Not hard encryption, just something that makes people work a little if they want to open the hood.

We had been doing this in ad hoc fashion; some people XORing in constants to each byte in an array; running an offline script; manually converting bytes; etc. All pretty dumb. Instead, I wrote a library that encrypts strings at compile time using templates. In the codebase, the string appears totally normal and in plaintext–but it never appears in the binary. Instead, it’s encapsulated in a function that descrambles on demand. It’s all completely automatic and only requires the user to wrap the string in a single macro.

Compile-time hashing
Similar to the above, we sometimes need to hash strings such that the strings never actually appear in the binary. Unlike the scrambler, these are totally irreversible. People had been using offline tools to generate the hashes, pasting the result into the code, but this was both a pain in the ass and error prone. So I implemented the same hash routine using templates. Again, nothing offline was required, and the input string never appears in the binary in any form. This could be used on the crappiest microcontroller because only the final bytes are ever stored.

As it turned out, implementing this found several bugs where, due to capitalization and some other issues, the existing collection of hashes just had some errors. With the new system it’s impossible to ever have the wrong hash.

Automatic argument packing
We have a multiprocessor system that communicates through a shared memory space. One processor is just the normal CPU; the other is a massively parallel processor. However, each of these processors is simple and only has a few hundred bytes of fast “RAM” (actually registers).

We wanted to send messages, sorta like remote procedure calls, from the parallel processor to the host. But the message buffer is of limited size and has some restrictions like alignment. How to enforce these on the parallel processor at minimum cost?

Again using templates (variadic this time), I was able to write a library that packs an arbitrary set of function arguments together and writes them out to the shared memory. It does all of the alignment computation and bounds checking at compile time–if you overflow the buffer on a given call, the build will fail. And it’s impossible to ever have a misaligned argument. At the same time, the resulting code is as minimal as it’s possible to be: a series of memory writes at different offsets. So it’s both provably optimal and completely safe.

Automatic variable-length structure packing
Similar to the above, but this time we’re sending full data packets between devices, each of which consists of a series of C-like structs (which may be variable length), and again with alignment constraints.

I wrote a (templated again!) library to generate the proper size and offsets automatically given the requested structs. Performance, while also optimal, was actually not the primary concern here. Instead, type safety was the goal. Once the buffer was created, you could then request a pointer to one of the substructs contained in it. It would in return give you a typed pointer to that struct. The client did not have to do any casting at all–all of the type handling was done by the library, again at compile time.

The previous code had all kinds of nasty pointer math, having to cast back and forth between byte pointers and otherwise, and was very fragile and error prone. The new code is trivial and if someone adds a new member to the packet, everything else “just works,” shifting and realigning the other members as necessary, as well as computing the necessary buffer size.
The common element here is that a huge amount of work is done at compile time. This has two big advantages: it’s fast (and provably so, because I can inspect the generated assembly to see how it compiled); and it’s very safe, because it ensures that a huge class of errors are impossible to make. There are languages, like C, that have the same low-level determinism as C++. And there are some, like C#, that have the same degree (or more) of expressiveness. And some that have the same degree of type safety. But I don’t know of anything else that gives you all of these together.

Some of the things above could have been implemented using generated code. We’re no stranger to that, either: using a script to generate C or C++ in exactly the form you want. That works, but increases the complexity of the build system and probably means there’s another language to know. It’s nicer when you can do everything in one environment.

septimus · April 10, 2019, 11:22am

Impressive!!

Dr.Strangelove · April 11, 2019, 5:58am

I feel like we’re actually just scratching the surface here, too. We’re actually quite conservative overall and just starting to use C++1x features.

To be clear, the remote processor I’m talking about is almost as limited as the ones beowulff brought up. They have a lot of floating point horsepower, but can’t even do an indirect function call (so no virtual functions). Memory is too slow for the stack, and there aren’t enough registers, so recursion is also out. No malloc, since there’s basically no way to service requests from thousands of processors at once. And so on.

But it has full-fledged C++ template support, because all the heavy lifting is done by the compiler. Done carefully, many sophisticated ops can reduce down to very tight machine code.

For instance, suppose one needs to sort small arrays. C has qsort–but that’s no good on a limited system, because it needs both recursion and function pointer support. But if you know the size of the array in advance (as is often the case if your system is doing something DSP-like, say), you can write a templated sort routine that reduces down to an near-optimal number of compare-and-swap operations for that length (using sorting networks). Won’t work for long buffers due to code size limits but great if you have, say, 9 samples and want to average together the middle 3.

Limited systems may also have no floating point hardware, and depend on fixed-point arithmetic when they have fractional data. Sure, in C you can write a set of functions to handle the math: fixed_16_16_mul, fixed_16_16_add, etc. But it’s much nicer to use C++ operator overloading. There’s no additional overhead here; internally it works the same way as C-style functions. But it makes the code much more readable. Furthermore, it makes them compatible with templated algorithms that may depend on the standard arithmetic operations being available.

septimus · April 11, 2019, 12:17pm

I see the Wiki article mentions the 45-step network for N=13 but doesn’t mention Danny Hillis. This network (improving on the 46-step best known until recently) was discovered by Mr. Hillis using a predator-prey genetic algorithm.

wolly · April 11, 2019, 1:10pm

Now,back to my question,can you use C instead of C++ if you are coding hardware at the level of computer or the level of devices that are used in electrical design and not electronical?

TriPolar · April 11, 2019, 2:06pm

Yes. Because it’s not a matter of the language. Only a few rather unique languages would bar you from doing this, it’s a question of what makes this easier to do in a particular development environment, and that rarely has anything to do with the particular language in use. It will be a matter of the development and production environment and available libraries.

Now when you ask the question of what language to learn, if it’s a choice you have instead of a requirement of a certain job, then the answer is that it doesn’t matter much. You need to learn some language fairly well, and then another one so you can appreciate the differences, and begin to see the commonality in all computer languages. The you are starting your journey down the road of ever changing technology. Everything will change, you may still be using C or some variant in 20 years but you won’t recognize everything else about the environment you’re working in. And rarely in that time will you actually have a choice of what language or technology you use so get used to be adaptable.

msmith537 · April 11, 2019, 2:15pm

In all fairness, if I am staffing a short project that needs Language A and someone gives me a developer who is only proficient in Language B but “is really eager and has a great attitude”, I still need to take into account the time it will take that individual to get up to speed on Language A.

Thudlow_Boink · April 11, 2019, 3:24pm

I know this is an old thread, but I haven’t seen it addressed (forgive me if I missed it):

Isn’t Python 3 just the most recent version of Python?

pulykamell · April 11, 2019, 3:57pm

OK, I’m glad I’m not the only one who was WTF about this statement, but not seeing a follow-up to it, I thought maybe I missed something or the posters were just ignoring the statement because it made no sense whatsoever.

TriPolar · April 11, 2019, 5:32pm

Python 3 is the latest version. Like some products it is often mentioned with it’s version number. In the case of Python the version was significant because version 3 made some changes that were not backward compatibility with previous versions and added new features. The version of Python is not really significant to the OP’s question though.

Melbourne · April 12, 2019, 7:28am

Or C++. Freescale, Arduino.

The Arduino language, used by beginners, is c++.

They don’t show raw c++ because raw c++ is a terrible language. They do use Arduino c++ because internally, it’s a better language than c.
Personally, if I was a beginner going from Python to c/c++, I’d start with the Boost library. Because the only sane reason for using c++ is to use somebodies library. Arduino, STL, Boost or whatever

Melbourne · April 12, 2019, 7:33am

“Understanding pointers” is an over-rated skill. If you aren’t forced to use a language, like c, that handles pointers badly, you don’t need to know how to handle pointers in a language, like c, that handles pointers badly.

And after you’ve learned to handle the stupid way c uses pointers, that just makes language like Python more confusing, because the object model in Python uses pointers differently.

Dr.Strangelove · April 12, 2019, 9:14am

I’m not sure you understand what pointers are. Pointers are just memory addresses. If you don’t understand how memory addresses work, then you don’t understand how computers work.

Scripting languages like Python don’t have pointers. They have references.

There’s no such thing as “Arduino C++”. It’s just C++, but with the code pasted together in strange ways that makes it easier for beginners to use modules, and harder for people doing complicated things.

TwoCarrotSnowman · April 12, 2019, 5:39pm

septimus:

The following may be a useless digression for someone seeking employment, or trying to find the best language for a particular niche or curriculum, BUT …

I’ve wondered whether personal temperament should play a role in a person’s choice of programming language. After all, programming can be fun and fun is good.

As some very old examples, some engineers loved Forth, while others hated it. Lisp also had both ardent lovers and haters.

Although C and C++ may seem similar, I wonder if they sometimes appeal to opposite personality types! The old-fashioned “philosophy” of C is very different from the “philosophy” of C++. Just as one hint of what I’m speaking of: “Use the Source, Luke!” was a common mantra in C programming, while making the source of libraries unreadable was seemingly sometimes touted under C++. :smack:

Yes, I absolutely think it does. I’m teaching myself to code, at my own pace and purely for fun. I tried C++ first, and didn’t get anywhere. Then I tried Visual Basic. I liked building Windows Forms, but the syntax didn’t stick. Then I tried Python, and something about it clicked. Now I make useful little programs for myself.

Voyager · April 12, 2019, 9:13pm

Pointers are in a sense like Gotos. Gotos in brain dead languages should implement control structures. Pointers should be used to implement data structures. Ditto with references. Hashes are good since you don’t have to worry about implementation and can use them as is.

Dr.Strangelove · April 12, 2019, 10:01pm

I’m in about halfway agreement with you. Pointers are like gotos in that it’s easy to get into trouble with them and that it is a kind of direct exposure to low level functionality. At the instruction set level, all control structures are conditional gotos. Likewise, at the physical memory level, all references are pointers/addresses.

However, it’s possible to almost completely avoid gotos with almost no downsides; and the rare exceptions (like certain types of error handling) where gotos still make sense, it’s not a question of functionality or efficiency, but rather just coding style.

But you can’t avoid thinking about pointers if you want maximum efficiency. If you have two variables, how can you know if they share a cache line or not if you don’t have access to their address? You can’t. And if you can’t, then you’ll never know why your program is sometimes 10% slower than other times (just one example among many, of course).

Most programmers don’t need to know this stuff, of course. Most code is just gluing together stuff that someone else wrote. There’s no such thing as high-performance Python, but there are Python programs that are just as fast as C/C++… because they use libraries written in C/C++. That’s fine, but someone wrote those libraries, and they had to know what pointers are, along with a bunch of other stuff.

Melbourne · April 13, 2019, 12:04am

I’m not sure you understand what references are.

Melbourne · April 13, 2019, 12:10am

The reason why ‘var’ variables were introduced in c++, the reason why anti-aliasing rules were introduced to c, is because ‘pointers’ in c and c++ made the languages less efficient, in the sense of ‘slower than Fortran’.

Also of course, less efficient in the sense of ‘error prone, harder to understand and take longer to write’, but for many CS students that was a feature, not a bug.

Dr.Strangelove · April 13, 2019, 12:31am

“var”? You must be thinking of some other language.

Maybe you are thinking of “auto”, which is a little like languages with “var” declarations, but with a key difference: the variable is still strongly typed. You can’t assign a different variable with a different type to it, or use it in an expression with a different type. Anyway, none of this has anything to do with aliasing.

It’s true that Fortran can be faster than C/C++ in some cases due to aliasing rules, though it’s become much less of a problem with modern compilers. Fortran certainly earns its place among high-performance languages; few languages can make the same claim. That said, aliasing is just one of many factors that go into language performance.

Voyager · April 13, 2019, 5:00am

Not just in high level languages like assembler. I was a microprogrammer - same thing there.
One of the questions on my PhD orals was on this subject. The intent was to trip me up due to an aversion to gotos. It was around a place where you needed some kind of decent loop exit statement do create non-convoluted code. So the reason we don’t need gotos any more is that decent ways of handling this were added to languages. I didn’t bite on this. I learned machine language first so I have no allergy to gotos.

That sounds like a case where you need direct access to the underlying physical implementation. That is not very common, to say the least. Most pointers in C for instance are used to implement things like linked lists. That’s a deficiency of the language.

When I was in school there we were still having the high performance code in assembler versus HLL debate. Scripting languages are great for fast development at good enough speed. I was handling gigabytes of manufacturing data plenty fast enough using a scripting language.
But the main point is that programmers should think at the level of control structures, not the inherent branches that they are constructed of, and at the level of high level data structures, not the pointers or arrays with which they are constructed. If the data structure you want needs to be constructed with pointers, then fine. When I taught assembler we taught the students to use the PDP-11 branches to build control structures, and not use them haphazardly. I more or less did the same kind of thing when I taught data structures (from Wirth’s book) but I didn’t really understand them until after I had taught them.
That’s how pointers are like gotos. Useful when required, essential to understand, but to be avoided when there are better alternatives.

Topic		Replies	Views
Why Python over any other high level language? In My Humble Opinion	37	1470	December 27, 2022
Help Learning C++ (for a C programmer) In My Humble Opinion	13	1895	January 27, 2012
Is Python a good start to learning a programming language? In My Humble Opinion	9	914	March 19, 2004
Programming preferences, detailed opinions requested In My Humble Opinion	10	936	July 31, 2003
What's the best language to learn programming with? In My Humble Opinion	105	11861	August 17, 2013

I found out what I like [related to computer coding]

Related topics