Are secret government backdoors built into all chips today?

The encryption is based on an RNG inside the cpu. This RNG was compromised, hence the encryption is compromised.

The “encryption chip” is the cpu.

In case you hadn’t noticed, the media is terrible at explaining tech.

I do know more of the details about how RSA works, and while it can be cracked with enough computation power, “enough” in this case means “all of the computers ever built or which ever will be built in any of our lifespans working continuously for many times the lifespan of the Universe”. It’s possible to attack RSA more quickly if it’s using a poor-quality random number generator like the one built into most commercial chips, but everyone remotely qualified to work in security already knows that the RNGs built into most commercial chips are poor-quality, and so doesn’t use them.

You misunderstand. Weakening the RNG weakens the encryption that uses the RNG for the seed value. There’s no need to get into the CPU itself if they’ve put in a compromised RNG into its design.

You really should read up on public-key encryption and how it works.

Lots of misconceptions there, I don’t even know where to begin to correct it.

What you don’t understand here is that this situation you’ve described, where you send the decryption key via messenger, is exactly the weak link that public-key cryptography solves.

And they can’t break public-key crypto, according to every expert.

I gotta agree with you on that last part … but the NYT article doesn’t mention CPU either. The best I’m getting is some chip makers do this some of the time, and one software company does this all the time.

Let me pose this as a General Question for which there is a Factual Answer. How does the random number generated by the transmitting computer get to the the receiving computer? Let’s set aside the fact it’s being transmitted in the clear to the NSA. We still have to get it to where we want it to go.

It doesn’t. If it did, then the system wouldn’t be secure.

The NSA uses the weakness in the RNG to reduce the amount of time spent to decrypt. When the RNG is good and/or close to random, the time to decrypt is beyond a useful time frame, when RNG is weakened, the time to decrypt is reduced significantly.

Since 2009 this has been a big area of research in my field. I’m on the editorial board of one journal, and we get multiple special issue proposals on this subject. But no one has ever described an actual case of this. Counterfeiting on the other hand is indeed a big problem.
There are lots of interesting theoretical methods of showing if someone puts in a Trojan Horse, but anyone with the skill to do so would likely be working at a better place than the kind of fab that would do it. Big fabs, like TSMC and Global Foundries, are not going to risk billions of dollars of business for this.
And the problem btw is exactly the fab putting in a Trojan Horse. It is trivial for a design team to do it. You buy a chip designed by a subsidiary of the Chinese Army and you may be in for problems.

A fab does not get a netlist. A fab gets tapeout data (used to be a tape, not any more) with masks and other information that tell you where the transistors, pads, interconnect and vias go. Good luck figuring out where and how to wire anything new up with that. For instance, there are often spare gates that can be used to make relatively low cost changes without a new mask for all wiring layers. Using them for a Trojan horse is going to be tough. There is also not a lot of room left after layout and routing is done.
Hell, a tool that could go from layout to logic would be awesome to have for verification. I’m not aware of one that could work on entire chips. When I was at Bell Labs we tried to do a transistor to gate mapper, and it was really tough. I’m not sure one is available even now, we don’t use it for sure.

Let me just make it clear. Your knowledge of how all this works is very, very low. To even ask this question means you are vastly unaware of how public key encryption works and the importance of good RNGs.

To reiterate: there are hardware RNGs in many modern cpus. The one used in some Intel and VIA cpus was deliberately fudged by NSA to have a weakness they knew how to exploit.

(The idea that companies go to NSA for analysis of their crypto-related products and then follow the NSA’s suggestions is astonishing.)

As noted, the encryption method used by good, current versions of (SSL’s successor) TLS seems pretty good. But they rely on RNGs. If the RNG is broken, the encryption scheme breaks down.

I don’t see it as my responsibility to teach a course in Encryption 101 here. For one thing, you have some misconceptions that a typical beginner doesn’t have. So I’d need to erase those just to get to a reasonable beginning point.

What’s so tough about going from layout to netlist? Think of how the LVS stuff analog designers use works. The tool you’d want in this case is slightly different, because you’d want to work slightly higher up the hierarchy (like recognizing gates and other standard cells, not transistors), but in some ways easier, because each copy of the standard cell is identical. I don’t think that exists, but I think that’s from lack of demand, not difficulty. Maybe the EDA tool developers have stuff like this for internal debugging?

Inserting the backdoor would be tough, but again, I don’t think it would be impossible in most designs. If you couldn’t often make meaningful changes to a complicated design without moving any existing logic around, then designers wouldn’t include those “sewing kits” in the first place. And, since the fab is making all the masks, they’re not operating under those constraints; they’re free to move things around a bit if required, at greater risk of getting discovered in a die photograph of course.

I don’t think this is a very practical risk, for the same reason that I don’t think hardware backdoors included by the designers are a very practical risk: there’s just not a lot you can do with the backdoor that won’t be inevitably discovered, so it’s a lot of work for a one-shot attack that also destroys your reputation. That’s a different question from technical feasibility of the hardware backdoor itself, though.

Also, what NSA-backdoored hardware RNG are people talking about? The NSA almost certainly backdoored Dual_EC_DRBG, but that’s a software pseudo-random number generator (which could be hardware-accelerated, but would still be deterministic).

I’ve seen speculation that hardware true RNGs are backdoored, but no strong evidence. It certainly doesn’t make sense to use that (or any other opaque source of entropy) without mixing it with all other sources of entropy available in some cryptographically strong way. Any hardware exploit would then need to be tailored to the specific code using the RNG, a more complicated task.

Analog stuff, sure. But that is a trivially small amount of the area in a processor. When you have a bunch of transistors, how they are partitioned into standard cells is far from obvious. There are lots of possible mappings. And, remember, you want a netlist that can be read, not a collection of basic gates and bus drivers.
The reason we worked on it was that our internal clients back 25 years ago were making DSPs with custom logic. Their customers wanted netlists, and it was to our advantage to give the inefficient ones, so that if they tried to build a chip with it they’d come up with crap.
As for EDA companies, I don’t know for sure, but I used to do EDA and I’m heavily involved with people in EDA companies, and was on the program committee of one user group conference. Never heard of this as a practical tool. I can see it for a small block with maybe some hand correction, but for a billion transistors?

If our fab moved stuff around you’d hear the screams from California in Taiwan. Closing timing is not easy at our speeds. I’ve seen examples of a big yield loss from a dogleg in a signal path, which is nothing as big as what you’d get when you moved something. And I’m pretty much sure we do our own masks - or at least the masks are a direct representation of our detailed layout.
We also know the physical location of each gate on the chip, information I’ve used to see if failures are close together physically.

The people claiming that fab-inserted Trojans are a problem have never come up with any examples. In a sense there are designer inserted backdoors already. Remember when Intel got into trouble about chip ids? We have them on our chips - for the more recent ones, there is a command you can type at a Linux prompt which will give a hex number. If you give that to me I can tell you everything about what happened to that chip before it went to system assembly. Lot’s of companies build call home features into high end computers which, with permission of the customer of course, sends information about what is going on in the system, including voltage fluctuations, crashes, temperature. We don’t monitor data of course, but I suspect it is possible. But could be notices with even rudimentary security.
I once jokingly wrote in a column that we should worry about digital picture frames that take voice commands listening to us. An expert told me that this is very possible. So there are really simple ways of collecting information that are much safer than fab or designer inserted logic.

Whoever’s performing this attack probably has the libraries. Indeed, they might have designed them. So they know the set of possibilities, and if you do find cases where the mapping is ambiguous, then any netlist is still “correct”, in that it corresponds to the same logic, and given the complexity of the synthesis tools doesn’t seem obviously easier or harder than any other to reverse-engineer.

I think that’s the hard part, not extracting a topologically correct netlist. The layout may provide some clues, though. If you can reduce your attack to tampering with signals into a particular independently-routed and -placed structure (e.g., a register file, or an MMU), then you can drop the number of transistors to consider by orders of magnitude. I think these attacks would be closest to practical on some relatively small piece of hard IP, to limit the scope and give you time to reverse-engineer without delaying the customer, and maybe reuse the backdoor across multiple designs.

Maybe on some paths, but it’s not like you’re going to tamper with the ALU, program counter register, etc. The trick would be to find something in a conservatively-timed peripheral that let you perform an interesting attack, which is why I’m thinking of stuff like the MMU setup registers (for escalation of privilege from user-mode code, for example).

Is anyone really claiming with a straight face that this seems like a practical problem? I see this as an academically interesting attack, not something anyone’s really worried about. I’d tend to agree that the large number of people with visibility into this process at a typical fab would make it hard to secretly perform the attack, even if it were technically possible.

It’s possible for a large group of people to keep a secret (e.g., the Manhattan Project), but hard to keep secret that a large group of people is keeping a secret (e.g., the newspaper article about all those scientists in the desert). From what I know about the processes by which fabs hire and assign work to employees, they don’t seem to have the necessary security procedures in place to keep a secret like this.

And just for completeness: the interesting thing about attacks through the hardware RNG is that they’re not detectable by black box examination. This is true only for a peripheral that’s supposed to be non-deterministic, because there’s no way to test if it’s working properly. If that entropy is used in intelligent ways (mixed in a cryptographically strong way with other sources of entropy, like keystrokes, network activity, etc.), though, then a practical attack would become extremely complex.

We had the libraries. They were our libraries, so we had them to whatever level of detail was needed.

Have you ever seen the netlist of a microprocessor? Even a hierarchical netlist with useful names is hard enough to figure out - a flat mostly incorrect one with no names worth mentioning I suppose can be decoded, but remember they would have to do it fairly quickly.
Even if inserting something near a register file would be useful, many memory elements are timing critical - and the fab won’t know which ones. I know this for sure since we would love to insert some test hardware around memories, but the designers usually won’t let us. If the fab guesses wrong, they are going to be in serious trouble, since the first thing that would be looked at is the failing memory.
Plus, routing congestion is often pretty high around memories, which means there is even less space to route signals from the inserted hardware.

Some sort of IP for a memory driver or a USB or other bus signal would be easier. But we use soft IP, so you have the same problem, and while if they attacked and rewrote hard IP (breaking protection of it) you’d have the problem that it is bigger than advertised, and that would be a red flag. I’m not sure that this would let you hijack anything, but we can forget about that for this discussion.

Trust me, the justification sections of lots of papers and special issue proposals imply that this is a big problem now or will be very soon - but when you look closer, you see that it is actually counterfeiting or gray market chips which are the real problem. We’ve rejected several special issues because of this. One on the counterfeiting problem would be interesting, especially if you get some security people to write an article. There was a paper in IEEE Spectrum a few years ago about this which was quite good. No Trojans there.

In Feynmann’s autobiography, he mentioned that the stationmaster of the Princeton Junction station knew something was going on since all these Princeton professors and graduate school students were buying tickets to the middle of nowhere New Mexico.

I’m more of an FPGA guy than an ASIC guy, but yes, I’ve seen many flattened netlists for complex designs. Have you ever seen the binary reverse-engineering tools used e.g. to find vulnerabilities in x86 code? They’re sort of like a debugger that also lets you annotate and restructure the disassembled code in various ways, as you refine your best guess at what each instruction is doing.

I’m imagining the logic simulator equivalent of that. The goal, as with reverse-engineering of software, would be not to understand the complete design, but to find an attack that required you to understand as little as possible of that. I think you could develop a useful version of that tool for low millions of dollars, beyond academic budgets but easily within any government’s means.

How much bigger are you thinking the hard IP would get? I liked the MMU because it seems like you potentially could implement an interesting backdoor (enable access to the operating system’s memory upon some undocumented register write) in ~100 gates, a small fraction of the total area.

The memory controller seems like it could still do something interesting, but that a useful backdoor would be bigger. USB seems trickier, but I guess possible e.g. in DMA architectures where any master can write to arbitrary memory? Any attack on hard IP that’s available on the open market gets easier, if you can buy it yourself and read the documentation, or maybe even buy it in soft form and get a netlist to compare against.

Soft IP everywhere makes life much harder. Unless you have some chance to become familiar with the design before submission (like from an earlier version), or a really good excuse for the delay, that one might be hopeless.

Maybe I have low standards for honesty? I just assume that since all papers will claim that, all such claims should be treated as meaningless, like an investor talking his book, or parents who think their children are beautiful. In any case, agreed that hardware backdoors look like academic toys for now, while shady component brokers selling floor sweepings are everywhere.

The extraordinary cost of IC fabrication equipment may help to discourage these backdoors. There’s really no such thing as a fly-by-night fab (at least, for modern digital processes), and anyone who has made such an investment would think very carefully about the cost of the reputation they’re about to destroy before inserting a backdoor that will almost surely be discovered within a few months of use. That’s not quite true for the hardware RNGs, though…

For FPGAs such a tool would be fairly simple, if it doesn’t exist already. Of course you can just fiddle with the ROM holding the design so adding a backdoor would be fairly trivial.
I started writing machine language - no assembler - and used to use adb a lot. Basic disassembly is trivial, tracing variables by looking at data flow from memory looks very doable - and no doubt done. Much simpler than the raw gate problem.

I know FPGAs are sloppy. :slight_smile: When I was at Intel I owned a spreadsheet giving the area and number of transistors of every bit of logic my team owned.
Looking at die photos of some of the chips I worked on, I don’t see a lot of empty space except in the I/O ring, and something there would show right up.

No one expects total practicality - though I do challenge it. You can say that saving area is important, say, and that this method saves area - even if what it saves is trivial and not worth the time. But these proposals made grandiose claims that were easily refuted. Anyhow, we care about our circulation, and we are more accessible than Transactions type journals, so we do want subject of interest to more than a few dozen people.
And you got the real problem right.
Their work is actually very interesting, they just need a better justification for it.

Exactly. I don’t know much about recent hardware RNGs, so I have no opinion about that discussion.

I don’t know why I pay my taxes if they aren’t. :wink:

What would make such a tool simpler for an FPGA than for an ASIC? After you’ve extracted the netlist, isn’t it almost exactly the same problem, equivalently difficult for equivalently complex designs? ASIC designs are probably bigger, but hard macros within ASICs (if used) aren’t necessarily.

The reverse-engineering tools go beyond that, with tools to annotate, and, more importantly, to interactively reorder and transform the code in ways that are functionally equivalent but that the operator thinks give more insight into what’s going on. I agree that a netlist is probably more difficult to reverse-engineer, per unit of complexity, than object code for a typical processor, but typical ASICs are also less complex (like in some kind of Kolmogorov sense, the shortest possible representation of each design) than operating system, game engine, etc. software that routinely gets reverse-engineered.

I wonder what the most serious effort has been so far to reverse-engineer a digital design from its masks. Maybe some kind of competitor analysis? That seems risky legally, given the heavy patent coverage, but maybe that will become more common as treble damages get awarded less often for knowing infringement? I’ve never seen any large-scale academic work.

You wouldn’t really have to extract a netlist, would you? You can build it from the programming patterns. You’d still have to understand it, but that is an equivalent problem. If there is space for your backdoor it would be easy to add, but it is also easier to get caught.

I guess they reverse engineer virus code all the time. Small ASICs might be easier to understand, but they’d have to be in a part of the system getting useful information.

I’ve sat in on competitive analysis meetings, and it was all done either by looking at freely available information or by running various tests. I don’t think academics could handle anything that big - it is hard enough to get them to do anything on real ASIC netlists I was able to distribute as benchmarks.

“Extract a netlist” is maybe a bit of a pretension. I mean compute it from the configuration bitstream, as you say, and that’s of course an easier task than any that we’ve been discussing. The “understanding” part is the interesting one.

Here’s someone who seems to have reverse-engineered a Russian 8080 clone. There’s not a lot of detail, but he ended up with a schematic and a transistor count. So I guess he did it entirely by hand from the die photo? Of course, that’s just 4758 transistors…

http://zeptobars.com/en/read/KR580VM80A-intel-i8080-verilog-reverse-engineering