Misunderstanding Computers

Why do we insist on seeing the computer as a magic box for controlling other people?
Why do we want so much to control others when we won't control ourselves?

Computer memory is just fancy paper, CPUs are just fancy pens with fancy erasers, and the network is just a fancy backyard fence.
コンピュータの記憶というものはただ改良した紙ですし、CPU 何て特長ある筆に特殊の消しゴムがついたものにすぎないし、ネットワークそのものは裏庭の塀が少し拡大されたものぐらいです。

(original post/元の投稿 -- defining computers site/コンピュータを定義しようのサイト)

Monday, August 27, 2012

Some Basic Hardware and OS Security

(I really wanted to do something else today.)

Recent movements in the IP field that I think need to be responded to. (A new huge dump of a random harvest of private data by a pseudo-anonymous group of crackers and gray-hats today, after a bunch of hoopla about a US standards body publishing its lame (pardon me for saying so out loud) list of hardware security requirements last week, among other things.)

#1 No standard DRM (digital rights management) systems in the design.

Let me repeat that. If you think system security is good, standard DRM is evil.


A: Every system will be vulnerable. (This is axiomatic, empirically proven in system theory.)

B: In order to meaningful, DRM is sold as universal.

C: If Universal DRM is adopted, every system using it will contain the same vulnerabilities (in addition to whatever vulnerabilities the system has on its own.)

Thus, DRM will tend to induce universal vulnerabilities. When you know how to get into one computer, you can get into them all. Huge, huge reward to the bad guys, and fighting against the diversity which could otherwise protect systems with different (or no) DRM.

Because even nonstandard DRM must function at a low level, it tends to allow entities that do not have system security as a high priority access to the OS at way too low a level, which is another serious social engineering flaw, but that could be worked around somewhat, sort-of, if we really must have DRM. (I'll have to rant about that sometime, I suppose.) To make that work, DRM has to be re-thought, however, particularly the very odd concept that DRM must be enforced.

Yes, I'm pointing out the inherent flaws in the DMCA (Digital Millenium Copyright Act). Which leads to

#2 Repeal the DMCA. If possible, pass an amendment against any further national, state, or community-level attempt at attaching legal culpability to any sort of automated rights protection system. (Which is another subject for a rant sometime.)

The security implications of the DMCA are that, with the DMCA provisions against reverse-engineering, the engineering required of the end-user (or the end-users system integrator or IT specialist) to secure a system has been technically criminalized.

You can provide all sorts of fancy argument about why this should not be so, but both my wife's and my cell (non-smart) phones are vulnerable. I can't get the information from the vendor or manufacturer that I would need to secure them. Under a DMCA-like law (I'm not sure whether the laws in Japan include such.) I would be risking prison time to dig around in the phone, to figure out how to fix the vulnerabilities.

Risk going to prison, just to keep the bad guys out of my phone. Not a win, okay?

Which leads to

#3 Both 3rd party and customer admin and support allowed, and the necessary documentation and source code provide by the vendor.

The options of non-vendor support and system administration are necessary to keep the vendors honest. The option of the customer doing it him/herself is necessary to keep the vendor from trying to use licenses and non-disclosure agreements to keep all the 3rd party companies in their back pocket.

A free-as-in-libre software license, such as the GPL, should be the recommended best practice, but it is not sufficient. The vendor support has to be optional, and not just theoretically so.

#4 Free or open source OS bootstrap code (BIOS, Open Firmware, etc.), supplied with the hardware.

Generally, the manufacturer should supply both the latest tested binary image and the source code over the 'net. But if the manufacturer goes under or drops support, or if you can't get on the internet, you might still be stuck. So a copy of the bootstrap code and object in the system at the time of original sale should be supplied with the hardware.

This should go without saying. If you can't fix the software that controls how your system starts up, you can't fix bugs there, either.

More important, if someone sneaks into your system from the internet and plants low-level malware in your BIOS, you may be left with no option but to junk the motherboard if you don't have some means of re-programming the BIOS to what it should be.

I know that means the customer would end up being able to re-purpose the hardware counter to the vendor's intent, but what is wrong with that? The vendor can (and should) explicitly dis-avow warranties when the user does things like modifying the bootstrap code.

But this brings us to two more points:

#5 Built-in bootstrap code re-programmer, not available during normal operation.

#6 Physical hardware disable for the bootstrap re-programmer which cannot be re-enabled by software, at least not during normal operation.

There is a bit of a conundrum here, because the typical (and often expected) solution here is to have a simple VB program that holds the user's hand while walking through the steps of re-flashing the BIOS. The problem is that, when you need to re-flash the BIOS, you don't want to run the operating system in order to do so. When the malware is in your bootstrap code/BIOS, you have to assume it's also in your OS, even in the so-called safe mode:

Why would the intruder mess with your BIOS/bootstrap code and not mess with the programs that test the integrity of the Operating System?

There are several ways to do this, but the cheapest will be to provide a backup of the bootstrap/BIOS, and a small button or strap on the motherboard, such that, when you push the button, the active bootstrap/BIOS will be automatically overwritten by the backup.

Of course, if you have patched your BIOS/bootstrap code, you lose all your patches when you do that. Which has induced manufacturers to play dodgy games with two flashable BIOSses in many current motherboards, but no real way to tell which one is the safe one.

This is the point at which the DRM advocates said, "Let's just take the whole mess out of the consumers' hands. We have to protect them from themselves."

Instead of doing a proper engineering job here, "protect the consumer from himself"!

And this makes the manufacturer a party with the malware installer.

It's not hard to get this right, but it's distracting from the point of this rant, so I'll explain it in the previous rant.

Why should the re-programmer be surrounded by so much fuss? Won't that make keeping the bootstrap/BIOS up to date all the more unlikely?

The bootstrap/BIOS should not be updated that often. But, yes, the manufacturer should provide an opt-in notification method, and instructions which could be printed out, when a bootstrap/BIOS level update is necessary.

This gets us safely booted. What's next?

#7 CPUs. And OS structure. Which should be the subjects of other rants, I guess, even though they're the reason I started ranting today.

Briefly, Intel, in their rush to lock the competition out, have always sacrificed security for features.

Even today, the best CPUs Intel has made fail to properly separate tasks from each other. The 8086 had those stupid segment registers, and they would not have been stupid at all, except they were just ways to extend the length of pointers in an otherwise pure 16-bit chip.

They looked like rudimentary memory management, but they were nothing of the sort. Just a misfeature that was used (fraudulently, if you ask me) to sell underperforming chips.

A: CPUs need to be able to separate flow-of-control information from data. (Yeah, I know, you're raising an eyebrow and saying that sounds like FORTH.)

How this works is that the flow-of-control (return pointer and possibly a frame pointer) are stored on one stack. That stack is cached in a cache that cannot be accessed by any normal instruction, and you have a low wall against the attacker trying to overwrite return pointers. A low wall, yes, but better than none at all.

(A stack-oriented cache with hysteresis could also speed up subroutine call/return significantly, but that's a topic for another day. Yeah, I know Sun has processors that do something like this, but not quite.)

The parameter/locals stack could also have a stack-oriented (ergo, hysteric LILO spill/fill) cache. It would be larger than the cache for the flow-of-control stack, but not too large, because of the delay it would cause for task-switching.

B: CPUs need to be able to separate task-local (thread-local) static data from global data, or, rather, from the data of the calling environment.

Task-local data is accessed without regard to other processes/threads because it is allocated per-process/thread. (This is another way of disentangling the current usage of the unified stack, actually.)

The context of the calling task or thread may be read relatively freely, but writes to it must be filtered through semaphores, critical sections, monitors and the like.

There are four segments here, which must be kept somewhat distinct. This is what segment registers should be used for, but the memory interface itself should have access mode protection, similar to the seven modes supported (but never really used) in the original M68000.

Not quite the same because the conditions for enabling read, write, and execute on those modes could not be tied to access via a specific address register (in absence of segment registers). And there were not limit registers to prevent attempts to access one segment through long offset in another.

These are the kinds of things where Intel has let us down, and actually fought against security.

Not because we knew we needed this back thirty years ago. (I only saw it dimly, until about twenty years ago. I don't know who else sees it.) But because Intel's processor is over-tuned to the structure of the C/Unix run-time defined forty years ago, and Intel has used very questionable means to destroy the market for more flexible processors which could have been used to explore alternative run-times.

Without Intel's hypercompetitivity, it is quite probable that the run-times that could prevent the most commonly exploited vulnerabilities in current OSses would ready for production. As it is, we still need to explore and experiment a lot.

(Microkernels can apparently help for a while.)

There's much more that I need to say about this, but I'm out of time today.

[JMR201704211138: I had some thoughts on the low-level boot process, which might be relevant: http://defining-computers.blogspot.com/2017/04/model-boot-up-process-description-with.html.]

How to update your bootstrap code/BIOS.

No, this is not for existing systems. This is how systems should be designed to provide for updating the code that the hardware runs before anything else.

(If you want to understand why, see the later post on keeping the bootstrap code scure.)

First, your bootsrap OS has to be a bit more complete than BIOSses have tended to be in the past. Closer to Open Firmware, but useable by the average moderately-technical user and the average support-guy at the local shop.

You have to have four images:

  1. Working image,
  2. Backup of latest working image,
  3. Archive image,
  4. Fallback, or fail-safe image.

The working image is the one you normally boot. It boots your normal operating system. If the hardware allows the working image to be re-programmed from the normal OS, the normal OS must only provide access to the re-programming features in a special administrator mode that requires being re-booted to.

(Getting or compiling an updated bootstrap image is a separate topic that I will try to rant about later.)

The backup of the latest working image is never booted. It's basically there because a good cryptographic checksum cannot guarantee perfectly that something hasn't been inserted by a very clever mathematically inclined attacker. It's for checking the bootstrap before the bootstrap is allowed to continue.

(Yes, that means a pre-boot boot. Naturally.)

The backup is also checked against the current working image before re-programming is allowed.

Then, after you update the bootstrap code, and run some integrity/security tests, reboot, and run some more integrity tests, before the normal OS is called, a new bootstrap will copy itself to the backup.

The archive is also never booted. It must be physically impossible to write to it from either the normal bootstrap or the normal OS.

The administrator will set a period, a week or two, or a month, after which a grandfather backup will be scheduled, and the pre-boot bootstrap will copy the backup to the archive. (The waiting period is to leave enough time that the bootstrap can be assumed stable.)

The fallback image, which includes the pre-boot bootstrap, must be physically impossible to write to, period. It's there for when all else has failed. It will include some command-line and (simple) menu-driven tools for testing, debugging, hunting for malware, etc.

There must be a physical button, switch, or electrical strap that will force booting to stop and wait at a command-line or menu instead of proceeding to the normal OS. In addition, an administrator tool should be provided for the normal OS, which directs the next boot to stop at the bootstrap level.

Another button, switch, or strap will direct bootup to the fallback.

Among the commands available will be one to get a new bootstrap (working) image from the manufacture, over the network, or from some removable media. Another will provide for updating the kernel and lowest-level utilities of the normal OS without having to start any image of the normal OS.

In a brand-new, fresh-from-the-factory motherboard or system, all four images will be identical.

So, what about the normal OS?

A similar approach might be useful in updating normal OS and application code, as well.

Some code, such as the kernel, would do well to have full multiple copies for backup. Others, mostly end-user applications, might be okay with only good checksums, but I would be inclined to use full copy backup for any mission-critical application.

If four copies of every app is overkill, two copies and a good checksum would be a next best alternative. (And preferably, don't let the application updater directly overwrite the checksums.)

[JMR201704211138: I had some further thoughts on the low-level boot process, which might be interesting: http://defining-computers.blogspot.com/2017/04/model-boot-up-process-description-with.html.]