Misunderstanding Computers

Why do we insist on seeing the computer as a magic box for controlling other people?
Why do we want so much to control others when we won't control ourselves?

Computer memory is just fancy paper, CPUs are just fancy pens with fancy erasers, and the network is just a fancy backyard fence.
コンピュータの記憶というものはただ改良した紙ですし、CPU 何て特長ある筆に特殊の消しゴムがついたものにすぎないし、ネットワークそのものは裏庭の塀が少し拡大されたものぐらいです。

(original post/元の投稿 -- defining computers site/コンピュータを定義しようのサイト)

Saturday, October 24, 2015

Why Personal Information in E-Mail Is Not a Good Idea

Postcards are fun. You can use them to keep in contact with people without having to think too hard.

I never got in that habit, and I don't tweet for pretty much the same reasons.

But they are useful to many people.

But you would never write a password or your social security number or your bank account number on one, would you?

Hey, Margot, this is the cliff I dove off today! Acapulco is GREAT! We're extending a week. Wish you were here!
And, hey, my PIN for the account at ABC Bank there in Brownfield is 7734. Can you pull out a hundred and go to the kennel for me and make sure Snickers is being well taken care of? I'd really appreciate it!

Love you loads, George.

The cat's name in the above may or may not be bad to put on a postcard, depending on whether George is using it as one of those things bits of information banks and such ask you for, to help identify you in the case of a lost password.

(And you really wouldn't use Snickers as a plain password, now, would you? Whether 7734 is a good PIN or not, well, I wouldn't use it. Banks really should quit using PINs at ATMs. Anyway.)

Now, you might put that in a letter, in an envelope. I wouldn't recommend it, but you might.

You want a fairly thick envelope if you do, to help in case someone holds your letter up to a light to read through the envelope.

And you want to fold the paper so that the sensitive information, the PIN the town name, and the bank name, are covered by as many layers as possible.

Don't make the envelope too thick, that might tempt someone to steam it open.

And write more, so that the sensitive information doesn't stick out so much. Rephrase things a little. Tell Margot about the markets you visited:
Hey, Margot, you like this postcard? This is the cliff I dove off today! Got a selfie, but the internet connection is terrible at the hotel. Still, you may get the selfie in e-mail before you get this.

Acapulco is GREAT! Fantastic markets. Good bargaining, although I'm sure we're being too nice and still paying too much.

We're having so much fun, we're extending a week.
Got a favor to ask, and I'll bet you can guess. In fact, you're probably calling me irresponsible for extending with the cat in the kennel and all.

You know where I bank? Yeah, that ABC. I'd really appreciate it if you could get some money out for me and go check up on Snickers for me. A hundred should cover the extra time and treat for him. Give him a hug for me, huh? 7734 is the number to use at the you-know-where. (I know, it's one of my stupid calculator trick numbers and anyone who knows me could guess it quickly enough if they were trying to get into my bank account.) Maybe you should change it for me while you're getting the money out.

We found this cool leather purse that we know you'll love. The motiffs look Toltec to me, but, hey, you don't have to tell anyone where I bought it, you think?

We should still make it back in time for the big soccer game.

Love you loads, and give our love to Fred and the kids.

If you fold that correctly in three, there can be a layer of (mostly) non-sensitive information over both front and back of the layer with sensitive information. And the postcard can add another layer. Look at how you're folding things and putting them in the envelope.

If someone knows the information is there, it's no longer secret. But you can avoid people who don't know it's there noticing it and deciding that this letter is the one they want to steam open.

The best thing, of course, is to arrange this with Margot and the kennel before you go, so that you could just tell her, on the postcard:
Hey, Margot, this is the cliff I dove off today! Acapulco is GREAT! We're extending a week. 
Sure appreciate you looking after Snickers for us. Could you give the poor fellow a hug and my apology for leaving him there another week when you go take care of things at the kennel? 
Love you loads, George.
And if you do need to mention the bank and paying the kennel, put it in an envelope.

So, what does this have to do with e-mail?

E-mail is a lot like postcards. There is a digital "envelope", but it doesn't really cover anything. Even though they call it an envelope, it's just a few more lines of data, and anyone who can read the envelope can read what you write in the e-mail.

Who does that anyone include?

Well, system admins, the NSA, interns at your provider who have been asked to check something on the mail server, etc. Random people at your office, or roommates, etc., who have discovered certain commonly used tools like tcpdump.

Depending on the way your provider sets things up, if your neighbor is on the same provider and is experimenting with his network interface card, she may be able to put it in a mode that lets it see all the data that passes through your modem.

Encryption is one solution. But you don't know how to do that.

That's one of the reasons why, when Microsoft decided to jump on the internet and make it a bandwagon before we really had proper standard methods for encrypting and decrypting e-mail, they were behaving really irresponsibly. (Criminal negligence, in my opinion, but apparently I don't count. And now we are all criminally negligent for not fixing things.)

Now, say you are stuck. You didn't plan ahead. And you need to put your bank information in on postcards. What can you do to make things a little safer?

Five different postcards. First:
August 10th, am. Hey, Margo, look at the cliff I'm diving from here! I dove off it four times. Keep this postcard. I want it when I get back. Poor Snickers is going to miss us. Can you call the kennel and ask them to give us extra time? Luvya, George.
August 10th, pm. Margo, Just had to send you this postcard of the market here, too. Found you a nice leather purse at the fourth shop on the right, there. You'll love it! Don't lose this card, either, okay? Hugs and kisses, George
August 11th, am. Hi again, Margo! Great lunch at this cantina. Seven guitarists. Unbelievable music. Make sure you keep this card, too, okay? your little brother, George
August 11th, pm. Margo, do you see this bank? The seventh building on the left on this street. Doesn't it remind you of the ABC bank back home? I think I've been getting too much sun, but keep this card, too, okay? George.

August 12th. Margo, lovely blue ocean, don't you think? Snickers would love the beach, if not the water. Speaking of the kennel, they need money. You need a number to get it. I'm looking backwards at it in time at the postcards I just sent you. love you loads. George.
Too obscure?

Well, look how we did this. The fact that a PIN was being sent in reverse order was saved until the last. (Okay, there was a number other than the date in each, see? And the PIN is reversed, just to make it more of a puzzle.) That way, someone at the post office in Acapulco would be much less likely to copy those numbers down.


  • Besides the obscurity? 
  • And the possibility that all five postcards end up in Margo's mailbox on the same day? 
  • Or the possibility that she just shook her head and said, "crazy little brother." and threw them all away?
  • Or that the words "save this postcard" induced someone at the NSA to take note and keep copies?
  • etc.
So, make arrangements in advance, in person, not through the mail.

E-mail has a further disadvantage, in that each server on the path from George to Margo has to make a copy to have a copy to send on. Some servers keep those cached copies around for a week. If that intern finds the messages, it's not going to be hard for him to put them together.

Never send a bank password or PIN through e-mail. Maybe no one will see it, but it's not worth the chance.

Well, if you do have to do something like that, have Margo change the PIN as soon as she's used it. (But that could be a race, to get it changed before someone can use it to steal from you.)

Of course, the bank has strictly warned you not to tell Margo your PIN, anyway. If you do and she robs you blind, it's not the bank's fault.

What if you have to send a bank account number through e-mail? Surely that's not as dangerous as a PIN?

Well, the bad guys can't use one without the other, sure. But having either one is half-way there. It's a lot closer than having neither. It's a lot safer for them to have neither.

How often can you change your bank account number?

So, make sure this kind of information never gets on the internet, by giving it to the other person directly, face-to-face. Or use properly e-mail encryption software, like PGP or gnupg.

With that warning, I'll suggest an approach that will reduce the risks somewhat.

First, tell Margo to read this blog post.

Then, split the account number into five or more parts. Say the account number is

Branch: 1723; account: 12340987

make it

17 23 123 40 987

(Note: 40, not 09.)

Multiply each part by a different number, say, 9, 3, 7, 5, 2
  • 17 * 9 = 153
  • 23 * 3 = 69
  • 123 * 7 = 861
  • 40 * 5 = 200
  • 987 * 2 = 1974

Send Margo seven messages like this:
  • I'm going to send you those sample English sentences now.
  • My father is sixty-nine years old.
  • I live at 153 Downing Street.
  • I must remind myself of the year 1974.
  • That two hundred pound bear is mine.
  • Send me 861 grams of chocolate.
  • What do you think of those samples?
She sends you a message like this:
Well, those aren't bad sentences. I got five of them. I need more.
And you send her the next set of sentences, again, in five different messages:
  • Okay, I think I found some more.
  • There are seven heavens for me.
  • I have nine lives.
  • My bonnie is three oceans away.
  • Sing mine five times, please.
  • I give myself two stars.
  • Will those do?
And she sends you a reply that she got them all.

Now, Margo knows you teach pronouns to your students in the order
I, my, me, mine, myself.
So she arranges those in the correct order:
  • I live at 153 Downing Street.
  • My father is sixty-nine years old.
  • Send me 861 grams of chocolate.
  • That two hundred pound bear is mine.
  • I must remind myself of the year 1974.
  • I have nine lives.
  • My bonnie is three oceans away.
  • There are seven heavens for me.
  • Sing mine five times, please.
  • I give myself two stars.
and she divides the first set by the second, to give the account number and branch number, all in one clot:
And, since 17 and 23 were together, she guesses that the branch number is 1723.

Now you need to tell her the name of the bank.
I drive our Mitsubishi Galant to USJ last week.
If she understands, she replies
That's a long way to drive.
If she doesn't understand, she replies,
Where's USJ?
And you try again. Or you just punt and say something a bit easier from the outset, like
Oh, I like Mitsubishi UFJ for banking, how about you?

Now your bank's branch name is Senri Central, so you send a message something like
Hey, when was the last time you took a trip to Senri Central Station? There's a nice park near there.
to which she replies something like
Huh? I don't recall seeing a park there.
if she can't understand the branch name, or
Oh, yeah, that's a nice park.
if she recognizes a branch name from that and finds the branch name and number on the bank's website.

Now, there's one little bit of information left, the account name. That has to be verbatim, so she just asks you how to spell your name, and you copy the account name from your passbook into your reply.

Cloak and dagger stuff. Takes a long time, huh? And it still leaves a trail the NSA or any other bunch of spooks that can afford to monitor you can follow and probably work out for themselves.


At minimum, if you have to send your account number through e-mail, send it in three or four separate messages, spaced somewhat apart in time, out of order.

And send the message that says what they are and gives the order in yet another message, afterwards, to make someone interested in the number have to look for it.

And, of course, never send your PIN through the e-mail.

Saturday, August 8, 2015

Is It Chrysler Or Is It Not?

I think it really is Chrysler. That is what has me bothered.

[Quick update: I searched for the dealer sites. Last time I looked, I couldn't find dealer sites. This time I found sites for both dealers, and both had live chat services. Very helpful operators. It looks like the problem I had before, of not being able to contact them, has been solved. That just leaves the domain name problems.]

[Second update: (24 October) Nope. My optimism was not founded. Or only half-founded. There is no Chrysler-wide policy. I've communicated directly with sales managers at the dealerships involved, and at least one seems to have a policy that they want addresses they can automatically dump their regular sales announcements to, more than they want real customers. (Sorry to be so blunt about it.) 

How is it that people would rather have a bigger pay check now than understand where the money comes from, and why and when the stream is going to dry up?

I guess that's a subject for another blog, sometime.]

One good tell-tale for phishing used to be found in the return e-mail address or in the url link that took you to the web page the mail was sent to inform you about.

If it was different from the domain name of the purported sender, you could guess that the message was not legitimate. And delete it or send it to the spam bucket for your admin to collect and add to the spam filters.

For example, if the mail claims to be from PayPal, and the return address (as, when I click "Reply") is
the address is in the paypal.com domain, which I have strong reason to believe is owned and operated by PayPal. (How I have reason to believe that is a subject for another post.)

On the other hand,
is in the advertising.com domain, and who knows who is operating that right now?

Likewise, if I copy a link url (right click, copy url or copy link) and paste it into a text editor window, I can see the raw url. Again,


is a url in the paypal.com domain. Unless I have been a victim of dns poisoning, the server I would go to when I click on that should be managed by the same people that manage paypal.com. But


is a url in the accountsurveys.tv domain, and could very well be somebody phishing for my PayPal password.

If PayPal wants me to trust the link they send me, but wants to outsource the advertising or the customer surveys, they should delegate a subdomain to their contractor.

Paypal would enter a domain name record that says, effectively,
deptA.paypal.com  => deptpaypal.advertising.com
(This is not the actual command, and it's a little more complicated than just a single line, but it's something a good systems administrator should be able to take care of in an hour, with spare time for a snack, easily. Or, maybe five minutes today, ten minutes tomorrow, and twenty-five minutes in a week from now, checking the results. Done quickly, does not cost a lot.)

With that setting, which PayPal controls (barring software bugs), advertising.com can do their PayPal related stuff using e-mail addresses in the paypal.com domain, and that lets me know that they are, in fact, authorized by PayPal to do it.

(Not 100 percent sure, but better than 90% sure. Again, I should talk more about that elsewhere. But if they ask for my password or login ID by e-mail I should contact paypal.com directly, instead. e-mail is currently not safe to send passwords by.)

And a similar setting can let the folks at accountsurveys.tv (if they really are legitimate) put a whole web site up under
which is in the paypal.com domain, and, again, let me know they are authorized by paypal to do it.

Well, even so, they should never ask me to tell them my password in a survey site. Passwords are not their business.

By the way, I think I remember seeing PayPal slip up on this kind of thing once or twice in the past (new grads or summer interns?), but they are generally pretty good about it.

(If you see a message from PayPal that comes from some domain that is not a PayPal domain, it's not them. Just hit the spam button. Unless it has personal details on you, in which case, contact PayPal directly.)

But there are lots of non-IT companies who seem to be outsourcing stuff and not realizing they need to delegate the domain names to do it under. Chrysler would seem to be one.

Shoot, they have such a variety of websites that I can't tell if any of them really have anything to do with the car company. This is bad news, and may have some influence on the state of their bottom line.

Some specifics --

For more than a year, I have been regularly getting messages with subjects like this:

Hey Joel, come back to Fremont Chrysler Jeep Dodge Ram

or like this:

Your recent Dodge Grand Caravan service at Concord Chrysler Dodge Jeep Ram

Now I really liked the family car I drove when I was a teenager. It was a Mistsubishi-made Dodge Colt (circa 1974). Wonderful car, lightweight, fast, easy to park, room to cart my stereo to church for the dances, etc.

Sometime while I was taking a break from college, my parents bought a second hand Ram van and used it for a long time. Quite dependable, allowed them to travel in reasonable comfort between their home in Texas and my grandfather's home in Utah. That van was turned over to a friend who needed transportation some ten years ago, I think.

I like Chrysler vehicles in general, but I myself have never bought a Chrysler, Dodge, Jeep, or Ram. And the last several years have definitely not seen me anywhere near Fremont or Richmond.

However, there are plenty of people in the world with the same first and last name as me. Some of them have been known to be careless when giving out their e-mail address. I have had to tell people in both England and Australia that I am not the Wookie they are looking for. (And I really am not.)

It is very difficult to tell a Chrysler dealership that they have the wrong e-mail address for a customer. I've tried that and failed several times. I'll probably try looking at the dealer sites for some sort of human-powered contact again after I post this. I'm too nice. Perhaps I should keep just sending these to the unsolicited (spam) box. A click every week or so costs me less time than this post.

On the other hand, this is a good example of how people should not use the internet. So it's not a waste of time to explain what's wrong.

So, back to the subject.

In Google's webmail interface, you have the reply arrow above the message, on the right. Next to that is a little triangle pointing down. Click that triangle and you can get a lot of fun things like "reply to all", "forward", and "filter messages like this".

You can also get "show original". Click that and a new tab opens to show the raw plaintext of the message, including the e-mail headers and, if the message is html formatted, the html source code. Basically, this allows you to see through all the tricks that illegitimate mailers use to make you think a message came from someplace else.

One is from
Who is mar0.net?

It uses links in
Who are they?

And it says in the plain-text part,
PLEASE DO NOT REPLY TO THIS EMAIL. Your message will not be read. Instead, please contact us by phone at the Customer Assistance Center:
Have you ever tried calling a 1-800-number from Japan? It doesn't come free.

And it tells me the VIN of the car and gives me a PIN to register the car at
I remember Mopar from when I drove that Colt. Bought fan belts and distributor caps and carburetor kits from them.

But moparownerconnect is not the way to do a domain name. They could have that as a vanity domain and have it redirect to
and that would make much more sense. But the domain should be in mopar.com.


I don't own that car. But now I know the VIN. If I were a black-hat wannabee, I could cause them some serious customer problems.

No. They should not have sent me that. Not until after they had verified, probably by phone, that the real customer was getting his mail, at least. (And I wouldn't put it in unencrypted e-mail anyway. Sometime I need to blog about how to do that with the current mess, and the technical/marketing/political barriers to doing it right.)

Yes. It is important to understand what I'm trying to say in this over-long blog post.

The other has similar issues, but different.

Their contractor seems to be exacttarget.com.

exacttarget.com seems to think plaintext is evil. That is, there is no plaintext, just HTML that looks almost deliberately obscured.

And they seem to think shortened urls in e-mails are cool. Don't do that. shortened urls tell you nothing, and you really should never trust them.

Well, a shortened url within a known domain might work, I suppose. Perhaps something like
would be reasonable, if cdjr.com were publicly advertised on the Chrysler, Dodge, Jeep, and Ram websites and in the dealers' show windows as being a shorthand site for all four brands.

There's more in this vein, but I think the above illustrate some common traps that should be avoided by users of e-mail, on both sides of the corporate divide.

Mind you, I have nothing against Chrysler/Dodge/Jeep/Ram or their dealers. And I am going to try to contact them again so they can fix this stuff. I think, if I can explain things reasonably enough, they'll be willing to go to the effort of using their domain names correctly.

Who is going to contact the other nine hundred and ninety-nine domain name abusers, I don't know.

Tuesday, March 31, 2015

Splitting the Return Addresses out of the Parameters

[I think I'm done editing now.]

My previous post here and some recent posts on misc@openbsd got me thinking again about significantly mitigating buffer overflows and the like by properly separating the parameters and the return addresses into separate stacks.

I posted my thoughts and asked for suggestions. Philip Guenther pointed out that I needed to unpack my thoughts a bit better.

I responded to several of his points, but I was asleep at the keyboard and didn't get my example of a function call using the separated stacks posted in my reply.

I don't want to add to the chatter on list, so I decided to unpack things here.

This is a lot more work than it seems on the surface.

First, typing in Euclid's method for greatest common divisors was faster than looking it up someplace where I've typed it in before:
Then I compiled to assembler. Don't even have to think to do that:
cc -Wall -S gcd.c

Then I modified the internal gcd() routine to use a separate parameter stack.

Here's how to call the gcd() routine:

         subl    $2*4, %ebp      /* allocate parameters */
         movl    3*4(%ebp), %eax /* copy n2 for the call to gcd */
         movl    %eax, 1*4(%ebp)
         movl    4*4(%ebp), %eax /* copy n1 for the call to gcd */
         movl    %eax, (%ebp)
         call    gcd             /* return address on the flow-of-control stack */
         addl    $2*4, %ebp      /* de-allocate parameters */
 /* Recover the return value. */
         movl    %eax, (%ebp)    /* move the result to divisor */

Here's the gcd() hand-compiled to access it's parameters on the parameter stack: 

         pushl   %ebp    /* Frame for unwinding */
         subl    $4, %ebp        /* Locals */
         jmp     .L2
         movl    4(%ebp), %eax   /* numA */
         cmpl    8(%ebp), %eax   /* numB */
         jge     .L4
         movl    4(%ebp), %eax   /* numA */
         movl    %eax, (%ebp)    /* temp */
         movl    8(%ebp), %eax   /* numB */
         movl    %eax, 4(%ebp)   /* numA */
         movl    (%ebp), %eax    /* temp */
         movl    %eax, 8(%ebp)   /* numB */
         movl    8(%ebp), %eax   /* numB */
         subl    %eax, 4(%ebp)   /* numA */
         movl    4(%ebp), %eax   /* numA */
         cmpl    8(%ebp), %eax   /* numB */
         jne     .L3
         movl    4(%ebp), %eax   /* numA */
         popl    %ebp    /* Restore previous frame */


(I'm not putting the full source of that up here because I'm lazy. Sorry. This is taking more time than I have.)

That wasn't too bad, but it really didn't answer Philip's objections.

From there, things got time consuming.

Providing shims to the C library functions was easy. I put generalized shims at the top of the assembler file because it was easy, but the generalized shims require loading the call address to %eax before calling the shim. Specific shims would be one instruction shorter and faster on the call, but would (potentially) require more shims, depending on how many calls of each number of parameters occurs.

There are other ways to do the shims, of course, but the shims should go away once the whole OS is compiled with the separated stacks.

Tracking which variables were where, so I could demonstrate the shims by keeping them all on the parameter stack that I allocated, required a lot of grunt work -- hand de-compiling and re-compiling.

And the comments I inserted were for my own benefit, more than for yours. Without them, I could not have tracked the variables on the stack. Thankfully, current gcc makes it easier, allocating the whole rack of local variables at function entry. I should have done the same, but I found it easier to focus on small bits of code at a time.

Note that I initially just used calloc() to allocate space for the parameter stack in a c source file I called gcddummy.c, and then shifted to using mmap, so I could map the stack up high, around 64M below the return address stack.

Placement of the two stacks requires some thinking. Putting the parameter stack above the return address stack will leave you vulnerable to certain kinds of collisions, primarily deep recursion with large local variables.

Putting the parameter stack below the return address stack will leave you vulnerable to buffer overflows more than recursion -- large overflows, that is, in the 64 megabyte range in this example. A 256M gap would be even better, but the theoretical vulnerability remains.

I chose below because it would have been much more work to move the return address stack down.

But there should be a gap of unallocated memory between the stacks, and the illegal access exception that should happen (if you have proper memory management hardware) when a buffer overflow hits the gap should prevent anything more than a denial of service to the attacker.

Recursion may be more difficult to handle, but, if the OS provides automatic extension of the stack, one might hope that it would also provide some way to detect such recursions.

Anyway, ultimately, the OS should support the second stack, so that the compiler and the source code programmer won't have to deal with the allocation, other than compiler switches for the rare case when the default size or placement of the gap is not desired.

64-bit addresses, even if not fully decoded, should help with the collision issues.

This looks pretty simple.

The problem is that no current compiler I know of does it this way. That means I have to write such a compiler myself, or I have to go find the production rules in the source of an existing compiler and see whether I can successfully modify the function call and return and parameter access without breaking the compiler.

Learning (finally) how to use compiler-compilers and lexical analyzers would be the easy part. Not breaking the compiler is the hard part.

Writing a new compiler myself might be easier, considering how familiar I'd have to get with the compiler of choice.

And then I would have to go digging through the library source, looking for all the places that directly access the stack and fixing the code that avoids the return address that isn't there any more. va_arg is just the tip of the iceburg.

It's a big project, but I think it needs to be done. It's just too easy to walk on the return addresses (and do bad things with them) otherwise.

Now it doesn't fix the general problem of using buffer overflows to modify the behavior of programs, it just makes the easiest way to do so significantly harder.

It also opens the C language to some paradigms that are currently blocked, and those paradigms just happen to make it easier to write robust programs. But that is a discussion for a later date.

Saturday, March 28, 2015

Converging CPU Models on the 68K Model?

Around 1986, I was talking with a fellow student at BYU about the CPU wars that "everyone" knew had finally (Oh, the relief!) been declared in Intel's favor.

(Do not listen to the salesmen when they tell you their product is the de-facto standard. They are usually exaggerating, but, if they aren't, you should throw them out on their ears. "Everybody does it!" is never a good priority argument.)

I told him that the industry would find itself painted into a corner with the 80x86, and everyone would find themselves having to convert their code to the 680X0.

On the one hand, I did not reckon on Intel being willing to waste as much money on the race to shrink, and on implementing self-fulfilling prophecy heuristics, just to keep their brain-damaged intellectual property somewhat competitive in apparent performance.

If the same amount of money were spent on shrinking and optimizing ARM, there would be no contest at all. Likewise on Freescale's ColdFire.

AMD's 64 bit "extension" to the INTEL 80x86 model basically saved INTEL's bacon.

And this is what is amusing about this whole thing: All the successful modern processors have ended up looking a lot like the 680X0 -- about 16 internal mostly orthogonally accessible general-purpose registers, and a control-flow stack model somewhat supported in the silicon.

If you don't have enough registers (see the 6800), or if the registers are not orthogonal and general-purpose (see the 8080 and 8086), you tend to waste a lot of instructions in moving data between the registers where you will work on them and the temporary storage where you keep intermediate results. Extra code is extra processing time. And it's extra space in cache, which is extra time moving code in and out of cache.

But if you have too many registers (see all the true "RISC" processors and most modified RISC such as the PowerPC), maybe you don't have to waste time getting data in and out of registers, but you lose on code density. The instructions are bigger and they often don't quite do enough, so you end up using more. And that means that you end up with more code to move in and out of cache again.

And it's actually getting the code in and out of cache that generally dings your performance most in general purpose computing. (Embedded applications that don't depend on cache take less of a hit, but the problem of instructions that don't quite do all that you want them to do, requiring more instructions to finish a calculation, still ends up being a trade-off.)

So, there it is. Look at the AMD64 and the first thing you notice is that they generalized the registers in the 80x86 model. Lots of talk about register-renaming, but the biggest thing was making the register set orthogonal to the instructions. Then you notice that they added eight registers somewhere along the line.

We spend a lot of money and time in keeping our code in the intermediate level C language or higher level languages, so that moving to a new CPU is just a compile away.

And the high-level code eats away at performance, too, so we waste a lot of money on bad attempts at optimizing the compilers and interpreters.

Now, we gain quite a lot in developer efficiency by giving up some of the performance to higher-level source code. So higher-level code is actually a good trade-off.

Even though I was wrong in details, I was essentially right. The current most successful "desktop" CPU looks a lot more like the 680X0 than the 80x86, and that CPU competes well in the server market, as well. The code that it spends most of its time running is not 80x86 code any more.

And the most successful CPU overall is the ARM, which also looks a lot more like the 68K than the 80x86. (Low order bytes first is a bit of brain damage that still has to be exorcised from the industry. The more we deal with multi-byte character sets, the more people will realize why.)

Motorola wasted a lot of resources in the CPU war, trying to keep the 680X0 ahead of the 80x86 in salescrew friendly features, many of which turned out to be less-than-useful.

The most important features (full 32 bit branching and a reliable exception model) were implemented in the 68010. Many of the advances between the 68020 and the 68060 actually came in the form of getting rid of features in ways that didn't penalize customers too much when they had blindly used those features.

And ColdFire, which is Freescale's more modern descendant of the 68K, gets rid of even more of the fluff features from the feature wars, and improves the exception model significantly. ColdFire is actually competitive with the ARM architecture in performance and power curve at the same level of shrinkage.

(AMD64/80x86 is only competitive on the power curve in salesperson's dreams, and, even that, with serious handicaps forced on ARM designs by the prevailing INTEL-leaning customs of the industry.)

There is lots of irony in history, and the computing industry is no exception.

Looking ahead, there is one more big irony that I see. When we understand the run-time model a bit better, we will quit interleaving the flow-of-control (return address) stack with the data stack.

Which 8 bit processor of a bygone era had two stacks?

By the way, here's a minimal execution (register) model I see as optimal:
  • Two accumulators, fully orthogonal, to allow tracking intermediate results of two different sets of calculations without hitting the memory bus;
  • The ability to concatenate the accumulators to work with double-width integers and pointers;
  • Two index registers, fully orthogonal, to allow referencing a source and a destination non-scalar without having to save and restore pointers;
  • Fully indexable data stack, so that accessing local variables on the stack doesn't require stack manipulation or using an index register;
  • Flow-of-control stack independent of the data stack, to get rid of the problems that happen when return addresses are overwritten by data;
  • It would be preferable to keep the return addresses in a memory area protected from normal access;
  • Local, per-task static allocation area for variables that are not accessed concurrently ("thread" local in current vernacular);
  • Global static allocation area for constants, code, and variables that need to be guarded by some sort of mutual exclusion;
  • Indexable program counter for calculated jump tables and constants buried in code.
Which archaic 8-bit processor fits this description almost to a "t", if you overlook the clumsiness of the 8-bit direct page register and the lack of a bus signal to indicate when a return address is being pushed or popped? Heh.

I've found that this model supports pretty much all the hand-optimized code I've produced without excess data movement, and allows a very compact machine code expression.

Now, the 6809 doesn't quite fit this model. The direct page register can't be indexed, which is okay for compiled code but not interpreted.

And it doesn't have a separate address space for a return address stack. I have a vague memory about something in the timing diagrams that might allow synthesizing a separate address space, but that's probably wishful thinking. I don't have timing diagrams handy, so I can't check.

Also, the 16 bit address limits are a bit tight.

If having products of your own company that compete with each other were a bad thing, Motorola would have been right to have de-emphasized the 6809.

But look at Freescale's current lineup. PowerPC, ColdFire, ARM? They essentially have three of their own products competing directly with each other in a lot of target product spaces.

When any two are less popular, the third keeps things stable.

The only problem here is that Freescale's salescrew have to educate themselves a bit more about the lineup. That's actually a good thing, because there is nothing more damaging to an industry than a pack of lazy salespeople.

The 68K can be programmed to fit this model, with a bit less compact machine code, except for the separate address space for the return addresses. (Again, I need to check the timing diagrams and, with certain members of the 68K family, the memory access function codes, to see if there might be some way to cache the stack.) Still, with a 32 bit address, you can get the return addresses far enough away from everything else that accidentally overwriting the return address becomes much less common, and most of the attacks on code that involve buffer overruns and such to overwrite the return addresses become very difficult.

The 68K provides other ways to concatenate small integers, so natural concatenation of accumulators is not necessary.

The PowerPC (which Freescale manufactures in cooperation with IBM) can be programmed to fit this model, too, but it usually isn't. Too many registers go unused. (It would be an interesting research project into register counts to compare the results of compiling to the large register set, single stack model on the PowerPC is more efficient than compiling to the above model. Note that the PowerPC code density is about 50% of the 68K at best.)

The ARM (which Freescale also manufactures under license) can also be easily mapped to this model. 64 bit ARM should allow even better separation of the return address stack from other parts of memory.

Renesas's SuperH has some odd non-orthogonalities that require a bit of intimacy with the processor, but it can be mapped to this run-time model as well.

80x86? Not so much. Too much funky stuff happening when you aren't expecting it, too much optimizing to the standard C runtime model which INTEL helped create. AMD64, on the other hand, should map rather well to the model. Maybe, if the register renaming doesn't get too much in the way.

Now, for the 68K, ARM, SH3, and other processors with a bit more registers than you need, the extra registers can always be used.
  • A third index register allows one to access two sources and a destination without having to save and restore index registers.
  • An extra index register could be used to differentiate between global constants and global variables. Yet another could be used for linking to global or parent process callbacks, particularly those for maintaining concurrent access coherency -- semaphores, monitors, mutual exclusion code, etc.
  • Another index register could maintain a second data stack, to separate parameters from local variables. (It's an artificial distinction, I know, but there are many distinctions not required by the math that help optimize code and execution.)
  • You might even dedicate an indexable registers to token threaded interpreter virtual instruction-code list execution.
Returning to the separate stack that is currently not part of the standard C runtime model, another benefit of a separate stack is reduced function call overhead. Since the return address is elsewhere, call frames really don't need special maintenance. If there is no need to walk the variable space of the calling routines, no frame pointers need to be generated at all.

A downside is that it adds a separate page for memory management functions to keep track of, with the added complexity of the operating system needing to make sure the return address stack and the parameter stack both always stay in memory while the process is in memory.

However, if the CPU manufacturer could be convinced to help just a little, the return address could be cached in a dedicated stack-oriented cache, and the return address page management could be greatly simplified, compared to the rest of memory. Careful caching could basically postpone, or even eliminate most of the remaining overhead of subroutine calls, reducing the overhead to the cost of the two ordinary jumps.

In addition, the return address stack does not have to maintain coherence with external memory. Spurious writes to the cache (in other words, not from the instruction fetch mechanism) could be silently dropped, or could trigger a memory access violation.

(Certain kinds of exception processing are usually done be re-writing return addresses, but there are other, better ways to do that on a decent CPU.)

I'm going to make another wild prediction. Once people get used to having 16 registers to use as the code needs it, instead of being restricted to the standard C model that INTEL helped converge around their 80x86 execution model, we'll see more multi-stack solutions. One of the first splits will be the flow-of-control stack.

(More about this model some other time.)