defining computers: 68000

Misunderstanding Computers

Why do we insist on seeing the computer as a magic box for controlling other people?
人はどうしてコンピュータを、人を制する魔法の箱として考えたいのですか?
Why do we want so much to control others when we won't control ourselves?
どうしてそれほど、自分を制しないのに、人をコントロールしたいのですか？

Computer memory is just fancy paper, CPUs are just fancy pens with fancy erasers, and the network is just a fancy backyard fence.
コンピュータの記憶というものはただ改良した紙ですし、CPU 何て特長ある筆に特殊の消しゴムがついたものにすぎないし、ネットワークそのものは裏庭の塀が少し拡大されたものぐらいです。

(original post/元の投稿 -- defining computers site/コンピュータを定義しようのサイト)

Showing posts with label 68000. Show all posts

Saturday, May 6, 2023

No, Higher Costs Were Not the Real Reason for the 8088 in the IBM 5051

[This is a reply to a comment on Hackaday: https://hackaday.io/project/190838-ibm-pc-8088-replaced-with-a-motorola-68000]

*****

For some reason, I can't reply directly to your comment with the eejournal opinion piece [https://www.eejournal.com/article/how-the-intel-8088-got-its-bus/], but I suspect my earlier comment was too brief.

Let me try that again:

The 68000 had built-in support for 8-bit peripheral devices, both in the bus signals and the instruction set. Most of the popular implementations, including the Mac, made heavy use of 8-bit parts, and Motorola had application notes on interfacing other company's 8-bit devices as well as their own. You could mix 8-bit peripherals and 16-bit memory without stretching.

Motorola even had an app-note about interfacing the 68000 directly to 8-bit memory, but any decent engineer would have looked at the note and realized that the cost of 16-bit memory was not really enough to justify hobbling the 68000 to 8-bit memory. That's one of the reasons the 68008 didn't come out until a couple of years later, and the primary reason that very few people used it. There was no good engineering reason for it.

Well, there was one meaningful cost of 16-bit wide memory: You couldn't really build your introductory entry-level model with 4 kB of RAM using just eight 4 kilobit dRAMs. (cough. MC-10.) You were forced to the next level up, 8 kB.

IBM knew that the cost of RAM was coming down, and that they would be delivering relatively few with the base 16 kB RAM (16 kilobit by 8 wide) configuration. Starting at 32 kB (16 kilobit by 16) would not have killed the product. Similarly, the cost of the 68000 would come down, and they knew that.

Management was scared of that.

Something you don't find easily on the Internet about the history of the IBM Instruments S9000 was when the project started. My recollection was that it started before the 5150. It was definitely not later. It had much more ambitious goals, and a much higher projected price tag, much more in line with IBM's minicomputer series. There was a reason for the time it took to develop and the price they sold it at. But even many of the sales force in the computer industry didn't understand the cost of software and other intangible development costs.

Consider how much damage the 5150 did to IBM's existing desktop and minicomputer lines. Word Processing? Word Perfect was one of the early killer apps for the 8088-based PC. Spreadsheet? Etc.

IBM management knew too well that if they sold the 5150 with a 68000 in it instead of the 8088, a lot of their minicomputer customers were going to be complaining to high heavens about the price difference. They knew the answer, but their experience showed them that the too many of the customers would not believe it.

That was the real reason. They hoped the 8088 would be limited enough to give them time to maintain control of the market disruption.

I think they were wrong. But it would have taken a level of foresight and vision that very few of management withing IBM had.(very few outside IBM, either.), to take the bull by the horns and drive the disruption.

*****

Anyway, my point was that higher cost wasn't the real reason any more than the (at the time, much-rumored) technical deficiencies of the 68000.

Sunday, July 3, 2022

A Critique of Motorola's 68XX and 680XX CPUs

I want to note at the top here, that this is not about which company's CPU was better. This is not about comparing CPUs at all.

And this is not disparaging Motorola. Motorola did a pretty decent job of designing each of their CPUs, especially when considering that they were not just pioneering microprocessor design. Engineers with experience designing CPUs were basically all already employed, mostly by other companies. (And many of those CPU engineers didn't really understand CPUs all that well, after all.) Motorola was also pioneering the design of CPUs in general.

The engineers at Motorola did a good job. But nobody's perfect.

Taking these in the order that Motorola produced them:

6800 Niggles:

(1) It's not hard to guess that the improper optimization of the CPX (compare X index register) instruction was an attempt to be too clever, a bad case of penny-pinching and setting arbitrary deadlines, an oversight, or any and all of the foregoing. But, as a result, the branches implementing signed and unsigned comparisons just don't do what they would be expected to do after CPX.

C (Carry) is simply not affected by CPX on the 6800 (and 6802), so the branches implementing unsigned compare, BCC, BCS, BHI, and BLS just won't work after CPX.
V (oVerflow) is the result from comparing the most-significant byte only, so the branches implementing signed comparison, BGE, BGT, BLE, and BLT fail in hard-to-predict ways after CPX.
N (Negative) is also the result from comparing the most-significant byte only. It may not seem that this is a problem for BPL (branch if plus) and BMI (branch if minus), but the programmers' manual says neither N nor V are intended for conditional branching. It seems to me that the N flag will actually be set correctly after the CPX, giving the sign of the result of the thrown-away subtraction of the argument address from the address in X. But using BPL and BMI in ordered comparison is just going to be a bit fiddly, no matter what. You probably just won't get what you thought you wanted if you use BPL or BMI after CPX.

Z (Zero) is the result of all 15 bits of the result of the compare, so BEQ (branch if equal) and BNE (branch if not equal) after a CPX work as expected.

In the abstract sense, pointers were thought at the time to be necessarily unordered, so it sort of didn't seem to matter. Ideally, you wouldn't be comparing addresses for order. But real algorithms often do want to give pointers order, and that meant that, on the 6800, you would have to use a sequence of instructions to cover all the cases in ordered comparison, because you couldn't rely on CPX alone.

This mis-feature was preemptively prevented in the designs of the 68000 and the 6809, and was fixed, pretty much without issue, in the 6801. In the 6805, it's prevented by making the X register an 8-bit register anyway, more on that below.

(2) Addressing temporary variables and parameters on a stack required using X, and if you had something you needed in X, you had to save X somewhere safe -- which meant on a stack if you wanted your code to be re-entrant. But the 6800 had no instructions to directly push or pop X. That left you with a conundrum. You had to save X to use X to save X.

So you had to use a statically allocated temporary variable. Statically allocated temporaries tend to introduce race conditions even in single-processor designs, because you really don't want to take the time to block interrupts just to use the temporaries, especially for something like adjusting a stack pointer.

You can potentially work around the race conditions in some cases by having your interrupt-time stack pointers separate from your non-interrupt-time stack pointers, but that can also get pretty tricky pretty quickly.

The 6801 provides push (PSHX) and pop (PULX) instructions for X.

Stack-addressable temporary variables and parameters were supported by definition in the 68000 and 6809 designs, but not on the 6801. They were considered out of scope on the 6805, but were addressed on descendants of the 6805.

(3) This niggle is somewhat controversial, but using a single stack that combines return addresses and parameters and temporary variables is a fiddly solution that has become widely accepted as the standard. Even though it is accepted, and learning how to set up a a stack frame is something of a rite-of-passage, setting up stack frames to keep the regions on stack straight consumes cycles, even when it can be done without inducing race conditions (see the above niggle about using X to address the stack.)

Separating parameters and temporaries from return addresses is supported by design on the 68000 and 6809, but not on the 6801 or 6805.

(4) The lack of direct-page mode op-codes for the unary operators was, in my opinion, a serious strategic miss. Sure, you could address variables in the direct page with extended mode addressing, but it cost extra cycles, and it just felt funny.

To explain, the binary instructions (loads, stores, two-operand arithmetic and logic) all have a direct-page mode. This allows saving a byte and a cycle when working on variables in the direct page (called zero page on other processors -- addresses from 0 to 255).

The unary instructions (increment/decrement, shifts, complements, etc.) do not. The irony is that the unary instructions are the ones you use on memory when you don't want to waste time and accumulators loading and storing a result.

This may have been another attempt to save transistors by not implementing every possible op-code. But a careful re-examination of the op-code table layout map indicates that it should have been possible without using significantly more transistors. In fact, I'm guessing it actually required more transistors to do it the way they ended up doing it.

Or it may have been an attempt to avoid running into the situation where they would need an op-code for something important but had already used all of the available codes in a particular area of the map. But, again, re-examining the op-code map would have revealed room to fit the op-codes in.

Maybe there just wasn't enough time to re-examine and reconsider the omissions before the scheduled deadlines, and they thought absolute/extended addressing should be good enough.

I'll come back to the reasons it really wasn't further down.

This one was also fixed in the designs of the 68000, and 6809, and sort-of in the 6805, but not addressed or fixed in the 6801.

Fixing it in the 6801 would have been awkward after-the-fact tack-on, but I'll look at that below.

(5) The 6800 had a few instructions for inter-accumulator math -- ABA (add B to A), SBA (subtract B from A), and CBA (compare B with A, which is SBA but without storing the result).

But it's missing the logical instructions AND, OR, and EOR (Exclusive-OR) of B into A, and doesn't have any instructions at all going the other direction, A into B.

Surprisingly, this is not hard to work around in most cases, but the workarounds are case-by-case tricks with the condition codes. Otherwise, you're back to using statically allocated temporaries, and care must be taken to avoid potential race conditions by such things as using the same temporaries during interrupt processing.

This is fixed in the design of the 68000, and eliminated from the scope of the 6805, effectively fixed in the 6809 (by the addition of stack-relative addressing for temporaries), and partially addressed in the 6801 (by adding 16-bit math, the most common place where it becomes a problem, more below).

(6) The 6800 has no native 16-bit math other than incrementing, decrementing, and comparing X, and incrementing and decrementing S. Synthesizing 16-bit math is straightforward, but -- especially without the inter-accumulator logical operators -- it does require temporary variables, requiring extra processor cycles and potentially inducing race conditions.

Also, you usually need one or more extra test cases to cover partial results in one or the other byte, or the use of a logical instruction to collect the results, and it's easy to forget or just fail to complete the math, per the problem with CPX.

And you need 16-bit arithmetic to deal with 16-bit addresses.

This is solved on the 6809 and 6801 by adding 16-bit addition and subtraction. On the 68000, the problem becomes 32-bit math, and it's solved for addition and subtraction, but, oddly, not quite completely for multiplication and division, more below.

(7) To explain this last niggle, of the above niggles, (1), (2), (3), (5), and (6) can be solved in the software application/operating system design by appropriate declaration of global pseudo-register variables, and globally accessible routines to handle the missing functionality, exercising care to separate variables and code for interrupt-time functions from those for non-interrupt-time functions. (These global routines and variables are a core feature of most 8-bit operating systems.)

For example, if your system design declares and systematically uses something like the following:

ORG $80 ; non-interrupt time global pseudo-registers
PSTK RMB 2 ; two bytes for parameter stack pointer
QTMP RMB 2 ; temporary for high bytes of 32-bit quadruple accumulator
DTMP RMB 2 ; temporary for 16-bit double accumulator
XTMP RMB 2 ; temporary for index math and copy source pointer
YTMP RMB 2 ; temporary for index math and copy destination pointer

ORG $90 ; interrupt time global pseudo-registers
IPSTK RMB 2 ; two bytes for parameter stack pointer
IQTMP RMB 2 ; temporary for high bytes of 32-bit quadruple accumulator
IDTMP RMB 2 ; temporary for 16-bit double accumulator
IXTMP RMB 2 ; temporary for index math and copy source pointer
IYTMP RMB 2 ; temporary for index math and copy destination pointer

... and if all the processes running on your system respect those global variable declarations, then you may at least have a way to avoid the race conditions.

But that chews a piece out of the memory map for user applications.

Now, if the unary operators all had direct-page mode versions, see niggle (4) above, the processor could also define a direct-page address space function code, along several other such function codes, allowing the system designer to optionally include hardware to separate the direct-page system resources from other resources in the address map, such as general data, stack, code, interrupt vectors, etc.

Two or three extra address lines could be provided as optional address function codes, to allow hardware to separate the spaces out.

This looks kind of like the I/O instructions on the 8080 and 8086 families, but it isn't separate instructions, it's separate address maps.

An example two-bit function code might be

00: general (extended/absolute) data and I/O
01: direct-page data and I/O
10: code/interrupt vectors
11: return address stack

Such extra address function signals can improve the utilization of the cramped 64 kilobyte address space, even though they would require increasing the number of pins on the processor package or multiplexing the functions onto some other signals, raising the effective count of external parts.

But they provide a place for such things as bank-switch hardware, in addition to general I/O and system globals and temporaries, without having to eat holes in general address space. And completely separating the return pointer stack from general data greatly increases the security of the system.

I'm not sure if Motorola ever did so in any of their evolved microcontrollers, but this could also potentially allow optimizing access to direct-page pseudo-registers when direct-page RAM is provided on-chip in integrated system-on-a-chip devices like the 6801 and 6805 SOC packages.

The 68000 provides similar address function codes, but the address space on the 68000 is so much bigger than 64 kilobytes that the address function codes have been largely ignored.

Before Motorola began designing new microprocessors, such niggles in the 6800 were noticed and discussed in engineering and management within Motorola. The company decided to analyze code they had available, including internally developed code and code customers shared with them for the purpose of the analysis, looking for bottlenecks and inefficient sequences that an improved processor design could help avoid. The results of this code analysis motivated the design of the 68000 and the 6809.

The 68000 and the 6809 were designed concurrently, by different groups within Motorola.

68000 Niggles:

The 68000 significantly increases the number of both accumulators (data registers) and index registers, and directly supports common address math in the instruction set. It also widens address and data registers to 32 bits. They solved a lot of problems, but they left a few niggles.

(1) The processor was excessively complex. Having a lot of registers reduced the need for complex instructions and for instructions that operated directly on memory without going through registers, but the 68000 did complex instructions and instructions that operated directly on memory, as well.

IBM was just beginning work on the 801 (followup to the ROMP) at the time, and reduced instruction sets were still not a common topic, so the assumption of complexity can be understood.

Still, the complexity required a lot of work to test and properly qualify products for production.

(2) They got the stack frame for memory management exceptions wrong. That is, memory management hardware turned out to work significantly better using the approach they did not initially choose to support, so the frames they had defined did not contain enough information to recover using the preferred memory management techniques. This was fixed in the 68010.

(3) The exception vector space being global made it difficult to fully separate the user program space from the system program space. This was also fixed in the 68010.

(4) Constant offsets for the indexed modes were limited to 16 bits. This seems to be another false optimization -- not fatal because they included variable (register) offsets in the addressing modes, so you could load a 32-bit offset into a data register to get what you wanted. But it still had a cost in cycle counts and register usage. This was not fixed until the 68020, and then they went overboard, making the addressing even more complex, which made the 68020 even harder to test.

(5) They added hardware multiplication and division to the 68000, but they didn't fully support 32 bit multiply and divide. This also was not fixed until the 68020. This can make such things as accessing really large data structures in memory suddenly become slow, when the index to the data structure exceeds 32,767.

Of the above, (4), and (5) could conceivably have been dealt with in the initial design, if management had not been pushing engineering to find corners to cut. The first three were problems that simply required experience to get right.

6809 Niggles:

The 6809 does not increase the number of accumulators, but it does add instructions that combine the two 8-bit accumulators, A and B, into a single 16-bit accumulator D for basic math -- addition, subtraction, load, and store.

On the other hand, it does increase the number of indexable registers to six, and it adds a whole class of address math that can be incorporated into the addressing portion of the instructions themselves, or can be calculated independently of other instructions.

It supports using two of the index registers as stack pointers, and thus supports stack addressing, so that race conditions can generally be completely avoided by using temporary variables on stack. (In comparison, the 68000 can use any of the 8 address registers visible to the programmer as stack pointers.)

One of the stack-pointer capable registers can be used as a frame pointer, making stack frames less of a bottleneck. Or it can be used as a separate parameter stack pointer, pretty much eliminating the bottleneck and improving security. (In comparison, the 68000 includes an instruction to generate a stack frame, which, of course, you don't need when you use properly split stacks. It also includes an entirely superfluous instruction to destroy a stack frame.)

One of the index-capable registers is the PC, which simplifies such things as mixing tables of constants in the code. (This is also supported on the 68000, making a ninth index-capable register for the 68000.)

One of the index registers (DP, for direct page) is a funky 8-bit high-byte partial index for the direct page modes it inherits from the 6800. (This is not done on the 68000, but any of the 68000's address registers can be used in a similar way, with short constant offsets for compact code and reduced cycle counts.)

All unary instructions have a direct page mode op-code, which saves byte count if not cycle count.

(1) As a minor niggle, I can't tell that not providing a full 16-bit base address for the direct addressing mode actually saved them anything in terms of transistor count and instruction cycle count, but we are probably safe in guessing that was their reasoning for doing it that way. It is still useful, although it might have been more useful to have provided finer-grain control of the base address of the direct page. (See above about using any address register in the 68000 in a similar way.)

The DP can be used, with caveat, as a base for process-local static allocations, which greatly reduces potential for inadvertent conflicts in use of global variables and race conditions.

(2) Another niggle about the direct page, the caveat, is that the direct page base is not directly supported for address math. Just finding where the direct page is pointing requires moving DP to the A accumulator and clearing the B accumulator, after which you can move it to one of the index registers. Cycle and register consuming, but not fatal.

(3) A third niggle about both the direct page and the indexed mode, it seems like cycle counts for both could have been better. The 6801 improved cycle counts for both, making the 6809 seem less attractive to engineers seeking for speed. It would have been nice for Motorola to have followed the 6801 with an improved 6809 that fixed the DP niggles and cycle count niggles.

(4) The 6809 also does not have address function code signals. The overall design provides enough power to implement mini-computer class operating systems, but the 64 kilobyte address space then limits the size of user applications. Address function signals that allow separating code, stack, direct page, and extended data would have eased the limits significantly.

On the other hand, widening the index registers would have done even more to ease the addressing restrictions. (I've talked about that elsewhere, and I hope to examine in more carefully sometime in a rant on how the 6809 could have evolved.)

(5) Other than those niggles, the 6809 is about as powerful a design as you can get and still call a CPU an 8-bit processor. In spite of the fact that it would have meant letting the 6809 compete with the 68000 in the market, they could have used the 6809 as the base design of a family of very competitive 16-bit CPUs.

In other words, my fifth niggle is that Motorola never pursued the potential of the 6809.

(6) but not really -- 8-bit CPUs are generally focused on keeping transistor count down for 8-bit applications, so hardware multiplication and division of 16-bit numbers doesn't really make sense in an 8-bit CPU design. This is probably the reason the 6809 only had 8- by 8-bit multiplication, and also probably the reason for the irregular structure of the operation.

A similar 8-bit division of accumulator A by accumulator B yielding 8 bits of quotient and 8 bits of remainder might make sense, but I'm not sure we should want to waste the transistors.

16-bit multiply and divide would have been good for a true 16-bit version of the 6809, but that would include a full 16-bit instruction set.

6801 Niggles:

When the 6809 was introduced in the market, it was still a bit too much complexity in the CPU to comfortably integrate peripheral parts -- timers, serial and parallel ports, and such -- into the same semiconductor die that contained the CPU. So Motorola decided to fix just a few of the niggles of the 6800 for use as a core CPU in semi-custom designs that included on-chip peripheral devices.

(It's something that is commonly misunderstood, that the 6801 actually came after the 6809 historically, but is best understood as a slightly improved 6800, not as a stripped-down 6809. Three steps forward, three steps back, half a step forward.)

As noted above, they fixed the CPX instruction in the 6801, but they did not fix the lack of direct-page unary instructions. They also added instructions to directly push and pop the X index register, which greatly helped when you had something in X that you needed to save before you used X for something else.

And they added the 16-bit loads, stores, and math that combined A and B into a single 16-bit double accumulator D -- similar to the 6809, which overcame a lot of the other niggles about the 6800. In particular, you don't feel the lack of an OR B with A instruction to make sure both bytes of the result were zero, because the flags are correctly set after the D accumulator instructions.

And they included the 8-bit multiply A by B from the 6809. They also included a couple of 16-bit double accumulator shifts, but only for D, not for memory, which is a very minor niggle, an engineering trade-off.

They also added an instruction to add B to X, ABX, to help calculate the addresses of fields within records.

This brings up niggle (1) -- ABX is unsigned, and they did not include a subtract B from X instruction. Being able to subtract B from X, or add a negative value in B to X, would have significantly helped with allocating local variable space on the stack. As it is, ABX is primarily useful for addressing elements with records and structures.

Although I/O devices tended to be assigned addresses in high memory on early 6800 designs, the 6801 put the built-in I/O devices in the direct page. They also put a bit of built-in RAM in the direct page, starting at $80.

But, as I noted above, niggle (2) is that they did not add direct-page mode unary instructions.

If they had done so, either they'd have broken object code compatibility with the 6800, or they'd have had to spread the direct-page op-codes in awkward places in the 6800, which definitely would have cost transistors that they wanted for the I/O devices and such. Either way, I think it would have been worth the cost.

I put together a table showing one possible way to spread them out among unimplemented op-code locations in the inherent/branch section of the op-code table for a chapter of one of my stalled novels, and I'll just copy below a list of where I allocated the direct page op-codes:

NEG direct: $02
ROR direct: $12
ASR direct: $03
COM direct: $13
LSR direct: $14
ROL direct: $15
ASL direct: $18
DEC direct: $1A
INC direct: $1C
TST direct: $1D
JMP direct: $1E
CLR direct: $1F

That doesn't prove anything other than that there were ultimately enough op-codes available. But I'm guessing this layout could be done with a hundred or less extra transistors -- transistors that admittedly would then be unavailable for counters or port bits. But it could be done, and it wouldn't have cost that much.

Also, with these in the op-code map, they could have provided this version of the CPU for compatibility, and then provided another version with the direct-page op-codes correctly laid out for customers who were willing to simply re-assemble their source code. (That's all it would have taken, but many customers wouldn't be willing to take a chance that something would sneak up and bite them.)

One possible more efficient layout would have been to repeat the addressing of the binary op-code groups. Working from the right in the opcode map, there are four columns for accumulator B binary operators and four columns for accumulator A binary operators:

$FX is extended mode B, and $BX is extended mode A;
$EX is indexed mode B, and $AX is indexed mode A;
$DX is direct page B, and $9X is direct page A;
$CX is immediate mode B, and $8X is immediate mode A.

In the existing 6800, this continues down two more for the unaries, but then you have the unary A and B instructions:

$7X is extended mode unary;
$6X is indexed mode unary;
$5X is B unary;
$4X is A unary.

Then you have inherent mode instructions in columns $3X, $1X, and $0X, with the branches in column $2X.

In a restructured op-code map, it could be done like this:

$7X is extended mode unary;
$6X is indexed mode unary;
$5X would be direct page unary;
$4X would be B unary;
$0X would be A unary.

And the inherent mode operators would be more densely packed in the $1X and $3X columns.

~~This would require either moving the negate instructions or the halt-and-catch-fire instruction, I suppose.~~ [I'm not finding my reference that had me thinking the 6801's test instruction was at $00. Cancel that thought.] Interestingly, when Motorola laid out the op-code map for the 6809, they kept A and B in columns $4X and $5X, and put the direct page in column $0X -- and left the negate at row $X0~~, so that they had to move the test instruction~~. [Again, I'm not finding my reference on the location of the 6809's test instruction. But they did leave negate where it was.]

Also interestingly, the 6801 has a direct-page jump to subroutine, which could be put to good use for a small set of quick global routines (like stack?). (The op-code is $9D, which some sources say was one of the accidental test instructions in the 6800).

Niggle (3) about the 6801 is that I think they should have split the stack. Add a parameter stack U, and then pushes and pops (PULs) would operate on the U stack, but JSR/BSR/RET would operate on the S stack. This would make stack frames much less of a bottleneck, make it possible to reduce call and return cycle counts, and increase general code security somewhat.

(Note again that the 6809 and the 68000 both directly support this kind of split stack. It was the education system that failed to teach engineers to use it.)

And I'll note here that the 68HC11 derivative of the 6801 added, among other things, a Y index, but no parameter stack.

6805 Niggles

Really the only niggle I have with the 6805 is the lack of a separate parameter stack, and the lack of any push/pop at all in the original 6805. Motorola did add pushes and pops to some derivatives of the 6805, but they were on the same S stack as the return address was going to.

The idea of an 8-bit index that could have a 16-bit base (as opposed to an offset) was novel to me when I first looked at the 6805, but it is rather useful. Instead of thinking in terms of putting a base address in X and then adding an offset, you think in terms of having a constant base address -- like an array with a known, fixed address, and the X register provides a variable offset. Indexed mode for binary operators includes no base, 8-bit base, and 16-bit base, allowing use anywhere in the address space.

A small caveat is that unary operators do not have 16-bit base address indexed versions. This is a valid engineering tradeoff, and they cut the right corners here, fully supporting unary instructions for variables in the direct page.

The 8-bit index does not support generalized copying and other generalized functions needed to support self-hosted development environments (without self-modifying code), but that's not necessarily a problem. Hosted development environments are much more powerful tools than self-hosted. (I think a very small Tiny-BASIC interpreter could be constructed without self-modifying code, but that's more of an application than a self-hosted dev environment.)

It does make the CPX operator much simpler -- as an 8-bit operator.

Motorola ultimately extended the index with an XHI in some derivatives of the 6805, which would have allowed self-hosting for those derivatives, but we won't go there today. Also, we won't look at the 68HC11 in detail today. Nor will we do more than glance at the 68HC12 and 68HC16, even though both are quite interesting designs -- in spite of not having split stacks.

I think this is enough to show that Motorola really did do a fairly decent job with their CPU designs.

Actually comparing CPUs, by the way, requires producing a lot of parallel code implementing several real-world applications for each CPU compared. I'd like to do that someday, but I doubt I'll ever have the spare time and money to do so.

Sunday, February 20, 2022

When the IBM 9000 Scientific Actually Existed (PC History and Context)

(This is not about the more recent mainframe called Enterprise System/9000. It's about a scientific (originally) workstation in the early 1980s called the System 9000.)

I was sure that the existence of the 68000-base IBM System 9000 scientific workstation actually predated the 8088-based IBM PC model 5150, but all sorts of articles say it started development after the machine we know as the IBM PC and was released in 1982 or 1983.

Nope, nope, and nope. I rediscovered something today (late Fri. night, Feb. 19, '22, when I probably really should have been going to bed). My memory was not wrong:

http://www.columbia.edu/cu/computinghistory/cs9000.html

(A picture says a lot. I'd like to post a picture here, but the only one I can find is the one at the end of the link above, which doesn't offer liberal use terms. That one shows a vertically stacked system with a plotter on top of a system unit that includes function keys on a slanted panel above the keyboard tray, and, mounted above the plotter, a monitor with floppy disk drives to its left. It looks like something you'd use in a lab.)

So there were running prototypes in 1980. Not released until 1982/83, but running prototypes in 1980.

IBM shot itself in the foot on this one. Big time.

Effectively gave the industry to Bill Gates and Microsoft.

And I have to remind myself of a little bit of religious mystery:

Microsoft was the only company -- well, one of the few companies -- willing to sell a defective operating system -- which thus allowed a lot of people to make money off fixing Microsoft's problems. (That's the short version. Maybe I shouldn't say "defective". "Unfinished" would probably be a better way to put it.) And it was good enough to start letting some people start keeping their family history work on their computers, among other things. That's why God (or natural consequence, for you atheists and agnostics out there) let Microsoft take over the market.

(I should not that shooting yourself in the foot is what this world is for. Shooting yourself in the foot doesn't have to be a fatal error, and, even though it's painful, it can be a good experience if you learn from it.)

Unpacking that:

First, when talking about software, unfinished and defective mean essentially the same thing -- that it doesn't (yet) perfectly satisfy the customer's needs.

But there is no such thing as finished software. If you have a piece of software that's in use, someone is going to be regularly finding ways in which it doesn't meet spec, and other ways in which it could be extended and improved.

Even now, software always has defects. Nobody sells truly finished software. When talking about software, finished means dead.

But back in the 1970s and 1980s, most companies in the nascent software market intended to at least try to make their products free of known defects before they put them on the market. Microsoft, on the other hand, was willing to put products out that were known (by their own estimate) to be only 80% finished. That meant that the customers could be using it while they worked on a better version.

(It also meant customer frustration because of overly optimistic interpretation of sales literature claims. I got bitten hard several times by that, and, yes, the pain is still there. The second time, yeah, I should have been more wary. The third time, I guess it was my fault for taking the claims too literally yet again. Talk about painful mistakes, but I've learned that Microsoft's stuff doesn't work very well for me.)

In the 1990s, Microsoft made too much of this principle with their 80/20 rule of getting a product out the door when 80% of the function was implemented, and letting the customer help figure out the remaining 20%. All too often, it was closer to 20/80 in my opinion, but even that is not exactly wrong in the agile approach to technology.

("Agile" is actually a discipline that I approve of in concept, if not in extant implementations. But I'm being a little more general than Agile techniques here.)

Other companies, including IBM, still tended back in the 1980s to try too hard to give a product too much polish before turning it over to the customer. That approach may make for more satisfied engineers (maybe), but gives the customer less say in how the product should develop, and less opportunity to revise and refine their ideas of their requirements early on, while the product requirements are easier to rework.

Microsoft BASIC is one example of this principle. (And Tiny BASIC is an even more extreme example.)

Dartmouth stripped down the definition of the Fortran (ForTran <= Formula Translator) language to produce a definition of a BASIC (Beginners' All-purpose Symbolic Instruction Code or something like that) language. Fortran was too complicated (for some meaning of complicated) for ordinary people to understand, so they made it simpler. More important, Fortran had to be compiled, meaning only people with a compiler could use it, putting it even further out of reach of ordinary people.

But even the Dartmouth definition of BASIC was more than your usual user thought they wanted at the time. A programmable calculator with a large screen was just fine for an awful lot of purposes, and was more than they/we had before then.

So Paul Allen and Bill Gates borrowed (with tacit non-disapproval) a certain company's PDP minicomputer at night for a couple of months and worked up a stripped-down derivative of a derivative of Dartmouth's BASIC (and in the present intellectual property regime would have probably owed a lot of royalties to Dartmouth and DEC among others) and got it running on the very early microcomputers, starting with the 8080 but continuing to the 6502, 6800, and many others.

(If I've made it sound easy, they were definitely ignoring their studies during those two months they worked on the first 8080 version, using understanding acquired from previous experience elsewhere, and putting in long hours for the whole two months.)

Microsoft BASIC was very incomplete. But it filled a big need. And customers were able to give them feedback, which was very important. (A different, somewhat freely distributable version of BASIC, Tiny BASIC, was even more incomplete, but it filled a similar big need in a more varied, but smaller overall market.)

Family history, at the time, was a field in which the professionals had very arcane rules about paper and ink to use, format of data, data to include, and so forth. As incomplete as anyone's implementation of BASIC was, programs written in BASIC were able to essentially help the researcher get the data into computer files fairly correctly and fairly painlessly.

(Family and personal history are a couple of the hidden real reasons for the need for personal computing systems.)

The same sort of thing happened with Microsoft DOS and Windows operating systems. They were incomplete and even defective in many ways, but they provided a framework under which a variety of programs useful in business applications could be written and shared/sold.

CPM from Digital Research was more complete than DOS, but more expensive. So was the Macintosh, from Apple. (Microware's OS-9/6809 was very nicely done for the time and, on Radio Shack's Color Computer, was priced more within reach, but it had an image of being too technical for the average user, and Microware really wasn't trying hard to sell it in the general market.)

Essentially, the incomplete (or defective) nature of Microsoft's products provided a virtual meeting ground for the cooperation of a large number of smart people to fix, enhance, and extend the products.

Similar things could be said for Commodore's 6502-based computers, but they had the limits of an 8-bit architecture, and Jack Tramiel and Commodore's board of directors were way too content just selling lots of cheap stuff and letting the customers figure out what to do with it.

When Tramiel picked up the (68000-based) Amiga, he didn't have a way for people to bridge the gap between Commodore's earlier 8-bit stuff and the Amiga.

One thing that can be said about Microsoft, they understood managing and selling upgrades.

Incompleteness and even defects can be a valuable feature of a technological product.

So this much was not bad, really. It was when Microsoft got too big for their britches in the mid-1980s when things started going really south, and when Microsoft refused to give up their developed tacit monopoly in the mid-to-late 1990s that things went permanently south.

Compare this to Apple's Macintosh? Apple had good technological reasons to keep closer control at the time. Even the 68000 didn't have enough oomph to provide a stable graphical user experience if too many people got their hands in the pie. But the lack of approachability did ultimately hurt the Macintosh's acceptance in the marketplace, more so than the price to the end-user. The technological reasons for doing so notwithstanding, maintaining that control hurt their market acceptance.

Intel's development of the 8086 followed a similar pattern of required upgrades, moving from less complete to more, although they almost killed themselves with the 80286.

Both Intel and Microsoft are now eating the consequences of trying too hard to be the only dog. Too big, too heavy, too much momentum in the technology of the past. And we, the market, are eating the consequences of letting them get into that position.

All the big companies are in the way. When a company gets too big, it can't help but get in the way, especially when they are busy trying to establish or keep the tacit monopoly position in their market. (I'm very concerned about Google, even though I use their stuff. Even though it works for me now -- Microsoft's stuff never worked very well for me. Even though Google's stuff works for me now, I'm looking for alternatives. Monopolies are not a good thing.)

I've wandered from the topic.

Yes, using the 68000 would have required IBM to work harder to keep focused on a limited introductory feature set similar to the 8088 IBM PC.

So what about the IBM 9000 and Motorola's 68000 and the IBM 5150?

Could IBM have based their personal computer offering on the 68000 instead of the 8088 and been successful?

The IBM 9000 has been regularly taken up, along with Apple's Lisa, as an example of how developing a PC based on the 68000 would be prohibitively expensive for the personal computer market.

But both are more of an example of how the 68000 allowed an awful lot more to be done than the 8086 -- so much more that over-specification became way too easy. It had lots of address space, decent speed at 16 bits, not slowed down significantly at 32 bits. And it was hard to tell the idealists in the company (the board of directors and the sales crew) who wanted to add features to a product, no, we can't do that yet, until it was too late and the product had departed significantly from what the customers wanted and was way over budget and way past the delivery date or market window. That's a significant part of the reason both the 9000 and the Lisa did not come out in 1980, and ultimately did not do well in the market.

The Macintosh is a counter-example. Much tighter focus in development, more accessible entry price, more accessible product. (And borrowing heavily from the lessons of the Lisa.)

I often say that the 68000 was may have been "too good" for the market at the time, since it seems to have required someone with Steve Jobs' focus and tenacity, and the lessons of the Lisa, to successfully develop the Macintosh.

The 9000 was targeted at the scientific community, and it was intended to be a "complete" (meaning all parts IBM) solution. That kept it off the market too long and kept the price high.

Could IBM have stripped down the 9000 design and built a machine comparable to the IBM PC with a 68000 and sold it at a comparable price? Or even started over with a simpler goal and successfully developed a 68000-based PC?

People have been "explaining" that it would have been "prohibitively expensive" for an awfully long time.

Sure, the 9000 was significantly more expensive than a PC, but it came with significantly more stuff. Expensive stuff. A complete, (relatively) solid real-time OS in ROM, for instance, 128K of ROM vs. the 8088 PC's 20K ROM with BIOS+BASIC only; base configuration included disk integrated into the OS, vs. no disk and no OS beyond BASIC in the original IBM PC. 128K of RAM vs. the IBM PC's 16K in base configuration (not just double, 8 times the amount).

The base configuration of the 9000 came with floppy disk storage, touch-panel display, keyboard designed for laboratory use, ports to interface it to scientific instruments, and something called memory management. All of that was very expensive stuff at the time. (I don't remember if the plotter was standard in the base configuration.)

And it was expensive to put any of that stuff on an 8088 PC.

Speaking of memory management, some people thought the segment registers in the 8086 were for "memory management", but that's just plain wrong. They were not designed for the same class of function. No mechanism to control class of access, no bounds registers to set segment size, no help when trying to page things in and out of memory. More of a cheap substitute for bank switching. Again, it was not a bad thing, just not what some people thought it was.

FWIW, the 68000 didn't need bank switching at the time because of the large address bus. And it came with 8 address registers, 7 of which could easily be used as true segment registers without the clumsy 16-byte granularity of the 8086 segment registers. Segment size still had to be enforced in software on the 68000, rather than hardware.

As the guy at the link I posted above said, strip the 9000 down to the kind of machine the original PC was and it would have been very competitive in 1980.

How much more would it have cost than the original IBM PC?

8 extra data lines. 4 extra address lines. 12 extra traces on the mainboard, twelve more lines in the expansion bus, two extra buffers. And they could've fudged on the extra address lines, left them out of the first model and kept the first model limited to a single megabyte address space. A megabyte of address space was huge at the time.

Other than the bus connectors, less than ten dollars, including markup.

Bus connectors were often mentioned as a blocking point at the time. They had sources for the bus connector they used in the System/23 Datamaster, and the connector was not overly expensive. But, with just 52 lines, it was just big enough for 8 data bits, 20 address bits, and the control signals and power lines they wanted. They would have either had to get a wider connector, or they would have had to use a second connector like the one in the AT bus from the outset. Forty dollar (after markup) for just the bus connectors seemed large to them, I guess, even though I think they should have been able to see that the cost would come down at the volumes they would be purchasing, even with their woefully underestimated demand.

RAM chips? The original board laid them out in four banks of eight. Arranging that as 2 by 16 instead of 4 by eight would not have killed any budgets. The one minor disadvantage was that you physically had to start with twice the base configuration RAM, 16 chips instead of 8. (The max on-mainboard would not have changed.) But that does not seriously harm the end price, either. Calculating the price as a portion of the base sticker price for the 8086 IBM PC, it would have added something like fifty dollars.

BIOS ROM? The original had four 2K ROMs, didn't it? Arrange those in 16-bit wide pairs and you're done. Same count of ROMs, one extra buffer. Three bucks added for liberal markup on the buffer and PC board traces. (I know there were engineers who felt that it was somehow sacrilege to use a pair of 8-bit ROMs instead of a single 16-bit wide ROM, but, no, it wasn't. And, like I say, it did not really add significantly to the cost to do it in pairs, if you're going to have four ROMs anyway.)

The ROM for BASIC? Yeah, that would have had to be done as a pair of ROMs, so add, I think it was, twenty dollars at manufacturing prices plus markup for two 8K by 8 ROMs instead of 15 dollars for a single 16K by 8 ROM.

Would the 68000's declaimed lower code density have meant more ROM?

No. Code density on the 68000 was not worse than on the 8086, unless you deliberately wrote the 68000 code like you were transliterating 8086 code.

The 68000 does not have as good code density as the 8-bit 6809, but the 6809 is exceptionally code dense when programmed by someone who knows what he is doing. Different question.

If you were using compilers of the time to do the coding, compilers for the 68000 were often really weak on using the 68000's instruction set and addressing modes. The compiler engineers really seemed to be writing the code generators like they were writing for, who knows? Some other, much more limited CPU.

If an engineer did the BIOS and BASIC in assembler and didn't bother to learn the CPU, sure, he would get similarly bad results. But the 68000 had a regular instruction set. It shouldn't take more than, say, eight hours of playing around with a prototyping kit (such as Motorola's inexpensive ECB) to understand.

Ah. Microsoft. Yeah. Their BASIC for the 6809 was not a model of either space or cycle efficiency. Apparently they did just map 8080 registers to 6809 registers and key 8080 instructions to something close in the 6809 instruction/addressing mode repertoire, and did the conversion automatically with no cleanup afterward. So, if they had gone that route, using Microsoft's BASIC, they'd have had to use a second 16K ROM. Add 15 dollars.

Do I fault Microsoft for their BASIC for 6809? Yeah, I guess I do, a little. It's a dog. Sorry. Some people think it's representative of 6809 code. No.

Peripheral chips? I've heard people talk about lack of 16-bit peripheral chips for the 68000. Why do they talk about that? I don't know.

The 68000 specifically included instructions to support the use of 8-bit port parts without any additional buffers or data bus translation. Hanging a 6821 on a 68000 was literally no more difficult than hanging one on a 6800 or 6809. Likewise the 6845 video controller that got used in the original PC. And non-Motorola parts would be no more difficult. Completely beside the point.

Price of the processor, yes. But not the four times price that is often tossed around. That's a small lot price. IBM's projected manufacturing, underestimated as it was, still would have allowed much better pricing on the CPU. So the 68000 would have added a hundred dollars to the cost of the first run.

Availability? Motorola was always conservative on availability estimates. It was not an actual problem, although I suppose some of IBM's purchasing department might not have known that.

Scaling the operating system down from the 9000's? That could have been a problem.

But IBM ultimately infamously went to a third party for the OS of the 5150 PC, anyway.

Operating systems aren't that hard to port to a CPU as capable as the 68000. I'm sure Microware would have been game for porting OS-9/6809 to the 68000 a couple of years earlier than actually happened. OS-9 was a good OS that was cheap enough for Radio Shack to sell for the 6809-based Color Computer, and it took them less than a year to go from the 6800 to the 6809 with both OS-9/6809 and Basic09 in 1979/1980. On to the 68000 in 1980/1981 instead of 1981/1982? Not a problem.

Some people worry about the position independent coding used in OS-9, but the 68000, like the 6809, directly supports PC-relative addressing (IP-relative, in Intel-speak), so you don't need a linking loader. No need for relocation tables. A module can load at any address and be linked and used by multiple processes, themselves loaded in arbitrary locations without patching long lists of addresses.

You point an index/address register to the base of the module, and the caller knows the offsets, and the callee doesn't care where it is loaded. Everything is relative. That's why OS-9 can be real-time, multitasking with no MMU.

Another possibility for third party OS and BASIC was Technical Systems Consultants, the producers of TSC Flex (similar to CP/M) and Uniflex (like a stripped-down Unix, but in a different way from OS-9 -- not real-time, not position-independent). They knew their way around assembly language on Motorola's CPUs, too.

Several possible third-party sources.

Total price increase? $200, maybe $250.

Pushing the price up to $1750 from $1500 was not prohibitive, not even a problem, if they had simply done the necessary market research.

This sounds theoretical?

I did some tentative design work about that time, similar to Motorola's ECB prototyping board for the 68000. Comparing it to the IBM PC specs, using factory volume prices on the CPU, they could have sold a 68000-based equivalent with a base configuration of 32K RAM instead of 16K to make the data bus easy, and still sold it at about 200, maybe 250 dollars more than the original IBM PC, without losing money. The market would have accepted that.

I mean look at what people were paying for OS-9/6809 systems in 1981. Well, okay, those systems came with at least one disk drive in base configuration for something more than $2000 total. But the acceptance would have been there.

I remember from conversations with people who worked at IBM that there was concern about the PC competing with the System 34. That was a sort-of valid concern, really. But that's the only thing that might have required them to set the price above 2000. But, no, the System 34 is a lot more than just CPU, just like the 9000 was. It was misplaced concern.

I've mentioned the 6809 above, and I'll mention it again here. Motorola shot themselves in the foot by failing to upgrade the 6809 to be a true 16-bit CPU, either with expanded index registers, or with full-width segment registers, to break the 64K address barrier without external bank switching.

They also shot themselves in the foot by over-designing the 68020. Way overdone, that one. If they had gone directly in 1984 to what they eventually called the CPU 32 instead, the early market for RISC CPUs would have taken a temporary but serious ding.

Was the 8088 PC a mistake, then?

Not exactly.

I rather think that going only with the 68000 would have been a mistake of a different sort, and I'm pretty sure I would have complaints about that, as well.

I think I've hinted at some of the reasons above, but I can sort-of summarize by noting that the reason monopolies are bad in technological fields is that all the technology we have is unfinished, broken outside the context in which it was designed. We can't escape that. Solutions are always context-specific. And technology that is too powerful from the start can really get in the way.

So, I say all of this, and yet I say that the 8088-based 5150 we know and love as the IBM PC was a mistake.

I personally think, if they had been really smart, they would have released both the 8088 PC and a 68000 PC at the same time, running relatively compatible OSses and software. Deliberately sowing the field with variety keeps options open for everyone.

Variety helps the market more than hinders it. But that is a separate topic for a separate rant another day. (Or an alternate reality novel. This is part of a theme I'm trying to explore in a novel that I never seem to have time to work on any more.)

[EDIT 20240519:

The Wikipedia article has links to more information, such as a 1984 BYTE article (Feb., p. 278). ]

Wednesday, December 22, 2021

Alternate Reality -- 33209 (pt. 2, summer 1981)

***** Alternate Reality Warning *****

So, for whatever reason, you are reading my fantasies about an alternate world where computer and information technology took a significantly different turn in the early 1980s. Maybe you started with the different Mac OS-9 that could have been. Maybe you have read the beginning of the 33209 technology timeline or the beginning of my story about the the alternate world of the 33209.

At this point, the timeline has gone well beyond what I have published of the 33209 story, but I'm going to keep plotting it out in advance because I've hit a writer's block on my stories. I'm not sure the timeline here will match the story as it plays out, though.

(You're not sure you're interested? How odd. ;-)

* Summer 1981:

In late June, Several engineering and CS students from University of Texas's main campus in Austin come to visit, ask to join the group for the rest of summer, and are accepted in. Some of the UT Austin group are undergraduate, some are doing graduate work.

Microware formally begins a top-secret project to re-write OS-9/6809 to use a split stack and merge in other features and concepts from Susumu. TSC begins publicly announced re-writes of both Flex and Uniflex, merging features from Susumu into their systems. Both Microware and TSC begin developing versions of their OSses for the 68000, as well, Microware secretly and TSC publicly.

IBM and Apple collaborate with the Microware and TSC OS projects, Apple openly about the non-top-secret projects, IBM on the quiet. Various students from the group are invited to Clive and Chapel Hill for short internships, as well.

Bill Mensch and others from Western Design Center request to be permitted to come observe the research group's work, and, after a short discussion among students and sponsors, are allowed to visit.

By the end of June, the students working with TI's new version of the 9900 have gotten their prototypes running most of Susumu, better than the ports to the other processors except for the 6809, 68000 and 682801.

More games, and more complex games are evidence of the functionality of the language and the system, and, with other applications that the students develop and share, help motivate the students to add support for modularization and versioning (tracking and control) to the language.

Versioning drives the development of archiving tools, and those who hadn't yet built fast tape hardware for their systems do so, for the mass storage necessary for backup and archiving. Even at a mere 4 kilobaud with one bit per baud, one channel of one side of a forty-five minute cassette yields the storage capacity of several of the floppy diskettes of the time. (Cassettes longer than twenty-three minutes per side tend to stretch more quickly, so the students generally tend not to use 90-minute or two-hour cassettes, except for permanent one-time backups.)

Several of the students develop control hardware for their cassette recorders or cassette tape decks, to allow various levels of computer control.

Apple releases its first models of the Apple IX line running OS-9 on the 6809 in early July, using the 6847 for console video output and the 6845 for optional workstation video output. The BIOS ROM is essentially Susumu.

Also in early July, more magazine articles appear, including rushed articles discussing early versions of Split P and the common code base. Outside the group, Split P generates particular interest at Bell Labs and the parent telephone companies.

And also in early July, Motorola offers the group samples of the upgraded 6809s and 6845s, along with integrated support chips for the upgraded 6845, and most of the group who are not out for the summer volunteer to experiment with the parts in various configurations, including using 6845s with non-Motorola CPUs.

Within a few days, students have Micro Chroma style computers with the upgraded 6809 running Susumu, using the 68471 for video output. Computers employing the upgraded 6845 take a bit more time to get up and running, but several are running within a week and work begins on improving the code generation for the processor's new features.

The UT Austin students arrange to move their coursework to the local branch of UT for fall semester so they can continue to engage with the group.

Bill Mensch arranges with a couple of the UT Austin students to help him design an upgrade to the 6502 that will be optimized for Susumu and Split P, and returns to Mesa.

And also early in July, a couple of students get interested in text windows on the Univac terminals and starts trying to implement them in their own computers. One of the teachers mentions the Xerox Parc project and graphical user interfaces, and several students begin back-burner on-again, off-again projects trying to figure out how to make such a thing work. Various experiments in pointer devices are attempted. Joysticks are quickly produced, and functional light pens have been produced for the 6845-based displays by mid-July. Various attempts at tracking a pencil writing or sketching on a page are attempted, without much success.

And again, also starting early in July, inspired in part by the dc/bc basic calculator the students have discovered in Unix and in part by Tiny BASIC, a group of students work to re-implement dc on top of the Forth grammar of Split P, and then re-implement bc on top of the C grammar. They then extend both with formatted output and other functions usually found in BASIC. At IBM's request, they add graphics and sound functions, and certain functions useful in a business environment.

Overall, July sees more general steady progress, with Susumu becoming stable on the upgraded version of the 9900, and, towards the end of July, on the other non-Motorola CPUs.

Discussion of a license for Susumu and the hardware the students are developing reaches a head, and a software license similar to the licenses Berkeley and MIT would later in our reality use for open source software is produced by the students and approved by the college, with help from the sponsoring companies.

Several hardware design licenses are produced and approved by the college, and the college and the sponsoring companies help arrange for patent research and applications.

More stringent sharing agreements are established for participating in and getting support from the research group, similar to the openBSD project source code requirements and the software freedom terms of the GNU Public License that would develop later in our reality.

Unix on the 6801 remains a bit slow but usable. It is a bit snappier on the 682801, more so on the 6809, even more so on the new version of the 6809.

Unix is even more usable on the 68000, even though virtual memory management functions on the 68000 can't be handled well without a full memory management unit. Simple bank switching in an address space as large as the 68000's is, of course, not very workable for hardware memory management. And, of course, the students haven't quite figured out virtual memory yet, anyway, even with the help of the UT students and some industry engineers. The complexities of virtual memory were not generally well understood at the time.

Experiments begin with using four of the 68000's address registers as explicit run-time segment registers for Unix, mirroring their use in Susumu, to help work around lack of memory management.

Similar experiments in segmentation on the 8086 do not fare as well, because of the design of the 8086's segmentation, but still proceed.

In very early August, Motorola publicly announces sampling of an upgrade to the 6809:

Instruction timings for the 681809 are brought up to par with the 6801 and 682801, and the 681809 is announced as an SOC core with most of the I/O library for the 682801 available, including on-board DMA and bank switching.
The 6809 already has direct page mode op-codes for the unary instructions, so that doesn't change. However, the direct page mode has been added to the index post-byte, allowing memory indirection on direct page variables and use of the load effective address (LEA) instruction to calculate the address of direct page variables.
A new I/O page register similar to the direct page register is added. Unlike the original DP register, there are no op-codes allocated for I/O page access. Addressing via the IOP register is implemented entirely as a new mode in the index mode post-byte, and access is entirely via arguments to the TFR and EXG instructions.

Since no codes for the IOP are available to the PSH and PUL arguments, pushing and popping IOP require transfer through another register. Since the IO page is considered more of a hardware design constant, this is considered to probably not be a great bottleneck in the use of the IO page.
64 byte spill-fill stack caches are implemented for the S and U stacks, similar to those in the 682801, but with a shorter spill point for S, to cover the larger register set -- 22:34:8.
In addition to the stack caches, the initial SOC versions of the 681809 has a half-kilobyte of internal RAM for use as fast direct-page RAM.
User and system modes are also implemented similarly to the 682801's. Access to the I/O page register can be prohibited in user mode, generating a new memory access violation interrupt.
Likewise, address mode function code outputs are provided, indicating addresses relative to the PC, to the S stack, to the U stack, to the direct page, to the I/O page, to extended (absolute) addresses. Two codes of eight are reserved.

Because the index post-byte provides the ability to specify each of the separations, address space use for the 68109 is much more flexible than for the 682801 when the address space can be determined at compile time.

As with the 682801, utilization of the extra address space is expected to require caution. However, since the index post-byte can specify the mode and therefore the space being used, the 681809 should be able to use the extra space more effectively, as long as the space is known at compile time. Better gains in address space are expected:
- code <= 64K
- general data <= 64K
- direct page <= 256 of 64K (because of the direct page register)
- I/O page <= 256 of 64K (because of the I/O page register)
- parameter stack <= 64K
- return stack <= 64K
With the 681809, the actual aggregate active memory space per process is expected to be able to often exceed 128K, but not typically exceed 192K. Again, work with Susumu and Unix bears these expectations out.
The initial SOCs provide bank switching somewhat similar to the 682801's. But the mapping registers have four more bits than those on the 682801, allowing maximum physical addresses of 16M. With the larger maximum address space, bank switching is simplified for the two stack spaces and the direct and IO pages.
Small integer hardware division per the 682801 is also provided.

Motorola also announces the Micro Chroma 681809 as a prototyping kit for the new processor and the 68471, again with Susumu in ROM.

Improvements in the 68455 over the 6845 are not as easy to quantify, being mostly a redesign to better fit a video cache used for either text or graphics, along with improved support for hardware scrolling and generally improved graphics support. Modes which switch between direct pixel output and character ROM mapped output simplify the support circuitry for designs that include being able to switch between graphics mode and text mode screens.

New support circuits for the 68455, to simplify the digital-to-analog output conversion for color and gray-scale are announced, along with a companion DRAM refresh/bus arbitration part, the ~~68831~~ 68835, capable of directly supporting up to 512 kilobytes of video RAM, and a nearly identical ~~68832~~ 68837 that includes bank switching for the the CPU side, for processors that can't address large address spaces on their own.

TI announces the 99S200, a version of the 9900 with the local stack frame implemented as entries in a spill-fill cache instead of in general memory. In addition, the 99S200 adds a separate return address stack with spill-fill caching and improved call overhead, and removes the PC entry in the local stack frame. The 99S200 does retain a 512-byte internal bank of fast RAM. SOC parts and libraries are also announced, initial SOCs including both bank switching and DMA.

TI also announces the 99V180 video display, compatible with the 9918A, but supporting up to double the 9918A's horizontal and vertical resolution, if the external digital-to-analog circuitry and display device are fast enough.

And TI announces the TI-99/16 home computer, utilizing the new processor and video generator in a memory configuration less constricted than the 99/4, with both Susumu and an improved 99/4 BASIC compatible BASIC in ROM. A simplified version of the main circuit board for the new home computer is also made available as a prototyping kit, with only Susumu in the ROM.

Shipments of the 99/16 are expected to begin in plenty of time for Christmas.

A few days later, Motorola pre-announces the 68010 about a half a year ahead of the announcement in our reality, with the improvements seen in our reality:

The Popek and Goldberg virtualization features,
Exception stack frames corrected to allow recovery from bus faults,
Three instruction loop mode,
Vector base register,
And improved instruction cycle counts for certain instructions.

-- and a little bit more:

New addressing modes -- 32-bit constant offsets for indexed modes and branches.
New 32-bit integer multiplies, with 32-bit and 64-bit results, and new 32-bit by 32-bit divide and 64-bit by 32-bit divide instructions.
A new system-mode A6 is provided in addition to the system-mode A7, to better support split-stack run-times.
Four spill-fill stack caches are provided, one each for system and user mode A7, and one each for system and user mode A6 as a parameter stack.

At the same time as the pre-announcement, samples are made available to the research group, and students dig into it, helping Motorola debug the design in much the way the students helped debug the 99S200.

A few days later, moving plans up under pressure because of TI's announcement of the 99/16, IBM announces the IBM PC in two models, the Business PC/09 based on the 681809 with bank switching, and the Family PC/88 based on the 8088, both with video based on the original 6845. Both models start at 16K of RAM expandable on the mainboard to 64K. The 640K address space limit we are familiar with is present on both the PC/88 and the PC/09, but the PC/09 adds a second connector for each card slot, similar to the AT bus that would later be used in our world, that allows full 16M addressing, making the boundary mostly irrelevant. Pricing for the both models is comparable, $1500 for the PC/88l, 1525 for the PC/09 model.

Neither comes with disk drives, but disk drives are an option. Both have built-in cassette I/O interface and simple sound generating devices.

Both models offer integration with existing third party OS products -- CP/M or MP/M for the 8088 and Uniflex-S/6809 or OS-9S/6809 for the 681809, and both already have software and hardware products that allow integrating these PCs as workstations into IBM's mainframe and mid-range systems.

OS-9S is a split-stack version of OS-9 based on OS-9 level 2 and Susumu, now no longer top-secret. Uniflex-S is based on Uniflex and Susumu.

Both have Susumu as their BIOS in ROM, and both include (as something of a last minute addition) the students' extended dc/bc languages in the ROMs.

A third model is also announced, based on the (alternate world) 68010 which Motorola has just announced. It starts at 32K RAM installed on the mainboard, but otherwise has a similar design to the 6809 version, including the expanded bus connector, but with a full 16-bit data bus. Uniflex-S/68000 and OS-9S/68000 are available OSses. Pricing starts a little higher, at $1750.

Neither Microsoft BASIC nor PC-DOS are mentioned.

In answer to questions about models based on other processors, IBM specifies a product concept that focuses on the software rather than the hardware. They do not commit to using other CPUs, but they do mention development research on the 99S200 and the newly announced Z8001S. They do not mention their own ROMP.

A week after IBM's announcement, Radio Shack rushes to announce their own OS-9/6809 compatible model, the TRS-09 Color Business Computer based on, and compatible with, their original Color Computer. It includes a built-in DMA controller, with the Color Computer's Multi-Pak interface, a floppy disk controller, and two hardware serial ports in addition to the single CPU-intensive bit-banging port of the Color Computer also built-in. It boots to Microsoft Disk Color BASIC, but the DOS command can load an operating system from a floppy disk or ROM pack. Two cartridge slots for ROM packs are brought out to the side, the other two slot circuits being used internally for the built-in I/O. It comes with 16K of RAM, upgradable to 64K. Price and branding adjustments are also announced, with sales personnel pointing out that the entry level price is only half the price of an IBM PC/09.

The limit to RAM expansion is swept under the rug, but the magazines dedicated to Radio Shack's computers all point it out.

The two students who are under NDAs with IBM Instruments go to Danbury again for short internship sessions, and return before school starts.

Efforts to find a source for CRTs are still not very fruitful, but the students have developed some approaches that allow a couple of them to write magazine articles describing circuits using the 68471 for TVs and steps to tune the output to individual TV models, to get legible text at 64 columns, or even 85 columns using 512 pixel wide lines and 6 pixel-wide characters.

Parents of several of the students meet with the colleges and the other companies that have been supporting the research group, and set up a company to handle commercializing their work. Arguments arise, but a small core group of seven of the students (who have just returned from a little vacation to Japan) work hard to bring everyone to agreement.

A non-profit research group is set up, and the students take membership in it.

Additionally, a for-profit development company is set up which can act as agent for the students in accepting contract work.

Microsoft attempts to sue IBM for getting shut out of the PC product, but IBM legal has their response already prepared. A legal skirmish commences, with public news coverage generally portraying Bill Gates as a modern David against IBM's Goliath.

Tandy/Radio Shack wakes up and sends lawyers, and then wakes up again and sends engineers, too. After some discussion and belated agreement to follow the licenses and consortium rules, they are allowed to join the sponsors' consortium.

Western Electric also approaches the group, but do not seek membership in the consortium because of their monopoly status in the communication industry. Top-level negotiations on the licensing of Unix ensue between the Susumu Sponsors Consortium and Western Electric and several other communications industry companies and educational institutions. The core student group participates, along with counselors and legal help from the college and university, representing the students' interests.

IBM Instruments again sends engineers to observe the students' work, and spend considerable time discussing both Susumu and Microware's OS-9 with the two NDA students and members of the core student group.

The core student group members all complain quietly about having to take time away from their own projects to deal with all the side issues.

As summer break ends, a group of foreign exchange students arrive from Japan, mostly on education/research leave from Japanese companies. This creates a bit of confusion and friction, but the core group members manage to iron things out and they join the research group.

~~~~~

How's that for more alternate reality? Doesn't it sound like it might have been even more fun than the reality we know?

Tuesday, November 23, 2021

Alternate Reality -- 33209 (pt. 1, winter 1981)

***** Alternate Reality Warning *****

So, having treated myself to a trip down memory lane, with some daydreaming about Apple and a different OS-9 and how things could have been, I decided maybe I should map out some of the directions the computer/information industry would take in the alternate world of the 33209.

As opposed to the previously mentioned daydream, this alternate reality begins branching from mainline reality in a general sense in early 1981.

(In a more specific, personal sense, it branches much earlier. But that branch runs pretty much parallel with our reality until late in 1980. I won't deal with that here, read the novel linked above (what there is of it at this time) if you're interested.)

(You're not sure you're interested? Odd. It's not like this is just working out a time line for the evolution in technology at the core of the plot of that novel or anything like that. ;-)

* First half of 1981:

A group of electronics and computer information science students at a community college in a football/oil field town in west Texas coalesces around the Micro Chroma 68, Motorola's prototyping board for the 6802/6808, a 6800-compatible early system-on-a-chip (SOC) MPU, the 6846 ROM-I/O-Timer chip, and the 6847 TV grade video controller. and around building microprocessor trainers using the 6805 (not 6800 compatible, but close, microcontroller with a little bit of built-in I/O, timer, ROM, and RAM) and turning the trainers into keyboard controllers.

Initially, all of the additional circuits they design are hand-wired.

From the beginning, several of the students help the others keep records of their work and of how they share what they produce.

Not only their teachers and other faculty members, but representatives of Motorola, IBM, and TI observe and encourage the students' work. (Why they do, I won't get into here. See the novel. Admittedly, this is the biggest jump in logic in the daydream, and big enough to keep it squarely in the realm of daydreams.)

Faculty from the nearby branch of the University of Texas drop in to observe, as well.

One of the students designs a floppy disk controller based on the 6801, and Motorola negotiates with him for options on the design. As a result, the group gets access to Motorola engineering support, in the form of both documentation and parts sampling.

The group has the right combination of talents, interests, resources, time, and connections, and by spring break, they all have working trainer/keyboards based on the 6805.

During spring break, several of the students participate in internship programs with the three companies.

By the end of spring break, most of the students have their own working Micro Chroma 68s, using daughterboards to substitute the 6801, an integrated microcontroller with a more powerful 6800 compatible CPU, for the 6808 or 6802 originally specified for the board.

Since most of the students do not yet have floppy disks, and since ROM is faster for reading than floppy disks and displaces less of the usable RAM space, as well, all students include hardware to program EPROMs in their computers. This allows them to program their own ROMs and construct their own monitor/debug systems and their own boot-level input-output software -- what we began to call BIOS (Basic Input/Output System) about this time under influence of CP/M.

Several simple graphical games get written in the process, for fun, relaxation, and testing the computers.

Several of the students get together to rewrite parts of the Micro Chroma monitor/debug program ROM to use the features of the 6801, and they call the combination Micro Chroma 6801. The monitor/debugger is shared with everyone, and all the students also put ROMs containing the freely available Tiny BASIC in their computers and are able to type in programs and save their programs to cassette tape and load them back in.

Many of the students add bank switching, lots of RAM, and floppy disk drives to their computers. Fast cassette interfaces are also experiment with, especially for backing up their work. Some of the students adding floppy disks use the 6801-based floppy disk controller developed by their fellow student, others use various commercially available floppy disk controllers, allowing the group to learn how to separate I/O functions out into device driver modules and make their software somewhat hardware independent.

Some of the students add the 6844 DMAC (direct memory access controller) to their Micro Chroma 6801s, to ease timing for the floppy disk high-speed cassette interfaces and device drivers.

Several students start experimenting with chemical etching to produce their circuit boards.

With floppy disk drives, they are able to bring up the Flex operating system from Technical Systems Consultants, which some of them have bought copies of.

Certain members of the group port the Forth Interest Group's freely available fig-Forth language/operating environment to their computers, allowing even those who don't buy an OS to (among other things) use their floppy disks without always having to type obscure machine code in at the keyboard.

Motorola and IBM both negotiate with the students for limited rights to use their design work, IBM asking for confidentiality concerning their being interested.

A small group begins an attempt on a port of Edition 7 Unix (now known as Research Unix) to their computers. The college makes their Univac 1100 available to them during off-hours, to run Unix on, to help with their work.

At this point, TI also negotiates with the students for limited rights to use their design work.

Engineers at Apple hear about the students' work and start visiting with IBM, TI, and Motorola when they visit. Members of the petroleum industry operating near and in the town also start visiting.

Some of the students have misgivings about the visits until visiting engineers share some of their experience and help with the Unix port and other problems students have trouble solving by themselves. Nevertheless, the port gets a bit stuck in trying to produce a cross-compiler for the C programming language.

Many of the students begin building computers with other CPUs, sharing ideas and keeping their designs similar enough to the Motorola CPU-based designs to make porting the Forth reasonably straightforward.

TI provides 9900s and 9900 series peripheral parts, including their video controller, the 9918A, to several students who want to use them. Motorola provides 6809s and 68000s and peripheral parts to those who want to use them. IBM steps up to buy CPUs for students who want to use 8086s or 8088s. Apple steps up to help those who want to use 6502s. Some of the students want to use Z-80s or 8085s, and petroleum industry companies step in to help them. At the suggestion and assistance of certain engineers from the petroleum industry, a couple of students decide to try the Z8000, as well.

Students working with the more advanced CPUs initially try to implement too many of the advanced features, but after a short time consign themselves to just getting them working first.

Students working on the 68000, in particular, spend a week trying to design asynchronous bus interfaces, along with 16-bit peripheral circuits to take the place of 16-bit parts that aren't available from Motorola. Not making much progress with that, they settle on using the 6800-compatible bus signals, and rely the special MOVEP instruction which was designed to make it easy to use the existing 8-bit peripheral parts from the 6800 family. This allows them to get basic functions running enough to start figuring out the advanced functions.

For most things that don't require large address spaces, the 6809 proves the easiest to work with, and students working with it are successful at getting more advanced operating system and hardware features running, including DMA access using the 6844 for high-speed I/O concurrent with other operations, and basic memory management functions using bank switching or the 6829 series memory management unit (MMU), which Motorola also provides samples of. Simple bank switching proves to allow the computers to operate at higher CPU and memory cycle speeds than the 6829, but with less flexibility in function.

Many of the students elect to use the 6845 or 9918A to control video in the computers of their own designs instead of the more limited 6847. Some initially elect not to include video, using the Micro Chromas as terminals for their computers, instead. Other options for video output are also explored.

Motorola samples an upgrade to the 6805 to the group, along with a cross assembler that the students can run on their Micro Chroma 6801s, either under Flex or by loading and calling directly from the monitor/debugger firmware. A couple of students build small computers with the upgraded 6805 for practice, pairing it with the 6847 video controller. The limits of the byte-wide index register X make it difficult to port a full Forth, but a small subset of library-type functionality is put together.

One of the students quickly redesigns the floppy controller using the libraries and the new microcontroller's internal direct memory access controller, and Motorola buys this design outright, giving the group a bit of money.

About the time that the new floppy controllers come up, the students working with the 68000 get their asynchronous bus interfaces running, speeding up memory access. But they decide to continue to use 8-bit peripheral parts and the MOVEP glue instruction rather than design their own peripherals, as much as possible.

In response to students' requests, Motorola offers to make MMU parts for the 68000 available, with the warning that most customers have not found them useful and have been implementing their own MMUs. They do make documentation available, and after a few days of studying the documentation and errata, the students decide to postpone memory management on the 68000 for the time being.

One of the students notices that a 512 byte table in ROM can be used to synthesize a fast byte multiplicative inverse on CPUs that lack hardware divide, and that is added as an optional part of their libraries, the source code for the table being generated mechanically on one of the students' Micro Chroma 6801s. The table can be used even on the 6805 by storing the more and less significant parts of the result in separate tables.

One of the students writes a simple inclusion pre-processor inspired by the C pre-processor to handle assembly language file inclusion, making it easier to handle library code.

A group of the students writes a set of library routines for the Micro Chroma 6801 inspired by the libraries of the programming language C, but using a split stack architecture inspired by Forth, with one stack to keep return addresses and the other (a software stack on the 6801) to keep parameters on. The group then produces a mashup programming language based on fig Forth and the programming language C, which language gets nicknamed first "Split C", then, amid jokes about split pea soup, renamed "Split P" -- P as in program.

The language has two grammars, a Forth grammar, which is modified from the fig Forth, and a C grammar, which is kept compatible with K&R C. Implementing the C macro-preprocessor takes some effort, and eventualy they implement it in the Forth grammar.

Almost as fast as it is implemented on the 6801s, split P is implemented on the 6809 and 68000, and the three code bases become the core code bases for the language.

One of the programming teachers tells the class about Ken Thompson's "Reflections on Trusting Trust", and the students design a process for using cross-compiling to bootstrap the compiler for Split C cleanly of potential library-hidden back doors..

Using Split P and borrowing more ideas from both Unix and Forth, members of the group begin designing a new OS they call Susumu (Japanese for "Proceed"). The BIOS for Susumu is designed to include a ROMmable monitor/debugger which is a subset of the Forth-like grammar of Split P and includes a minimal interactive assembler and disassembler.

Full assemblers are also written, to make the operating system and programming language self-hosting. Early versions of the 68000 assembler do not handle the full instruction set, only enough to compile and run the high-level code.

The disassembler/debuggers, both minimal BIOS and full versions, and the relocating linking loader are ongoing projects that have to be rewritten as Split P evolves. In particular, the ability to relocate variables and labels within the 6801's direct page requires several tries to get right. These are not complete during April (or even May), but they are working well enough to fork (spin off) versions to use in the project to port Unix.

Some of the students write about their work and submit to certain electronics and computing magazines, and some of the articles are accepted, to be published beginning in June.

In late April, Motorola acts on the option to buy the right to use the 6801 floppy controller design, giving the group more money.

Motorola then publicly announces the upgraded 6805 system-on-a-chip (SOC) microcontroller, with immediate sampling:

The 682805, using a pre-byte to expand the op-code map as necessary, gets a second parameter stack pointer (named U after the 6809 second stack pointer) and push/pop and transfer pointer instructions (PSH/PUL/TFR) to support it. It also gets push/pop/transfer instructions for the return stack, to reduce the bottleneck that a single stack tends to create. Careful design allows using the pre-fix instruction byte to add u-stack indexed mode instructions to access local variables on the parameter stack without significant increase in transistor count, as well.
The return address stack and its RAM are moved out of the direct page, with access optimized so that simple calls cost no more than jumps or branches. The parameter stack and its RAM replace the return stack in the direct page.
A two-channel chainable direct memory access controller (DMAC) with 8-bit counters is added to the list of optional integrated I/O devices. (Chainability is what makes the microcontroller especially suited to the floppy controller.) (It's worth noting here that, in our real world, Motorola didn't introduce DMA channels in their published SOC microcontrollers until much later.)
Also optional instructions are added, including the 8 by 8 to 16-bit multiply from the 6801 and 6809. Both of these use the X index register to store the high byte of the multiply, similar to later versions of the 6805.
The MEK 682805 and Micro Chroma 682805 are announced as prototyping kits, crediting the students' work.

What is not announced is that monitoring the students' experiences is a large part of what has actually motivated the changes.

Motorola engineers have also polished the floppy controller designs and implemented them as integrated, almost single-chip controllers, and Motorola publishes the designs, both as products and as reference design application notes on constructing custom SOC designs using Motorola's processors, again acknowledging the group's contributions.

At about this time, Motorola also samples upgrades to the 6801 and 6847 to the group, and several of the students elect to build prototyping kits similar to the un-expanded Micro Chroma kits using these parts, for fun and practice. They work quickly, producing operational prototypes in just a few days.

Comparing the 6845 and the upgraded 6847, a few of the other students decide 64 characters wide is good enough, and rework their video designs to use the upgraded 6847 instead of the 6845.

Most of the students are now building both chemically etched and hand-wired circuit boards.

As students bring up computers of their own design on CPUs other than the core three, they begin to port both Split P and the nascent Susumu to their computers, keeping a high degree of compatibility between their systems in spite of the different CPUs. In order to make the ports work, techniques for specifying byte-order independence in source code are explored and implemented.

Several of the students work together to write cross-assemblers for the non-core CPUs in Split P, which allows much of the development to proceed on working hardware instead of the hardware prototypes. Coding significant parts of the OS and interpreter/compiler in Split P instead of assembler also helps speed development.

Modifications are made to the Forth grammar of Split P, to include explicit named statically allocated local variables and global variables controlled by semaphores or counting monitors.

Modifications are made to the C grammar to include structured return types for functions (using the split stack), global variables controlled by semaphores or counting monitors, and task-local static allocation.

Several students working on computers based on the 6809 purchase TSC's Uniflex or Microware's OS-9, as well. TSC and Microware both provide help to the students to bring their OSses up, both companies becoming interested in Split P and Susumu in the process.

Two of the students, frustrated with the restrictions against mixing code and string data in Microware's native assembler write their own assembler to allow mixing string and other constants in with code without needing fix-up tables and code. This allows them to bring up Forth for OS-9 as a relocatable module.

Apple inquires about obtaining formal rights to produce products based on the work the group is doing. Motorola, IBM, TI, Microware, TSC, and the petroleum industry companies also express a need for more formal arrangements, and, with help from the college, the group formally organizes as a research group under the college umbrella, in order to simplify legal issues. Legally organized, the group itself becomes able to receive small grants from each of the interested companies, and individual students are more easily able to contract to do projects for various companies.

The organizational structure within the group is kept flat, and participation voluntary. With legal help from the colleges, they agree on a liberal license for collaboration in their work, allowing them to continue collaborating with each other and also with interested people not formally in their group.

More students express interest in joining the group, both from within the college and from the nearby branch of the University of Texas. After some discussion, and after the new students agree to accept the license and respect requests for confidentiality from the corporate sponsors, they are cautiously welcomed in.

As spring semester ends, some students head out for summer jobs and summer vacation, others stay for the summer terms and to keep the group moving forward. Certain of the students begin internships with one or more of the companies that are taking something of a sponsorship role by this point, working some days with engineers at the corporate campuses and some days back at the college lab. A newsletter is begun, to keep everyone up to date.

More articles are accepted by electronics and computer magazines, including a few articles that are contracted by Motorola and TI to publicize application notes, including the floppy controller and the 682805 in particular.

At Apple, during May, Jef Raskin prevails on management to produce an OS-9/6809-based business computer line, including workstations and network computers.

Nothing really exciting happens within the student group during early June, only steady progress on various projects. Well nothing exciting except that TI also brings samples of an upgraded variant of the 9900 CPU and of the 9918A video generator for the students, and several agree to try them out. These take a bit more work, with students helping to debug TI's new design work.

The appearance of the magazine articles on the students' work creates some excitement outside the group, but the group is so far ahead of what is published that other than the excitement of seeing their work in print, the articles are a bit anti-climatic for the students.

Engineers at Radio Shack are among those outside the group that take note of the articles, in particular showing their management an article detailing the Micro Chroma revisions for the 6801 and 6809 and another article showing the 6809 interfaced to the 6844 DMAC.

The students working on Susumu get Split P code generation on the 6801 working well enough to start using the C grammar directly as the system native compiler for the Unix port project, and parts of Unix are brought up on the Micro Chroma 6801s with bank switching. There isn't enough room in 64K for Unix without bank switching, but it does run with bank switching. It's also relatively slow on the 6801, but it works well enough to compile and run some system tools.

The ports of Susumu to the 6809 and 68000 are likewise brought up to a usable level by mid-June, and ports to the 8086, Z8000, 9900, Z-80, and 6502 are partially running, but buggy.

With work on 68000-based computers proceeding well, students working on those are invited to a meeting with engineers at IBM Instruments under non-disclosure. After some discussion, the student group agrees to let the invited students make up their own minds.

Two of the invited students go, and return in a week with a couple of IBM engineers who observe the student group's activities without comment for several days and then return to Danbury. Everyone in the group studiously avoids tempting the two students to break their NDAs, which engenders some small tension. But the tension gradually evaporates naturally.

In mid-June, just before the start of summer, Motorola publicly announces sampling of the upgrade to the 6801 SOC microcontroller:

The 682801 gets direct-page versions of the 6801's unary instructions, allowing more effective use of the direct page. (The 6800 and 6801 did not have them, I assume because the design engineers were concerned that there would not be enough opcodes for required inherent address op-codes. They were present in the 6809, however.) The op-codes for the new direct-page unary modes are carefully allocated so that they fit in the primary op-code map and don't cost a pre-byte fetch to use, but are scattered across the available un-implemented op-codes instead of lined up with the rest of the unary instructions. This costs more in circuitry, but maintains object-code compatibility with the 6801 and 6800.
The 682801 also uses a pre-byte to expand the op-code map so that it can have a second U stack pointer for parameters and for index mode op-codes that allow access to the parameter stack without going through the X index register.

In addition to the 6801's ABX (add B to X) instruction, new instructions SBX (subtract B from X) and the corollary ABU and SBU are provided for improved handling of address math and stack allocation/deallocation. And the compare D (CPD) instruction which is missing on the 6801 is added to the 682801.
The 682801 gets a 64 byte spill-fill cache for the return stack, with 18:38:8 hysteresis to provide worst-case overhead space for interrupt frames, which is large enough for nineteen levels of frameless calls and nine levels of framed calls without going to the external bus to save the PC or the frame pointer. This allows simple calls to statistically cost no more memory cycles than jumps and branches. A similar 64 byte spill-fill cache for the parameter stack is provided, as well, but with 8:48:8 hysteresis, improving access cycle times to local parameters. These caches can be locked in place, for applications that don't need deep stacks.
Separate system and user mode are implemented, with a new status bit in a ~~new~~ status register in direct page I/O space. In order to make context switches faster for interrupts, there is a return stack cache and parameter stack cache for both system mode and user mode.

In order to avoid complex delay circuitry for the caches under interrupt and return, the S and U registers also have system and user mode versions. Control registers for the caches are in the direct page I/O space.
Functional equivalent modules for many of the 6800-series peripheral parts are announced as optional integrated I/O and timer functions, including the 6844 DMAC. Integrated devices will be shared with the 6805 series MCUs, too.
Address function signals are provided, to allow a system to distinguish between program/interrupt response, return address stack, direct page, and extended address/general data access for both user and system modes. This allows separate buses for each function, reducing memory conflicts.

This also allows expanding the address space to an effective total active address space larger than 64K.

(Theoretically:
- code (f3) <= 64K
- + general data (f2) <= 64K
- + direct page (f0) <= 256 of 64K
- + return address stack (f1) <= 64K;
Because of the 16-bit width of the index register X, expected actual aggregate active process space will generally be less than 128K. Even so, the extra address space clears a number of design bottlenecks.)

Because the index register X can only point into the general data area, using separation requires care in software when using addresses and pointers, and when assigning address range deselection in the hardware.

Hardware 8 bit by 8 bit unsigned divide -- A is dividend, B is divisor, result B is quotient, and A is remainder, to make it easier to repeat the division and get the binary fraction.
Bank switching memory management is also added to the list of optional integrated devices, to expand the physical address space and allow write and address mode protection and address function separation.

In the bank-switch module provided by one initial part, 16 8-bit-wide bank switches provide linear mapping of 4-kilobyte banks of a single linear 64K address space into a single 1 megabyte maximum physical address space. The top 4 bits (LA12-LA15) of the logical (CPU) address select the bank switch, and the bank switch provides the top 8 bits of physical address (PA12-PA19), instead. This part provides only 128 bytes of internal direct-page RAM.

In the bank-switch module provided by another initial part, address function codes are appended to the top of the CPU address, giving 18 total logical address lines (LA0-LA15,FA16,-FA17) for a 256K logical address space.
- Data and code (f2 and f3, FA17:FA16=10 and 11) are each given their own sets of 16 8-bit-wide bank switches (providing PA12-PA19) to map 4K banks from the 64K space addressed in data mode (extended or indexed) or code mode (program counter or interrupt) into the full 1M max physical address space.
- Data (f2, LA17:LA16=10) is given its own set of 16 10-bit-wide bank switches (providing PA12-PA21) to map 4K banks from the 64K address space pointed to by the index register or extended mode absolute addresses into the full 4M max physical address space.
- Code (f3, LA17:LA16=11) is given its own set of 16 9-bit-wide bank switches (providing PA12-PA20, with PA21=1) to map 4K banks of the address space pointed to by the program counter or selected by an interrupt response into the top 2M of physical address space.
- Stack references (f1, FA17:FA16=01) are hard-mapped to the second 64K range ($010000 - $01FFFF), and four sets of 16-bit bounds register latches and a stack bounds violation interrupt are provided for stack security. The first version of this part provides 256 bytes of internal stack use RAM at $10000-$100FF.
- Stack (f1, LA17:LA16=01) is given its own set of 16 12-bit wide bank switches (providing PA8-PA19, with PA21:20=00). Shifting the switch addresses down allows stack to be allocated in 256 byte chunks, making more efficient use of stack RAM. The extra width allows allocating illegal address ranges around the stack, to improve system security in case of stack overflow, underflow, or corruption. (Too late at night.)
- Direct page references (f0, FA17:FA16=00,LA15:LA8=00000000) are split into upper and lower halves, and mapped into the lowest 32K range ($000000 - $007FFF), using the high bit of the direct page address (LA7/DP7) and a system/user state bit to select one of four 8-bit switches to provide physical address PA7 - PA14. In this part, 256 bytes of internal direct page RAM are provided at $000400 - $0004FF. internal I/O addresses, including the bank switch registers, are provided from $000000 - $0000FF.
- The upper half of the direct page is mapped through a simple 8-bit latch with constant higher bits that are appended above the lower 7 bits of the direct-page logical address produced by the CPU, to yield a physical address in the range $8000 - $FFFF. In this part, only 512 bytes of internal RAM are provided.
- The lower half of the direct page is similarly mapped, to yield a physical address in the range $0000 - $7FFF. The bank switching registers are mapped into this range, in particular, the direct page latches and related control bits start at address zero, and interrupts
Initial SOC parts provide either simple general switching out of a single larger address space or function-based switching out of function-separated address spaces. of 16 by 4K banks out of a single 1 Megabyte max address space, or function-based switching of 16 by 4K banks each of code and data out of 1M max plus 16 1/4K banks of return stack out of 64K max with a constant offset into the CPU address space

Special bank switching for the direct page is another optional integrated peripheral, to allow switching half-pages from up to 32K of physical address space into the lower half of the direct page and 32K into the upper half (theoretical max, depending on the width of the implemented bank switch register), in 128 byte chunks.

Motorola suggests the convention that I/O should be switched in the lower range from $00 to $7F, and (preferably internal) RAM in the range from $80 to $FF, and the initial SOC parts follow this convention., providing 512 bytes of actual internal direct page RAM. Initial parts only implement 4 bits of direct-page bank switch, physically limiting total addressing for each half to 2K.

With this kind of mapping, keeping task global variables in the direct page can speed task switches.

~~By keeping task global variables in the $80 to $FF range, fast task switching can be supported.~~

Full 6829-style MMU functionality is mentioned as under consideration, but not yet committed to.
In addition to the 40 pin DIP package, 48 pin and 64 pin DIP packages ~~will~~ provide access to extra address and port signals.

They also announce the 68471, with the following extensions to the 6847:

In character mode, 32 or 64 characters per row, 16 or 32 rows, interlaced or non-interlaced. Internal character ROM now includes 128 characters. The default character set includes lower case, full ASCII punctuation, and common useful symbols in the control code range. External character ROM is supported,
In graphics mode, 512 pixels per row and 384 rows per field modes are added to the 6847's lower resolution modes, interlaced or non-interlaced. One, two, or four bits per pixel are supported, allowing 2, 4, or 16 colors per pixel, if the output digital-to-analog converters and amplifiers are fast enough. 24 kilobyte maximum video buffer give 2 colors at highest density, 4 at 256 by 192, and 16 at 128 by 64.

8-bit programmable palette registers are provided, intended as 2 bits each of red, green, blue, and intensity.
6883/74LS783 DRAM sequential address multiplexer (SAM) and refresh functions are built-in. Bank switching into an up-to 16-kilobyte CPU-side window is also built in, to support the large video buffers for the higher density graphics modes on 16-bit addressing CPUs.

And Motorola announces the Micro Chroma 682801 as a prototyping kit for the new CPU and video generator, with ~~Susumu~~ the 2 kilobyte monitor in internal ROM implemented as Susumu library functions.

Specifications for upgrades to the 6809 and 68000 CPUs and 6845 video controller are also shown to the students, and more than a few express interest in seeing what they can do with them.

The students start searching in earnest for a source for CRTs that can function at sufficient resolution to handle 80-columns or more of text output that the 6845 and the improved 6845 can produce, but don't have real success. A few students are able to get a CRT here or there, but the interfaces are varied, and the video generator circuits the students produce are also varied.

As a result, most of the students remain dependent on TVs for video, and Motorola supplies them with enough 68471s that they can all get 64 column text output circuits for their more advanced computers. Unfortunately, results are rather uneven and require careful tuning to individual TVs, such that, even with the 68471s, the circuits they produce do not approach mass-marketable products.

Susumu and Split P code generation for the 682801, 6809, and 68000 are also functioning well enough as summer begins that the ports of Unix are re-booted using Split P as the system language and Susumu as the system programming language.

And this is getting long.

~~~~~

How's that for alternate reality?

And the outline for summer is ready now, here.