tag:blogger.com,1999:blog-39545769734207655202024-03-23T19:16:41.972+09:00defining computers零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.comBlogger80125tag:blogger.com,1999:blog-3954576973420765520.post-53601297539245637902023-07-26T23:26:00.007+09:002023-07-26T23:26:51.778+09:00[NOTE] newline insertion on paste bug in gedit<p>Intermittent bug: </p><p>gedit inserts newlines on paste under certain conditions involving the contents of the search buffer.<br /></p><p>gedit --version reports <br /></p><blockquote><p>gedit - Version 3.28.1<br /></p></blockquote><p>from the command line on Ubuntu, uname --all: </p><blockquote><p>Linux {machine-name} 5.4.0-150-generic #167~18.04.1-Ubuntu SMP Wed May 24 00:51:42 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux <br /></p></blockquote><p>Looking it up, it might be something inherited from the GTK toolbox. <br /></p><p>Specific details.</p><p>Searching through a text file for lines beginning with a specified string, such as a label in an assembly language source file. Leaving the search string in the search buffer, select, copy, and paste a section of several lines of text including the searched string. </p><p>gedit inserts blank lines (newline character sequences) before occurrences of searched string.</p><p>Example:</p><p></p><blockquote>MESS DC.L DOCOL,WARN,AT,ZBRAN<br /> DC.L MESS3-*-NATWID<br /> DC.L DDUP,ZBRAN ; -DUP here is a bug from the original 6800 model, at least.<br /> DC.L MESS3-*-NATWID<br /> DC.L LIT16<br /> DC.W 4<br /> DC.L OFSET,AT,BSCR,SLASH,SUB,DLINE,BRAN<br /> DC.L MESS4-*-NATWID<br />MESS3 DC.L PDOTQ<br /> DC.B 6<br /> DC.B 'err # ' ; 'err # '<br /> DC.B 0 ; hand align<br /> DC.L DOT<br />MESS4 DC.L SEMIS</blockquote><p></p><p></p><p>with \nMESS in the search buffer.<br /></p><p>But then when I tried it with </p><p></p><blockquote><p>* This should add two 64-bit numbers:<br />ADD64STK:<br /> MOVEM.L (A6)+,D7/D6/D5/D4<br />ADDLO ADD.L D5,D7<br />ADDHI ADDX.L D4,D6<br /> MOVEM.L D6/D7,-(A6)<br /> RTS<br /></p><p></p></blockquote><p>it quit inserting the newlines, even in the original example text.<br /></p><p><br /></p><p><br /></p><p><br /></p><p><br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-32683525973996252802023-07-09T10:20:00.046+09:002023-07-09T14:40:07.746+09:00Website about the TI Series 0100 Calculator Chips<p>
This is another website I want to remember. It gives a lot of detail on early
Texas Instruments and related calculators and the electronics that made them
work:
</p>
<p>
<a href="http://www.datamath.org/" target="_blank">http://www.datamath.org/</a></p><p>In particular, this page is dedicated to the TI's series 0100 chips that were in current production with Intel's 4004:<br /></p><p>
<a href="http://www.datamath.org/Chips/TMS0100.htm" target="_blank">http://www.datamath.org/Chips/TMS0100.htm</a> <br /></p>
<p> </p>
<p> </p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-79088771860662697952023-05-06T12:01:00.014+09:002023-05-06T12:04:23.773+09:00No, Higher Costs Were Not the Real Reason for the 8088 in the IBM 5051<p><i>[This is a reply to a comment on Hackaday: <a href="https://hackaday.io/project/190838-ibm-pc-8088-replaced-with-a-motorola-68000">https://hackaday.io/project/190838-ibm-pc-8088-replaced-with-a-motorola-68000</a>] </i><br /></p><p style="text-align: center;">***** <br /></p><p>For some reason, I can't reply directly to your comment with the
eejournal opinion piece <i>[<a href="https://www.eejournal.com/article/how-the-intel-8088-got-its-bus/">https://www.eejournal.com/article/how-the-intel-8088-got-its-bus/</a>]</i>, but I suspect my earlier comment was too brief.</p><p>Let me try that again: <br /></p><p>The
68000 had built-in support for 8-bit peripheral devices, both in the
bus signals and the instruction set. Most of the popular
implementations, including the Mac, made heavy use of 8-bit parts, and
Motorola had application notes on interfacing other company's 8-bit
devices as well as their own. You could mix 8-bit peripherals and 16-bit
memory without stretching.</p><p>Motorola even had an app-note about
interfacing the 68000 directly to 8-bit memory, but any decent engineer
would have looked at the note and realized that the cost of 16-bit
memory was not really enough to justify hobbling the 68000 to 8-bit
memory. That's one of the reasons the 68008 didn't come out until a
couple of years later, and the primary reason that very few people used
it. There was no good engineering reason for it.</p><p>Well, there was
one meaningful cost of 16-bit wide memory: You couldn't really build
your introductory entry-level model with 4 kB of RAM using just eight 4
kilobit dRAMs. (cough. MC-10.) You were forced to the next level up, 8
kB. </p><p>IBM knew that the cost of RAM was coming down, and that they
would be delivering relatively few with the base 16 kB RAM (16 kilobit
by 8 wide) configuration. Starting at 32 kB (16 kilobit by 16) would not
have killed the product. Similarly, the cost of the 68000 would come
down, and they knew that. </p><p>Management was scared of that.</p><p>Something
you don't find easily on the Internet about the history of the IBM
Instruments S9000 was when the project started. <a href="https://defining-computers.blogspot.com/2022/02/when-ibm-9000-actually-existed-history-context.html" target="_blank">My recollection was that it started before the 5150</a>. It was definitely not later. It had much
more ambitious goals, and a much higher projected price tag, much more
in line with IBM's minicomputer series. There was a reason for the time
it took to develop and the price they sold it at. But even many of the
sales force in the computer industry didn't understand the cost of
software and other intangible development costs.<br /></p><p>Consider how
much damage the 5150 did to IBM's existing desktop and minicomputer
lines. Word Processing? Word Perfect was one of the early killer apps
for the 8088-based PC. Spreadsheet? Etc. <br /></p><p>IBM management knew
too well that if they sold the 5150 with a 68000 in it instead of the
8088, a lot of their minicomputer customers were going to be complaining
to high heavens about the price difference. They knew the answer, but
their experience showed them that the too many of the customers would
not believe it.</p><p>That was the real reason. They hoped the 8088
would be limited enough to give them time to maintain control of the
market disruption.<br /></p><p>I think they were wrong. But it would have
taken a level of foresight and vision that very few of management
withing IBM had.(very few outside IBM, either.), to take the bull by the
horns and drive the disruption.</p><p style="text-align: center;">*****</p><p style="text-align: left;">Anyway, my point was that higher cost wasn't the real reason any more than the (at the time, much-rumored) technical deficiencies of the 68000.<br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-36265910146300873482023-03-11T22:46:00.003+09:002023-03-12T14:54:39.895+09:00Mapping the Panasonic Let's Note Japanese Keyboard to the Hatari Emulator <p>I never owned an ST family computer, which might have been a tactical error on my part. It would have been close to an unadorned 68000-based machine in the way that the <a href="https://en.wikipedia.org/wiki/TRS-80_Color_Computer" target="_blank">Radio Shack/Tandy Color Computer</a> was an unadorned 6809-based machine. <br /></p><p>I have been using the <a href="https://hatari.tuxfamily.org/" target="_blank">Atari ST emulator (simulator) Hatari</a> as a platform for <a href="https://osdn.net/projects/fig-forth-68000/" target="_blank">converting the fig Forth model for the 6800 to the 68000</a>.<br /></p><p>The <a href="https://defining-computers.blogspot.com/2022/05/panasonic-lets-note-cf-nx2-keyboard.html" target="_blank">keyboard on this Japanese Panasonic Let's Note</a> does not map well to Atari ST keyboard. The default mapping leaves important keys like equal (=) unavailable. (Some keys are available by using the FN key to select the ten-key pad that starts on the 7 key.)<br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEie2r4tlbukAgrwpifmOHTJa9nfZhshdwi8yDihxvm3nWPciPEIGy2Ws6n5n09agZYFYjpJ1x7UgMKaY4AObYHAD-LhVqTpzX8AS8etVb9sGQ0_1kAcD19Cvv2xOKs3Ixo359oRfppfDi8ZV2hDOMVgpEzmXDoj_AREiK2AQ7m2pK-1Q3CQHAA3gpuBcA/s1518/Lets_note_keyboard_hi.jpeg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="908" data-original-width="1518" height="382" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEie2r4tlbukAgrwpifmOHTJa9nfZhshdwi8yDihxvm3nWPciPEIGy2Ws6n5n09agZYFYjpJ1x7UgMKaY4AObYHAD-LhVqTpzX8AS8etVb9sGQ0_1kAcD19Cvv2xOKs3Ixo359oRfppfDi8ZV2hDOMVgpEzmXDoj_AREiK2AQ7m2pK-1Q3CQHAA3gpuBcA/w640-h382/Lets_note_keyboard_hi.jpeg" width="640" /></a></div> For comparison, the Atari ST (US) keyboard is laid out like this: <p></p><p></p><div style="text-align: center;"><blockquote><p>!@#$%^&*() _+~<br />1234567890 -=`<br /><br />QWERTYUIOP{}<br />qwertyuiop[] del<br /><br />ASDFGHJKL:" |<br />asdfghjkl;' \<br /><br />ZXCVBNM<>?<br />zxcvbnm,./</p></blockquote></div><p>And it's precisely around where the equals key is, that the keyboard mapping goes wonky. <br /></p><p>I've been doing almost all the source code editing in <a href="https://wiki.gnome.org/Apps/Gedit" target="_blank">Gedit</a>, under <a href="https://www.ubuntu.com" target="_blank">Ubuntu</a>, just using Hatari for assembling and test runs. But I'm now into some really difficult debugging sessions, and the mapping of the keyboard is getting in the way. </p><p>So today I dug into Hatari's keyboard remapping, invoked something like:</p><blockquote><p>hatari -k keymap.text <br /></p></blockquote><p>on the command line. </p><p>In the source code for Hatari, there are some utilities in the tests/keymap directory for looking at what SDL sees the keyboard generating (if I got this right) -- listkeys.c and checkkeys.c. I downloaded the source code from the <a href="https://git.tuxfamily.org/hatari/hatari.git/" target="_blank">git repository</a> and changed to the tests/keymap directory, and ran make, and got the executables there. Don't need them in the general path, you can execute them in place with</p><blockquote><p>./listkeys</p></blockquote><p>and </p><blockquote><p>./checkkeys</p></blockquote><p>They gave me some clues and not much more. Taking a look at the example keymap file in the source was also not very enlightening. And neither was the man page from SDL:</p><blockquote><p>man SDLKey<br /></p></blockquote><p>But after working through those, and after some reading in forums and playing around with Hatari's remapping a bit. I figured out what to put in the keymap file: </p><p>Each line consists of the key you want to remap, a comma, and the SDL (?) scancode. </p><p>Sort of. </p><p>Figuring out the scancodes was a bit tricky. First, I tried the codes I learned from the checkkeys and listkeys utilities to see how they would work:</p><p></p><blockquote>-,45<br />^,94<br />@,64<br />[,91<br />],93<br />;,59<br />:,58<br />/,47<br />\,92</blockquote>The results weren't even close to what I wanted.<p></p><p>So I used a little trick involve perverse keyboard mappings. What I did was line up the alphabet keys and just arbitrarily mapped almost all of them to sequential codes:</p><p></p><blockquote>-,45<br />a,1<br />b,2<br />c,3<br />d,4<br />f,5<br />g,6<br />h,7<br />j,8<br />k,9<br />l,10<br />m,11<br />n,12<br />o,13<br />p,14<br />q,15<br />r,16<br />s,17<br />u,18<br />v,19<br />w,20<br />y,21<br />z,22</blockquote><p></p><p>(The mapping for hyphen was the one I was able to make sense of from the utilities, but it turned out not to be one I am using.)</p><p>This is better than random guessing because it allows testing a bunch of the scan codes at once, and it helps you remember which ones you've tried. You can add the ones that look like they work at the top, like I did with hyphen, so you can test them -- and so you don't forget them.<br /></p><p>I think I gleaned one scancode from the sequence 1 to 22, then similarly gleaned a few more from the next set, starting at scancode 23 to about 43 temporarily and perversely mapped to key a through y. </p><p>(I left e, i, t, and x undefined to allow typing the exit command, and, after the first set, I left z un-assigned so I could use ctrl-Z to invoke the debugger.)<br /></p><p>But I couldn't figure out how to map individual scancodes. Each remapping seems to be done as a pair, which is kind of awkward. </p><p>Ultimately, I used this file:</p><p></p><blockquote>-,12<br />[,26<br />],27<br />^,13 </blockquote><p></p><p>and, while it makes the equals key available, it maps it to the caret/tilde key -- which leaves caret and tilde unavailable. <br /></p><p>It's not a good fit. It matches neither the keycaps on the PC keyboard nor the layout of the Atari ST keyboard. </p><p>But I think it will allow me to proceed with debugging.</p><p>So I'll leave this post here for my own notes and post a link here to a Hatari forum for the developers, if I can figure out the appropriate forum.<br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-30563982831786398862022-11-12T23:18:00.018+09:002022-11-13T16:58:28.186+09:008080 Assembly Language Crib Sheet<p>
The 8080 is messy. I have a fairly easy time remembering the 680X assembly
languages. I don't have nearly as easy a time remembering the 8080 operators,
allowed operands, flags, etc. <br />
</p>
<p>So I'm putting up a crib sheet, mostly for myself:</p>
<table border="1">
<tbody>
<tr>
<th colspan="3">8080 Registers<span style="font-weight: normal;"> (8 & 16 bit)</span><br /></th>
</tr>
<tr>
<th><span style="font-weight: normal;">Temporary Registers</span></th>
<td style="text-align: center;" width="25%"><b>B</b></td>
<td style="text-align: center;" width="25%"><b>C</b></td>
</tr>
<tr>
<th><span style="font-weight: normal;">Temporary </span><span style="font-weight: normal;">Registers</span></th>
<td style="text-align: center;" width="25%"><b>D</b></td>
<td style="text-align: center;" width="25%"><b>E</b></td>
</tr>
<tr>
<th><span style="font-weight: normal;">Index High/Low</span><br /></th>
<td style="text-align: center;" width="25%"><b>H</b></td>
<td style="text-align: center;" width="25%"><b>L</b></td>
</tr>
<tr>
<th><span style="font-weight: normal;">Accumulator/Status</span><br /></th>
<td style="text-align: center;" width="25%"><b>PSW</b></td>
<td style="text-align: center;" width="25%"><b>A</b></td>
</tr>
<tr>
<th><span style="font-weight: normal;">Stack Pointer</span></th>
<td colspan="2" style="text-align: center;"><b>SP</b></td>
</tr>
<tr>
<th><span style="font-weight: normal;">Program Counter</span></th>
<td colspan="2" style="text-align: center;"><b>PC</b></td>
</tr>
</tbody>
</table>
<p>(Need to add some better short-short summary stuff here when I figure out how to organize it.)<br /></p>
<dl>
<dt>R byte operands --</dt>
<dd>
registers B,C,D,E,H,L,A<br />
memory M pointed to by HL
</dd>
<br />
<dt>Condition code flags (Program Status Word==PSW), in order --</dt>
<dd>Sign, Zero, (0), Auxilliary Carry, (0), Parity, (1), Carry</dd>
<br />
<dt>RP 16-bit operands --</dt>
<dd>subset of register pairs B:C, D:E, H:L, SP, PSW:A</dd>
<br />
<dt>index operands</dt>
<dd>
M (H:L pair)<br />
X:B (B:C pair), D (D:E pair)
</dd>
<br />
<dt>ORG D</dt>
<dd>set origin (assembly address) to absolute address D</dd>
<br />
<dt>L EQU V</dt>
<dd>define invariant value of label/symbol</dd>
<br />
<dt>L SET V</dt>
<dd>
set value of label/symbol<br />
SET labels may be redefined.
</dd>
<br />
<dt>END</dt>
<dd>end</dd>
<br />
<dt>DB/DW V</dt>
<dd>define a label and allocate and store byte or word value V there</dd>
<br />
<dt>DS SZ</dt>
<dd>define a label and only reserve space of size SZ</dd>
<br />
<dt>STC/CMC {C}</dt>
<dd>set/complement carry</dd>
<br />
<dt>INR/DCR R {ZSPA}</dt>
<dd>byte increment/decrement R/M</dd>
<br />
<dt>CMA {}</dt>
<dd>complement A</dd>
<br />
<dt>DAA {ZSCPA}</dt>
<dd>decimal adjust A</dd>
<br />
<dt>NOP</dt>
<dd>No OP</dd>
<br />
<dt>MOV Rdest,Rsrc {}</dt>
<dd>
move byte data R/M<br />
But MOV M,M is not valid.<br />
Other than disallowed M, to self is effective NOP.
</dd>
<br />
<dt>MVI R,I {}</dt>
<dd>move 8-bit immediate data from instruction stream to R/M</dd>
<br />
<dt>LDA/STA D {}</dt>
<dd>
load/store (move) 8-bit data at direct (absolute) 16-bit address D to A<br />
or 8-bit data in A to direct (absolute) 16-bit address D
</dd>
<br />
<dt>LDAX/STAX X {}</dt>
<dd>load/store (move) A indexed by B:C or D:E</dd>
<br />
<dt>LHLD/SHLD D {}</dt>
<dd>
load/store (move) 16-bit data at direct (absolute) 16-bit address D to
H:L<br />
or 16-bit data in H:L to direct (absolute) 16-bit address D
</dd>
<br />
<dt>LXI RP,I {}</dt>
<dd>
move 16-bit immediate data from instruction stream to RP <br />
Destination can be B (B:C), D (D:E), H (H:L) or SP.
</dd>
<br />
<dt>ADD/ADC R {CSZPA}</dt>
<dd>
add without or with carry R/M to A<br />
ADD A is effectively shift left, but note flags.
</dd>
<br />
<dt>ADI/ACI I {CSZPA}</dt>
<dd>add without or with carry immediate date to A</dd>
<br />
<dt>SUB/SBB/CMP R {CSZPA}</dt>
<dd>
subtract/compare without or with borrow R/M from A<br />
SUB A clears A and sets the flags accordingly.<br />
Sense of C flag in compare inverted if operand signs differ.
</dd>
<br />
<dt>SUI/SBI/CMP I {CSZPA}</dt>
<dd>
subtract/compare without or with borrow immediate data from A<br />
Sense of C flag in compare inverted if operand signs differ.
</dd>
<br />
<dt>ANA R {CZSP}</dt>
<dd>
bit-and R/M into A<br />
Carry is always cleared.
</dd>
<br />
<dt>ANI I {CZSP}</dt>
<dd>
bit-and immediate data from instruction stream into A<br />
Carry is always cleared.
</dd>
<br />
<dt>XRA R {CZSPA}</dt>
<dd>
bit exclusive-or R/M into A<br />
Carry is always cleared.
</dd>
<br />
<dt>XRI I {CZSPA}</dt>
<dd>
bit exclusive-or data from instruction stream into A<br />
Carry is always cleared.
</dd>
<br />
<dt>ORA R {CZSP}</dt>
<dd>
bit-or R/M into A<br />
Carry is always cleared.
</dd>
<br />
<dt>ORI I {CZSP}</dt>
<dd>
bit-or data from instruction stream into A<br />
Carry is always cleared.
</dd>
<br />
<dt>RLC/RRC R {C}</dt>
<dd>8-bit left/right rotate with carry R/M</dd>
<br />
<dt>RAL/RAR R</dt>
<dd>9-bit left/right rotate through carry R/M</dd>
<br />
<dt>PUSH/POP RP {},{all}</dt>
<dd>
push/pop register pairs:<br />
B (B:C), D (D:E), H (H:L), PSW (flags:A)<br />
Condition codes only affected by POP PSW/A.
</dd>
<br />
<dt>DAD RP {C}</dt>
<dd>
16-bit add of register pair into H:L,<br />
RP can be B:C, D:E, H:L, SP<br />
DAD H is shift left with carry.
</dd>
<br />
<dt>INX/DCX {}</dt>
<dd>
increment/decrement register pair<br />
RP can be B:C, D:E, H:L, SP
</dd>
<br />
<dt>XCHG {}</dt>
<dd>16-bit exchange D:E with H:L</dd>
<br />
<dt>XTHL {}</dt>
<dd>16-bit exhange of top of stack with H:L</dd>
<br />
<dt>SPHL {}</dt>
<dd>16-bit move H:L to SP</dd>
<br />
<dt>PCHL {}</dt>
<dd>
move H:L to PC<br />
This is the 8080's indexed jump.
</dd>
<br />
<dt>JMP D {}</dt>
<dd>jump uncoditionally to direct (absolute) 16-bit address</dd>
<br />
<dt>JC/JNC D {}</dt>
<dd>
jump if C (carry) set/clear to direct (absolute) 16-bit address<br />
(Carry/No Carry)
</dd>
<br />
<dt>JZ/JNZ D {}</dt>
<dd>
jump if Z (zero) set/clear to direct (absolute) 16-bit address<br />
(Zero/Not Zero)<br />
Effectively equal/not equal after a subtract or compare.
</dd>
<br />
<dt>JM/JP D {}</dt>
<dd>
jump if S (sign) set/clear to direct (absolute) 16-bit address<br />
(Minus/Plus)
</dd>
<br />
<dt>JPE/JPO D {}</dt>
<dd>
jump if P (parity) set/clear to direct (absolute) 16-bit address<br />
(Even/Odd)
</dd>
<br />
<dt>CALL D {}</dt>
<dd>
call unconditionally to direct (absolute) 16-bit address<br />
Push address of next instruction on stack and jump.
</dd>
<br />
<dt>CC/CNC D, CZ/CNZ D, CM/CP D, CPE/CPO D</dt>
<dd>Conditional calls, same conditions as conditional JMPs.</dd>
<br />
<dt>RET {}</dt>
<dd>
return unconditionally to address saved on stack<br />
Pop top of stack into PC.
</dd>
<br />
<dt>RC/RNC, RZ,RNZ, RM/RP, RPE/RPO</dt>
<dd>
Conditionally return to address saved on stack,<br />
same conditions as conditional JMPs.
</dd>
<br />
<dt>RST N {}</dt>
<dd>
save address of next instruction on stack and jump to address N times 8<br />
N is 0 through 7, yielding address from 0 to 56 on 8-byte boundaries.<br />
Effects a software version of a numbered interrupt.<br />
Use ordinary RET or conditional return to return.<br />
Interrupt routine must explicitly save state of all registers used.
</dd>
<br />
<dt>DI/EI {}</dt>
<dd>
disable/enable interrupts<br />
Clears/sets the INTE interrupt enable flip-flop.
</dd>
<br />
<dt>IN/OUT P {}</dt>
<dd>
load A from/store A to 8-bit port number P<br />
P is an address in port space between 0 and 256.
</dd>
</dl>
<p>
Okay, I think I got the HTML right on that without losing any of the
entries.<br />
</p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-61095564303017086932022-10-23T16:29:00.001+09:002022-10-23T16:29:10.231+09:00Security Misfeature Report for Google GMail<p>Had another little unpleasant surprise from Google today.</p>
<p>
I wonder, how many would agree with me that this is a misfeature and reflects
poorly on Google's changing attitudes towards privacy and security?<br />
</p>
<p>Here it is:</p>
<div class="separator" style="clear: both; text-align: center;">
<br />
</div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwTpN55cydX6b80YzW6UzqJpsVnScpZ_T1zXgVm0vAogUjX2k5Fu90zl5Re9AVR0WN5fCfJ822fXsAiGJZbh4WE6nj3ug2WkH2N9JcqLrvLSV55sVgxlrofEZnZfwFy8UQVsAOC5YBYO-OseWAi1gHlRafASnT2gMP5BHZSlFm_NWFKY2_GBZ180s5iA/s1600/Screenshot%20from%202022-10-23%2007-18-17_attachment_warning_google_mail_s0.jpeg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="900" data-original-width="1600" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwTpN55cydX6b80YzW6UzqJpsVnScpZ_T1zXgVm0vAogUjX2k5Fu90zl5Re9AVR0WN5fCfJ822fXsAiGJZbh4WE6nj3ug2WkH2N9JcqLrvLSV55sVgxlrofEZnZfwFy8UQVsAOC5YBYO-OseWAi1gHlRafASnT2gMP5BHZSlFm_NWFKY2_GBZ180s5iA/w640-h360/Screenshot%20from%202022-10-23%2007-18-17_attachment_warning_google_mail_s0.jpeg" width="640" /></a></div><br />
<p></p><p>If you can't see the message, it says</p>
<p></p>
<blockquote>
<p>
<span style="color: #660000;">It seems like you forgot to attach a file.</span>
</p>
<p>
<span style="color: #660000;">You wrote "is attached" in your message, but there are no files attached.
Send anyway?</span>
</p>
<p style="text-align: right;">
<span style="background-color: #d9ead3;">Cancel</span> <span style="color: white;"><span style="background-color: #3d85c6;"> OK </span></span><br />
</p>
</blockquote>
<p>Seems convenient, doesn't it?</p><p>Let's think about this.</p><p>In case you missed it, here's what Gmail's deep inspection keyed on:</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYjENRF9UxBnL9xRRHjD7QmnJEMU05lUKme3SFjNznK8-Kpje0pLNFWPaiopNWnkWB3dtwhLSus5ocE0HGFeBrfWghpePgc2kD5d78PuGlASkxSV9s220E4ffXYKQs832KOSeVbqxMy4_wTXaJ7dw1eZ75HwXX5aQGGVfg_m31kjJMB46h0Xzvx8nNLg/s1600/Screenshot%20from%202022-10-23%2007-18-17_attachment_warning_google_mail_s1.jpeg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="900" data-original-width="1600" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYjENRF9UxBnL9xRRHjD7QmnJEMU05lUKme3SFjNznK8-Kpje0pLNFWPaiopNWnkWB3dtwhLSus5ocE0HGFeBrfWghpePgc2kD5d78PuGlASkxSV9s220E4ffXYKQs832KOSeVbqxMy4_wTXaJ7dw1eZ75HwXX5aQGGVfg_m31kjJMB46h0Xzvx8nNLg/w640-h360/Screenshot%20from%202022-10-23%2007-18-17_attachment_warning_google_mail_s1.jpeg" width="640" /></a></div><br /><blockquote><p><span style="background-color: #01ffff;">The sign is attached to the desk.</span><br /></p></blockquote>
<p>What do you think? Is Google going too far with this?<br /></p>
<p><br /></p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com1tag:blogger.com,1999:blog-3954576973420765520.post-11231255545775971152022-07-03T03:39:00.009+09:002022-07-07T22:05:57.334+09:00A Critique of Motorola's 68XX and 680XX CPUs<p>I want to note at the top here, that this is not about which company's CPU was better. This is not about comparing CPUs at all.<br /></p><p>And this is not disparaging Motorola. Motorola did a pretty decent job of designing each of their CPUs, especially when considering that they were not just pioneering microprocessor design. Engineers with experience designing CPUs were basically all already employed, mostly by other companies. (And many of those CPU engineers didn't really understand CPUs all that well, after all.) Motorola was also pioneering the design of CPUs in general.<br /></p><p>The engineers at Motorola did a good job. But nobody's perfect. </p><p>Taking these in the order that Motorola produced them: <br /></p><h4 style="text-align: left;">6800 Niggles: <br /></h4><p>(1) It's not hard to guess that the improper optimization of the CPX (compare X index register) instruction was an attempt to be too clever, a bad case of penny-pinching and setting arbitrary deadlines, an oversight, or any and all of the foregoing. But, as a result, the branches implementing signed and unsigned comparisons just don't do what they would be expected to do after CPX.<br /></p><ul style="text-align: left;"><li>C (Carry) is simply not affected by CPX on the 6800 (and 6802), so the branches implementing unsigned compare, BCC, BCS, BHI, and BLS just won't work after CPX. </li><li>V (oVerflow) is the result from comparing the most-significant byte only, so the branches implementing signed comparison, BGE, BGT, BLE, and BLT fail in hard-to-predict ways after CPX. </li><li>N (Negative) is also the result from comparing the most-significant byte only. It may not seem that this is a problem for BPL (branch if plus) and BMI (branch if minus), but the programmers' manual says neither N nor V are intended for conditional branching. It seems to me that the N flag will actually be set correctly after the CPX, giving the sign of the result of the thrown-away subtraction of the argument address from the address in X. But using BPL and BMI in ordered comparison is just going to be a bit fiddly, no matter what. You probably just won't get what you thought you wanted if you use BPL or BMI after CPX.<br /></li></ul><p>Z (Zero) is the result of all 15 bits of the result of the compare, so BEQ (branch if equal) and BNE (branch if not equal) after a CPX work as expected. <br /></p>In the abstract sense, pointers were thought at the time to be necessarily unordered, so it sort of didn't seem to matter. Ideally, you wouldn't be comparing addresses for order. But real algorithms often do want to give pointers order, and that meant that, on the 6800, you would have to use a sequence of instructions to cover all the cases in ordered comparison, because you couldn't rely on CPX alone. <br /><p>This mis-feature was preemptively prevented in the designs of the 68000 and the 6809, and was fixed, pretty much without issue, in the 6801. In the 6805, it's prevented by making the X register an 8-bit register anyway, more on that below.<br /></p><p>(2) Addressing temporary variables and parameters on a stack required using X, and if you had something you needed in X, you had to save X somewhere safe -- which meant on a stack if you wanted your code to be re-entrant. But the 6800 had no instructions to directly push or pop X. That left you with a conundrum. You had to save X to use X to save X. </p><p>So you had to use a statically allocated temporary variable. Statically allocated temporaries tend to introduce race conditions even in single-processor designs, because you really don't want to take the time to block interrupts just to use the temporaries, especially for something like adjusting a stack pointer.</p><p>You can potentially work around the race conditions in some cases by having your interrupt-time stack pointers separate from your non-interrupt-time stack pointers, but that can also get pretty tricky pretty quickly.</p><p>The 6801 provides push (PSHX) and pop (PULX) instructions for X.<br /></p><p>Stack-addressable temporary variables and parameters were supported by definition in the 68000 and 6809 designs, but not on the 6801. They were considered out of scope on the 6805, but were addressed on descendants of the 6805.</p><p>(3) This niggle is somewhat controversial, but using a single stack that combines return addresses and parameters and temporary variables is a fiddly solution that has become widely accepted as the standard. Even though it is accepted, and learning how to set up a a stack frame is something of a rite-of-passage, setting up stack frames to keep the regions on stack straight consumes cycles, even when it can be done without inducing race conditions (see the above niggle about using X to address the stack.)</p><p>Separating parameters and temporaries from return addresses is supported by design on the 68000 and 6809, but not on the 6801 or 6805.<br /></p><p>(4) The lack of direct-page mode op-codes for the unary operators was, in my opinion, a serious strategic miss. Sure, you could address variables in the direct page with extended mode addressing, but it cost extra cycles, and it just felt funny. </p><p>To explain, the binary instructions (loads, stores, two-operand arithmetic and logic) all have a direct-page mode. This allows saving a byte and a cycle when working on variables in the direct page (called zero page on other processors -- addresses from 0 to 255). </p><p>The unary instructions (increment/decrement, shifts, complements, etc.) do not. The irony is that the unary instructions are the ones you use on memory when you don't want to waste time and accumulators loading and storing a result.<br /></p><p>This may have been another attempt to save transistors by not implementing every possible op-code. But a careful re-examination of the op-code table layout map indicates that it should have been possible without using significantly more transistors. In fact, I'm guessing it actually required more transistors to do it the way they ended up doing it. </p><p>Or it may have been an attempt to avoid running into the situation where they would need an op-code for something important but had already used all of the available codes in a particular area of the map. But, again, re-examining the op-code map would have revealed room to fit the op-codes in. </p><p>Maybe there just wasn't enough time to re-examine and reconsider the omissions before the scheduled deadlines, and they thought absolute/extended addressing should be good enough. </p><p>I'll come back to the reasons it really wasn't further down.<br /></p><p>This one was also fixed in the designs of the 68000, and 6809, and sort-of in the 6805, but not addressed or fixed in the 6801. <br /></p><p>Fixing it in the 6801 would have been awkward after-the-fact tack-on, but I'll look at that below. </p><p>(5) The 6800 had a few instructions for inter-accumulator math -- ABA (add B to A), SBA (subtract B from A), and CBA (compare B with A, which is SBA but without storing the result). </p><p>But it's missing the logical instructions AND, OR, and EOR (Exclusive-OR) of B into A, and doesn't have any instructions at all going the other direction, A into B. </p><p>Surprisingly, this is not hard to work around in most cases, but the workarounds are case-by-case tricks with the condition codes. Otherwise, you're back to using statically allocated temporaries, and care must be taken to avoid potential race conditions by such things as using the same temporaries during interrupt processing.</p><p>This is fixed in the design of the 68000, and eliminated from the scope of the 6805, effectively fixed in the 6809 (by the addition of stack-relative addressing for temporaries), and partially addressed in the 6801 (by adding 16-bit math, the most common place where it becomes a problem, more below).</p><p>(6) The 6800 has no native 16-bit math other than incrementing, decrementing, and comparing X, and incrementing and decrementing S. Synthesizing 16-bit math is straightforward, but -- especially without the inter-accumulator logical operators -- it does require temporary variables, requiring extra processor cycles and potentially inducing race conditions. <br /></p><p>Also, you usually need one or more extra test cases to cover partial results in one or the other byte, or the use of a logical instruction to collect the results, and it's easy to forget or just fail to complete the math, per the problem with CPX.<br /></p><p>And you need 16-bit arithmetic to deal with 16-bit addresses.</p><p>This is solved on the 6809 and 6801 by adding 16-bit addition and subtraction. On the 68000, the problem becomes 32-bit math, and it's solved for addition and subtraction, but, oddly, not quite completely for multiplication and division, more below. <br /></p>(7) To explain this last niggle, of the above niggles, (1), (2), (3), (5), and (6) can be solved in the software application/operating system design by appropriate declaration of global pseudo-register variables, and globally accessible routines to handle the missing functionality, exercising care to separate variables and code for interrupt-time functions from those for non-interrupt-time functions. (These global routines and variables are a core feature of most 8-bit operating systems.)<p>For example, if your system design declares and systematically uses something like the following:</p><p></p><blockquote><p> ORG $80 ; non-interrupt time global pseudo-registers<br />PSTK RMB 2 ; two bytes for parameter stack pointer<br />QTMP RMB 2 ; temporary for high bytes of 32-bit quadruple accumulator<br />DTMP RMB 2 ; temporary for 16-bit double accumulator<br />XTMP RMB 2 ; temporary for index math and copy source pointer<br />YTMP RMB 2 ; temporary for index math and copy destination pointer<br /><br /> ORG $90 ; interrupt time global pseudo-registers<br />IPSTK RMB 2 ; two bytes for parameter stack pointer<br />IQTMP RMB 2 ; temporary for high bytes of 32-bit quadruple accumulator<br />IDTMP RMB 2 ; temporary for 16-bit double accumulator<br />IXTMP RMB 2 ; temporary for index math and copy source pointer<br />IYTMP RMB 2 ; temporary for index math and copy destination pointer<br /></p></blockquote><p></p><p>... and if all the processes running on your system respect those global variable declarations, then you may at least have a way to avoid the race conditions.</p><p>But that chews a piece out of the memory map for user applications. </p><p>Now, if the unary operators all had direct-page mode versions, see niggle (4) above, the processor could also define a direct-page address space function code, along several other such function codes, allowing the system designer to optionally include hardware to separate the direct-page system resources from other resources in the address map, such as general data, stack, code, interrupt vectors, etc.<br /></p><p>Two or three extra address lines could be provided as optional address function codes, to allow hardware to separate the spaces out.</p><p>This looks kind of like the I/O instructions on the 8080 and 8086 families, but it isn't separate instructions, it's separate address maps.<br /></p><p>An example two-bit function code might be<br /></p><ul style="text-align: left;"><li>00: general (extended/absolute) data and I/O<br /></li><li>01: direct-page data and I/O<br /></li><li>10: code/interrupt vectors<br /></li><li>11: return address stack<br /></li></ul>Such extra address function signals can improve the utilization of the cramped 64 kilobyte address space, even though they would require increasing the number of pins on the processor package or multiplexing the functions onto some other signals, raising the effective count of external parts. <p>But they provide a place for such things as bank-switch hardware, in addition to general I/O and system globals and temporaries, without having to eat holes in general address space. And completely separating the return pointer stack from general data greatly increases the security of the system.</p><p>I'm not sure if Motorola ever did so in any of their evolved microcontrollers, but this could also potentially allow optimizing access to direct-page pseudo-registers when direct-page RAM is provided on-chip in integrated system-on-a-chip devices like the 6801 and 6805 SOC packages. <br /></p><p>The 68000 provides similar address function codes, but the address space on the 68000 is so much bigger than 64 kilobytes that the address function codes have been largely ignored.</p><p>Before Motorola began designing new microprocessors, such niggles in the 6800 were noticed and discussed in engineering and management within Motorola. The company decided to analyze code they had available, including internally developed code and code customers shared with them for the purpose of the analysis, looking for bottlenecks and inefficient sequences that an improved processor design could help avoid. The results of this code analysis motivated the design of the 68000 and the 6809. </p><p>The 68000 and the 6809 were designed concurrently, by different groups within Motorola.<br /></p><p> </p><h4 style="text-align: left;">68000 Niggles:</h4><p>The 68000 significantly increases the number of both accumulators (data registers) and index registers, and directly supports common address math in the instruction set. It also widens address and data registers to 32 bits. They solved a lot of problems, but they left a few niggles.</p><p>(1) The processor was excessively complex. Having a lot of registers reduced the need for complex instructions and for instructions that operated directly on memory without going through registers, but the 68000 did complex instructions and instructions that operated directly on memory, as well. </p><p>IBM was just beginning work on the 801 (followup to the ROMP) at the time, and reduced instruction sets were still not a common topic, so the assumption of complexity can be understood.</p><p>Still, the complexity required a lot of work to test and properly qualify products for production. <br /></p><p>(2) They got the stack frame for memory management exceptions wrong. That is, memory management hardware turned out to work significantly better using the approach they did not initially choose to support, so the frames they had defined did not contain enough information to recover using the preferred memory management techniques. This was fixed in the 68010.</p><p>(3) The exception vector space being global made it difficult to fully separate the user program space from the system program space. This was also fixed in the 68010. <br /></p><p>(4) Constant offsets for the indexed modes were limited to 16 bits. This seems to be another false optimization -- not fatal because they included variable (register) offsets in the addressing modes, so you could load a 32-bit offset into a data register to get what you wanted. But it still had a cost in cycle counts and register usage. This was not fixed until the 68020, and then they went overboard, making the addressing even more complex, which made the 68020 even harder to test.</p><p>(5) They added hardware multiplication and division to the 68000, but they didn't fully support 32 bit multiply and divide. This also was not fixed until the 68020. This can make such things as accessing really large data structures in memory suddenly become slow, when the index to the data structure exceeds 32,767.<br /></p><p>Of the above, (4), and (5) could conceivably have been dealt with in the initial design, if management had not been pushing engineering to find corners to cut. The first three were problems that simply required experience to get right. <br /></p><p><br /></p><h4 style="text-align: left;">6809 Niggles:</h4><p>The 6809 does not increase the number of accumulators, but it does add instructions that combine the two 8-bit accumulators, A and B, into a single 16-bit accumulator D for basic math -- addition, subtraction, load, and store. <br /></p><p>On the other hand, it does increase the number of indexable registers to six, and it adds a whole class of address math that can be incorporated into the addressing portion of the instructions themselves, or can be calculated independently of other instructions. </p><p>It supports using two of the index registers as stack pointers, and thus supports stack addressing, so that race conditions can generally be completely avoided by using temporary variables on stack. (In comparison, the 68000 can use any of the 8 address registers visible to the programmer as stack pointers.)<br /></p><p>One of the stack-pointer capable registers can be used as a frame pointer, making stack frames less of a bottleneck. Or it can be used as a separate parameter stack pointer, pretty much eliminating the bottleneck and improving security. (In comparison, the 68000 includes an instruction to generate a stack frame, which, of course, you don't need when you use properly split stacks. It also includes an entirely superfluous instruction to destroy a stack frame.)<br /></p><p>One of the index-capable registers is the PC, which simplifies such things as mixing tables of constants in the code. (This is also supported on the 68000, making a ninth index-capable register for the 68000.)<br /></p><p>One of the index registers (DP, for direct page) is a funky 8-bit high-byte partial index for the direct page modes it inherits from the 6800. (This is not done on the 68000, but any of the 68000's address registers can be used in a similar way, with short constant offsets for compact code and reduced cycle counts.)</p><p>All unary instructions have a direct page mode op-code, which saves byte count if not cycle count.<br /></p><p>(1) As a minor niggle, I can't tell that not providing a full 16-bit base address for the direct addressing mode actually saved them anything in terms of transistor count and instruction cycle count, but we are probably safe in guessing that was their reasoning for doing it that way. It is still useful, although it might have been more useful to have provided finer-grain control of the base address of the direct page. (See above about using any address register in the 68000 in a similar way.)</p><p>The DP can be used, with caveat, as a base for process-local static allocations, which greatly reduces potential for inadvertent conflicts in use of global variables and race conditions.<br /></p><p>(2) Another niggle about the direct page, the caveat, is that the direct page base is not directly supported for address math. Just finding where the direct page is pointing requires moving DP to the A accumulator and clearing the B accumulator, after which you can move it to one of the index registers. Cycle and register consuming, but not fatal.</p><p>(3) A third niggle about both the direct page and the indexed mode, it seems like cycle counts for both could have been better. The 6801 improved cycle counts for both, making the 6809 seem less attractive to engineers seeking for speed. It would have been nice for Motorola to have followed the 6801 with an improved 6809 that fixed the DP niggles and cycle count niggles.<br /></p><p>(4) The 6809 also does not have address function code signals. The overall design provides enough power to implement mini-computer class operating systems, but the 64 kilobyte address space then limits the size of user applications. Address function signals that allow separating code, stack, direct page, and extended data would have eased the limits significantly.</p><p>On the other hand, widening the index registers would have done even more to ease the addressing restrictions. (I've talked about that elsewhere, and I hope to examine in more carefully sometime in a rant on how the 6809 could have evolved.)<br /></p><p><span>(5) Other than those niggles, the 6809 is about as powerful a design as you can get and still call a CPU an 8-bit processor. In spite of the fact that it would have meant letting the 6809 compete with the 68000 in the market, they could have used the 6809 as the base design of a family of very competitive 16-bit CPUs.</span></p><p><span>In other words, my fifth niggle is that Motorola never pursued the potential of the 6809. </span></p><p><span>(6) but not really -- 8-bit CPUs are generally focused on keeping transistor count down for 8-bit applications, so hardware multiplication and division of 16-bit numbers doesn't really make sense in an 8-bit CPU design. This is probably the reason the 6809 only had 8- by 8-bit multiplication, and also probably the reason for the irregular structure of the operation. </span></p><p><span>A similar 8-bit division of accumulator A by accumulator B yielding 8 bits of quotient and 8 bits of remainder might make sense, but I'm not sure we should want to waste the transistors.<br /></span></p><p><span>16-bit multiply and divide would have been good for a true 16-bit version of the 6809, but that would include a full 16-bit instruction set. </span><br /></p><p> </p><h4 style="text-align: left;">6801 Niggles: <br /></h4><p>When the 6809 was introduced in the market, it was still a bit too much complexity in the CPU to comfortably integrate peripheral parts -- timers, serial and parallel ports, and such -- into the same semiconductor die that contained the CPU. So Motorola decided to fix just a few of the niggles of the 6800 for use as a core CPU in semi-custom designs that included on-chip peripheral devices.</p><p>(It's something that is commonly misunderstood, that the 6801 actually came after the 6809 historically, but is best understood as a slightly improved 6800, not as a stripped-down 6809. Three steps forward, three steps back, half a step forward.) <br /></p><p>As noted above, they fixed the CPX instruction in the 6801, but they did not fix the lack of direct-page unary instructions. They also added instructions to directly push and pop the X index register, which greatly helped when you had something in X that you needed to save before you used X for something else. <br /></p><p>And they added the 16-bit loads, stores, and math that combined A and B into a single 16-bit double accumulator D -- similar to the 6809, which overcame a lot of the other niggles about the 6800. In particular, you don't feel the lack of an OR B with A instruction to make sure both bytes of the result were zero, because the flags are correctly set after the D accumulator instructions. <br /></p><p>And they included the 8-bit multiply A by B from the 6809. They also included a couple of 16-bit double accumulator shifts, but only for D, not for memory, which is a very minor niggle, an engineering trade-off.<br /></p><p>They also added an instruction to add B to X, ABX, to help calculate the addresses of fields within records. </p><p>This brings up niggle (1) -- ABX is unsigned, and they did not include a subtract B from X instruction. Being able to subtract B from X, or add a negative value in B to X, would have significantly helped with allocating local variable space on the stack. As it is, ABX is primarily useful for addressing elements with records and structures.<br /></p><p>Although I/O devices tended to be assigned addresses in high memory on early 6800 designs, the 6801 put the built-in I/O devices in the direct page. They also put a bit of built-in RAM in the direct page, starting at $80. <br /></p><p>But, as I noted above, niggle (2) is that they did not add direct-page mode unary instructions.<br /></p><p>If they had done so, either they'd have broken object code compatibility with the 6800, or they'd have had to spread the direct-page op-codes in awkward places in the 6800, which definitely would have cost transistors that they wanted for the I/O devices and such. Either way, I think it would have been worth the cost.<br /></p><p>I put together a table showing one possible way to spread them out among unimplemented op-code locations in the inherent/branch section of the op-code table for <a href="https://joelrees-novels.blogspot.com/2020/04/33209-headwinds-license.html" target="_blank">a chapter of one of my stalled novels</a>, and I'll just copy below a list of where I allocated the direct page op-codes:</p><ul style="text-align: left;"><li>NEG direct: $02</li><li>ROR direct: $12</li><li>ASR direct: $03 <br /></li><li>COM direct: $13</li><li>LSR direct: $14</li><li>ROL direct: $15</li><li>ASL direct: $18</li><li>DEC direct: $1A</li><li>INC direct: $1C</li><li>TST direct: $1D</li><li>JMP direct: $1E</li><li>CLR direct: $1F <br /></li></ul><p>That doesn't prove anything other than that there were ultimately enough op-codes available. But I'm guessing this layout could be done with a hundred or less extra transistors -- transistors that admittedly would then be unavailable for counters or port bits. But it could be done, and it wouldn't have cost that much.<br /></p><p>Also, with these in the op-code map, they could have provided this version of the CPU for compatibility, and then provided another version with the direct-page op-codes correctly laid out for customers who were willing to simply re-assemble their source code. (That's all it would have taken, but many customers wouldn't be willing to take a chance that something would sneak up and bite them.)<br /></p><p>One possible more efficient layout would have been to repeat the addressing of the binary op-code groups. Working from the right in the opcode map, there are four columns for accumulator B binary operators and four columns for accumulator A binary operators:</p><ul style="text-align: left;"><li>$FX is extended mode B, and $BX is extended mode A;</li><li>$EX is indexed mode B, and $AX is indexed mode A;</li><li>$DX is direct page B, and $9X is direct page A;</li><li>$CX is immediate mode B, and $8X is immediate mode A. <br /></li></ul><p>In the existing 6800, this continues down two more for the unaries, but then you have the unary A and B instructions:</p><ul style="text-align: left;"><li>$7X is extended mode unary;</li><li>$6X is indexed mode unary;</li><li>$5X is B unary;</li><li>$4X is A unary.</li></ul><p>Then you have inherent mode instructions in columns $3X, $1X, and $0X, with the branches in column $2X.</p><p>In a restructured op-code map, it could be done like this: <br /></p><ul style="text-align: left;"><li>$7X is extended mode unary;</li><li>$6X is indexed mode unary;</li><li>$5X would be direct page unary;</li><li>$4X would be B unary;</li><li>$0X would be A unary. <br /></li></ul><p>And the inherent mode operators would be more densely packed in the $1X and $3X columns.</p><p><strike>This would require either moving the negate instructions or the halt-and-catch-fire instruction, I suppose.</strike> <i>[I'm not finding my reference that had me thinking the 6801's test instruction was at $00. Cancel that thought.]</i> Interestingly, when Motorola laid out the op-code map for the 6809, they kept A and B in columns $4X and $5X, and put the direct page in column $0X -- and left the negate at row $X0<strike>, so that they had to move the test instruction</strike>. <i>[Again, I'm not finding my reference on the location of the 6809's test instruction. But they did leave negate where it was.]</i><br /></p><p>Also interestingly, the 6801 has a direct-page jump to subroutine, which could be put to good use for a small set of quick global routines (like stack?). (The op-code is $9D, which some sources say was one of the accidental test instructions in the 6800).<br /></p><p>Niggle (3) about the 6801 is that I think they should have split the stack. Add a parameter stack U, and then pushes and pops (PULs) would operate on the U stack, but JSR/BSR/RET would operate on the S stack. This would make stack frames much less of a bottleneck, make it possible to reduce call and return cycle counts, and increase general code security somewhat.</p><p>(Note again that the 6809 and the 68000 both directly support this kind of split stack. It was the education system that failed to teach engineers to use it.) <br /></p><p>And I'll note here that the 68HC11 derivative of the 6801 added, among other things, a Y index, but no parameter stack. <br /></p><p> <br /></p><h4 style="text-align: left;">6805 Niggles</h4><p>Really the only niggle I have with the 6805 is the lack of a separate parameter stack, and the lack of any push/pop at all in the original 6805. Motorola did add pushes and pops to some derivatives of the 6805, but they were on the same S stack as the return address was going to. <br /></p><p>The idea of an 8-bit index that could have a 16-bit base (as opposed to an offset) was novel to me when I first looked at the 6805, but it is rather useful. Instead of thinking in terms of putting a base address in X and then adding an offset, you think in terms of having a constant base address -- like an array with a known, fixed address, and the X register provides a variable offset. Indexed mode for binary operators includes no base, 8-bit base, and 16-bit base, allowing use anywhere in the address space. <br /></p><p>A small caveat is that unary operators do not have 16-bit base address indexed versions. This is a valid engineering tradeoff, and they cut the right corners here, fully supporting unary instructions for variables in the direct page. <br /></p><p>The 8-bit index does not support generalized copying and other generalized functions needed to support self-hosted development environments (without self-modifying code), but that's not necessarily a problem. Hosted development environments are much more powerful tools than self-hosted. (I think a very small Tiny-BASIC interpreter could be constructed without self-modifying code, but that's more of an application than a self-hosted dev environment.)<br /></p><p>It does make the CPX operator much simpler -- as an 8-bit operator.</p><p>Motorola ultimately extended the index with an XHI in some derivatives of the 6805, which would have allowed self-hosting for those derivatives, but we won't go there today. Also, we won't look at the 68HC11 in detail today. Nor will we do more than glance at the 68HC12 and 68HC16, even though both are quite interesting designs -- in spite of not having split stacks.<br /></p><p>I think this is enough to show that Motorola really did do a fairly decent job with their CPU designs.</p><p>Actually comparing CPUs, by the way, requires producing a lot of parallel code implementing several real-world applications for each CPU compared. I'd like to do that someday, but I doubt I'll ever have the spare time and money to do so.<br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-1672240456261937542022-05-07T19:58:00.062+09:002022-06-28T21:32:43.451+09:00Translation of Article on the 63C09 in _Oh!FM_ 1988-4: p.72 <p>
As I noted in my post of my transcription of this magazine article, if you're
a fan of Motorola's 6809 Advanced 8-bit CPU, you may be aware of Hitachi's
6809-compatible 6309 CPU. You may have heard of the article that "revealed"
the 6309's extensions to the 6809 register model and instruction set.
</p>
<p>
I am not especially a fan of the 6309, although I am
<a href="https://defining-computers.blogspot.com/2021/08/the-real-reason-ibm-chose-wrong-cpu-for.html" target="_blank">something of a fan of the 6809</a>. But there are many fans of the 6809 who are also fans of the 6309, and I've
heard enough of them ask for a translation of this article, and I happen to be
spinning my wheels in my more important goals at the moment, so I've decided
to take a crack at it.
</p>
<p></p>
<p>
Today (Saturday, May 7, 2022), I just finished typing in the background
portion of the article in Japanese, and that is probably the more interesting
part, anyway. Over the next several months or longer, when I'm not too tired
from the mail route, I'll either be typing more of the article in on
<a href="https://defining-computers.blogspot.com/2022/05/transcription-of-article-on-63c09-in-oh-fm-1988-4-72.html" target="_blank">the transcription post</a>, or I'll be translating more of it here. Right now, I only have the first
pass on the headline and the first paragraph in English.
</p>
<p>
I don't really want to work with the output of
<a href="https://translate.google.com" target="_blank">Google Translate</a>
here, because I find it distracting. (But people will ask for it, and I
guess it's not necessary to make everyone go to Google themselves, so I'll
paste it below for while I work, and to laugh at when I'm done.)<br />
</p>
<p>
If the publishers of <i>Oh! FM</i> or the authors of the article take
exception to me putting this up without permission, yes, I'll take it back
down. It's so long water under the bridge that it's hard to imagine anyone
thinking the information will do anyone damage now, but you never know.<br />
</p>
<p>
<strike>The beginning of the translation, just text for now, with a tiny bit of
crude markup:</strike><br />
</p>
<p></p>
<p>
The background sections are now (around May 21st) done, and I'm leaving the
Japanese mixed in here for reference.
</p>
<p>
<i>[JMR202206082344: <br /></i>
</p>
<p>
<i>And now (June 8th) the rest of the technical details section has been typed
up and posted in the transcription post linked above. I'll begin translating
it here real soon now, beginning at </i>
</p>
<blockquote>
<p>************<br />拡張レジスタ<br />Extension registers</p>
</blockquote>
<p></p>
<p>
<i>But I'll note again that the technical information in the article should be
considered to be for historical interest only. The technical reference by
Alan DeKok and Chet Simpson at
</i>
</p>
<p>
<i><a href="http://www.sandelman.ottawa.on.ca/People/Alan_DeKok/interests/6309.techref.html" target="_blank">http://www.sandelman.ottawa.on.ca/People/Alan_DeKok/interests/6309.techref.html</a></i>
</p>
<i> </i>
<p>
<i>is more complete and more accurate.</i></p><p><i>For further reference, the reportage by Hirotsugu Kakugawa which made the meat of the technical information widely available and led to the reference above can be found on comp.sys.m6809, archived by Google at <a href="https://groups.google.com/g/comp.sys.m6809/c/xxCoMu_gyA4/m/-mhmKurDc90J">https://groups.google.com/g/comp.sys.m6809/c/xxCoMu_gyA4/m/-mhmKurDc90J</a> .<br /></i>
</p>
<i> </i>
<p><i>]</i></p><p><i>[JMR202206262027:</i></p><p><i>Translation of the entire article is now complete. </i></p><p><i>There is no purpose in this other than historicity, so I will not be posting an English-only version of the article. Also, I cannot grant permission to copy. <br /></i></p><p><i>] <br /></i></p>
<p>
It should be remembered that this was written almost forty years ago, and much
of what the (apparently pseudonymous) authors say is old news and dated
marketing hype.
</p>
<p>
Since I don't own the original copyrights, I cannot extend permission to copy
or publish in any way. Partly for that reason, I do not plan to publish an
English-only version. Go ahead and wade through the mix. If you feel inclined,
try reading the Japanese.<br />
</p>
<p>The translation, mixed in with the transcription:</p>
<p></p>
<hr />
<p>
[Original copyright 1988 Oh!FM -- 元発行社 Oh!FM 1988]<br />[Translation
copyright 2022 Joel Matthew Rees -- 翻訳文発行者 Joel Matthew Rees 2022]
<br />
</p>
<p>
[Oh!FM 1988-4: p.72]<br />=================================================<br />
</p>
<p>
16ビット乗除算/レジスタ間演算/ブロック転送が可能<br />16-bit
Multiplication & Division / Register-register Math / Block Transfers<br />超8ビット級MPU<br />8-bit
Super MPU<br />63C09の拡張機能をさぐる<br />Finding Extensions to the 6809 in
the 63C09<br />63C09解析委員会 UNO<br />63C09 Survey Committee (Kaiseki
Iinkai) UNO<br /><br /> 6809のマイナーチェンジ版に、63C09というLSIがあります。ハードに強い一部のユーザの間では、その高速性を買われ、本体の改造に使われたりしてきました。ところが、最近この63C09に各種の拡張機能が隠されていたことがわかりました。ここでは、それらの機能を発見し、探索してきた「63C09解析委員会」の方にその概要を報告していただきます。<br />The
63C09 is known as a minor upgrade to the 6809. Some users who have confidence
with hardware have bought it for the higher speed rating, to speed up their
computers. Recently, various hidden functional enhancements have been
discovered in the 63C09. Here, we will have someone from the 63C09 Survey
Committee (Kaiseki Iinkai) that found and explored these extensions outline
what they have brought to light.
</p>
<p>
なお、本体の改造、ことにCPUの差し換えは、メーカーの修理は保証されず、他の周辺LSI、周辺機器も交換しなければならない場合もありえ、おまけに63C09だと従来のソフトの一部(あるいは多数)が動作しなくなる危険性がありますので、「私は自作したプログラムしか使わない!」という、よほど腕に自信のある方以外にはお勧めしかねます。<br />Now,
we need to point out the dangers. When you start adding your own custom
modifications to your computer, in particular when you swap out the CPU, the
manufacturer is not guaranteed to fix your mistakes for you. It is possible
that you will be required to change out other LSI parts and peripherals as
well. And with the 63C09 you may end up losing the ability to run some -- or
even most -- of the software you've been relying on, to boot. For these
reasons, we cannot recommend these modifications, except to those who have a
very high level of confidence in their own skills, to the point that they are
willing to forego the use of any software they don't themselves write.
</p>
<p></p>
<p><br /></p>
<p>
******************<br />6809の高速版 63C09<br />The 63C09 as a High-Speed 6809
</p>
<p></p>
<p>
パソコンの楽しみ方には色々ありますが、その筋の兵[つわもの]だけに許された遊びとして、ハードの改造があります。その昔からいろいろな改造が行なわれてきましたが、中でもCPUの高速化は、処理能力の向上が著しいことと、比較的簡単に行えることから、市販ソフトに頓着しないプログラム自作派の間では広く試みられてきました。古くは
FM-8 に積まれた 68A09 (1.2 MHz) を 68B09 (2 MHz) に変えて、
FM-7並の処理速度を与えたこと、最近では FM-11
を中心にCPUをクロックアップして 2.5~4 MHz
で動かすことが、その代表的なところです。<br />There are many ways to have fun
with a personal computer, but one game that is only allowed the adept is
customizing your hardware. There have been a number of traditional
customizations, accelerating your CPU stands among them out for improving the
functional speed of your computer. Since it is relatively easy to do, it is
popular among DIY-ers who do not get overly concerned about commercial
software. In the past, one might trade the 1.2 MHz 68A09 in an FM-8 for a 2
MHz 68B09, to give it speed comparable to the FM-7. More recently,
overclocking the processor at speeds of 2.5~4 MHz has become common,
especially with the FM-11.<br /><br /> 「えっ、 2.5~4MHz だって、そんなに速い
6809ってあったっけ」と思われる方もおられるでしょうが、実は存在するのです。秋葉原や日本橋のチップ屋さんで売っている日立の
63C09 というMPUがそれで、 3 MHz で動作する、
6809のC−MOS版です。ノーマルに従うなら従来の 68B09 (2 MHz) の
1.5倍の処理能力になり、選別して規格外の 4 MHz
で動くものを使えば2倍の処理能力となります。<br />Many people will be
thinking, "Huh? 2.5 to 4 MHz? Are there 6809s that fast?" Yes. There are.
Hitachi's 63C09, available at chip shops in Akihabara and Nihonbashi, is a
C-MOS 6809 that operates at 3 MHz. Standard parts are rated at 1.5 times the
speed of the (2 MHz) 68B09, and if you use a specially selected part, you can
run it out of spec at 4 MHz, to give double the functional speed.
<br /><br /> この 63C09
を使って高速化を行うわけで、基本的にはCPUとクロックを差し換えるだけの簡単なものですが、それだけでは済まないこともあります。
FM-8, 7, 77D1/D2/L4, 77AV, 11
のようにCPUがソケットに差さっている機種では単純に差し換えるだけですが、
FM-NEW7, 77AV20/40/20EX/40EX
のように基板に直接ハンダづけされている機種ではかなりの腕がないとCPUが引っこ抜けません。また、クロックアップした場合周辺LSIや周辺機器が追いつかないこともあり、その場合はそれらも交換しなければなりませんが、AV系のようにカスタムLSIが多用されている場合は難しいでしょう。
7/AV系のサブシステムのように微妙なタイミングで動いているものでは、クロックアップはかなり困難で、しかも、クロックアップしたが最後、プロテクトやその他の処理に内蔵タイマやソフトウェア的なタイミングを使った市販アップリケーションソフトの多くは全て使いものにならなくなってしまいます。<br />Using
the 63C09 to accelerate your computer may be basically a matter of replacing
the CPU and clock, but it's not always that simple. With models which have
socketed CPUs, such as FM-8, 7, 77D1/D2/L4, 77AV, and 11, it's just a matter
of replacing the chips. But with models that have the parts soldered directly
to the board, such as the FM-NEW7 and 77AV20/40/20EX/40EX, it requires a fair
degree of skill to get the CPU out. Also, when increasing the clock speed, the
support LSI ICs and external peripheral devices may not be able to keep up. In
that case, you must trade those out as well. With models which, like the AV
[audio/video => multimedia] models, use many custom LSI parts, this can be
exceedingly difficult [meaning, impossible]. With models which push
specification limits, such as the 7/AV series, raising the clock rate can
leave you unable to access most commercial application software which uses
internal timers or software timing for processes such as [copy]
protection. <br /><br /> さて、これら様々な困難を伴った高速化ですが、そのもたらす結果は苦労を補って余りあるものです。<br />So,
accelerating your computer comes with a wide range of problems, but the
results well make up for the effort.<br /><br /> ちなみに FM-11
の場合を例にとると、CPUの差し換えだけだと 2.5~3 MHz
ぐらいが限界のようで、それ以上を望むと一部周辺LSIの差し換え等が必要なようです。
11ではサブシステムの高速化も可能で、手を加えれば 4 MHz までいけます。とくに、
4 MHz化された FM-11
のサブシステムの表示速度は目を見張るものがあり、漢字の表示速度が漢字VRAMをもった
FM16β と大差ない速さになります。<br />As an aside, taking the FM-11 as an
example, just swapping out the CPU has a limit of 2.5 to 3 MHz gain. Getting
more out of it will require changing out some of the support LSI and
peripherals, etc. With the 11, subsystems can also be accelerated so that,
with a little effort, 4 MHz speeds can be obtained. Accelerated to 4 MHz, the
FM-11 subsystems display speed is astounding, with Kanji display speed not
very different from the Kanji VRAM-equipped FM16β.<br /><br />*************<br />拡張機能の発見<br />Discovering
the Functional Extensions<br /><br /> というわけで、私の周りの歴戦の勇士たちは次々と
FM-11 に高速化改善を行ったのですが、そこに1つ、奇妙な問題が発生しました。<br />Which
is why the brave wizards around me, one after another, accelerated their
FM-11s. But a rather curious problem occurred.<br /><br /> コマスのワープロWPV3が動かなくなってしまったのです。最初は、前述したソフト的なタイミングの問題か何かだと思われたのですが、驚いたことに、クロックを
2 MHz に落としても動きません。<br />Comas's word processor, WPV3, would not
work. At first, we suspected the sort of software timing issues I mentioned
above, but, to our surprise, when we brought the clock back down to 2 MHz, it
still didn't run.<br /><br /> そこで、友人の
Gigo氏を中心に原因探究が始まったのですが、ほどなく 6809
の未定義命令で引っかかっていることがわかりました。未定義命令とは、メーカが発表しているマニュアルで定義されていない命令のことで、建前上はそのような命令を使ってもなんの動作もしないことになっているのですが、実は隠し命令になっていることもままあります。一昔のパソコン雑誌には、よく各社製CPUの隠し命令の解析記事が載っていたりしました。このよう<br />Investigation
began at this point, and, with a friend I'll call Mr. Gigo taking a lead role,
we quickly found that it was getting stuck at unimplemented 6809 instructions.
Unimplemented instructions are instructions that are not defined in the
manufacturer's manuals, and fundamentally should not be expected to do any
[particular] thing, but it's not unusual that they are in fact hidden
instructions. A short time ago, you would find articles detailing hidden
instructions for each manufacturer's CPUs in Pasokon [nascent personal
computer] magazines. Such<br /><br />--------<br />72 Oh!FM 1988-4
警告 CPUを 63C09
に交換した場合、かなりの市販アプリケーション(とくにゲーム)が動作しなくなります。<br />p.
72 Oh! FM 1988-4 Warning: Exchanging the CPU for the 63C09 will result
in many commercial applications (especially games) failing to run.<br />--------<br />[Oh!FM
1988-4: p.73]<br /><br />な隠し命令は、 6809
にもそう大したものではありませんでしたがありました。さて、未定義命令の扱いが
63C09 と 6809 とでは異なるということは、隠し命令も異なる可能性があるわけです。
63C09 の出荷開始時期 (1985年秋)
を考えても、何か機能が追加されても当然なくらいで、「もしかしたら」の期待がわきました。<br />hidden
instructions on the 6809 are not very special, but do exist. Differences in
the effects of unimplemented instructions probably implies differences in
hidden instructions. Considering the timing of the introduction of the 63C09
(Fall 1985), it might even be expected that some additional functionality was
included. We began to get excited about the what-ifs.<br /><br /> 引っかかっているコードの1つに「$1F,
$62」というものがありました。命令自体は TFR
(レジスタ間のデータ転送命令)でおなじみのものですが、未定義レジスタから Y
レジスタへ転送するように指示されています。 6809 の場合は未定義なので Y
レジスタに $FFFF が返りますが、63C09 の場合は Y
レジスタにはめちゃくちゃな[値]が入ります。試みに、 Y
レジスタから未定義レジスタ番号にデータ転送してから未定義レジスタ番号から Y
レジスタへ戻してやると、元のデータがちゃんと残っていました……。つまり、 63C09
の未定義レジスタ番号は、番号が余って未定義となっていたのではなく、実在するレジスタを指す番号だったのです!<br />One
of the codes it was getting stuck at was $1F, $62. The instruction itself is
the familiar TFR (register-to-register transfer) instruction, but it specifies
copying the value from an unimplemented register to the Y register. On the
6809, it is unspecified, but the value $FFFF is returned to Y. On the 63C09, Y
gets loaded with an absurd value. As a test, when a [known] value was
transferred from Y to the unimplemented register number and then returned from
the unimplemented register number to Y, the value nicely left in Y was the
original value .... In other words, on the 63C09, the unimplemented register
number was not just a leftover number specifying an unimplemented register, it
specified an actually existing register! <br /><br /> その隠しレジスタを発見した
Gigo氏は、狂喜してかたっぱしから友人に電話をかけまくりました。そして、私のところにも夜も丑三つ[うしみつ]時[dead
of the night]過ぎにかかってきました……。話の内容は、「63C09
にはレジスタが余計にある。レジスタがあるからには命令もあるはずだ。みんなで手分けして調べよう」というもので、それから隠し命令を探す日日が始まり、ディスアセンブル表の割り当てのないコードをデバッガでメモリ上に書き、ブレークポイントを設定して
TFR
でレジスタに値をセットして実行させ、レジスタの内容を見るという単調な作業が繰り返されました。その日わかった結果を、パソコン通信を通じて情報交換するうちに、いつとはなしに、
FM-11 で OS-9 をやっている、それもほとんど病気に近いマニアが集まり、「63C09
解析委員会」なる集団が自然発生しました。<br />Mr. Gigo, who discovered the
hidden register, in his rapture, immediately called all his friends one after
another. It was beyond the dead of night when he called me. The substance of
his call was, "The 63C09 has extra registers. It it has registers, it ought to
have instructions, too. Let's split up the work and see what we can find."
Thus began long days of the monotonous job of hunting hidden instructions by
using a debugger to repeatedly write codes that are unallocated in the
disassembly table into memory, set breakpoints and use TFR to set values in
registers and execute the instructions, and check the register contents. Each
day we would share our results, communicating by personal computer networks,
and, before we knew it, a group of almost ill maniacs who run OS-9 on their
FM-11s had naturally coalesced into the "63C09 Survey Committee".<br />[The
grammar in the Japanese here is a bit murky. I think he should have said
something like, "... by repeatedly selecting unallocated codes from the
disassembly table and using the debugger to create test routines by (1)
writing them into memory sandwiched between TFR codes which set values in
registers and then extract the values to check them, (2) setting breakpoints,
and (3) executing the test routines". But the language in the article is not
that precise, and I can't find enough clues to justify that much
interpolation. If you've done this, you know what he means. If you haven't,
you'd need a full article, blog post, or video describing the process. It has
been done. Might be fun to do my own, if I had the time.]<br /><br /> マニアの執念は恐ろしいもので、ほどなく
36C09 の拡張機能の大筋が判明しました。概略を述べると、<br /> ・3種類のレジスタが増設されており、そのうちの1つはアキュムレータとして、またインデックスレジスタとして使える<br /> ・32÷16
ビット除算、 16÷8 ビット除算、 16×16
ビット乗算、レジスタ間演算、ビット操作、ブロック転送などの命令が拡張されている。<br /> ・未定義の命令を検出した場合トラップがかかる<br /> ・6809
コンパチのモードと、 63C09 本来のモードの2種類の動作モードをもつ<br />といったところで、今までの
6809
で不便であった部分、弱点であった部分が相当改善されており、またとても8ビットMPUとは思えない強力な機能も含まれています。<br />Fanatical
obsession can be scary. Before long, we had the general outline of the
extended functionality fleshed out. In brief,<br /><br />* 3 classes of
registers have been added, one of which can be used either as an accumulator
or as an index register;<br />* the extended instruction set includes 32-by-16
bit divide, 16 by 8 bit divide, 16 by 16 bit multiply, register-register math,
bit operations, block moves, and such;<br />* a trap is triggered when
unimplemented instructions are found;<br />* and there are two operating
modes, a 6809 compatible mode and a 63C09 native mode.<br /><br />Here we have
many inconveniences or straight-out weaknesses in the 6809 which are rather
improved, and functionality which is hard to think of as existing in an 8-bit
MPU is included.<br /><br /> これらの解析結果はNANNO−NETを皮切りに、いくつかの
BBS
にアップされました。多くのネットワーカーの方から多大な反響を得ましたが、より多くの方に「8ビットを超える8ビットMPU」
63C09 の全貌[ぜんぼう=entire body]を知っていただくために、 Oh!FM
の誌上お借りしてご報告します。<br />The results of our analyses have been
uploaded to several BBSses, beginning with NANNO-NET. They have been
well-received among net-walkers [among netizens? in the on-line communtity],
but now we are borrowing magazine space in Oh!FM to make the full nature of
the more-than-8-bit 63C09 8-bit MPU known among a broader audience.<br /><br /><br />*******************<br />63C09化の<br />メリット/デメリット<br />Considerations
in <br />Converting to the 63C09<br />[Merits/demerits =>
Advantages/disadvantages => Considerations]<br /><br /> 日立から販売されている
HD63C09 は、モトローラの MC6809
とピンコンパチの8ビットMPUです。MPUの仕様は 6809
に拡張機能を付け加えた形のもので、
6809の上位コンパチになっています(未定義の命令を除く)。日立から公式発表はされていませんが、
63C09 の拡張機能を活用すると、
6809パソコンの処理能力を大幅に向上させることができます。<br />Hitachi's 63C09
is a 6809 pin-compatible 8-bit MPU. It is specified as upward compatible with
the 6809 (excepting 6809 unimplemented op-codes) with extended functionality.
The functionality is not publcally acknowledged by Hitachi, but using the
extended functionality of the 63C09 can yield significantly improvements over
the performance of the 6809.<br /><br /> 6809パソコンのMPUを 63C09
に差し換えた場合のメリットは処理能力の向上につきます。その要因としては以下の3点が考えられます。<br />The
merits of replacing the MPU in a 6809-based personal computer are found in the
improved performance. These three factors can be considered:<br /><br /> 1 高速クロック<br /> 2 拡張命令/拡張レジスタ<br /> 3 ネイティブモード<br /><br />1:
Higher-speed clock<br />2: Instruction/register extensions<br />3: Native
mode<br /><br /> 1は当然のことで、MPUの動作クロックをあげればソフトの実行速度は上がります。未定義命令トラップやソフトウェアタイマの関係で引っかかる一部を除けば、従来のソフトが高速に動かせます。動作クロックの上昇率はハードにより異なり、場合によってはほとんどあげられないこともあります。<br />The
merits gained from the first factor are a matter of course -- if you raise the
MPU's processor clock rate, programs runs faster. Except for certain programs
that will have problems related to unimplemented instruction traps, software
timers, and such, existing software runs faster unchanged. How fast the clock
can run depends on the hardware, and there are some cases where the clock can
hardly be raised at all.<br /><br /> 2は、新規にソフトを書き起こすか、従来のソフトにパッチを当てたときに効果があります。従来の
6809 だとアセンブラのマクロ機能で表現していた処理の相当が 63C09
の1命令で書けるようになり、マシンサイクルを短縮できます。<br />Merits due to
the second factor only come into play when writing new software new or
patching existing software. Many operations that would be expressed as macros
in the existing 6809 instruction set are performed by a single instruction on
the 63C09, reducing machine cycle count.<br /><br /> 3は、
63C09固有のモードに切り換えて使うと、通常のエミュレーションモードよりも命令の実行サイクルが短くなることにより生じるものです。このモードを使うと、同じ動作クロックでも、通常より実行速度が最大20〜30%上がります(アドレッシングモードにより効果が違う)。ただ、スタックや割り込み関係で動作が異なる点があるので、ネイティブモードを利用するには
F-BASIC なり OS-9 なりのシステムの一部を書き換える必要があります。<br />Merits
from the third factor are obtained by switching to native mode, in which
instruction execution cycles are reduced over the normal emulation mode. When
using this mode, run speed can increase as much as 20 to 30%, even at the same
operation clock (varies according to addressing mode). However, due to
differences in handling the stack and interrupts, in order to use native mode,
parts of your system (F-BASIC, OS-9, etc.) must be rewritten.<br /><br /> 逆にデメリットとしては、まず、
6809
の未定義命令を使ったソフトに限らず、多くの内蔵タイマを使った市販ソフトその他が動作不良を起こしてしまうであろうこと、また、
63C09
の拡張機能を活用するにはそれらを活用するための開発ツールを自作できるくらいのそれなりの腕が必要で、ノービスには難しいことが難点といえるでしょう。<br />Looking
on the other hand at the demerits, first off, besides the different effects of
unimplemented instructions in the 6809, commercial software which uses
internal timers will tend to function erratically. Also, the skills to produce
tools that take advantage of the 63C09 extensions must be produced oneself,
which will be a point of difficulty for the novice user. <br /><br /> メリットとデメリットを比較すると、現状では自分でプログラム書くだけの人ならその恩恵を受けることができるが、ごく一般のゲームユーザやアプリユーザは決して手を出さないほうがよい、といったところでしょう。<br />Comparing
the merits and demerits, under the present conditions those who can write
their own programs can receive the benefits of the 63C09, but game and
application users in general should pass this one by.<br /><br /> それでは、
63C09 の拡張機能について以下順番に解説していきます。<br />With that, we shall
proceed to describe the extended functionality of the 63C09.
</p>
<p>
**********************<br /><i>[The essential content of the technical description is already available at
<br /><br /><a href="http://www.sandelman.ottawa.on.ca/People/Alan_DeKok/interests/6309.techref.html">http://www.sandelman.ottawa.on.ca/People/Alan_DeKok/interests/6309.techref.html</a>.<br /><br /><strike>I may go ahead and translate what the article has, for completeness. But
the above link will provide more correct information, now. So, if I
do,</strike>
I'll be working on the technical sections at lower priority.]<br /></i>**********************<br /><i></i>
</p>
<p>
************<br />拡張レジスタ<br />Extension registers<br /><br /> 63C09
では 6809 よりレジスタの数が3つ増えています(図1)。そのうち2つは
16ビットのレジスタで、もう1つは8ビットのモードステータスレジスタです。<br />The
63C09 has three more registers than the 6809 (Table 1). Two of those registers
are 16-bit registers, and one is an 8-bit mode/status register.<br /><br /><br />Wレジスタ[16ビット]<br />W
Register (16 bit)<br />~~~~~~~~~~~~~~~~~~~~<br /> アキュムレータとしても、インデックスレジスタとしても使用できる
16ビットレジスタです。<br />16 bit register useable as either an accumulator
or an index register.<br /><br /> アキュームレータとして使うときは、 16ビ<br />When
used as an accumulator,<br /><br />--------<br />Oh!FM 1988-4 73<br />Oh! FM
1988-4 p. 73<br />--------<br />[Oh!FM 1988-4: p.74]<br /><br />ットレジスタとしてのほか、2つの8ビットレジスタ(E/Fレジスタ)に分割して使うこともできます。ちょうど、既存の
D/A/Bレジスタがもう1組増えたようなものです。ただし、 AND/OR 等の W/E/F
レジスタでは使えない命令もあります。<br />in addition to being useable as a
16-bit register, W may be split and used as two 8-bit registers (E/F). This is
just like having an extra D/A/B register, except that there are instructions
such as AND and OR that cannot be used with W/E/F.<br /><br /> また、既存の
Dレジスタと連結して
32ビットレジスタ(Qレジスタ)として使うことができ、乗除算のときに利用します。<br />It
can also be concatenated with the existing D register and used as a 32-bit
register, useful in division operations.<br /><br /> インデックスレジスタとして使うときは、既存の
X/Yレジスタと同様に利用します。この場合、 6809
でポストバイトに用いられていないビットパターンを使用します。Wレジスタをインデックスレジスタとして使用したときの、アキュムレータオフセットと5ビットオフセット、8ビットのコンスタントオフセットはありません。
</p>
<p>
W functions similarly to the existing x and Y registers when used as an index.
In this case, it uses a post-byte bit pattern that is not used on the 6809.
However, when using the W register as an index, accumulator, 5-bit, and 8-bit
offsets are not available.<br /><br /> また、特徴な使い方として、ブロック転送でのカウンタレジスタとして使う方法があります。<br />As
a specialized use, W is used as the counter in block transfer operations.<br /><br />Vレジスタ[16ビット]<br />V
Register (16-bit)<br />~~~~~~~~~~~~~~~~~~~~<br /> Vレジスタを使う命令は、レジスタ間演算命令や
TFR
などに限られています。Vレジスタの特徴は、MPUをリセットしてもレジスタの値が変化しないことです。このレジスタをOSなどで定数等を保持するような目的に便利でしょう。<br />Instructions
that use the V register are limited to register-register math and TFR, etc.
The special feature of the V register is that its value is not changed, even,
when you reset the MPU. This register could be useful for tracking constants
and such, for example, in operating system code.<br /><br />MDレジスタ[8ビット]<br />MD
Register (8-bit)<br />~~~~~~~~~~~~~~~~~~~~<br /> モード/ステータスビットレジスタの略で、除算実行時のエラー検出や未定義命令トラップの作動チェック、動作モードの設定など、
63C09
になって増えたモードやステータスの表示に用いられます。各ビットの意味は次のとおりです。<br />Abbreviated
MD for mode/status bit register, this register is for accessing the modes and
status that the 63C09 adds, including testing for divide-time errors and
checking for undefined instruction traps, and for setting the operating mode,
etc.<br />
</p>
<pre> ・ビット7 R 除算で 0 で割ったときに 1 がセットされる
・ビット6 R 未定義命令をフェッチしたときに 1 がセットされる
・ビット1 W FIRQ時のレジスタの退避モード設定ビット
0 -> FIRQ時、 PC と CC のみスタックに退避
1 -> FIRQ時、 すべてのレジスタを退避
・ビット0 W 動作モード設定ビット
0 -> エミュレートモード
1 -> ネイティブモード
</pre>
<pre> * bit 7 (R): set to 1 on divide-by-zero
* bit 6 (R): set to 1 on an undefined instruction fetch
* bit 1 (W): bit to set the FIRQ register save mode
0 -> on FIRQ, only save PC and CC
1 -> on FIRQ, save all registers
* bit 0 (W): bit to set the operating mode
0 -> emulation mode
1 -> native mode
</pre>
<p>
なお、リセット時にはすべてのビットは 0 になります。<br />On reset, all bits
are set to 0.
</p>
<p> </p>
<p></p>
<pre>図1 63C09レジスタ構成
__________________________________________________
|----------Q----------|
|----D----| |----W----|
|-A-| |-B-| |-E-| |-F-| アキュムレータ
----------- |----X----| X インデックスレジスタ
----------- |----Y----| Y インデックスレジスタ
----------- |----U----| ユーザスタックポインタ
----------- |----S----| システムスタックポインタ
----------- |---PC----| プログラムカウンタ
----------- |----V----| V(alue) レジスタ
----------- |---DP----| ダイレクトページレジスタ
----------------- |CC-| コンデションコードレジスタ
----------------- |MD-| モード/ステータスレジスタ
__________________________________________________
</pre>
<pre> </pre>
<pre>Table 1: 63C09 Registers
_________________________________________________
|----------Q----------|
|----D----| |----W----|
|-A-| |-B-| |-E-| |-F-| Accumulatorx
----------- |----X----| X index register
----------- |----Y----| Y index register
----------- |----U----| User stack pointer
----------- |----S----| System stack pointer
----------- |---PC----| Program counter
----------- |----V----| V(alue) register
----------- |---DP----| Direct page register
----------------- |CC-| Condition code register
----------------- |MD-| Mode/status Register
__________________________________________________
</pre>
<p>
<br />*********<br />動作モード<br />Operating Modes<br /><br /> 63C09には2つの動作モードがあります。1つは
6809 とのコンパチビリティを考えたエミュレートモードで、もう1つは 63C09
の本来の機能を引き出すネイティブモードです。<br />The 63C09 has two operating
modes. One is the 6809 compatibility emulation mode, and one is the 63C09
native mode that brings out it's own functionality.<br /><br /> と、いうと、「拡張レジスタや拡張命令が使えるのがネイティブモードで、使えないのがエミュレートモードだな」と思われる方もいるでしょうが、それはハズレです。
63C09 では、拡張レジスタと拡張命令をどちらのモードでも使えます。<br />That
said, some will think, "The mode which makes the extension registers and the
instructions available is the native mode, right?" But that is not correct.
The extension register and instruction set are accessible in either mode.<br /><br /> エミュレートモードとネイティブモードの違いは、インタラプト時のスタックの扱いの違いです。インタラプトがかかったときレジスタの内容はスタックに退避されますが、そのとき従来からのレジスタだけを退避させるのがエミュレートモードで、拡張レジスタの
W レジスタも退避させるのがネイティブモードです。<br />The difference between
emulation mode and native mode is how the stack is handled at interrupt time.
The register contents are saved on interrupt, but the mode that saves only the
original register set is the emulation mode, and the mode that saves all the
extension registers as well is the native mode.<br /><br /> 63C09
をリセットした直後はエミュレートモードに設定されています。このモードでは 6809
のソフトが問題なく動作する代わり、マルチタスクで拡張レジスタを使うときに気をつけなければなりません。たとえば、拡張レジスタの
W レジスタをカウンタに使うブロック転送命令 TFM
を使ったケースを考えます。タスクAで TFM
命令を使用中に、インタラプトをかけて、タスクBに移ったとしましょう。そのときにタスクBで
W レジスタを使用したとしたら、また元のタスクAに戻ったときに W
レジスタの中身が変更されていますので、誤動作を起こします(もちろんシングルタスクで使用するときや、マルチタスクでも1つのタスクでしか拡張レジスタを使用しないときは問題ありません)。よって、エミュレートモードでは、拡張レジスタを使用するときはいちいちインタラプト禁止して、さらにこのレジスタを一度スタックにセーブしてからでないと、別タスクに切り換えてはなりません。<br />Immeditiately
after reset, the 63C09 is in emulation mode. In this mode, software for the
6809 executes without problems, but you must be careful when using the
extension registers in multitasking. For example, consider using the block
transfer instruction TFM, which uses the extension register W as the counter.
You interrupt task A in the middle of using the TFM instruction, and switch to
task B. If you use W in task B, when you return to task A the contents of W
will be changed, which will cause malfunction. (Of course, when using the
processor only in single tasking, or if you limit use of the extension
registers to one task only, there is no problem.) Thus, when using the
extension registers in emulation mode, you must not switch tasks without every
time disabling interrupts and saving the extension registers on the stack
before doing starting the new task.<br /><br /> これでは、高速ゲームや OS-9
から拡張レジスタを使いづらいうえ、使いにくい拡張命令も発生します。そのために、拡張レジスタと拡張命令を使うことを前提にしたモード、ネイティブモードが存在します。このモードでのインタラプトは
PC, U, Y, X, DP, W, D, CC
の順にスタックにレジスタを退避して割り込み処理に入ります。ここで注意してほしいのは、
W は DP と D の間にあるということです。これは、 D と W の 32ビットレジスタペア
Q としてスタックに退避するという意味です。<br />Because of this, not only are
the extension registers difficult to use in high-performance games and OS-9,
but certain instructions also become problematic to use. This is why the
native mode exists as a mode in which the extension registers are assumed to
be in use. In this mode, interrupt processing is entered after saving the
registers on stack in the order PC, U, Y, X, DP, W, D, and CC. What you should
notice here is that W is saved between DP and D. What this means is that D and
W are saved together as the register pair Q.<br /><br /> ネイティブモードの特徴としては、もう1つ、命令のマシンサイクル短縮があげられます。その結果、アドレシングモードによって
20 ~ 35%高速に動作します。<br />One more feature of native mode is that
instruction machine cycles are shortened. As a result, depending on the
addressing mode, the processor operates 20 to 35% faster.<br /><br /> とくにダイレクト、エクステンド、インヘラント、で顕著[けんちょ=remarkable,
striking, conspicuous]にその効果が現れます。<br />This effect is particularly
conspicuous in direct, extended, and inherent addressing modes.<br /><br />--------<br />74 Oh!FM
1988-4<br />p. 74 Oh!FM 1988-4<br />--------<br />[Oh!FM 1988-4: p.75]<br /><br /> なお、ネイティブモードでも
V レジスタと MD レジスタはその性格上退避されませんので注意してください。<br />Take
note that the V and MD registers, due to their nature and expected uses, are
not saved.<br /><br /> エミュレートモードからネイティブモードに移行するには、新設された
MD レジスタのビット 0 (LSB) に 1 を書き込むことによって実現します。<br />Changing
from the emulation mode to naive mode is accomplished by writing a 1 to bit 0
(LSB) of the newly provided MD register.<br /><br /> さて、 63C09
のモードには上の2つのほかに、 FIRQ
のスタック退避モードが用意されています。ご存知のように、 6809 では FIRQ
をフェッチすると、 PC と CC
のみをスタックに退避してインタラプト処理ルーチンへ分岐します。しかし、制御用のボードマイコンの場合、
FIRQ より IRQ がもう1つあったほうが便利なケースがあります。しかし、 63C09 は
6809
とピンコンパチを[?謳=うた?]っていますので、足の配置を変えるわけにはいきません。そこで、
FIRQ を IRQ
として使用できるように、スタックの退避をすべてのレジスタが行うようにモードをソフトで切り換えられるようになっています。
FIRQ を IRQ として使用する場合は MD レジスタのビット 1 に 1
を書くことによって実現します。<br />The 63C09 has, in addition to the two
modes mentioned above, modes for saving state to the stack during FIRQ. As you
are know, when the 6809 fetches [sic] a FIRQ, it only saves the PC and CC to
stack before jumping to the interrupt processing routine. However, when
building a single-board controller, there are cases when it is more convenient
to have an extra IRQ than to have a FIRQ. But since the 63C09 asserts pin
compatibility with the 6809, it wouldn't do to change the layout of the chip's
feet. To that purpose, so that the FIRQ can be used as an IRQ, you can make a
software switch to a mode in which the full register set is saved on FIRQ.
When using FIRQ as an IRQ [style interrupt], switch to this mode by writing a
1 to bit 1 of the MD register.
</p>
<p><br /></p>
<p>
********<br />トラップ<br />Traps<br /><br /> 63C09
は以下の現象が発生したときにトラップがかかります。<br />The 63C09 takes a trap
under the following conditions:<br /><br /> 1 未定義命令がフェッチされたとき<br /> 2 除算命令の
DIV 命令で 0 で割ったとき<br />1: When an unimplemented/undefined instruction
is fetched<br />2: When a DIV instruction attempts to divide by 0
<br /><br /> トラップがかかると、エミュレートモードでは PC, U, Y, X, DP, B,
A, CC の順に、ネイティブモードでは PC, U, Y, X, DP, W, B, A, CC の順に S
レジスタにレジスタをプッシュした後、 $FFF0
のアドレスに書いてあるベクタに分岐します ($FFF0 は 6809 では
RESERVE)。このトラップはリセットの次の割り込み優先度があります。なお、未定義命令かゼロ・ディバイドかを判定する命令として
BITMD 命令があります。<br />When a trap is taken, after the internal registers
are pushed on the S register in order PC, U, Y, X, DP, B, A, CC in emulation
mode and PC, U, Y, X, DP, W, B, A, CC in native mode, the processor jumps to
the address stored at vector $FFF0 (specified as RESERVED on the 6809). We
have the BITMD instruction to distinguish between an undefined instruction and
a zero divide.<br /><br /> このトラップのため、未定義命令を使っている 6809
のソフトが動作しなくなりますが、代わりに
OS-9/68000等で使われているトラップライブラリを組めるようになります。たとえば、未定義命令に浮動小数点演算プロセッサの呼び出しを割り当てておくと、その命令を未定義命令トラップに引っ掛け、処理ルーチンに飛ばすことが可能になります。このトラップライブラリを利用すると、オブジェクトのサイズをかなり縮められるので便利でしょう。<br />Because
of these traps, 6809 software that uses undefined instructions will fail to
function. On the other hand, the trap allows trap libraries of the sort that
are used in OS-9/68000 and others. For example, if we allocate calls to a
floating point [co]processor to an unimplemented instruction, when that
unimplemented instruction is trapped, we can use the trap to jump to the
handler routines. Using such a trap library should help significantly reduce
code size.
</p>
<p> </p>
<p>
********<br />拡張命令<br />Instruction Extensions<br /><br /> 63C09
拡張命令には、既存の命令の対応レジスタを増やした追加命令と新規に設けられた新設命令に分けられます。<br />663C09
instruction set extensions can be divided into two classes -- additional
versions of existing instructions for working with the added registers, and
entirely new instructions.<br /><br /> 新設命令としては、レジスタ間演算命令や、ブロック転送命令、乗算/除算命令、ビット操作命令、ビット演算/転送命令等の命令があります。<br />The
newly added instructions include register-to-register math, block moves,
multiplication and division, bit operators, and bit math/extraction
instructions. <br />
</p>
<p>
<br />追加命令<br />Additional Instructions<br />~~~~~~~~<br /><br /> 63C09
では、既存の命令も拡張されていて、対応するレジスタが増えています。<br />In the
63C09, some existing instructions have been extended, widening the range of
registers they can operate on.<br /><br /> たとえば、今までありそうでなかった
TSTD, ADCD
などが追加されています。これらは従来でもアセンブラ上でマクロを使って表現できましたが、これらを使うことによりマシンサイクルを短縮できます。<br />For
example, instructions that seemed like they ought to exist but did not, such
as TSTD and ADCD, have been added. These operations could be expressed in
existing 6809 assembler as macros, but using the new instructions will reduce
machine cycle counts.<br /><br /> また、 ADD や SUB などの命令では、 E/F/W
レジスタが増えたことにより、それに対応する命令が増えています。いわば、 A/B/D
レジスタがもう1組増えたようなもので、プログラミングの柔軟性が増します。ただし、
A/B/D
レジスタで使える命令がすべて対応しているわけではありませんので注意してください(図2)。<br />In
addition, the number of registers has increased by E/F/W, and versions of
instructions such as ADD and SUB have been added for the new registers. It's
as if another set of the registers A/B/D has been added, and programming
flexibility has improved accordingly. However, you should be aware that not
all instructions that exist for A/B/D have their counterparts for E/F/W.<br /><br /> 既存の命令の中で追加の度合いが大きいのは
TFR と EXG 命令でしょう。 TFR, EXG
命令では、対象レジスタの指定にポストバイトのビットパターンを用います。 63C09
ではレジスタが増えていますので、そのビットパターンの組み合わせも増えていて
(0110->W, 0111->V, 1110->E, 1111->F),
レジスタアドレッシングとでもいったらよい状態になっています。このレジスタアドレッシングは新設命令のレジスタ間演算でも使用しています。ここで注意しなければいけないのは、本当の未定義レジスタ番号を指定した場合、
63C09 と 6809 とでは動作が異なるということです。<br />Among existing
instructions, two that are significantly impacted are TFR and EXG. These both
use a postbyte to specify the registers that will be affected. There are more
registers in the 63C09, so the applicable bit patterns have also increased
(0110->W, 0111->V, 1110->E, 1111->F). These bit patterns might as
well be used in register addressing, and this register addressing is used in
the newly added register-to-register math instructions. Something to be
careful of here is that the results of specifying a register that does not
exist will differ between the 63C09 and the 6809.<br />
</p>
<pre><code>
図2 アキュムレータで行える処理
Table 2: Operations that can be performed on the accumulator(s)
____________________________________
| A | B | E | F | D | W | Q |
CLR | o | o | o | o | o | o | |
INC | o | o | o | o | o | o | |
DEC | o | o | o | o | o | o | |
TST | o | o | o | o | o | o | |
COM | o | o | o | o | o | o | |
NEG | o | o | | | o | | |
SEX | o*| o*| | | o*| o*| |
ASL/LSL| o | o | | | o | | |
ASR | o | o | | | o | | |
LSR | o | o | | | o | o | |
ROL | o | o | | | o | o | |
LD | o | o | o | o | o | o | o |
ST | o | o | o | o | o | o | o |
ADD | o | o | o | o | o | o | |
SUB | o | o | o | o | o | o | |
CMP | o | o | o | o | o | o | |
ADC | o | o | | | o | | |
SBC | o | o | | | o | | |
AND | o | o | | | o | | |
OR | o | o | | | o | | |
EOR | o | o | | | o | | |
BIT | o | o | | | o | | |
MUL | o*| o*| | | o | | |
DIV | | | | | | | o |
____________________________________
* ワークとして使用
* Used as working registers.
</code></pre>
<p><br /></p>
<p>
レジスタ間演算命令<br />Register-to-register Instructions<br />~~~~~~~~~~~~~~~~~~<br /><br /> 6809での演算は、ほとんどレジスタ対メモリないしイミディエイト値で行われていました。そのため
A レジスタと B レジスタの値の AND
をとりたい場合は、どちらかのレジスタをメモリ上にストアしてから演算(この場合は
AND)を行わなければなりませんでした。 63C09
ではこれが解決されていてレジスタ同士の演算が可能になりました。これらは TFR や
EXG と同じレジスタアドレッシングを用います。<br />Math and logic operations on
the 6809 were mostly performed memory-to-register. Because of that, when you
want to take the AND of register A with register B, you had to store one or
the other register to memory first and then perform the operation (AND in this
case). This is solved in the 63C09 and direct register-to-register math
operations are possible. These use the same register addressing used by TFR
and EXG.<br /><br /> レジスタ間演算命令は以下のようなものがあります。<br />Register-to-register
operations include the following instructions:<br /><br /> ADDR, ADCR, SUBR,
SBCR,<br /> ANDR, ORR, EORR, CMPR<br />
</p>
<p><br /></p>
<p>
ブロック転送命令<br />Block Move/Transfers<br />~~~~~~~~~~~~~~~~<br /><br /> 6809
でメモリ上のデータを移動させるときは、一度そのデータをレジスタにロードしてきては、それをセーブするということを繰り返して行っていました。これはこれでよいのですが、問題はその処理にかかる時間です。そこで
Z80 や 8086 などにもあるブロック転送命令が、 63C09 にも設けられています。<br />Moving
data in memory with the 6809 involves repeatedly loading part of it into a
register and then saving it. This works, but has the problem of consuming
processing time. This is why the 63C09 includes block moves like such
processors as the Z80 and 8086 have.<br /><br />--------<br />Oh!FM 1988-4
75<br />Oh! FM 1988-4 p. 75<br />--------<br />[Oh!FM 1988-4: p.76]<br /><br /> ブロック転送命令では、転送元アドレス(ソース)、転送先アドレス(ディスティネーション)の指定に
16ビットレジスタの D/X/Y/U/S
レジスタの中から1〜2使います。レジスタの指定にはポストバイトを使い、その形式はレジスタアドレッシングの形式をとります。また、ソースとディスティネーションを同じレジスタでも指定できます。転送するバイト数のカウントには
W レジスタを使います。<br />As the source and/or destination of block moves,
one or two of the 16-bit registers D/X/Y/U/S can be specified. Register
specification uses a postbyte of the same format as register addressing. The
same register can be specified as both the source and destination. The count
of bytes to transfer is specified by the W register.<br /><br /> 転送方法には4種類あり、正方向(TFM
r0+,r1+)/逆方向(TFM r0-,r1-)の通常のブロック転送のほか、 I/O
ポート等のアドレスにデータを次々と流し込むもの(TFM
r0+,r1)、指定ブロックを指定値で塗りつぶすもの(TFM r0,r1+)があります。<br />There
are four modes of transfer; in addition to forward (TFM r0+,r1+) and reverse
(TFM r0-,r1-), a mode that pours data one byte after another into a single
address such as that of an I/O port (TFM r0+,r1), and a mode that can paint
the specified block of data with a specified value (TFM r0,r1+) [sic].
</p>
<p> </p>
<p>
乗算/除算命令<br />Multiply/Divide Instructions<br />~~~~~~~~~~~~~~<br /><br /> 6809
には MUL という 8×8ビットの乗算命令がありましたが、これは A レジスタと B
レジスタの値を掛け合わせるだけのものでした。 63C09
で設けられた16×16ビット乗算命令(MULD)では、いろいろなアドレシングモードが使え、追加というよりは新設に近いものです。<br />The
6809 has one 8 by 8 bit multiply instruction, MUL, which can only multiply the
contents of the A and B registers. The 63C09's 16 by 16 bit multiply, MULD,
can use a variety of addressing modes. More than an extension of existing
instructions, it is a newly added instruction.<br /><br /> また、 63C09
にはそれに加えて16÷8ビット除算(DIVD)、32÷16ビット除算(DIVQ)が設けられて、これらも、いろいろなアドレシングモードが使えるようになっています。<br />Additionally,
the 63C09 has a 16 by 8 bit divide, DIVD, and a 32 by 16 bit divide, DIVQ,
both of which can use a variety of addressing modes.
</p>
<p></p>
<p>
<br />ビット操作命令(6301 コンパチ命令)<br />Bit [Pattern] Operation (6301
Compatibility) Instructions<br />~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br /><br /> 日立の
HD6301 には、 6801 の拡張命令としてビット操作命令が新設されていましたが、同じ
63 シリーズの 63C09
にも同じ命令があります。これらの命令はイミディエイトデータとメモリの内容を論理演算して、結果をメモリに戻したり、関連コンディションコードを変化させてしまうので、ビットパターンを操作するときなどに重宝します。<br />Hitachi's
HD6301 has, as extensions to the 6801 instruction set, new bit operation
instructions. The 63C09, a member of the same 63 series, has the same. These
instructions are of great value, performing logical operations directly
between memory and immediate data, returning the results to memory, and
setting the relevant condition codes accordingly.<br /><br /> 行える論理演算には、論理積(AIM)、論理和(OIM)排他的論理和(EIM)、論理積コンディションコード(TIM)があります。オブジェクトの構成は、<br /><br /> <命令コード>、<ビットの位置>、<オペランド><br /><br />の順になっています。<br />Operations
available are logical _and_ (AIM), logical _or_ (OIM), logical _exclusive_or_
(EIM), and bit test-to-condition-codes (TIM). The object code has the
structure<br /><br /> <op-code>, <bit position> [sic:
pattern], <operand><br /><br />in that order.<br /><br /> これらの命令を使うと、
6301 の命令はマクロアセンブラを用いれば 63C09 上で実行可能になります。つまり
OASYS Lite 等の組み込みプログラムを FM
上で動かすことができるかも知れないわけで、そういう意味でも美味しい命令なのです(もっとも、その前に根性で
ROM を逆アセンブルしなければなりませんが)。<br />Using these instructions, it
becomes possible by means of a macro-assembler to execute 6301 instructions on
the 63C09. In other words, it may be possible to run embedded [sic] software
such as OASYS Lite programs. In this sense, these are very tasty instructions.
(Of course, you'll first have to get tough and disassemble the ROM.) <br />
</p>
<p> </p>
<p>
ビット演算/転送命令<br />Bit Calculation/Extraction Instructions<br />~~~~~~~~~~~~~~~~~~~~<br /><br /> 63C09
には、多分に I/O
を意識したビット演算/転送命令が存在しています。これらの命令は、アドレシングモードにダイレクトモードしかサポートしていない難はありますが、使い慣れれば便利に使えるでしょう。動作は、ダイレクトページの
LABEL のビット n と REG レジスタのビット m を論理演算して、 REG
レジスタに入れるものがほとんどです。ビット演算/転送命令には以下のようなものがあります。<br />The
63C09 has [individual] bit calculation and extraction instructions, probably
intended for I/O. These instructions have the deficiency of only supporting
direct mode addressing, but with a little practice, should be useful. They
perform logical operations on bit n of direct mode address LABEL and bit m of
register REG, most of them leaving their result in REG. The following bit
calculation/extraction instructions are supported:<br /><br /> BAND, BOR,
BEOR, BIAND,<br /> BIOR, BIEOR, LDBT, STBT<br /><br />オブジェクトの構成は<br /><br /> <命令コード
($11,$xx)>、<ポストバイト>、<オペランド><br /><br />と、かなり変則な構成をとります。オペランドはダイレクトアドレシングのみです。また、ポストバイトは特殊な形式をとります。<br />Object
code follows the following rather irregular structure:<br /><br /> <op-code
($11,$xx)>, <post-byte>, <operand><br /><br />The operand is
direct page addressing only, and the post-byte has its own peculiar format.
<br /><i>[Note that the post-byte format is not explained further in the article.
Again, see DeKok and Simpson's technical pages at </i>
</p>
<p>
<i><a href="http://www.sandelman.ottawa.on.ca/People/Alan_DeKok/interests/6309.techref.html">http://www.sandelman.ottawa.on.ca/People/Alan_DeKok/interests/6309.techref.html</a>
.]</i><br />
</p>
<p><br /></p>
<p>
その他の命令<br />Other Instructions<br />~~~~~~~~~~~~<br /><br /> その他の命令としては、先ずモード切り換え命令があります。といっても、エミュレートモードからネイティブモードへの移行は、
MD レジスタのビット 0 (LSB) に1を書き込むことによって行われますので、 MD
レジスタに対する普通の LD 命令を使います。<br />Among the other instructions
is the mode change instruction mentioned above. Or, rather, shifting from
emulation mode to native mode is accomplished by writing a 1 to bit 0 of the
MD register, just using the LD for the MD register.<br /><br /> 次にトラップがかかったとき、未定義命令でかかったのか除算のエラーで起こったのかを調べる命令
BITMD があります。これは、 MD レジスタのステータスビット(ビット 7 or
6)を調べ、どちらでトラップがかかったのかを知らせます。ただし、この命令を実行すると
MD レジスタのステータスビット(ビット 7 and
6)はクリアされますので、未定義コードトラップか、 Divide by Zero
トラップかは、一度きりしか調べることはできません。<br />Next, when a trap is
taken, to determine whether the source of the trap was an unimplemented
instruction or a divide error, we have BITMD. This investigates the status
bits (bit 7 or 6) to tell us the cause of the trap. However, when you execute
this instruction, the status bits (bits 7 and 6) are cleared so you can only
check whether the source was the undefined instruction trap or the Divide by
Zero trap once.<br /><br /> そして、スタックに関するものがあります。 63C09
ではレジスタが増設されていますが、現在の PSHS/PSHU
ではそれらをスタックにセーブすることはできません。なぜなら、PSHS/PSHU および
PULS/PULU
のポストバイトが、すでにすべて割り当てられていて追加の余地がないということです。そこで、
63C09 では増設されたレジスタへのスタック操作は別命令の<br />Finally, we have
stack instructions. The 63C09 has additional registers, but the original
PSHS/PSHU cannot be used to save any of them. The post-byte for PSHS/PSHU and
PULS/PULU is already completely allocated, leaving no room for additional
registers. For that purpose, the 63C09 has the following new stack
instructions: <br /><br /> PSHSW, PULSW, PSHUW, PULUW<br /><br />を使います。ただし、これは
W
レジスタに対するもののみしかありません。よって、これらはポストバイトをもたないインヘレンとアドレシングのみです。<br />But
there are only stack instructions for the W register, so these instructions
operate in inherent mode and do not have a post-byte.<br />
</p>
<p><br /></p>
<p>
********<br />おわりに<br />Wrapping Up<br /><br /> 以上 63C09
に隠されていた機能の大筋を説明してきました。 6809
に+αで追加してほしかった機能がほぼ盛り込まれており、
6809派にはひさびさの好ニュースといえます。ただ、惜しむらくは登場時期が遅かったことで、そのため活躍の場がパソコンの改造か、産業用ワンボードマイコン程度に限られてしまったことです。ゲームパソコンも
68000系や 8086系、 65816 といった16ビットCPUを使い始めている現状では、
63C09
を積んだパソコンをメーカーが出荷することは、まずありえないことでしょう。当面は市場アプリに依存しない
FM-11等の改造にしか使えないというのは残念なことです。<br />We have given a
general outline of the hidden functionality of the 63C09 above. It contains a
wealth of 6809-plus-alpha extensions, and is a bit of welcome, if long-awaited
news for fans of the 6809. But it is regretable that the news is rather
untimely, and the places where it can be put to good use are now pretty much
limited to customizing existing personal computers and designing industrial
one-board microcomputers. In the current market where game computers are now
moving on to the 68000, 8086, and 65816 class 16-bit CPUs, there really won't
be any personal computer shipping with a 63C09. It is unfortunate that the
only place to really use it now is in customizing such computers as the FM-11,
which do not have to depend on packaged applications from the marketplace.<br /><br /> なお、この稿をまとめるにあたっては、私が解析した資料のほか、
63C09 解析委員会の仲間(とくに Gigo氏と
Miyazaki氏)が解析された資料を参照させていただきました。 63C09
解析委員会関係者のご協力に感謝いたします。<br />In addition to material I have
produced myself, I have referenced material produced by members of the 63C09
Survey Committee (especially Mr. Gigo and Mr. Miyazaki). I offer my gratitude
for the cooperation of all who have participated in the 63C09 Survey
Committee.<br /><i>[Translator's note: Not having any clues about gender, I am by default
assuming Gigo-shi and Miyazaki-shi are male -- I could well be wrong.]</i><br />
</p>
<p><br /></p>
<p>
--------------------------------<br /><参考文献><br /><References><br /><br />・63C09解析委員会、「お年玉プレゼント
63C09 に隠し機能があった」など、NANNO−NET、1988年1月1日〜<br />*
63C09 Survey Committee -- "A New Years Gift: There hidden functions in the
63C09" and other material on <i>NANNO-NET</i>, from Jan 1, 1988.
<br />・モトローラ、「MC6809-MC6809Eマイクロプロセッサプログラミングマニュアル」、CQ出版、1982年<br />*
Motorola, <i>MC6809-MC6809E Microprocessor Programming Manual</i> [Japanese
edition], CQ Shuppan [CQ Publishing], 1982<br />・「6809
インストラクションポケットブック」、Oh!FM 1983年第4号、日本ソフトバンク<br />*
"6809 Instruction Pocket Reference", <i>Oh!FM</i> 1983-4, Softbank Japan<br />・水谷隆太、「6809の未定義命令」、I/O
1985年5月号、工学社<br />* Mizutani, Ryūta, "6809 Undefined Instructions",<i>
I/O</i>, May 1985, Kohgakusha<br />・原進、「FM-11 のクロックを 3MHz
に」、パソコンワールド 1987年1月号、ピーシーワールドジャパン<br />* Hara,
Susumu, "Overclocking the FM-11 to 3MHz", <i>Pasokon World</i>, Jan 1987,
P-C-World Japan
<i>[Note: Probably not the current PCWorld founded in 1995.]</i><br /><br />--------<br />76 Oh!FM 1988-4<br />p. 76 Oh!FM 1988-4<br />--------<br />[Oh!FM
1988-4: p.77]<br />
</p>
<p></p>
<p>
図3 63C09で増えた命令(灰色に塗られた部分[=>*囲*])<br />Table 3:
Instruction Extensions for the 63C09 (extensions in gray) <br /><i>[In the original, the extensions were shown in gray background. Here I have
put asterisks around them.]</i><br /> (<br />横の列は、上からプリバイトなし、プリバイト
$10付き、フリバイト $11付きの順に並んでいる。また、<br />Rows in the following
order: without pre-byte, with $10 pre-byte, and with $11 pre-byte. Also<br />各項目中の下段左側の数値はサイクル数で、カッコ内がネイティブモード時の値。右側は命令長。<br />on
the second line of each entry, the left number is the cycle count, with native
cycle count in parenthesis. The right number is the byte count.<br /> )<br /><br /><i>[原稿の編集に因る誤植在り。 Original contains typographical errors.]<br />[The
only Japanese word in the table is 「(なし)」 {"(nashi)" => "(none)"},
so it makes no real sense to repeat the table for each language. Refer to
the transcription at <br /><a href="https://defining-computers.blogspot.com/2022/05/transcription-of-article-on-63c09-in-oh-fm-1988-4-72.html">https://defining-computers.blogspot.com/2022/05/transcription-of-article-on-63c09-in-oh-fm-1988-4-72.html</a><br />if necessary.]</i><br />
</p>
<code><pre>=================================================================================================================================================================|
| DIRECT | | | REL |ACC A/D/E|ACC B/W/F| INDEXD | EXTEND | IMMED | DIRECT | INDEXD | EXTEND | IMMED | DIRECT | INDEXD | EXTEND |
|0000xxxx|0001xxxx|0010xxxx| 0011xxxx | 0100xxxx| 0101xxxx| 0110xxxx|0111xxxx|1000xxxx|1001xxxx| 1010xxxx|1011xxxx|1100xxxx|1101xxxx| 1110xxxx|1111xxxx|
| 0x | 1x | 2x | 3x | 4x | 5x | 6x | 7x | 8x | 9x | Ax | Bx | Cx | Dx | Ex | Fx |
=================================================================================================================================================================|
0000 0 | NEG | (PRE) | BRA | LEAX | NEGA | NEGB | NEG | NEG | SUBA | SUBA | SUBA | SUBA | SUBB | SUBB | SUBB | SUBB |
(none) | 6(5),2 | (BYTE1)| 3,2 | 4+,2+ | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* addr *|* negd *| | | |* subw *|* subw *|* subw *|* subw *| | | | |
($10) | | | | 4,3 | 3(2),2) | | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* band *| | | | |* sube *|* sube *|* sube *|* sube *|* subf *|* subf *|* subf *|* subf *|
($11) | | | | 7(6),4 | | | | | 3,3 | (4),3 | 5+,3+ | 6(5),4 | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
0001 1 |* oim *| (PRE) | BRN | LEAY | | |* oim *|* oim *| CMPA | CMPA | CMPA | CMPA | CMPB | CMPB | CMPB | CMPB |
(none) | 6,3 | (BYTE2)| 3,2 | 4+,2+ | | | 7+,3+ | 7,4 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBRN |* adcr *| | | | |* cmpw *|* cmpw *|* cmpw *|* cmpw *| | | | |
($10) | | | 5,4 | 4,3 | | | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* biand *| | | | |* cmpe *|* cmpe *|* cmpe *|* cmpe *|* cmpf *|* cmpf *|* cmpf *|* cmpf *|
($11) | | | | 7(6),4 | | | | | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
0010 2 |* aim *| NOP | BHI | LEAS | | |* aim *|* aim *| SBCA | SBCA | SBCA | SBCA | SBCB | SBCB | SBCB | SBCB |
(none) | 6,3 | 2(1),1 | 3,2 | 4+,2+ | | | 7+,3+ | 7,4 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBHI |* subr *| | | | |* sbcd *|* sbcd *|* sbcd *|* sbcd *| | | | |
($10) | | |5/6(5),4| 4,3 | | | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* bor *| | | | | | | | | | | | |
($11) | | | | 7(6),4 | | | | | | | | | | | | |
=================================================================================================================================================================|
0011 3 | COM | SYNC | BLS | LEAU | COMA | COMB | COM | COM | SUBD | SUBD | SUBD | SUBD | ADDD | ADDD | ADDD | ADDD |
(none) | 6(5),2 | 2,1 | 3,2 | 4+,2+ | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 4(3),3 | 6(4),2 |6+(5+),2+| 7(5),3 | 4(3),3 | 6(4),2 |6+(5+),2+| 7(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBLS |* sbcr *|* comd *|* comw *| | | CMPD | CMPD | CMPD | CMPD | | | | |
($10) | | |5/6(5),4| 4,3 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* bior *|* come *|* comf *| | | CMPU | CMPU | CMPU | CMPU | | | | |
($11) | | | | 7(6),4 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
=================================================================================================================================================================|
0100 4 | LSR | sexw | BHS/BCC| PSHS | LSRA | LSRB | LSR | LSR | ANDA | ANDA | ANDA | ANDA | ANDB | ANDB | ANDB | ANDB |
(none) | 6(5),2 | 4,1 | 3,2 | 5+(4+),2 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | |LBHS/BCC|* andr *|* lsrd *|* lsrw *| | |* andd *|* andd *|* andd *|* andd *| | | | |
($10) | | |5/6(5),4| 4,3 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* beor *| | | | | | | | | | | | |
($11) | | | | 7(6),4 | | | | | | | | | | | | |
=================================================================================================================================================================|
0101 5 | eim | | BLO/BCS| PULS | | |* eim *|* eim *| BITA | BITA | BITA | BITA | BITB | BITB | BITB | BITB |
(none) | 6,3 | | 3,2 | 5+(4+),2 | | | 7+,3+ | 7,4 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | |LBLU/BCS|* orr *| | | | |* bitd *|* bitd *|* bitd *|* bitd *| | | | |
($10) | | |5/6(5),4| 4,3 | | | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* bieor *| | | | | | | | | | | | |
($11) | | | | 7(6),4 | | | | | | | | | | | | |
=================================================================================================================================================================|
0110 6 | ROR | LBRA | BNE | PSHU | RORA | RORB | ROR | ROR | LDA | LDA | LDA | LDA | LDB | LDB | LDB | LDB |
(none) | 6(5),2 | 5(4),3 | 3,2 | 5+(4+),2 | 2(1),1 | 2(1),1 | 6+,2+ | 2,2 | 7(6),3 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBNE |* eorr *|* rord *|* rorw *| | |* ldw *|* ldw *|* ldw *|* ldw *| | | | |
($10) | | |5/6(5),4| 4,3 | 3(2),2 | 3(2),2 | | | 4,4 | 6(5),3 | 6+,3+ | 7(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* ldbt *| | | | |* lde *|* lde *|* lde *|* lde *|* ldf *|* ldf *|* ldf *|* ldf *|
($11) | | | | 7(6),4 | | | | | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
0111 7 | ASR | LBSR | BEQ | PULU | ASRA | ASRB | ASR | ASR | | STA | STA | STA | | STB | STB | STB |
(none) | 6(5),2 | 9(7),2 | 3,2 | 5+(4+),2 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | | 4(3),2 | 4+,2+ | 5(4),3 | | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBEQ |* cmpr *|* asrd *| | | | |* stw *|* stw *|* stw *| | | | |
($10) | | |5/6(5),4| 4,3 | 3(2),2 | | | | | 6(5),3 | 6+,3+ | 7(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* stbt *| | | | | |* ste *|* ste *|* ste *| |* stf *|* stf *|* stf *|
($11) | | | | 8(7),4 | | | | | | 5(4),3 | 5+,3+ | 6(5),4 | | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
1000 8 | ASL/LSL| | BVC | |ASLA/LSLA|ASLB/LSLB| ASL/LSL | ASL/LSL| EORA | EORA | EORA | EORA | EORB | EORB | EORB | EORB |
(none) | 6(5),2 | | 3,2 | | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBVC |* pshsw *|* asld *| | | |* eord *|* eord *|* eord *|* eord *| | | | |
($10) | | |5/6(5),4| 5/6,2 | 3(2),2 | | | | 5(4),4 | 5(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |*tfm(r+,r+)*| | | | | | | | | | | | |
($11) | | | | 6+3n,3 | | | | | | | | | | | | |
=================================================================================================================================================================|
1001 9 | ROL | DAA | BVS | RTS | ROLA | ROLB | ROL | ROL | ADCA | ADCA | ADCA | ADCA | ADCB | ADCB | ADCB | ADCB |
(none) | 6(5),2 | 2(1),1 | 3,2 | 5(4),1 | 2(1),1 | 2(1),1 | 6+,2+ | 2,2 | 7(6),3 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBVS |* pulsw *|* rold *|* rolw *| | |* adcd *|* adcd *|* adcd *|* adcd *| | | | |
($10) | | |5/6(5),4| 6,2 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |*tfm(r-,r-)*| | | | | | | | | | | | |
($11) | | | | 6+3n,3 | | | | | | | | | | | | |
=================================================================================================================================================================|
1010 A | DEC | ORCC | BPL | ABX | DECA | DECB | DEC | DEC | ORA | ORA | ORA | ORA | ORB | ORB | ORB | ORB |
(none) | 6(5),2 | 3(2),2 | 3,2 | 3(1),1 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBPL |* pshuw *|* decd *|* decw *| | |* ord *|* crd *|* ord *|* ord *| | | | |
($10) | | |5/6(5),4| 6,2 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* tfm(r+,r)*|* dece *|* decf *| | | | | | | | | | |
($11) | | | | 6+3n,3 | 3(2),2 | 3(2),2 | | | | | | | | | | |
=================================================================================================================================================================|
1011 B |* tim *| | BMI | RTI | | |* tim *|* tim *| ADDA | ADDA | ADDA | ADDA | ADDB | ADDB | ADDB | ADDB |
(none) | 4,3 | | 3,2 | 6/15(17),1 | | | 5+,3+ | 5,4 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBMI |* puluw *| | | | |* addw *|* addw *|* addw *|* addw *| | | | |
($10) | | |5/6(5),4| 4,3 | | | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* tfm(r,r+)*| | | | |* adde *|* adde *|* adde *|* adde *|* addf *|* addf *|* addf *|* addf *|
($11) | | | | 6+3n,3 | | | | | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
1100 C | INC | ANDCC | BGE | CWAI | INCA | INCB | INC | INC | CMPX | CMPX | CMPX | CMPX | LDD | LDD | LDD | LDD |
(none) | 6(5),2 | 3(2),2 | 3,2 | 20(22),2 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 4(3),3 | 6(4),2 |6+(5+),2 | 7(5),3)| 3,3 | 5(4),2 | 5+,2+ | 6(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBGE | |* incd *|* incw *| | | CMPY | CMPY | CMPY | CMPY | |* ldq *|* ldq *|* ldq *|
($10) | | |5/6(5),4| | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | 8(7),3 | 8+,3+ | 9(8),4 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* BITMD *|* ince *|* incf *| | | CMPS | CMPS | CMPS | CMPS | | | | |
($11) | | | | 4,3 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
=================================================================================================================================================================|
1101 D | TST | SEX | BLT | MUL | TSTA | TSTB | TST | TST | BSR | JSR | JSR | JSR |* ldq *| STD | STD | STD |
(none) | 6(4),2 | 2(1),1 | 3,2 | 11(10),1 | 2(1),1 | 2(1),1 |6+(5+),2+| 7(6),3 | 7(6),2 | 7(6),2 | 7+(6+),2| 8(6),4 | 5,5 | 5(4),2 | 5+,2+ | 6(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBLT | |* tstd *|* tstw *| | | | | | | |* stq *|* stq *|* stq *|
($10) | | |5/6(5),4| | 3(2),2 | 3(2),2 | | | | | | | | 8(7),3 | 8+,3+ | 9(8),4 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* ldmd *|* tste *|* tstf *| | |* divd *|* divd *|* divd *|* divd *| | | | |
($11) | | | | 5,3 | 3(2),2 | 3(2),2 | | | 25,3 |27(26),3| 27+,3+ |28(27),4| | | | |
=================================================================================================================================================================|
1110 E | JMP | EXG | BGT | | | | JMP | JMP | LDX | LDX | LDX | LDX | LDU | LDU | LDU | LDU |
(none) | 3(2),2 | 8(5),2 | 3,2 | | | | 3+,2+ | 4(3),3 | 3,3 | 5(4),2 | 5+,2+ | 6(5),3 | 3,3 | 5(4),2 | 5+,2+ | 6(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBGT | | | | | | LDY | LDY | LDY | LDY | LDS | LDS | LDS | LDS |
($10) | | |5/6(5),4| | | | | | 4,4 | 6(5),3 |6+(6+),3+| 7(6),4 | 4,4 | 6(5),3 |6+(6+),3+| 7(6),4 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | | | | | | |* divq *|* divq *|* divq *|* divq *| | | | |
($11) | | | | | | | | | 34,4 |36(35),3| 36+,3+ |37(36),4| | | | |
=================================================================================================================================================================|
1111 F | CLR | TFR | BLE | SWI | CLRA | CLRB | CLR | CLR | | STX | STX | STX | | STU | STU | STU |
(none) | 6(5),2 | 6(4),2 | 3,2 | 19(21),1 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | | 5(4),2 | 5+,2+ | 6(5),3 | | 5(4),2 | 5+,2+ | 6(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBLE | SWI2 |* clrD *|* clrw *| | | | STY | STY | STY | | STS | STS | STS |
($10) | | |5/6(5),4| 20(22),2 | 3(2),2 | 3(2),2 | | | | 6(5),3 |6+(6+),3+| 7(6),4 | | 6(5),3 |6+(6+),3+| 7(6),4 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | | SWI3 |* clre *|* clrf *| | |* muld *|* muld *|* muld *|* muld *| | | | |
($11) | | | | 20(22),2 | 3(2),2 | 3(2),2 | | | 28,4 |30(29),3| 30+,3+ |31(30),4| | | | |
=================================================================================================================================================================|
</pre></code>
<p><i>[** These are the typographical errors I noticed in the above table: ]<br />[** Several other places appear to be errors and need to be checked.]<br />[*1* Cycle count for sube direct mode ($1190) is clearly a typo, see subf: (4),3 => 5(4),3 .]<br />[*2* Long branches with dual mnemonics (LBHS/LBCC=$1024 and LBLO/LBCS=$1025) abbreviate the second mnemonic in the original to fit the table.]<br />[*3* First mnemonic (LBLU) for LBLO/LBCS ($1025) is a typo.]<br />[*4* The cycle and byte counts for ROR extended ($76) and LDA immediate ($86) are swapped: ROR extended => 7(6),3 and LDA immediate => 2,2 .] <br />[*5* The byte count for LBSR ($17) is a typo: 2=>3 .]<br />[*6* The cycle and byte counts for ROL extended ($79) and ADCA immediate ($89) are swapped: ROL extended => 7(6),3 and ADCA immediate => 2,2 .] <br />[*7* The mnemonic for (crd) for ord direct ($109A) is a typo.]<br />[*8* The cycle and byte counts for puluw ($103B) appear to be a typo.]<br />[*9* The byte count for CMPX indexed ($AC) is a typo: 2=>2+ .]<br />[*10* The op-code at which ldq immediate is shown ($CD) looks maybe out-of-place (=>$10CC?), needs to be checked.]<br />[*11* The cycle counts for LDY, STY, LDS, and STS indexed ($10AE, $10EE, $11AE, $11EE) look odd: 6+(6+), need to be checked.]<br />[*12* The mnemonic for clrd ($104F) is a typo: clrD .]</i><br /><br /><br />--------<br />Oh!FM 1988-4 77<br />Oh! FM 1988-4 p. 77<br /><br /></p>
<p><br /></p>
<hr /><br />
<hr /><br />
<p>
<i>[Google Translate output below (Yes, it turns into mush pretty quick.):]</i><br />
</p>
<p>
16-bit multiplication / division / register-to-register operation / block
transfer is possible<br />Super 8-bit class MPU<br />Search for extended
functions of 63C09<br />63C09 Analysis Committee UNO<br /><br />There is an
LSI called 63C09 in the minor change version of 6809. Some users who are
strong in hardware have bought the high speed and used it to modify the main
body. However, I recently found out that various extensions were hidden in
this 63C09. Here, we would like to ask the "63C09 Analysis Committee" who has
discovered and explored these functions to report the outline.<br /><br />In
addition, if the main body is modified, especially the CPU is replaced, the
manufacturer's repair is not guaranteed, and other peripheral LSIs and
peripheral devices may have to be replaced. In addition, 63C09 is a part of
the conventional software (6C09). Or many) may not work, so I would not
recommend it to anyone other than those who are very confident in their
skills, such as "I only use my own programs!".<br /><br /><br />******************<br />High-speed
version of 6809 63C09<br /><br />There are various ways to enjoy a personal
computer, but there is a hardware modification as a play that is only allowed
for the soldiers of that line. Various modifications have been made since
ancient times, but among them, the speedup of the CPU is widespread among
program self-made groups who do not care about commercial software because the
processing power is significantly improved and it can be done relatively
easily. It has been tried. In the old days, he changed his 68A09 (1.2 MHz)
loaded on FM-8 to 68B09 (2 MHz) and gave it the same processing speed as FM-7.
Clocking up and running at 2.5-4 MHz is a typical example.<br /><br />Some
people may think, "Well, 2.5 ~ 4MHz was such a fast 6809", but it actually
exists. Hitachi's 63C09 MPU sold at chip stores in Akihabara and Nihonbashi is
the C-MOS version of the 6809, which runs at 3 MHz. If you follow the normal,
he will have 1.5 times the processing power of 68B09 (2 MHz), and if you
select and use the one that operates at 4 MHz, which is out of the standard,
it will be twice the processing power.<br /><br />He uses the 63C09 to speed
things up, so it's basically as simple as swapping the CPU and clock, but
sometimes that's not enough. For models such as FM-8, 7, 77D1 / D2 / L4, 77AV,
11 where the CPU is plugged into the socket, simply replace it, but like
FM-NEW7, 77AV20 / 40 / 20EX / 40EX. For models that are soldered directly to
the board, the CPU will not pull out unless you have a considerable amount of
skill. Also, if the clock is up, peripheral LSIs and peripheral devices may
not be able to catch up, in which case they must be replaced, but it will be
difficult if custom LSIs are frequently used such as AV systems. It is quite
difficult to clock up with a 7 / AV system that operates at a delicate timing,
and even though the clock is up, the built-in timer and software timing are
used for protection and other processing at the end. Most of the commercially
available application software will be useless.<br /><br />By the way, the
speedup is accompanied by these various difficulties, but the result is more
than compensated for the hardship.<br /><br />By the way, taking the case of
FM-11 as an example, it seems that the limit is about 2.5 to 3 MHz if only the
CPU is replaced, and if you want more than that, it seems that you need to
replace some peripheral LSIs. The 11 can also speed up the subsystem, up to 4
MHz with some modifications. In particular, the display speed of his FM-11
subsystem, which has been converted to 4 MHz, is remarkable, and the display
speed of Chinese characters is not much different from that of FM16β, which
has a Chinese character VRAM.<br /><br /><br />*************<br />Discovery of
extensions<br /><br />So, the veteran heroes around me made speedup
improvements to his FM-11 one after another, but there was one strange
problem.<br /><br />Comas word processor WPV3 has stopped working. At first I
thought it was a soft timing issue or something mentioned above, but to my
surprise, he dropped the clock to 2 MHz and it didn't work.<br /><br />Then,
the cause investigation began with his friend Gigo as the center, but soon it
turned out that he was caught by an undefined order of 6809. An undefined
instruction is an instruction that is not defined in the manual published by
the manufacturer, and although it is supposed that even if such an instruction
is used, it does not work at all, but it is actually hidden. Sometimes it is
an order. A long time ago, computer magazines often contained analysis
articles on hidden instructions of CPUs made by various companies. like
this<br /><br />--------<br />72 Oh! FM 1988-4 Warning: If you replace the CPU
with 63C09, a lot of commercial applications (especially games) will not
work.<br />--------<br />[Oh! FM 1988-4: p.73]<br /><br />There was a hidden
order in 6809, though it wasn't that big of a deal. Now, the fact that he
treats undefined instructions differently between 63C09 and 6809 means that
hidden instructions may also be different. Considering the shipping start time
of 63C09 (Autumn 1985), it was natural that some functions were added, and I
was expecting "maybe".<br /><br />One of the code that was stuck was "$ 1F, $
62". The instruction itself is familiar to him in his TFR (data transfer
instruction between registers), but he is instructed to transfer from an
undefined register to the Y register. In the case of 6809, $ FFFF is returned
in the Y register because it is undefined, but in the case of 63C09, the Y
register contains a messed up [value]. As an attempt, when I transferred the
data from the Y register to the undefined register number and then returned
from the undefined register number to the Y register, the original data
remained properly ... In other words, the undefined register number in 63C09
wasn't too undefined, it was a number pointing to a real register!<br /><br />Discovering
the hidden register, his Gigo was ecstasy and ecstasyly called his friend. And
it was too late for me at night [dead of the night] ... The content of the
story is "63C09 has an extra register. There must be an instruction because
there is a register. Let's find out by hand with everyone", then the day to
search for hidden instructions begins, and the disassembly table The
monotonous task of writing unallocated code in memory with a debugger, setting
breakpoints, setting values in registers with TFR and executing them, and
looking at the contents of the registers was repeated. While exchanging
information on the results found that day through personal computer
communication, enthusiasts who are doing his OS-9 on FM-11, which is almost
sick, gathered, "63C09 Analysis Committee" Group naturally occurred.<br /><br />Mania's
obsession was horrifying, and soon the outline of his 36C09 extension was
revealed. To outline<br />・ Three types of registers have been added, one of
which can be used as an accumulator and index register.<br />・ Instructions
such as 32 ÷ 16 bit division, 16 ÷ 8 bit division, 16 × 16 bit multiplication,
register-to-register operation, bit operation, and block transfer have been
expanded.<br />・ Trap is applied when an undefined instruction is
detected.<br />・ It has two operation modes, 6809 compatible mode and 63C09
original mode.<br />By the way, he has improved the inconveniences and
weaknesses of the 6809 considerably, and also includes powerful functions that
do not seem to be an 8-bit MPU.<br /><br />These analysis results have been
uploaded to several BBSs, starting with NANNO-NET. We received a lot of
feedback from many networkers, but in order for more people to know the whole
picture of "8-bit MPU over 8 bits" 63C09 [zenbo = entire body], Oh! FM I will
borrow it from the magazine and report it.<br /><br /><br />*******************<br />63C09<br />merit
and demerit<br /><br />The HD63C09 sold by Hitachi is Motorola's MC6809 and
pin-compatible 8-bit MPU. The MPU specification is a form of the 6809 with
extensions added, and is a high-level compatibility of the 6809 (except for
undefined instructions). Although it has not been officially announced by
Hitachi, the processing power of the 6809 PC can be greatly improved by taking
advantage of the extended functions of the 63C09.<br /><br />The merit of
replacing the MPU of the 6809 personal computer with 63C09 is the improvement
of processing capacity. The following three points can be considered as the
factors.<br /><br />1 High-speed clock<br />2 Extension instruction /
extension register<br />3 Native mode<br /><br />1 is a matter of course, and
if you raise the operating clock of the MPU, the execution speed of the
software will increase. Conventional software can be operated at high speed,
except for some parts that are caught due to undefined instruction traps and
software timers. The rate of increase of the operating clock varies depending
on the hardware, and in some cases it can hardly be increased.<br /><br />2 is
effective when writing new software or applying a patch to existing software.
With the conventional 6809, the equivalent of the processing expressed by the
macro function of the assembler can now be written with one instruction of his
63C09, and the machine cycle can be shortened.<br /><br />3 is caused by the
instruction execution cycle being shorter than in the normal emulation mode
when switching to the 63C09 specific mode. Using this mode, even with the same
operating clock, the execution speed will be up to 20 to 30% faster than usual
(the effect will differ depending on the addressing mode). However, there are
differences in operation due to stacking and interrupts, so in order to use
native mode, it is necessary to rewrite a part of the system such as F-BASIC
or his OS-9.<br /><br />On the contrary, the disadvantages are that not only
the software that uses the undefined instructions of 6809 but also the
commercial software that uses many built-in timers and others will
malfunction, and the extended function of 63C09. In order to utilize them, it
is necessary to have a certain level of skill to create development tools to
utilize them, and it can be said that the difficulty for Novice is that it is
difficult.<br /><br />Comparing the merits and demerits, at present, people
who just lack the program can benefit from it, but ordinary game users and app
users should never touch it.<br /><br />Then, I will explain the extended
functions of 63C09 in order below.
</p>
<p>
************<br />Expansion register<br /><br />In 63C09, the number of
registers has increased by 3 from 6809 (Fig. 1). Two are 16-bit registers and
the other is an 8-bit mode status register.<br /><br /><br />W register [16
bits]<br />~~~~~~~~~~~~~~~~~~~~<br />A 16-bit register that can be used both
as an accumulator and as an index register.<br /><br />When used as an
accumulator, 16 bits<br /><br />--------<br />Oh! FM 1988-4 73<br />--------<br />[Oh!
FM 1988-4: p.74]<br /><br />It can also be used as a register by dividing it
into two 8-bit registers (E / F registers). It's just like adding another set
of existing D / A / B registers. However, some instructions, such as AND / OR,
cannot be used with W / E / F registers.<br /><br />Also, it can be used as a
32-bit register (Q register) by concatenating it with the existing D register,
and is used for multiplication and division.<br /><br />When using it as an
index register, use it in the same way as the existing X / Y register. In this
case, use a bit pattern that was not used for postbytes in 6809. There is no
accumulator offset, 5-bit offset, or 8-bit constant offset when using the W
register as an index register.<br /><br />Also, as a characteristic usage,
there is a method of using it as a counter register in block transfer.<br /><br />V
register [16 bits]<br />~~~~~~~~~~~~~~~~~~~~<br />Instructions that use the V
register are limited to inter-register operation instructions and TFR. The
feature of the V register is that the value of the register does not change
even if the MPU is reset. This register will be useful for the purpose of
holding constants etc. in the OS etc.<br />
</p>
<pre> MD register [8 bits]
~~~~~~~~~~~~~~~~~~~~
Abbreviation for mode / status bit register, which is used to display
the modes and statuses that have increased in 63C09, such as
error detection during division execution,
operation check of undefined instruction traps,
and operation mode settings.
The meaning of each bit is as follows:
・ 1 is set when bit 7 R division is divided by 0
・ Bit 6 R 1 is set when fetching an undefined instruction
・ Bit 1 W Register save mode setting bit during FIRQ
At 0-> FIRQ, only PC to CC is saved to the stack
Evacuate all registers at 1-> FIRQ
・ Bit 0 W Operation mode setting bit
0-> emulated mode
1-> Native mode
At reset, all bits are set to 0.
</pre>
<p></p>
<p>
Figure 1 63C09 register configuration<br />__________________________________________________<br />|
---------- Q ---------- |<br />| ---- D ---- | | ---- W ---- |<br />| -A- | |
-B- | | -E- | | -F- | Accumulator<br />----------- | ---- X ---- | X index
register<br />----------- | ---- Y ---- | Y index register<br />----------- |
---- U ---- | User stack pointer<br />----------- | ---- S ---- | System stack
pointer<br />----------- | --- PC ---- | Program counter<br />----------- |
---- V ---- | V (alue) register<br />----------- | --- DP ---- | Direct page
register<br />----------------- | CC- | Condition code register<br />-----------------
| MD- | Mode / Status Register<br />__________________________________________________<br />
</p>
<p><br /></p>
<p>
*********<br />action mode<br /><br />63C09 has two operation modes. One is an
emulated mode that is compatible with the 6809, and the other is a native mode
that brings out the original functions of his 63C09.<br /><br />Speaking of
which, some people may think that "extended registers and extended
instructions can be used in native mode, and cannot be used in emulated mode",
but that is a mistake. The 63C09 accepts extended registers and extended
instructions in either mode.<br /><br />The difference between emulated mode
and native mode is the handling of the stack at the time of interaction. When
an interrupt is applied, the contents of the register are saved on the stack.
At that time, the emulated mode saves only the conventional register, and the
native mode saves the W register of the extended register as well.<br /><br />Immediately
after resetting 63C09, it is set to emulate mode. In this mode he has to be
careful when using extended registers for multitasking instead of the 6809
software working fine. For example, consider the case of using the block
transfer instruction TFM that uses the W register of the extension register as
a counter. Let's say that in task A he interrupts while using the TFM
instruction and moves to task B. If the W register is used in task B at that
time, the contents of the W register will be changed when returning to the
original task A, which will cause a malfunction (of course, when using it in a
single task or multi). Even for tasks, there is no problem when the expansion
register is used for only one task). Therefore, in emulated mode, you must
disable interrupts each time you use an extended register, save this register
on the stack, and then switch to another task.<br /><br />This makes it
difficult to use expansion registers from high-speed games and OS-9, and also
causes difficult-to-use expansion instructions. Therefore, there is a native
mode, which is a mode that assumes the use of extended registers and extended
instructions. Interrupts in this mode save registers on the stack in the order
of PC, U, Y, X, DP, W, D, CC and enter interrupt processing. Note that W is
between DP and D. This means that D and W are pushed onto the stack as his
32-bit register pair Q.<br /><br />Another feature of the native mode is the
shortening of the instruction machine cycle. As a result, it runs 20-35%
faster depending on the addressing mode.<br /><br />The effect is particularly
noticeable in direct, extend, and inherant [kencho = remarkable, striking,
conspicuous].<br /><br />--------<br />74 Oh! FM 1988-4<br />--------<br />[Oh!
FM 1988-4: p.75]<br /><br />Please note that the V register and MD register
are not saved due to their nature even in native mode.<br /><br />To switch
from the emulated mode to the native mode, write 1 to bit 0 (LSB) of the newly
installed MD gista.<br /><br />By the way, in addition to the above two modes,
the 63C09 mode has a FIRQ stack evacuation mode. As you know, in 6809 fetching
FIRQ saves only PC and CC to the stack and branches to the interrupt
processing routine. However, in the case of a board microcomputer for control,
there are cases where it is more convenient to have another IRQ than FIRQ.
However, the 63C09 is pin compatible with the 6809, so you can't change the
placement of your feet. So, so that FIRQ can be used as his IRQ, the mode can
be switched softly so that all registers save the stack. Using FIRQ as an IRQ
is achieved by writing 1 to bit 1 of the MD register.<br /><br /><br />********<br />trap<br /><br />63C09
will be trapped when the following phenomena occur.<br /><br />1 When an
undefined instruction is fetched<br />When he divides by 0 with a DIV
instruction in a two-division instruction<br /><br />When trapped, S in the
order of PC, U, Y, X, DP, B, A, CC in emulated mode, and in the order of PC,
U, Y, X, DP, W, B, A, CC in native mode. After pushing the register to the
register, it branches to the vector written at the address of $ FFF0 ($ FFF0
is RESERVE in 6809). This trap has the next interrupt priority for reset. As
an instruction to determine whether it is an undefined instruction or a zero
divide.<br /> There is a BITMD instruction.<br /><br />Because of this
trap, the software of 6809 that uses undefined instructions will not work, but
instead you will be able to build the trap library used in OS-9 / 68000 etc.
For example, assigning a floating-point processor call to an undefined
instruction allows the instruction to be hooked into an undefined instruction
trap and skipped to a processing routine. This trap library can be useful
because it can significantly reduce the size of objects.<br /><br /><br />********<br />Extension
instruction<br /><br />The 63C09 extension instruction can be divided into an
additional instruction that increases the corresponding register of the
existing instruction and a newly established instruction.<br /><br />Newly
installed instructions include inter-register operation instructions, block
transfer instructions, multiplication / division instructions, bit operation
instructions, bit operation / transfer instructions, and so on.<br /><br />Additional
instructions<br />~~~~~~~~<br /><br />In 63C09, the existing instructions have
been expanded and the corresponding registers have increased.<br /><br />For
example, TSTD, ADCD, etc., which were unlikely until now, have been added.
These can be expressed using macros on the assembler in the past, but by using
them, the machine cycle can be shortened.<br /><br />Also, for instructions
such as ADD and SUB, the number of corresponding instructions is increasing
due to the increase in E / F / W registers. It's like having another set of A
/ B / D registers, giving you more programming flexibility. However, note that
not all instructions that can be used in the A / B / D registers are supported
(Fig. 2).<br /><br />Among the existing instructions, the TFR and EXG
instructions are probably the ones with the highest degree of addition. In the
TFR and EXG instructions, the postbyte bit pattern is used to specify the
target register. Since the number of registers is increasing in 63C09, the
combination of bit patterns is also increasing (0110-> W, 0111-> V,
1110-> E, 1111-> F), and it is in a good state even if it is register
addressing. I am. This register addressing is also used in the
register-to-register operations of new instructions. One thing to keep in mind
here is that 63C09 and 6809 behave differently if you specify a true undefined
register number.
</p>
<p> </p>
<p>
Figure 2 Processing that can be performed with the accumulator<br />____________________________________<br />
A | B | E | F | D | W | Q |<br />CLR | o | o | o | o | o | o | |<br />INC | o
| o | o | o | o | o | |<br />DEC | o | o | o | o | o | o | |<br />TST | o | o
| o | o | o | o | |<br />COM | o | o | o | o | o | o | |<br />NEG | o | o | |
| o | | |<br />SEX | o * | o * | | | o * | o * | |<br />ASL / LSL | o | o | |
| o | | |<br />ASR | o | o | | | o | | |<br />LSR | o | o | | | o | o | |<br />ROL
| o | o | | | o | o | |<br />LD | o | o | o | o | o | o | o |<br />ST | o | o
| o | o | o | o | o |<br />ADD | o | o | o | o | o | o | |<br />SUB | o | o |
o | o | o | o | |<br />CMP | o | o | o | o | o | o | |<br />ADC | o | o | | |
o | | |<br />SBC | o | o | | | o | | |<br />AND | o | o | | | o | | |<br />OR
| o | o | | | o | | |<br />EOR | o | o | | | o | | |<br />BIT | o | o | | | o
| | |<br />MUL | o * | o * | | | o | | |<br />DIV | | | | | | | o |<br />____________________________________<br />
* Used as work
</p>
<p><br /></p>
<p>
Inter-register operation instruction<br />~~~~~~~~~~~~~~~~~~<br /><br />Most
of the operations on the 6809 were done with register-to-memory or immediate
values. Therefore, if we wanted to AND the values of the A and B registers,
we had to store either register in memory before performing the operation (AND
in this case). This has been resolved in 63C09, which allows
register-to-register operations. They use the same register addressing as TFR
and EXG.<br /><br />The inter-register operation instructions are as
follows.<br /><br />ADDR, ADCR, SUBR, SBCR,<br />ANDR, ORR, EORR, CMPR<br /><br /><br />Block
transfer instruction<br />~~~~~~~~~~~~~~~~<br /><br />When moving the data in
the memory with 6809, once the data was loaded into the register, it was saved
repeatedly. This is fine, but the problem is the time it takes to process it.
Therefore, the 63C09 also has a block transfer instruction that is also found
on the Z80 and 8086.<br /><br />--------<br />Oh! FM 1988-4 75<br />--------<br />[Oh!
FM 1988-4: p.76]<br /><br />In the block transfer instruction, 1 to 2 of the
16-bit registers D / X / Y / U / S registers are used to specify the transfer
source address (source) and transfer destination address (destination).
Postbytes are used to specify registers, which take the form of register
addressing. You can also specify the source and destination in the same
register. Use the W register to count the number of bytes to transfer.<br /><br />There
are four types of transfer methods. In addition to normal block transfer in
the forward direction (TFM r0 +, r1 +) / reverse direction (TFM r0-, r1-),
data is flowed one after another to addresses such as I / O ports (TFM r0 +,
r1-). There are TFM ro +, r1) and those that fill the specified block with the
specified value (TFM r0, r1 +).<br /><br /><br />Multiply / divide
instruction<br />~~~~~~~~~~~~~~<br /><br />The 6809 had an 8-bit 8-bit
multiplication instruction called MUL, but this was just a multiplication of
the values of the A register and the B register. The 16x16-bit
multiplication instruction (MULD) provided by the 63C09 allows you to use
various addressing modes, which is more like a new installation than an
addition.<br /><br />In addition, the 63C09 is equipped with 16/8 bit division
(DIVD) and 32/16 bit division (DIVQ), which also allow various addressing
modes to be used.<br /><br />Bit manipulation instruction (6301 compatible
instruction)<br />~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br /><br />Hitachi's
HD6301 has a bit manipulation instruction as an extension instruction for
6801.<br />Although it was new, his 63C09 in the same 63 series has the same
instructions. These instructions logically operate the immediate data and the
contents of the memory, return the result to the memory, and change the
related condition code, which is useful when manipulating bit patterns.<br /><br />Logical
operations that can be performed include logical product (AIM), logical sum
(OIM), exclusive logical sum (EIM), and logical product condition code (TIM).
The composition of the object is<br /><br /><Instruction code>, <Bit
position>, <Operand><br /><br />It is in the order of.<br /><br />Using
these instructions, the 6301 instruction can be executed on 63C09 using the
macro assembler. In other words, it may be possible to run embedded programs
such as OASYS Lite on FM, which is also a delicious instruction (although you
have to disassemble the ROM with guts before that).<br /><br />Bit operation /
transfer instruction<br />~~~~~~~~~~~~~~~~~~~~<br /><br />63C09 probably has I
/ O-aware bit operation / transfer instructions. These instructions have the
drawback that they only support direct mode for addressing mode, but once you
get used to it, you'll find it useful. Most of the behavior is to logically
operate the bit n of the LABEL of the direct page and the bit m of the REG
register and put them in the REG register. The bit operation / transfer
instructions are as follows.<br /><br />BAND, BOR, BEOR, BIAND,<br />BIOR,
BIEOR, LDBT, STBT<br /><br />The composition of the object is<br /><br /><Instruction
code ($ 11, $ xx)>, <Post byte>, <Operand><br /><br />And, it
takes a fairly irregular configuration. Operand is direct addressing only.
Postbytes also take a special form.<br /><br />Other orders<br />~~~~~~~~~~~~<br /><br />As
other commands, there is a mode switching command first. However, the
transition from emulated mode to native mode is done by writing 1 to bit 0
(LSB) of the MD register, so use the usual LD instructions for the MD
register.<br /><br />The next time a trap is applied, there is an instruction
BITMD to check whether it was caused by an undefined instruction or a division
error. It looks at the status bits (bits 7 or 6) of the MD register and tells
you where the trap occurred. However, executing this instruction clears the
status bits (bits 7 and 6) of the MD register, so you can only check once for
undefined code traps or Divide by Zero traps.<br /><br />And there is
something about the stack. The 63C09 has more registers, but the current PSHS
/ PSHU does not allow them to be saved on the stack. This is because the PSHS
/ PSHU and PULS / PULU post bytes are all already allocated and there is no
room for additional. Therefore, in 63C09, the stack operation to the added
register is a separate instruction.<br /><br />PSHSW, PULSW, PSHUW, PULUW<br /><br />Is
used. However, this is only for the W register. Therefore, these are only
Inheren and Addressing without postbytes.<br /><br /><br />********<br />in
conclusion<br /><br />I have explained the outline of the functions hidden in
63C09. Most of the features that I wanted to add to the 6809 with + α are
included, which is good news for the 6809 group. However, unfortunately, the
appearance time was late, so the place of activity was limited to the
modification of personal computers or industrial one-board microcomputers. As
for gaming PCs, with the current situation where he is starting to use 16-bit
CPUs such as the 68000 series, the 8086 series, and the 65816, it is unlikely
that manufacturers will ship PCs loaded with the 63C09. It is a pity that it
can only be used for remodeling FM-11 etc. that does not depend on the market
application for the time being.<br /><br />In compiling this article, in
addition to the materials I analyzed, I referred to the materials analyzed by
the members of the 63C09 Analysis Committee (especially his Gigo and his
Miyazaki). 63C09 We would like to thank the people involved in the analysis
committee for their cooperation.<br /><br /><br />--------------------------------<br /><References><br /><br />・
63C09 Analysis Committee, "New Year's gift 63C09 had a hidden function",
NANNO-NET, January 1, 1988-<br />-Motorola, "MC6809-MC-6809E Microprocessor
Programming Manual", CQ Publishing, 1982<br />・ "6809 Instruction Pocket
Book", Oh! FM 1983 No. 4, Japan Softbank<br />・ Ryuta Mizutani, "Undefined
Order of 6809", I / O May 1985, Engineering Co., Ltd.<br />・ Susumu Hara,
"Clock of FM-11 to his 3MHz", PC World January 1987 issue, PC World Japan<br /><br />--------<br />76
Oh! FM 1988-4<br />--------<br />[Oh! FM 1988-4: p.77]<br />
</p>
<p><br /><br /></p>
<p>
Fig. 3 Instructions increased in 63C09 (gray part [=> * box *])<br />((<br />The
horizontal columns are from top to bottom with no prebytes, with prebytes $
10, and with freebytes $ 11. again,<br />The number on the lower left side of
each item is the number of cycles, and the value in parentheses is the value
in native mode. The right side is the command length.<br />))<br /><br />[There
is a typographical error due to the editing of the manuscript. Original
contains typographical errors.]<br />=================================================
=================================================
================================================= =========== |<br />
DIRECT | | | REL | ACC A / D / E | ACC B / W / F | INDEXD | EXTEND | IMMED |
DIRECT | INDEXD | EXTEND | IMMED | DIRECT | INDEXD | EXTEND |<br />
0000xxxx | 0001xxxx | 0010xxxx | 0011xxxx | 0100xxxx | 0101xxxx | 0110xxxx |
0111xxxx | 1000xxxx | 1001xxxx | 1010xxxx | 1011xxxx | 1100xxxx | 1101xxxx |
1110xxxx | 1111xxxx |<br /> 0x | 1x
| 2x | 3x | 4x | 5x | 6x | 7x | 8x | 9x | Ax | Bx | Cx | Dx | Ex | Fx |<br />=================================================
=================================================
================================================= =========== |<br /> 0000
0 | NEG | (PRE) | BRA | LEAX | NEGA | NEGB | NEG | NEG | SUBA | SUBA | SUBA |
SUBA |<br /> (None) | 6 (5), 2 | (BYTE1) | 3,2 | 4+, 2+ | 2 (1), 1 | 2
(1), 1 | 6+, 2+ | 7 (6), 3 | 2,2 | 4 (3), 2 | 4+, 2+ | 5 (4), 3 | 2,2 | 4 (3),
2 | 4+, 2+ | 5 (4), 3 |<br />-------- | -------- | -------- | -------- |
------------ |- -------- | --------- | --------- | -------- | -------- | ---
----- | --------- | -------- | -------- | -------- | ------- -| -------- |<br />
| | | | * addr * | * negd * | | | | * subw * | * subw * | * subw * | * subw *
| | | | |<br /> ($ 10) | | | | 4,3 | 3 (2), 2) | | | | 5 (4), 4 | 7 (5),
3) | 7+ (6+), 3+ | 8 (6) ), 4 | | | | |<br />-------- | -------- | -------- |
-------- | ------------ |- -------- | --------- | --------- | -------- |
-------- | --- ----- | --------- | -------- | -------- | -------- | ------- -|
-------- |<br /> | | | | * band * |
| | | | * sube * | * sube * | * sube * | * sube * | * subf * | * subf * | *
subf * | * subf * |<br /> ($ 11) | | | | 7 (6), 4 | | | | | 3,3 | (4), 3
| 5 +, 3+ | 6 (5), 4 | 3,3 | 5 (4), 3 | 5+, 3+ | 6 (5), 4 |<br />=================================================
=================================================
================================================= =========== |<br /> 0001
1 | * oim * | (PRE) | BRN | LEAY | | | * oim * | * oim * | CMPA | CMPA | CMPA
| CMPA | CMPB | CMPB | CMPB | CMPB |<br /> (None) | 6,3 | (BYTE2) | 3,2 |
4+, 2+ | | | 7+, 3+ | 7,4 | 2,2 | 4 (3), 2 | 4+, 2+ 5 (4), 3 | 2,2 | 4 (3), 2
| 4+, 2+ | 5 (4), 3 |<br />-------- | -------- | -------- | -------- |
------------ |- -------- | --------- | --------- | -------- | -------- | ---
----- | --------- | -------- | -------- | -------- | ------- -| -------- |<br />
| | | LBRN | * adcr * | | | | | * cmpw * | * cmpw * | * cmpw * | * cmpw * | |
| | |<br /> ($ 10) | | | 5,4 | 4,3 | | | | | 5 (4), 4 | 7 (5), 3) | 7+
(6+), 3+ | 8 (6), 4 | | | | |<br />-------- | -------- | -------- | -------- |
------------ |- -------- | --------- | --------- | -------- | -------- | ---
----- | --------- | -------- | -------- | -------- | ------- -| -------- |<br />
| | | | * biand * | | | | | * cmpe * | * cmpe * | * cmpe * | * cmpe * | * cmpf
* | * cmpf * | * cmpf * | * cmpf * |<br /> ($ 11) | | | | 7 (6), 4 | | |
| | 3,3 | 5 (4), 3 | 5 +, 3+ | 6 (5), 4 | 3,3 | 5 (4) , 3 | 5+, 3+ | 6 (5), 4
|<br />=================================================
=================================================
================================================= =========== |<br /> 0010
2 | * aim * | NOP | BHI | LEAS | | | * aim * | * aim * | SBCA | SBCA | SBCA
| SBCA | SBCB | SBCB | SBCB | SBCB |<br /> (None) | 6,3 | 2 (1), 1 | 3,2
| 4+, 2+ | | | 7+, 3+ | 7,4 | 2,2 | 4 (3), 2 | 4+ , 2+ | 5 (4), 3 | 2,2 | 4
(3), 2 | 4+, 2+ | 5 (4), 3 |<br />-------- | -------- | -------- | -------- |
------------ |- -------- | --------- | --------- | -------- | -------- | ---
----- | --------- | -------- | -------- | -------- | ------- -| -------- |<br />
| | | LBHI | * subr * | | | | | * sbcd * | * sbcd * | * sbcd * | * sbcd * | |
| | |<br /> ($ 10) | | | 5/6 (5), 4 | 4,3 | | | | | 5 (4), 4 | 7 (5), 3)
| 7+ (6+), 3+ | 8 ( 6), 4 | | | | |<br />-------- | -------- | -------- |
-------- | ------------ |- -------- | --------- | --------- | -------- |
-------- | --- ----- | --------- | -------- | -------- | -------- | ------- -|
-------- |<br /> | | | | * bor * | |
| |
</p>
<p><br /></p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com2tag:blogger.com,1999:blog-3954576973420765520.post-56233038684082684482022-05-07T19:33:00.029+09:002022-06-26T19:49:17.283+09:00Transcription of Article on the 63C09 in _Oh!FM_ 1988-4: p.72 <p>
If you're a fan of Motorola's 6809 Advanced 8-bit CPU, you may be aware of
Hitachi's 6809-compatible 6309 CPU. You may have heard of the article that
"revealed" the 6309's extensions to the 6809 register model and instruction
set. And if you don't know Japanese, you may have wondered if there were a
translation.
</p>
<p>
I am not especially a fan of the 6309, although I am
<a href="https://defining-computers.blogspot.com/2021/08/the-real-reason-ibm-chose-wrong-cpu-for.html" target="_blank">something of a fan of the 6809</a>. But there are many fans of the 6809 who are also fans of the 6309, and I've
heard enough of them ask for a translation of this article, and I happen to be
spinning my wheels in my more important goals at the moment, so I've decided to take a
crack at it.
</p>
<p>
Up until now, the article in question has only been available as a graphic
embedded in a PDF. So you couldn't even feed the text to Google Translate to get
a bad translation because the text had to be read by someone who can read
Japanese and type it. So that's what I'm doing here, reading it from the image and typing it. <br />
</p>
<p>
Today (May 7th), I just finished typing in the background portion of the
article, and that may be the more interesting part, anyway, so I'll go ahead
and post it here, adding the rest in the next several weeks or months, when
I'm not too tired from the mail route. <i>(Completed and posted below, June 8th.)
</i></p>
<p>
I'm also planning a real translation, a bit at a time. I've done the first
pass on the headline and the first paragraph <i>(and now background section)</i>, and those will form the start of
the
<a href="https://defining-computers.blogspot.com/2022/05/translation-of-article-on-63c09-in-oh-fm-1988-4-72.html.html" target="_blank">translation post</a>. (For the sake of not forcing everyone to go to Google Translate, I've
pasted what it does with the background part into the bottom of the
translation post for now.)
</p>
<p><i>[JMR202206072234:</i></p>
<p>
<i>I finished translating the background part a couple of weeks ago at the
<a href="https://defining-computers.blogspot.com/2022/05/translation-of-article-on-63c09-in-oh-fm-1988-4-72.html.html" target="_blank">translation post</a>
linked above, and forgot to make a note of that here. It's done, you can
read it there.
</i>
</p>
<p>
<i>I have now (June 7th) finished typing in the technical part of the transcription<strike>, and
will be posting it here over the next several days</strike>. <strike>Or I might make a second
post of the complete transcription.</strike> The tables and lists take special
handling, and can't be done all at once the way raw text can. </i>
</p>
<p>
<i>I'll begin translating the technical part <u>real soon now</u>, but I'll
note again that the technical information in the article should be
considered to be for historical interest only. The information at </i>
</p>
<p>
<a href="http://www.sandelman.ottawa.on.ca/People/Alan_DeKok/interests/6309.techref.html" target="_blank"><i>http://www.sandelman.ottawa.on.ca/People/Alan_DeKok/interests/6309.techref.html</i></a>
</p>
<p>
<i>is more complete and more accurate.<br /></i>
</p>
<p><i>]</i></p><p><i>[JMR202206082228:</i></p><p><i>(June 8th) Added the technical details section, starting with </i></p><p></p><blockquote><i>************<br />拡張レジスタ </i></blockquote><p></p><p><i>Tables (図) are buried in code and pre tags to preserve text formatting without html markup.<br /></i></p><p><i>] </i><br /></p>
<p>
If the publishers of <i>Oh! FM</i> or the authors of the article take exception to me putting this up
without permission, yes, I'll take it back down. It's just so long water under
the bridge that it's hard to imagine anyone thinking it will do anyone any
more damage now, but you never know.</p><p>Since I don't own the original copyrights, I cannot extend permission to copy or publish in any way. <br /></p>
<p>
</p>
<p>The transcript text, with a tiny bit of crude markup:<br /></p>
<br />
<hr />
<p>[Original copyright 1988 Oh!FM -- 元発行社 Oh!FM 1988]<br />[Transcription copyright 2022 Joel Matthew Rees -- 書き起こし発行者 Joel Matthew Rees 2022] <br /></p><p>[Oh!FM 1988-4: p.72]<br />=================================================<br />16ビット乗除算/レジスタ間演算/ブロック転送が可能<br />超8ビット級MPU<br />63C09の拡張機能をさぐる<br />63C09解析委員会
UNO<br /><br /> 6809のマイナーチェンジ版に、63C09というLSIがあります。ハードに強い一部のユーザの間では、その高速性を買われ、本体の改造に使われたりしてきました。ところが、最近この63C09に各種の拡張機能が隠されていたことがわかりました。ここでは、それらの機能を発見し、探索してきた「63C09解析委員会」の方にその概要を報告していただきます。<br /><br /> なお、本体の改造、ことにCPUの差し換えは、メーカーの修理は保証されず、他の周辺LSI、周辺機器も交換しなければならない場合もありえ、おまけに63C09だと従来のソフトの一部(あるいは多数)が動作しなくなる危険性がありますので、「私は自作したプログラムしか使わない!」という、よほど腕に自信のある方以外にはお勧めしかねます。<br /><br /><br />******************<br />6809の高速版
63C09<br /><br /> パソコンの楽しみ方には色々ありますが、その筋の兵[つわもの]だけに許された遊びとして、ハードの改造があります。その昔からいろいろな改造が行なわれてきましたが、中でもCPUの高速化は、処理能力の向上が著しいことと、比較的簡単に行えることから、市販ソフトに頓着しないプログラム自作派の間では広く試みられてきました。古くは
FM-8 に積まれた 68A09 (1.2 MHz) を 68B09 (2 MHz) に変えて、
FM-7並の処理速度を与えたこと、最近では FM-11
を中心にCPUをクロックアップして 2.5~4 MHz
で動かすことが、その代表的なところです。<br /><br /> 「えっ、 2.5~4MHz
だって、そんなに速い
6809ってあったっけ」と思われる方もおられるでしょうが、実は存在するのです。秋葉原や日本橋のチップ屋さんで売っている日立の
63C09 というMPUがそれで、 3 MHz で動作する、
6809のC−MOS版です。ノーマルに従うなら従来の 68B09 (2 MHz) の
1.5倍の処理能力になり、選別して規格外の 4 MHz
で動くものを使えば2倍の処理能力となります。<br /><br /> この 63C09
を使って高速化を行うわけで、基本的にはCPUとクロックを差し換えるだけの簡単なものですが、それだけでは済まないこともあります。
FM-8, 7, 77D1/D2/L4, 77AV, 11
のようにCPUがソケットに差さっている機種では単純に差し換えるだけですが、
FM-NEW7, 77AV20/40/20EX/40EX
のように基板に直接ハンダづけされている機種ではかなりの腕がないとCPUが引っこ抜けません。また、クロックアップした場合周辺LSIや周辺機器が追いつかないこともあり、その場合はそれらも交換しなければなりませんが、AV系のようにカスタムLSIが多用されている場合は難しいでしょう。
7/AV系のサブシステムのように微妙なタイミングで動いているものでは、クロックアップはかなり困難で、しかも、クロックアップしたが最後、プロテクトやその他の処理に内蔵タイマやソフトウェア的なタイミングを使った市販アップリケーションソフトの多くは全て使いものにならなくなってしまいます。<br /><br /> さて、これら様々な困難を伴った高速化ですが、そのもたらす結果は苦労を補って余りあるものです。<br /><br /> ちなみに
FM-11 の場合を例にとると、CPUの差し換えだけだと 2.5~3 MHz
ぐらいが限界のようで、それ以上を望むと一部周辺LSIの差し換え等が必要なようです。
11ではサブシステムの高速化も可能で、手を加えれば 4 MHz までいけます。とくに、
4 MHz化された FM-11
のサブシステムの表示速度は目を見張るものがあり、漢字の表示速度が漢字VRAMをもった
FM16β と大差ない速さになります。<br /><br /><br />*************<br />拡張機能の発見<br /><br /> というわけで、私の周りの歴戦の勇士たちは次々と
FM-11 に高速化改善を行ったのですが、そこに1つ、奇妙な問題が発生しました。<br /><br /> コマスのワープロWPV3が動かなくなってしまったのです。最初は、前述したソフト的なタイミングの問題か何かだと思われたのですが、驚いたことに、クロックを
2 MHz に落としても動きません。<br /><br /> そこで、友人の
Gigo氏を中心に原因探究が始まったのですが、ほどなく 6809
の未定義命令で引っかかっていることがわかりました。未定義命令とは、メーカが発表しているマニュアルで定義されていない命令のことで、建前上はそのような命令を使ってもなんの動作もしないことになっているのですが、実は隠し命令になっていることもままあります。一昔のパソコン雑誌には、よく各社製CPUの隠し命令の解析記事が載っていたりしました。このよう<br /><br />--------<br />72
Oh!FM 1988-4 警告 CPUを 63C09
に交換した場合、かなりの市販アプリケーション(とくにゲーム)が動作しなくなります。<br />--------<br />[Oh!FM
1988-4: p.73]<br /><br />な隠し命令は、 6809
にもそう大したものではありませんでしたがありました。さて、未定義命令の扱いが
63C09 と 6809 とでは異なるということは、隠し命令も異なる可能性があるわけです。
63C09 の出荷開始時期 (1985年秋)
を考えても、何か機能が追加されても当然なくらいで、「もしかしたら」の期待がわきました。<br /><br /> 引っかかっているコードの1つに「$1F,
$62」というものがありました。命令自体は TFR
(レジスタ間のデータ転送命令)でおなじみのものですが、未定義レジスタから Y
レジスタへ転送するように指示されています。 6809 の場合は未定義なので Y
レジスタに $FFFF が返りますが、63C09 の場合は Y
レジスタにはめちゃくちゃな[値]が入ります。試みに、 Y
レジスタから未定義レジスタ番号にデータ転送してから未定義レジスタ番号から Y
レジスタへ戻してやると、元のデータがちゃんと残っていました……。つまり、 63C09
の未定義レジスタ番号は、番号が余って未定義となっていたのではなく、実在するレジスタを指す番号だったのです!<br /><br /> その隠しレジスタを発見した
Gigo氏は、狂喜してかたっぱしから友人に電話をかけまくりました。そして、私のところにも夜も丑三つ[うしみつ]時[dead
of the night]過ぎにかかってきました……。話の内容は、「63C09
にはレジスタが余計にある。レジスタがあるからには命令もあるはずだ。みんなで手分けして調べよう」というもので、それから隠し命令を探す日日が始まり、ディスアセンブル表の割り当てのないコードをデバッガでメモリ上に書き、ブレークポイントを設定して
TFR
でレジスタに値をセットして実行させ、レジスタの内容を見るという単調な作業が繰り返されました。その日わかった結果を、パソコン通信を通じて情報交換するうちに、いつとはなしに、
FM-11 で OS-9 をやっている、それもほとんど病気に近いマニアが集まり、「63C09
解析委員会」なる集団が自然発生しました。<br /><br /> マニアの執念は恐ろしいもので、ほどなく
36C09 の拡張機能の大筋が判明しました。概略を述べると、<br /> ・3種類のレジスタが増設されており、そのうちの1つはアキュムレータとして、またインデックスレジスタとして使える<br /> ・32÷16
ビット除算、 16÷8 ビット除算、 16×16
ビット乗算、レジスタ間演算、ビット操作、ブロック転送などの命令が拡張されている。<br /> ・未定義の命令を検出した場合トラップがかかる<br /> ・6809
コンパチのモードと、 63C09 本来のモードの2種類の動作モードをもつ<br />といったところで、今までの
6809
で不便であった部分、弱点であった部分が相当改善されており、またとても8ビットMPUとは思えない強力な機能も含まれています。<br /><br /> これらの解析結果はNANNO−NETを皮切りに、いくつかの
BBS
にアップされました。多くのネットワーカーの方から多大な反響を得ましたが、より多くの方に「8ビットを超える8ビットMPU」
63C09 の全貌[ぜんぼう=entire body]を知っていただくために、 Oh!FM
の誌上お借りしてご報告します。
<br /><br /><br />*******************<br />63C09化の<br />メリット/デメリット<br /><br /> 日立から販売されている
HD63C09 は、モトローラの MC6809
とピンコンパチの8ビットMPUです。MPUの仕様は 6809
に拡張機能を付け加えた形のもので、
6809の上位コンパチになっています(未定義の命令を除く)。日立から公式発表はされていませんが、
63C09 の拡張機能を活用すると、
6809パソコンの処理能力を大幅に向上させることができます。<br /><br /> 6809パソコンのMPUを
63C09
に差し換えた場合のメリットは処理能力の向上につきます。その要因としては以下の3点が考えられます。<br /><br /> 1 高速クロック<br /> 2 拡張命令/拡張レジスタ<br /> 3 ネイティブモード<br /><br /> 1は当然のことで、MPUの動作クロックをあげればソフトの実行速度は上がります。未定義命令トラップやソフトウェアタイマの関係で引っかかる一部を除けば、従来のソフトが高速に動かせます。動作クロックの上昇率はハードにより異なり、場合によってはほとんどあげられないこともあります。<br /><br /> 2は、新規にソフトを書き起こすか、従来のソフトにパッチを当てたときに効果があります。従来の
6809 だとアセンブラのマクロ機能で表現していた処理の相当が 63C09
の1命令で書けるようになり、マシンサイクルを短縮できます。<br /><br /> 3は、
63C09固有のモードに切り換えて使うと、通常のエミュレーションモードよりも命令の実行サイクルが短くなることにより生じるものです。このモードを使うと、同じ動作クロックでも、通常より実行速度が最大20〜30%上がります(アドレッシングモードにより効果が違う)。ただ、スタックや割り込み関係で動作が異なる点があるので、ネイティブモードを利用するには
F-BASIC なり OS-9 なりのシステムの一部を書き換える必要があります。<br /><br /> 逆にデメリットとしては、まず、
6809
の未定義命令を使ったソフトに限らず、多くの内蔵タイマを使った市販ソフトその他が動作不良を起こしてしまうであろうこと、また、
63C09
の拡張機能を活用するにはそれらを活用するための開発ツールを自作できるくらいのそれなりの腕が必要で、ノービスには難しいことが難点といえるでしょう。<br /><br /> メリットとデメリットを比較すると、現状では自分でプログラム書くだけの人ならその恩恵を受けることができるが、ごく一般のゲームユーザやアプリユーザは決して手を出さないほうがよい、といったところでしょう。<br /><br /> それでは、
63C09 の拡張機能について以下順番に解説していきます。<br /><br />************<br />拡張レジスタ<br /><br /> 63C09
では 6809 よりレジスタの数が3つ増えています(図1)。そのうち2つは
16ビットのレジスタで、もう1つは8ビットのモードステータスレジスタです。<br /><br /><br />Wレジスタ[16ビット]<br />~~~~~~~~~~~~~~~~~~~~<br /> アキュムレータとしても、インデックスレジスタとしても使用できる
16ビットレジスタです。<br /><br /> アキュームレータとして使うときは、 16ビ<br /><br />--------<br />Oh!FM
1988-4 73<br />--------<br />[Oh!FM 1988-4: p.74]<br /><br />ットレジスタとしてのほか、2つの8ビットレジスタ(E/Fレジスタ)に分割して使うこともできます。ちょうど、既存の
D/A/Bレジスタがもう1組増えたようなものです。ただし、 AND/OR 等の W/E/F
レジスタでは使えない命令もあります。<br /><br /> また、既存の
Dレジスタと連結して
32ビットレジスタ(Qレジスタ)として使うことができ、乗除算のときに利用します。<br /><br /> インデックスレジスタとして使うときは、既存の
X/Yレジスタと同様に利用します。この場合、 6809
でポストバイトに用いられていないビットパターンを使用します。Wレジスタをインデックスレジスタとして使用したときの、アキュムレータオフセットと5ビットオフセット、8ビットのコンスタントオフセットはありません。<br /><br /> また、特徴な使い方として、ブロック転送でのカウンタレジスタとして使う方法があります。<br /><br />Vレジスタ[16ビット]<br />~~~~~~~~~~~~~~~~~~~~<br /> Vレジスタを使う命令は、レジスタ間演算命令や
TFR
などに限られています。Vレジスタの特徴は、MPUをリセットしてもレジスタの値が変化しないことです。このレジスタをOSなどで定数等を保持するような目的に便利でしょう。<br /><br />MDレジスタ[8ビット]<br />~~~~~~~~~~~~~~~~~~~~<br /> モード/ステータスビットレジスタの略で、除算実行時のエラー検出や未定義命令トラップの作動チェック、動作モードの設定など、
63C09
になって増えたモードやステータスの表示に用いられます。各ビットの意味は次のとおりです。<br /> ・ビット7 R 除算で
0 で割ったときに 1 がセットされる<br /> ・ビット6 R 未定義命令をフェッチしたときに
1 がセットされる<br /> ・ビット1 W FIRQ時のレジスタの退避モード設定ビット<br /> 0
-> FIRQ時、 PC と CC のみスタックに退避<br /> 1 -> FIRQ時、
すべてのレジスタを退避<br /> ・ビット0 W 動作モード設定ビット<br /> 0
-> エミュレートモード<br /> 1 -> ネイティブモード<br />なお、リセット時にはすべてのビットは
0 になります。<br />
<br />
<code></code>
</p>
<pre><code>図1 63C09レジスタ構成
__________________________________________________
|----------Q----------|
|----D----| |----W----|
|-A-| |-B-| |-E-| |-F-| アキュムレータ
----------- |----X----| X インデックスレジスタ
----------- |----Y----| Y インデックスレジスタ
----------- |----U----| ユーザスタックポインタ
----------- |----S----| システムスタックポインタ
----------- |---PC----| プログラムカウンタ
----------- |----V----| V(alue) レジスタ
----------- |---DP----| ダイレクトページレジスタ
----------------- |CC-| コンデションコードレジスタ
----------------- |MD-| モード/ステータスレジスタ
__________________________________________________
</code></pre>
<br />
<br />
*********<br />動作モード<br /><br /> 63C09には2つの動作モードがあります。1つは
6809 とのコンパチビリティを考えたエミュレートモードで、もう1つは 63C09
の本来の機能を引き出すネイティブモードです。<br /><br /> と、いうと、「拡張レジスタや拡張命令が使えるのがネイティブモードで、使えないのがエミュレートモードだな」と思われる方もいるでしょうが、それはハズレです。
63C09 では、拡張レジスタと拡張命令をどちらのモードでも使えます。<br /><br /> エミュレートモードとネイティブモードの違いは、インタラプト時のスタックの扱いの違いです。インタラプトがかかったときレジスタの内容はスタックに退避されますが、そのとき従来からのレジスタだけを退避させるのがエミュレートモードで、拡張レジスタの
W レジスタも退避させるのがネイティブモードです。<br /><br /> 63C09
をリセットした直後はエミュレートモードに設定されています。このモードでは 6809
のソフトが問題なく動作する代わり、マルチタスクで拡張レジスタを使うときに気をつけなければなりません。たとえば、拡張レジスタの
W レジスタをカウンタに使うブロック転送命令 TFM
を使ったケースを考えます。タスクAで TFM
命令を使用中に、インタラプトをかけて、タスクBに移ったとしましょう。そのときにタスクBで
W レジスタを使用したとしたら、また元のタスクAに戻ったときに W
レジスタの中身が変更されていますので、誤動作を起こします(もちろんシングルタスクで使用するときや、マルチタスクでも1つのタスクでしか拡張レジスタを使用しないときは問題ありません)。よって、エミュレートモードでは、拡張レジスタを使用するときはいちいちインタラプト禁止して、さらにこのレジスタを一度スタックにセーブしてからでないと、別タスクに切り換えてはなりません。<br /><br /> これでは、高速ゲームや
OS-9
から拡張レジスタを使いづらいうえ、使いにくい拡張命令も発生します。そのために、拡張レジスタと拡張命令を使うことを前提にしたモード、ネイティブモードが存在します。このモードでのインタラプトは
PC, U, Y, X, DP, W, D, CC
の順にスタックにレジスタを退避して割り込み処理に入ります。ここで注意してほしいのは、
W は DP と D の間にあるということです。これは、 D と W の 32ビットレジスタペア Q
としてスタックに退避するという意味です。<br /><br /> ネイティブモードの特徴としては、もう1つ、命令のマシンサイクル短縮があげられます。その結果、アドレシングモードによって
20 ~ 35%高速に動作します。<br /><br /> とくにダイレクト、エクステンド、インヘラント、で顕著[けんちょ=remarkable,
striking, conspicuous]にその効果が現れます。<br /><br />--------<br />74 Oh!FM
1988-4<br />--------<br />[Oh!FM 1988-4: p.75]<br /><br /> なお、ネイティブモードでも
V レジスタと MD レジスタはその性格上退避されませんので注意してください。<br /><br /> エミュレートモードからネイティブモードに移行するには、新設された
MD レジスタのビット 0 (LSB) に 1 を書き込むことによって実現します。<br /><br /> さて、
63C09 のモードには上の2つのほかに、 FIRQ
のスタック退避モードが用意されています。ご存知のように、 6809 では FIRQ
をフェッチすると、 PC と CC
のみをスタックに退避してインタラプト処理ルーチンへ分岐します。しかし、制御用のボードマイコンの場合、
FIRQ より IRQ がもう1つあったほうが便利なケースがあります。しかし、 63C09 は
6809
とピンコンパチを[?謳=うた?]っていますので、足の配置を変えるわけにはいきません。そこで、
FIRQ を IRQ
として使用できるように、スタックの退避をすべてのレジスタが行うようにモードをソフトで切り換えられるようになっています。
FIRQ を IRQ として使用する場合は MD レジスタのビット 1 に 1
を書くことによって実現します。<br /><br /><br />********<br />トラップ<br /><br /> 63C09
は以下の現象が発生したときにトラップがかかります。<br /><br /> 1 未定義命令がフェッチされたとき<br /> 2 除算命令の
DIV 命令で 0 で割ったとき<br /><br /> トラップがかかると、エミュレートモードでは
PC, U, Y, X, DP, B, A, CC の順に、ネイティブモードでは PC, U, Y, X, DP, W, B, A,
CC の順に S レジスタにレジスタをプッシュした後、 $FFF0
のアドレスに書いてあるベクタに分岐します ($FFF0 は 6809 では
RESERVE)。このトラップはリセットの次の割り込み優先度があります。なお、未定義命令かゼロ・ディバイドかを判定する命令として BITMD
命令があります。<br /><br /> このトラップのため、未定義命令を使っている 6809
のソフトが動作しなくなりますが、代わりに
OS-9/68000等で使われているトラップライブラリを組めるようになります。たとえば、未定義命令に浮動小数点演算プロセッサの呼び出しを割り当てておくと、その命令を未定義命令トラップに引っ掛け、処理ルーチンに飛ばすことが可能になります。このトラップライブラリを利用すると、オブジェクトのサイズをかなり縮められるので便利でしょう。<br /><br /><br />********<br />拡張命令<br /><br /> 63C09
拡張命令には、既存の命令の対応レジスタを増やした追加命令と新規に設けられた新設命令に分けられます。<br /><br /> 新設命令としては、レジスタ間演算命令や、ブロック転送命令、乗算/除算命令、ビット操作命令、ビット演算/転送命令等の命令があります。<br /><br />追加命令<br />~~~~~~~~<br /><br /> 63C09
では、既存の命令も拡張されていて、対応するレジスタが増えています。<br /><br /> たとえば、今までありそうでなかった
TSTD, ADCD
などが追加されています。これらは従来でもアセンブラ上でマクロを使って表現できましたが、これらを使うことによりマシンサイクルを短縮できます。<br /><br /> また、
ADD や SUB などの命令では、 E/F/W
レジスタが増えたことにより、それに対応する命令が増えています。いわば、 A/B/D
レジスタがもう1組増えたようなもので、プログラミングの柔軟性が増します。ただし、
A/B/D
レジスタで使える命令がすべて対応しているわけではありませんので注意してください(図2)。<br /><br /> 既存の命令の中で追加の度合いが大きいのは
TFR と EXG 命令でしょう。 TFR, EXG
命令では、対象レジスタの指定にポストバイトのビットパターンを用います。 63C09
ではレジスタが増えていますので、そのビットパターンの組み合わせも増えていて
(0110->W, 0111->V, 1110->E, 1111->F),
レジスタアドレッシングとでもいったらよい状態になっています。このレジスタアドレッシングは新設命令のレジスタ間演算でも使用しています。ここで注意しなければいけないのは、本当の未定義レジスタ番号を指定した場合、
63C09 と 6809 とでは動作が異なるということです。<br />
<br />
<code>
<pre>図2 アキュムレータで行える処理
____________________________________
| A | B | E | F | D | W | Q |
CLR | o | o | o | o | o | o | |
INC | o | o | o | o | o | o | |
DEC | o | o | o | o | o | o | |
TST | o | o | o | o | o | o | |
COM | o | o | o | o | o | o | |
NEG | o | o | | | o | | |
SEX | o*| o*| | | o*| o*| |
ASL/LSL| o | o | | | o | | |
ASR | o | o | | | o | | |
LSR | o | o | | | o | o | |
ROL | o | o | | | o | o | |
LD | o | o | o | o | o | o | o |
ST | o | o | o | o | o | o | o |
ADD | o | o | o | o | o | o | |
SUB | o | o | o | o | o | o | |
CMP | o | o | o | o | o | o | |
ADC | o | o | | | o | | |
SBC | o | o | | | o | | |
AND | o | o | | | o | | |
OR | o | o | | | o | | |
EOR | o | o | | | o | | |
BIT | o | o | | | o | | |
MUL | o*| o*| | | o | | |
DIV | | | | | | | o |
____________________________________
* ワークとして使用
</pre>
</code>
<br />
レジスタ間演算命令<br />~~~~~~~~~~~~~~~~~~<br /><br /> 6809での演算は、ほとんどレジスタ対メモリないしイミディエイト値で行われていました。そのため
A レジスタと B レジスタの値の AND
をとりたい場合は、どちらかのレジスタをメモリ上にストアしてから演算(この場合は
AND)を行わなければなりませんでした。 63C09
ではこれが解決されていてレジスタ同士の演算が可能になりました。これらは TFR や
EXG と同じレジスタアドレッシングを用います。<br /><br /> レジスタ間演算命令は以下のようなものがあります。<br /><br /> ADDR,
ADCR, SUBR, SBCR,<br /> ANDR, ORR, EORR, CMPR<br /><br /><br />ブロック転送命令<br />~~~~~~~~~~~~~~~~<br /><br /> 6809
でメモリ上のデータを移動させるときは、一度そのデータをレジスタにロードしてきては、それをセーブするということを繰り返して行っていました。これはこれでよいのですが、問題はその処理にかかる時間です。そこで
Z80 や 8086 などにもあるブロック転送命令が、 63C09 にも設けられています。<br /><br />--------<br />Oh!FM
1988-4 75<br />--------<br />[Oh!FM 1988-4: p.76]<br /><br /> ブロック転送命令では、転送元アドレス(ソース)、転送先アドレス(ディスティネーション)の指定に
16ビットレジスタの D/X/Y/U/S
レジスタの中から1〜2使います。レジスタの指定にはポストバイトを使い、その形式はレジスタアドレッシングの形式をとります。また、ソースとディスティネーションを同じレジスタでも指定できます。転送するバイト数のカウントには
W レジスタを使います。<br /><br /> 転送方法には4種類あり、正方向(TFM
r0+,r1+)/逆方向(TFM r0-,r1-)の通常のブロック転送のほか、 I/O
ポート等のアドレスにデータを次々と流し込むもの(TFM
r0+,r1)、指定ブロックを指定値で塗りつぶすもの(TFM r0,r1+)があります。<br /><br /><br />乗算/除算命令<br />~~~~~~~~~~~~~~<br /><br /> 6809
には MUL という 8×8ビットの乗算命令がありましたが、これは A レジスタと B
レジスタの値を掛け合わせるだけのものでした。 63C09
で設けられた16×16ビット乗算命令(MULD)では、いろいろなアドレシングモードが使え、追加というよりは新設に近いものです。<br /><br /> また、
63C09
にはそれに加えて16÷8ビット除算(DIVD)、32÷16ビット除算(DIVQ)が設けられて、これらも、いろいろなアドレシングモードが使えるようになっています。<br /><br />ビット操作命令(6301
コンパチ命令)<br />~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br /><br /> 日立の HD6301
には、 6801 の拡張命令としてビット操作命令が新設されていましたが、同じ 63
シリーズの 63C09
にも同じ命令があります。これらの命令はイミディエイトデータとメモリの内容を論理演算して、結果をメモリに戻したり、関連コンディションコードを変化させてしまうので、ビットパターンを操作するときなどに重宝します。<br /><br /> 行える論理演算には、論理積(AIM)、論理和(OIM)排他的論理和(EIM)、論理積コンディションコード(TIM)があります。オブジェクトの構成は、<br /><br /> <命令コード>、<ビットの位置>、<オペランド><br /><br />の順になっています。<br /><br /> これらの命令を使うと、
6301 の命令はマクロアセンブラを用いれば 63C09 上で実行可能になります。つまり
OASYS Lite 等の組み込みプログラムを FM
上で動かすことができるかも知れないわけで、そういう意味でも美味しい命令なのです(もっとも、その前に根性で
ROM を逆アセンブルしなければなりませんが)。<br /><br />ビット演算/転送命令<br />~~~~~~~~~~~~~~~~~~~~<br /><br /> 63C09
には、多分に I/O
を意識したビット演算/転送命令が存在しています。これらの命令は、アドレシングモードにダイレクトモードしかサポートしていない難はありますが、使い慣れれば便利に使えるでしょう。動作は、ダイレクトページの
LABEL のビット n と REG レジスタのビット m を論理演算して、 REG
レジスタに入れるものがほとんどです。ビット演算/転送命令には以下のようなものがあります。<br /><br /> BAND,
BOR, BEOR, BIAND,<br /> BIOR, BIEOR, LDBT, STBT<br /><br />オブジェクトの構成は<br /><br /> <命令コード
($11,$xx)>、<ポストバイト>、<オペランド><br /><br />と、かなり変則な構成をとります。オペランドはダイレクトアドレシングのみです。また、ポストバイトは特殊な形式をとります。<br /><br />その他の命令<br />~~~~~~~~~~~~<br /><br /> その他の命令としては、先ずモード切り換え命令があります。といっても、エミュレートモードからネイティブモードへの移行は、
MD レジスタのビット 0 (LSB) に1を書き込むことによって行われますので、 MD
レジスタに対する普通の LD 命令を使います。<br /><br /> 次にトラップがかかったとき、未定義命令でかかったのか除算のエラーで起こったのかを調べる命令
BITMD があります。これは、 MD レジスタのステータスビット(ビット 7 or
6)を調べ、どちらでトラップがかかったのかを知らせます。ただし、この命令を実行すると
MD レジスタのステータスビット(ビット 7 and
6)はクリアされますので、未定義コードトラップか、 Divide by Zero
トラップかは、一度きりしか調べることはできません。<br /><br /> そして、スタックに関するものがあります。
63C09 ではレジスタが増設されていますが、現在の PSHS/PSHU
ではそれらをスタックにセーブすることはできません。なぜなら、PSHS/PSHU および
PULS/PULU
のポストバイトが、すでにすべて割り当てられていて追加の余地がないということです。そこで、
63C09 では増設されたレジスタへのスタック操作は別命令の<br /><br /> PSHSW,
PULSW, PSHUW, PULUW<br /><br />を使います。ただし、これは W
レジスタに対するもののみしかありません。よって、これらはポストバイトをもたないインヘレンとアドレシングのみです。<br /><br /><br />********<br />おわりに<br /><br /> 以上
63C09 に隠されていた機能の大筋を説明してきました。 6809
に+αで追加してほしかった機能がほぼ盛り込まれており、
6809派にはひさびさの好ニュースといえます。ただ、惜しむらくは登場時期が遅かったことで、そのため活躍の場がパソコンの改造か、産業用ワンボードマイコン程度に限られてしまったことです。ゲームパソコンも
68000系や 8086系、 65816 といった16ビットCPUを使い始めている現状では、
63C09
を積んだパソコンをメーカーが出荷することは、まずありえないことでしょう。当面は市場アプリに依存しない
FM-11等の改造にしか使えないというのは残念なことです。<br /><br /> なお、この稿をまとめるにあたっては、私が解析した資料のほか、
63C09 解析委員会の仲間(とくに Gigo氏と
Miyazaki氏)が解析された資料を参照させていただきました。 63C09
解析委員会関係者のご協力に感謝いたします。<br /><br /><br />--------------------------------<br /><参考文献><br /><br />・63C09解析委員会、「お年玉プレゼント
63C09 に隠し機能があった」など、NANNO−NET、1988年1月1日〜<br />・モトローラ、「MC6809-MC6809Eマイクロプロセッサプログラミングマニュアル」、CQ出版、1982年<br />・「6809
インストラクションポケットブック」、Oh!FM 1983年第4号、日本ソフトバンク<br />・水谷隆太、「6809の未定義命令」、I/O
1985年5月号、工学社<br />・原進、「FM-11 のクロックを 3MHz
に」、パソコンワールド 1987年1月号、ピーシーワールドジャパン<br /><br />--------<br />76 Oh!FM
1988-4<br />--------<br />[Oh!FM 1988-4: p.77]<br />
<br />
<code><pre>図3 63C09で増えた命令(灰色に塗られた部分[=>*囲*])
(
横の列は、上からプリバイトなし、プリバイト $10付き、フリバイト $11付きの順に並んでいる。また、
各項目中の下段左側の数値はサイクル数で、カッコ内がネイティブモード時の値。右側は命令長。
)
<i>[原稿の編集に因る誤植在り。 Original contains typographical errors.]</i>
=================================================================================================================================================================|
| DIRECT | | | REL |ACC A/D/E|ACC B/W/F| INDEXD | EXTEND | IMMED | DIRECT | INDEXD | EXTEND | IMMED | DIRECT | INDEXD | EXTEND |
|0000xxxx|0001xxxx|0010xxxx| 0011xxxx | 0100xxxx| 0101xxxx| 0110xxxx|0111xxxx|1000xxxx|1001xxxx| 1010xxxx|1011xxxx|1100xxxx|1101xxxx| 1110xxxx|1111xxxx|
| 0x | 1x | 2x | 3x | 4x | 5x | 6x | 7x | 8x | 9x | Ax | Bx | Cx | Dx | Ex | Fx |
=================================================================================================================================================================|
0000 0 | NEG | (PRE) | BRA | LEAX | NEGA | NEGB | NEG | NEG | SUBA | SUBA | SUBA | SUBA | SUBB | SUBB | SUBB | SUBB |
(なし) | 6(5),2 | (BYTE1)| 3,2 | 4+,2+ | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* addr *|* negd *| | | |* subw *|* subw *|* subw *|* subw *| | | | |
($10) | | | | 4,3 | 3(2),2) | | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* band *| | | | |* sube *|* sube *|* sube *|* sube *|* subf *|* subf *|* subf *|* subf *|
($11) | | | | 7(6),4 | | | | | 3,3 | (4),3 | 5+,3+ | 6(5),4 | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
0001 1 |* oim *| (PRE) | BRN | LEAY | | |* oim *|* oim *| CMPA | CMPA | CMPA | CMPA | CMPB | CMPB | CMPB | CMPB |
(なし) | 6,3 | (BYTE2)| 3,2 | 4+,2+ | | | 7+,3+ | 7,4 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBRN |* adcr *| | | | |* cmpw *|* cmpw *|* cmpw *|* cmpw *| | | | |
($10) | | | 5,4 | 4,3 | | | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* biand *| | | | |* cmpe *|* cmpe *|* cmpe *|* cmpe *|* cmpf *|* cmpf *|* cmpf *|* cmpf *|
($11) | | | | 7(6),4 | | | | | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
0010 2 |* aim *| NOP | BHI | LEAS | | |* aim *|* aim *| SBCA | SBCA | SBCA | SBCA | SBCB | SBCB | SBCB | SBCB |
(なし) | 6,3 | 2(1),1 | 3,2 | 4+,2+ | | | 7+,3+ | 7,4 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBHI |* subr *| | | | |* sbcd *|* sbcd *|* sbcd *|* sbcd *| | | | |
($10) | | |5/6(5),4| 4,3 | | | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* bor *| | | | | | | | | | | | |
($11) | | | | 7(6),4 | | | | | | | | | | | | |
=================================================================================================================================================================|
0011 3 | COM | SYNC | BLS | LEAU | COMA | COMB | COM | COM | SUBD | SUBD | SUBD | SUBD | ADDD | ADDD | ADDD | ADDD |
(なし) | 6(5),2 | 2,1 | 3,2 | 4+,2+ | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 4(3),3 | 6(4),2 |6+(5+),2+| 7(5),3 | 4(3),3 | 6(4),2 |6+(5+),2+| 7(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBLS |* sbcr *|* comd *|* comw *| | | CMPD | CMPD | CMPD | CMPD | | | | |
($10) | | |5/6(5),4| 4,3 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* bior *|* come *|* comf *| | | CMPU | CMPU | CMPU | CMPU | | | | |
($11) | | | | 7(6),4 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
=================================================================================================================================================================|
0100 4 | LSR | sexw | BHS/BCC| PSHS | LSRA | LSRB | LSR | LSR | ANDA | ANDA | ANDA | ANDA | ANDB | ANDB | ANDB | ANDB |
(なし) | 6(5),2 | 4,1 | 3,2 | 5+(4+),2 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | |LBHS/BCC|* andr *|* lsrd *|* lsrw *| | |* andd *|* andd *|* andd *|* andd *| | | | |
($10) | | |5/6(5),4| 4,3 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3)|7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* beor *| | | | | | | | | | | | |
($11) | | | | 7(6),4 | | | | | | | | | | | | |
=================================================================================================================================================================|
0101 5 | eim | | BLO/BCS| PULS | | |* eim *|* eim *| BITA | BITA | BITA | BITA | BITB | BITB | BITB | BITB |
(なし) | 6,3 | | 3,2 | 5+(4+),2 | | | 7+,3+ | 7,4 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | |LBLU/BCS|* orr *| | | | |* bitd *|* bitd *|* bitd *|* bitd *| | | | |
($10) | | |5/6(5),4| 4,3 | | | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* bieor *| | | | | | | | | | | | |
($11) | | | | 7(6),4 | | | | | | | | | | | | |
=================================================================================================================================================================|
0110 6 | ROR | LBRA | BNE | PSHU | RORA | RORB | ROR | ROR | LDA | LDA | LDA | LDA | LDB | LDB | LDB | LDB |
(なし) | 6(5),2 | 5(4),3 | 3,2 | 5+(4+),2 | 2(1),1 | 2(1),1 | 6+,2+ | 2,2 | 7(6),3 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBNE |* eorr *|* rord *|* rorw *| | |* ldw *|* ldw *|* ldw *|* ldw *| | | | |
($10) | | |5/6(5),4| 4,3 | 3(2),2 | 3(2),2 | | | 4,4 | 6(5),3 | 6+,3+ | 7(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* ldbt *| | | | |* lde *|* lde *|* lde *|* lde *|* ldf *|* ldf *|* ldf *|* ldf *|
($11) | | | | 7(6),4 | | | | | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
0111 7 | ASR | LBSR | BEQ | PULU | ASRA | ASRB | ASR | ASR | | STA | STA | STA | | STB | STB | STB |
(なし) | 6(5),2 | 9(7),2 | 3,2 | 5+(4+),2 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | | 4(3),2 | 4+,2+ | 5(4),3 | | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBEQ |* cmpr *|* asrd *| | | | |* stw *|* stw *|* stw *| | | | |
($10) | | |5/6(5),4| 4,3 | 3(2),2 | | | | | 6(5),3 | 6+,3+ | 7(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* stbt *| | | | | |* ste *|* ste *|* ste *| |* stf *|* stf *|* stf *|
($11) | | | | 8(7),4 | | | | | | 5(4),3 | 5+,3+ | 6(5),4 | | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
1000 8 | ASL/LSL| | BVC | |ASLA/LSLA|ASLB/LSLB| ASL/LSL | ASL/LSL| EORA | EORA | EORA | EORA | EORB | EORB | EORB | EORB |
(なし) | 6(5),2 | | 3,2 | | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBVC |* pshsw *|* asld *| | | |* eord *|* eord *|* eord *|* eord *| | | | |
($10) | | |5/6(5),4| 5/6,2 | 3(2),2 | | | | 5(4),4 | 5(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |*tfm(r+,r+)*| | | | | | | | | | | | |
($11) | | | | 6+3n,3 | | | | | | | | | | | | |
=================================================================================================================================================================|
1001 9 | ROL | DAA | BVS | RTS | ROLA | ROLB | ROL | ROL | ADCA | ADCA | ADCA | ADCA | ADCB | ADCB | ADCB | ADCB |
(なし) | 6(5),2 | 2(1),1 | 3,2 | 5(4),1 | 2(1),1 | 2(1),1 | 6+,2+ | 2,2 | 7(6),3 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBVS |* pulsw *|* rold *|* rolw *| | |* adcd *|* adcd *|* adcd *|* adcd *| | | | |
($10) | | |5/6(5),4| 6,2 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |*tfm(r-,r-)*| | | | | | | | | | | | |
($11) | | | | 6+3n,3 | | | | | | | | | | | | |
=================================================================================================================================================================|
1010 A | DEC | ORCC | BPL | ABX | DECA | DECB | DEC | DEC | ORA | ORA | ORA | ORA | ORB | ORB | ORB | ORB |
(なし) | 6(5),2 | 3(2),2 | 3,2 | 3(1),1 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBPL |* pshuw *|* decd *|* decw *| | |* ord *|* crd *|* ord *|* ord *| | | | |
($10) | | |5/6(5),4| 6,2 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* tfm(r+,r)*|* dece *|* decf *| | | | | | | | | | |
($11) | | | | 6+3n,3 | 3(2),2 | 3(2),2 | | | | | | | | | | |
=================================================================================================================================================================|
1011 B |* tim *| | BMI | RTI | | |* tim *|* tim *| ADDA | ADDA | ADDA | ADDA | ADDB | ADDB | ADDB | ADDB |
(なし) | 4,3 | | 3,2 | 6/15(17),1 | | | 5+,3+ | 5,4 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 | 2,2 | 4(3),2 | 4+,2+ | 5(4),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBMI |* puluw *| | | | |* addw *|* addw *|* addw *|* addw *| | | | |
($10) | | |5/6(5),4| 4,3 | | | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* tfm(r,r+)*| | | | |* adde *|* adde *|* adde *|* adde *|* addf *|* addf *|* addf *|* addf *|
($11) | | | | 6+3n,3 | | | | | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 | 3,3 | 5(4),3 | 5+,3+ | 6(5),4 |
=================================================================================================================================================================|
1100 C | INC | ANDCC | BGE | CWAI | INCA | INCB | INC | INC | CMPX | CMPX | CMPX | CMPX | LDD | LDD | LDD | LDD |
(なし) | 6(5),2 | 3(2),2 | 3,2 | 20(22),2 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | 4(3),3 | 6(4),2 |6+(5+),2 | 7(5),3)| 3,3 | 5(4),2 | 5+,2+ | 6(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBGE | |* incd *|* incw *| | | CMPY | CMPY | CMPY | CMPY | |* ldq *|* ldq *|* ldq *|
($10) | | |5/6(5),4| | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | 8(7),3 | 8+,3+ | 9(8),4 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* BITMD *|* ince *|* incf *| | | CMPS | CMPS | CMPS | CMPS | | | | |
($11) | | | | 4,3 | 3(2),2 | 3(2),2 | | | 5(4),4 | 7(5),3 |7+(6+),3+| 8(6),4 | | | | |
=================================================================================================================================================================|
1101 D | TST | SEX | BLT | MUL | TSTA | TSTB | TST | TST | BSR | JSR | JSR | JSR |* ldq *| STD | STD | STD |
(なし) | 6(4),2 | 2(1),1 | 3,2 | 11(10),1 | 2(1),1 | 2(1),1 |6+(5+),2+| 7(6),3 | 7(6),2 | 7(6),2 | 7+(6+),2| 8(6),4 | 5,5 | 5(4),2 | 5+,2+ | 6(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBLT | |* tstd *|* tstw *| | | | | | | |* stq *|* stq *|* stq *|
($10) | | |5/6(5),4| | 3(2),2 | 3(2),2 | | | | | | | | 8(7),3 | 8+,3+ | 9(8),4 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | |* ldmd *|* tste *|* tstf *| | |* divd *|* divd *|* divd *|* divd *| | | | |
($11) | | | | 5,3 | 3(2),2 | 3(2),2 | | | 25,3 |27(26),3| 27+,3+ |28(27),4| | | | |
=================================================================================================================================================================|
1110 E | JMP | EXG | BGT | | | | JMP | JMP | LDX | LDX | LDX | LDX | LDU | LDU | LDU | LDU |
(なし) | 3(2),2 | 8(5),2 | 3,2 | | | | 3+,2+ | 4(3),3 | 3,3 | 5(4),2 | 5+,2+ | 6(5),3 | 3,3 | 5(4),2 | 5+,2+ | 6(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBGT | | | | | | LDY | LDY | LDY | LDY | LDS | LDS | LDS | LDS |
($10) | | |5/6(5),4| | | | | | 4,4 | 6(5),3 |6+(6+),3+| 7(6),4 | 4,4 | 6(5),3 |6+(6+),3+| 7(6),4 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | | | | | | |* divq *|* divq *|* divq *|* divq *| | | | |
($11) | | | | | | | | | 34,4 |36(35),3| 36+,3+ |37(36),4| | | | |
=================================================================================================================================================================|
1111 F | CLR | TFR | BLE | SWI | CLRA | CLRB | CLR | CLR | | STX | STX | STX | | STU | STU | STU |
(なし) | 6(5),2 | 6(4),2 | 3,2 | 19(21),1 | 2(1),1 | 2(1),1 | 6+,2+ | 7(6),3 | | 5(4),2 | 5+,2+ | 6(5),3 | | 5(4),2 | 5+,2+ | 6(5),3 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | LBLE | SWI2 |* clrD *|* clrw *| | | | STY | STY | STY | | STS | STS | STS |
($10) | | |5/6(5),4| 20(22),2 | 3(2),2 | 3(2),2 | | | | 6(5),3 |6+(6+),3+| 7(6),4 | | 6(5),3 |6+(6+),3+| 7(6),4 |
--------|--------|--------|--------|------------|---------|---------|---------|--------|--------|--------|---------|--------|--------|--------|---------|--------|
| | | | SWI3 |* clre *|* clrf *| | |* muld *|* muld *|* muld *|* muld *| | | | |
($11) | | | | 20(22),2 | 3(2),2 | 3(2),2 | | | 28,4 |30(29),3| 30+,3+ |31(30),4| | | | |
=================================================================================================================================================================|
[** These are the typographical errors I noticed in the above table: ]
[** Several other places appear to be errors and need to be checked.]
[*1* Cycle count for sube direct mode ($1190) is clearly a typo, see subf: (4),3 => 5(4),3 .]
[*2* Long branches with dual mnemonics (LBHS/LBCC=$1024 and LBLO/LBCS=$1025) abbreviate the second mnemonic in the original to fit the table.]
[*3* First mnemonic (LBLU) for LBLO/LBCS ($1025) is a typo.]
[*4* The cycle and byte counts for ROR extended ($76) and LDA immediate ($86) are swapped: ROR extended => 7(6),3 and LDA immediate => 2,2 .]
[*5* The byte count for LBSR ($17) is a typo: 2=>3 .]
[*6* The cycle and byte counts for ROL extended ($79) and ADCA immediate ($89) are swapped: ROL extended => 7(6),3 and ADCA immediate => 2,2 .]
[*7* The mnemonic for (crd) for ord direct ($109A) is a typo.]
[*8* The cycle and byte counts for puluw ($103B) appear to be a typo.]
[*9* The byte count for CMPX indexed ($AC) is a typo: 2=>2+ .]
[*10* The op-code at which ldq immediate is shown ($CD) looks maybe out-of-place (=>$10CC?), needs to be checked.]
[*11* The cycle counts for LDY, STY, LDS, and STS indexed ($10AE, $10EE, $11AE, $11EE) look odd: 6+(6+), need to be checked.]
[*12* The mnemonic for clrd ($104F) is a typo: clrD .]
[** Do not assume that I found them all.]
</pre></code>
<br />
<p>--------<br />Oh!FM 1988-4 77<br /></p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-28809399292871679492022-05-01T15:19:00.037+09:002022-05-01T16:04:01.578+09:00Panasonic Let's note CF-NX2 Japanese Keyboard<p>For reference, this is the Japanese keyboard on the Panasonic Let's note model CF-NX2:</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCxOpnCF2tgo4zVSsD8BXE8pEhu6yIOX7FCkuWJWhbG2XkW74AmIh4CJIsRncCboWxn1z-agHpYodvxzMgdP7-16dayl-HthTQIA-WpzgFjAmMOXat27JNz1-qFSrJHaBvrFr3D7nyhJI-6sZNlk1g1lbKcjx7lkK8C775Ou4XsGutf2bks72D4N4qGQ/s1518/Lets_note_keyboard_hi.jpeg" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="908" data-original-width="1518" height="382" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCxOpnCF2tgo4zVSsD8BXE8pEhu6yIOX7FCkuWJWhbG2XkW74AmIh4CJIsRncCboWxn1z-agHpYodvxzMgdP7-16dayl-HthTQIA-WpzgFjAmMOXat27JNz1-qFSrJHaBvrFr3D7nyhJI-6sZNlk1g1lbKcjx7lkK8C775Ou4XsGutf2bks72D4N4qGQ/w640-h382/Lets_note_keyboard_hi.jpeg" width="640" /></a></div><p><br />This is a fairly standard layout for Japanese keyboards since sometime around the early 1990s.<br /></p><p>Note that the ¥ (yen) key is by default mapped by Ubuntu 18 to the \ (backslash) character for most applications, unless you are using a Japanese input method and have Japanese input selected.<br /></p><p></p><p> </p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-67584488409481972602022-02-20T01:08:00.392+09:002022-02-24T00:00:35.901+09:00When the IBM 9000 Scientific Actually Existed (PC History and Context)<p><i>
</i><i>(This is not about the more recent mainframe called Enterprise
System/9000. It's about a scientific (originally) workstation in the
early 1980s called the <a href="https://en.wikipedia.org/wiki/IBM_System_9000" target="_blank">System 9000</a>.)</i> <br /></p><p>I was sure that the existence of the 68000-base IBM System 9000 scientific workstation actually predated the 8088-based IBM PC model 5150,
but all sorts of articles say it started development after the machine we know as the IBM PC and was released in
1982 or 1983.
</p>
<p>Nope, nope, and nope. I rediscovered something today (late Fri. night, Feb. 19, '22, when I probably really should have been going to bed). My memory was not wrong:</p>
<p><a href="http://www.columbia.edu/cu/computinghistory/cs9000.html">http://www.columbia.edu/cu/computinghistory/cs9000.html</a>
</p>
<p>
(A picture says a lot. I'd like to post a picture here, but the only one I can find is the one at the end of the link above, which doesn't offer liberal use terms. That one shows a vertically stacked system with a plotter on top of a system unit that includes function keys on a slanted panel above the keyboard tray, and, mounted above the plotter, a monitor with floppy disk drives to its left. It looks like something you'd use in a lab.) <br /></p><p>So there were running prototypes in 1980. Not released until 1982/83, but running prototypes
in 1980. <br />
</p>
<p>IBM shot itself in the foot on this one. Big time. </p>
<p>Effectively gave the industry to Bill Gates and Microsoft.</p>
<p>And I have to remind myself of a little bit of religious mystery: </p>
<p>
Microsoft was the only company -- well, one of the few companies -- willing to sell a defective operating system --
which thus allowed a lot of people to make money off fixing Microsoft's
problems. (That's the short version. Maybe I shouldn't say "defective". "Unfinished" would
probably be a better way to put it.) And it was good enough to start letting some people start keeping their
family history work on their computers, among other things. That's why God (or natural consequence, for you atheists
and agnostics out there) let Microsoft take over the market.
</p>
<p>(I should not that shooting yourself in the foot is what this world is for. Shooting yourself in the foot doesn't have to be a fatal error, and, even though it's painful, it can be a good experience if you learn from it.) <br /></p><p> </p>
<hr width="50%" />
<p></p>
<p>Unpacking that:<br /></p>
<p>
First, when talking about software, unfinished and defective mean essentially
the same thing -- that it doesn't (yet) perfectly satisfy the customer's needs. </p><p>But
there is no such thing as finished software. If you have a piece of software
that's in use, someone is going to be regularly finding ways in which it
doesn't meet spec, and other ways in which it could be extended and improved. </p><p>Even now, software always has defects. Nobody sells truly finished software. When talking about software, finished means dead. </p><p>But back in the 1970s and 1980s, most companies in the nascent software market intended to at least try to make their products free of known defects before they put them on the market. Microsoft, on the other hand, was willing to put products out that were known (by their own estimate) to be only 80% finished. That meant that the customers could be using it while they worked on a better version.</p><p>(It also meant customer frustration because of overly optimistic interpretation of sales literature claims. I got bitten hard several times by that, and, yes, the pain is still there. The second time, yeah, I should have been more wary. The third time, I guess it was my fault for taking the claims too literally yet again. Talk about painful mistakes, but I've learned that Microsoft's stuff doesn't work very well for me.)<br /></p>
<p>
In the 1990s, Microsoft made too much of this principle with their 80/20 rule
of getting a product out the door when 80% of the function was implemented,
and letting the customer help figure out the remaining 20%. All too often, it
was closer to 20/80 in my opinion, but even that is not exactly wrong in the
agile approach to technology.
</p>
<p>
(<a href="https://en.wikipedia.org/wiki/Agile_software_development" target="_blank">"Agile" is actually a discipline</a>
that I approve of in concept, if not in extant implementations. But I'm being
a little more general than Agile techniques here.)<br />
</p>
<p>
Other companies, including IBM, still tended back in the 1980s to try too hard to give a product too
much polish before turning it over to the customer. That approach may make for
more satisfied engineers (maybe), but gives the customer less say in how the
product should develop, and less opportunity to revise and refine their ideas
of their requirements early on, while the product requirements are easier to
rework.<br />
</p>
<p>
Microsoft BASIC is one example of this principle. (And Tiny BASIC is an even more extreme
example.)
</p>
<p>
Dartmouth stripped down the definition of the Fortran (ForTran <= Formula
Translator) language to produce a definition of a BASIC (Beginners'
All-purpose Symbolic Instruction Code or something like that) language.
Fortran was too complicated (for some meaning of complicated) for ordinary
people to understand, so they made it simpler. More important, Fortran had to be compiled, meaning only people with a compiler could use it, putting it even further out of reach of ordinary people.<br /></p>
<p>
But even the Dartmouth definition of BASIC was more than your usual user thought they wanted at the
time. A programmable calculator with a large screen was just fine for an awful lot of
purposes, and was more than they/we had before then.
</p>
<p>
So Paul Allen and Bill Gates borrowed (with tacit non-disapproval) a certain company's PDP
minicomputer at night for a couple of months and worked up a stripped-down derivative of a derivative of Dartmouth's BASIC (and in
the present intellectual property regime would have probably owed a lot of
royalties to Dartmouth and DEC among others) and got it running on the very
early microcomputers, starting with the 8080 but continuing to the 6502, 6800, and many others.</p><p>(If I've made it sound easy, they were definitely ignoring their studies during those two months they worked on the first 8080 version, using understanding acquired from previous experience elsewhere, and putting in long hours for the whole two months.) <br /></p>
<p>Microsoft BASIC was very incomplete. But it filled a big need. And customers were able to
give them feedback, which was very important. (A different, somewhat freely distributable version of BASIC, Tiny BASIC, was even more incomplete, but it filled a similar big need in a more varied, but smaller overall market.)<br /></p>
<p>
Family history, at the time, was a field in which the professionals had very
arcane rules about paper and ink to use, format of data, data to include, and
so forth. As incomplete as anyone's implementation of BASIC was, programs written in BASIC were able to essentially help the
researcher get the data into computer files fairly correctly and fairly
painlessly.
</p>
<p>
(Family and personal history are a couple of the hidden real reasons for the
need for personal computing systems.)<br />
</p>
<p>
The same sort of thing happened with Microsoft DOS and Windows operating
systems. They were incomplete and even defective in many ways, but they provided a framework under which a variety of programs useful in
business applications could be written and shared/sold. </p><p>CPM from Digital Research was more complete than DOS, but more expensive. So was the Macintosh, from Apple. (Microware's OS-9/6809 was very nicely done for the time and, on Radio Shack's Color Computer, was priced more within reach, but it had an image of being too technical for the average user, and Microware really wasn't trying hard to sell it in the general market.) <br /></p>
<p>
Essentially, the incomplete (or defective) nature of Microsoft's products provided
a virtual meeting ground for the cooperation of a large number of smart
people to fix, enhance, and extend the products. <br /></p><p>Similar things could be said for Commodore's 6502-based computers,
but they had the limits of an 8-bit architecture, and Jack Tramiel and Commodore's board of directors were way
too content just selling lots of cheap stuff and letting the customers figure out what to do with it. </p><p>When Tramiel picked up the (68000-based) Amiga, he didn't have a way for people to bridge the gap between Commodore's earlier 8-bit stuff and the Amiga. <br /></p><p>One thing that can be said about Microsoft, they understood managing and selling upgrades. <br /></p><p>Incompleteness and even defects can be a valuable feature of a technological product.<br />
</p><p>So this much was not bad, really. It was when Microsoft got too big for their
britches in the mid-1980s when things started going really south, and when Microsoft
refused to give up their developed tacit monopoly in the mid-to-late 1990s
that things went permanently south.<br />
</p>
<p>
Compare this to Apple's Macintosh? Apple had good technological reasons to keep
closer control at the time. Even the 68000 didn't have enough oomph to provide
a stable graphical user experience if too many people got their hands in the
pie. But the lack of approachability did ultimately hurt the Macintosh's acceptance in
the marketplace, more so than the price to the end-user. The technological reasons for doing so notwithstanding, maintaining that control hurt their market acceptance.</p>
<p>
Intel's development of the 8086 followed a similar pattern of required upgrades, moving from less complete to more, although they
almost killed themselves with the 80286.<br />
</p>
<p> Both Intel and Microsoft are now eating the consequences of trying too hard to
be the only dog. Too big, too heavy, too much momentum in the technology of
the past. And we, the market, are eating the consequences of letting them get into that
position. <br /></p>
<p>All the big companies are in the way. When a company gets too big, it can't help but get in the way, especially when they are busy trying to establish or keep the tacit monopoly position in their market. (I'm very concerned about Google, even though I use their stuff. Even though it works for me now -- Microsoft's stuff never worked very well for me. Even though Google's stuff works for me now, I'm looking for alternatives. Monopolies are not a good thing.)<br /></p><p>I've wandered from the topic. </p><p>Yes, using the 68000 would have required IBM to work harder to keep focused on a limited introductory feature set similar to the 8088 IBM PC.<br /></p>
<p>So what about the IBM 9000 and Motorola's 68000 and the IBM 5150?</p>
<p>
Could IBM have based their personal computer offering on the 68000 instead of the 8088 and been
successful?
</p>
<p>
The IBM 9000 has been regularly taken up, along with Apple's Lisa, as an
example of how developing a PC based on the 68000 would be prohibitively expensive
for the personal computer market.
</p>
<p>
But both are more of an example of how the 68000 allowed an awful lot more to be
done than the 8086 -- so much more that over-specification became way too
easy. It had lots of address space, decent speed at 16 bits, not slowed down
significantly at 32 bits. And it was hard to tell the idealists in the company (the board of directors and the sales crew) who wanted to add
features to a product, no, we can't do that yet, until it was too late and the product had departed significantly
from what the customers wanted and was way over budget and way past the delivery date or market window. That's a significant part of the reason both
the 9000 and the Lisa did not come out in 1980, and ultimately did not do well in the market.
</p>
<p>
The Macintosh is a counter-example. Much tighter focus in development, more accessible entry
price, more accessible product. (And borrowing heavily from the lessons of the Lisa.) </p><p>I often say that the 68000 was may have been "too good" for the market at the time, since it seems to have required someone with Steve Jobs' focus and tenacity, and the lessons of the Lisa, to successfully develop the Macintosh.<br />
</p>
<p>
The 9000 was targeted at the scientific community, and it was intended to be a
"complete" (meaning all parts IBM) solution. That kept it off the market too
long and kept the price high.<i></i> <br /></p><p>Could IBM have stripped down the 9000 design and built a machine comparable to the IBM PC with a 68000 and sold
it at a comparable price? Or even started over with a simpler goal and successfully developed a 68000-based PC?
</p>People have been "explaining" that it would have been "prohibitively expensive" for an awfully long time.
<p>Sure, the 9000 was significantly more expensive than a PC, but it came with significantly more stuff. Expensive stuff. A complete, (relatively) solid real-time OS in ROM, for instance, 128K of ROM vs. the 8088 PC's 20K ROM with BIOS+BASIC only; base configuration included disk integrated into the OS, vs. no disk and no OS beyond BASIC in the original IBM PC. 128K of RAM vs. the IBM PC's 16K in base configuration (not just double, 8 times the amount). <br /></p><p>The base configuration of the 9000 came with floppy disk storage, touch-panel display, keyboard designed for laboratory use, ports to interface it to scientific instruments, and something called memory management. All of that was very expensive stuff at the time. (I don't remember if the plotter was standard in the base configuration.) </p><p>And it was expensive to put any of that stuff on an 8088 PC. </p><p>Speaking of memory management, some people thought the segment registers in the 8086 were for "memory management", but that's just plain wrong. They were not designed for the same class of function. No mechanism to control class of access, no bounds registers to set segment size, no help when trying to page things in and out of memory. More of a cheap substitute for bank switching. Again, it was not a bad thing, just not what some people thought it was. <br /></p><p>FWIW, the 68000 didn't need bank switching at the time because of the large address bus. And it came with 8 address registers, 7 of which could easily be used as true segment registers without the clumsy 16-byte granularity of the 8086 segment registers. Segment size still had to be enforced in software on the 68000, rather than hardware. <br /></p>
<p>As the guy at the link I posted above said, strip the 9000 down to the kind of machine the original PC was and it would have been very competitive in 1980.
</p>
<p>How much more would it have cost than the original IBM PC?
</p>
<p>8 extra data lines. 4 extra address lines. 12 extra traces on the mainboard, twelve more lines in the expansion bus, two extra buffers. And they could've fudged on the extra address lines, left them out of the first model and kept the first model limited to a single megabyte address space. A megabyte of address space was huge at the time. </p><p>Other than the bus connectors, less than ten dollars, including markup. <br /></p><p>Bus connectors were often mentioned as a blocking point at the time. They had sources for the bus connector they used in the System/23 Datamaster, and the connector was not overly expensive. But, with just 52 lines, it was just big enough for 8 data bits, 20 address bits, and the control signals and power lines they wanted. They would have either had to get a wider connector, or they would have had to use a second connector like the one in the AT bus from the outset. Forty dollar (after markup) for just the bus connectors seemed large to them, I guess, even though I think they should have been able to see that the cost would come down at the volumes they would be purchasing, even with their woefully underestimated demand.</p><p>RAM chips? The original board laid them out in four banks of eight. Arranging that as 2 by 16 instead of 4 by eight would not have killed any budgets. The one minor disadvantage was that you physically had to start with twice the base configuration RAM, 16 chips instead of 8. (The max on-mainboard would not have changed.) But that does not seriously harm the end price, either. Calculating the price as a portion of the base sticker price for the 8086 IBM PC, it would have added something like fifty dollars.
</p>
<p>BIOS ROM? The original had four 2K ROMs, didn't it? Arrange those in 16-bit wide pairs and you're done. Same count of ROMs, one extra buffer. Three bucks added for liberal markup on the buffer and PC board traces. (I know there were engineers who felt that it was somehow sacrilege to use a pair of 8-bit ROMs instead of a single 16-bit wide ROM, but, no, it wasn't. And, like I say, it did not really add significantly to the cost to do it in pairs, if you're going to have four ROMs anyway.)</p><p>The ROM for BASIC? Yeah, that would have had to be done as a pair of ROMs, so add, I think it was, twenty dollars at manufacturing prices plus markup for two 8K by 8 ROMs instead of 15 dollars for a single 16K by 8 ROM. <br /></p>
<p>Would the 68000's declaimed lower code density have meant more ROM? </p><p>No. Code density on the 68000 was not worse than on the 8086, unless you deliberately wrote the 68000 code like you were transliterating 8086 code. </p><p>The 68000 does not have as good code density as the 8-bit 6809, but the 6809 is exceptionally code dense when programmed by someone who knows what he is doing. Different question.</p><p>If you were using compilers of the time to do the coding, compilers for the 68000 were often really weak on using the 68000's instruction set and addressing modes. The compiler engineers really seemed to be writing the code generators like they were writing for, who knows? Some other, much more limited CPU.</p><p>If an engineer did the BIOS and BASIC in assembler and didn't bother to learn the CPU, sure, he would get similarly bad results. But the 68000 had a regular instruction set. It shouldn't take more than, say, eight hours of playing around with a prototyping kit (such as Motorola's inexpensive ECB) to understand. <br /></p><p>Ah. Microsoft. Yeah. Their BASIC for the 6809 was not a model of either space or cycle efficiency. Apparently they did just map 8080 registers to 6809 registers and key 8080 instructions to something close in the 6809 instruction/addressing mode repertoire, and did the conversion automatically with no cleanup afterward. So, if they had gone that route, using Microsoft's BASIC, they'd have had to use a second 16K ROM. Add 15 dollars.</p><p>Do I fault Microsoft for their BASIC for 6809? Yeah, I guess I do, a little. It's a dog. Sorry. Some people think it's representative of 6809 code. No.<br /></p>
<p>Peripheral chips? I've heard people talk about lack of 16-bit peripheral chips for the 68000. Why do they talk about that? I don't know.<br /></p><p>The 68000 specifically included instructions to support the use of 8-bit port parts without any additional buffers or data bus translation. Hanging a 6821 on a 68000 was literally no more difficult than hanging one on a 6800 or 6809. Likewise the 6845 video controller that got used in the original PC. And non-Motorola parts would be no more difficult. Completely beside the point.<br /></p>
<p>Price of the processor, yes. But not the four times price that is often tossed around. That's a small lot price. IBM's projected manufacturing, underestimated as it was, still would have allowed much better pricing on the CPU. So the 68000 would have added a hundred dollars to the cost of the first run. </p><p>Availability? Motorola was always conservative on availability estimates. It was not an actual problem, although I suppose some of IBM's purchasing department might not have known that. <br /></p>
<p>Scaling the operating system down from the 9000's? That could have been a problem.</p><p>But IBM ultimately infamously went to a third party for the OS of the 5150 PC, anyway. </p><p>Operating systems aren't that hard to port to a CPU as capable as the 68000. I'm sure Microware would have been game for porting OS-9/6809 to the 68000 a couple of years earlier than actually happened. OS-9 was a good OS that was cheap enough for Radio Shack to sell for the 6809-based Color Computer, and it took them less than a year to go from the 6800 to the 6809 with both OS-9/6809 and Basic09 in 1979/1980. On to the 68000 in 1980/1981 instead of 1981/1982? Not a problem. </p><p>Some people worry about the position independent coding used in OS-9, but the 68000, like the 6809, directly supports PC-relative addressing (IP-relative, in Intel-speak), so you don't need a linking loader. No need for relocation tables. A module can load at any address and be linked and used by multiple processes, themselves loaded in arbitrary locations without patching long lists of addresses. </p><p>You point an index/address register to the base of the module, and the caller knows the offsets, and the callee doesn't care where it is loaded. Everything is relative. That's why OS-9 can be real-time, multitasking with no MMU.<br /></p><p>Another possibility for third party OS and BASIC was Technical Systems Consultants, the producers of TSC Flex (similar to CP/M) and Uniflex (like a stripped-down Unix, but in a different way from OS-9 -- not real-time, not position-independent). They knew their way around assembly language on Motorola's CPUs, too. </p><p>Several possible third-party sources.<br /></p><p>Total price increase? $200, maybe $250.<br /></p>
<p>Pushing the price up to $1750 from $1500 was not prohibitive, not even a problem, if they had simply done the necessary market research. <br /></p>
<p>
</p>
<p>
</p><p>This sounds theoretical? </p><p>I did some tentative design work about that time, similar to Motorola's ECB prototyping board for the 68000. Comparing it to the IBM PC specs, using factory volume prices on the CPU, they could have sold a 68000-based equivalent with a base configuration of 32K RAM instead of 16K to make the data bus easy, and still sold it at about 200, maybe 250 dollars more than the original IBM PC, without losing money. The market would have accepted that.<br /></p><p>I mean look at what people were paying for OS-9/6809 systems in 1981. Well, okay, those systems came with at least one disk drive in base configuration for something more than $2000 total. But the acceptance would have been there.<br /></p>I remember from conversations with people who worked at IBM that there was concern about the PC competing with the System 34. That was a sort-of valid concern, really. But that's the only thing that might have required them to set the price above 2000. But, no, the System 34 is a lot more than just CPU, just like the 9000 was. It was misplaced concern.<br /><p>I've mentioned the 6809 above, and I'll mention it again here. Motorola shot themselves in the foot by failing to upgrade the 6809 to be a true 16-bit CPU, either with expanded index registers, or with full-width segment registers, to break the 64K address barrier without external bank switching.</p><p>They also shot themselves in the foot by over-designing the 68020. Way overdone, that one. If they had gone directly in 1984 to what they eventually called the CPU 32 instead, the early market for RISC CPUs would have taken a temporary but serious ding.<br /></p><p>Was the 8088 PC a mistake, then?</p><p>Not exactly. </p><p>I rather think that going only with the 68000 would have been a mistake of a different sort, and I'm pretty sure I would have complaints about that, as well. <br /></p><p>I think I've hinted at some of the reasons above, but I can sort-of summarize by noting that the reason monopolies are bad in technological fields is that all the technology we have is unfinished, broken outside the context in which it was designed. We can't escape that. Solutions are always context-specific. And technology that is too powerful from the start can really get in the way.<br /></p><p>So, I say all of this, and yet I say that the 8088-based 5150 we know and love as the IBM PC was a mistake. <br /></p><p>I personally think, if they had been really smart, they would have released both the 8088 PC and a 68000 PC at the same time, running relatively compatible OSses and software. Deliberately sowing the field with variety keeps options open for everyone. <br /></p><p>Variety helps the market more than hinders it. But that is a separate topic for a separate rant another day. (Or an alternate reality novel. This is part of a theme I'm trying to explore in <a href="https://defining-computers.blogspot.com/2021/11/alternate-reality-33209-pt-1.html" target="_blank">a novel that I never seem to have time to work on any more</a>.)<br /></p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-83670538005248282942021-12-22T22:55:00.031+09:002022-03-21T23:43:09.810+09:00Alternate Reality -- 33209 (pt. 2, summer 1981)<div style="text-align: center;">
***** Alternate Reality Warning *****<br />
</div>
<p></p>
<p>
So, for whatever reason, you are reading my fantasies about an alternate world where computer and information technology took a significantly different turn in the early 1980s. Maybe you started with the different <a href="https://defining-computers.blogspot.com/2021/10/alternate-reality-early-microcomputer.html" target="_blank">Mac OS-9 that could have been</a>. Maybe you have read <a href="https://defining-computers.blogspot.com/2021/11/alternate-reality-33209-pt-1.html" target="_blank">the beginning of the 33209 technology timeline</a> or the beginning of my story about the
<a href="https://joelrees-novels.blogspot.com/2020/01/33209-2nd-Microcomputer-Revolution-Homecoming-TOC.html" target="_blank">the alternate world of the 33209</a>. </p>At this point, the timeline has gone well beyond what I have published of the 33209 story, but I'm going to keep plotting it out in advance because I've hit a writer's block on my stories. I'm not sure the timeline here will match the story as it plays out, though.<br /><p>
(You're not sure you're interested? How odd. ;-)
</p>
<p></p><h4 style="text-align: left;">* Summer 1981: <br /></h4><p>In late June, Several engineering and CS students from University of Texas's main campus in Austin come to visit, ask to join the group for the rest of summer, and are accepted in. Some of the UT Austin group are undergraduate, some are doing graduate work.<br /></p><p>
</p>
<p>
Microware formally begins a top-secret project to re-write OS-9/6809 to use a
split stack and merge in other features and concepts from <i>Susumu</i>. TSC begins publicly announced re-writes of both Flex and Uniflex, merging features from
<i>Susumu</i> into their systems. Both Microware and TSC begin developing
versions of their OSses for the 68000, as well, Microware secretly and TSC publicly. </p><p>IBM and Apple collaborate with the Microware and TSC OS projects, Apple
openly about the non-top-secret projects, IBM on the quiet. Various students from the group are invited to Clive and Chapel Hill for short internships, as well.<br /></p><p>Bill Mensch and others from Western Design Center request to be permitted to come observe the research group's work, and, after a short discussion among students and sponsors, are allowed to visit. <br /></p><p>By the end of June, the students working with TI's new version of the 9900 have gotten their prototypes running most of <i>Susumu</i>, better than the ports to the other processors except for the 6809, 68000 and 682801.</p><p>More games, and more complex games are evidence of the functionality of the language and the system, and, with other applications that the students develop and share, help motivate the students to add support for modularization and versioning (tracking and control) to the language. </p><p>Versioning drives the development of archiving tools, and those who hadn't yet built fast tape hardware for their systems do so, for the mass storage necessary for backup and archiving. Even at a mere 4 kilobaud with one bit per baud, one channel of one side of a forty-five minute cassette yields the storage capacity of several of the floppy diskettes of the time. (Cassettes longer than twenty-three minutes per side tend to stretch more quickly, so the students generally tend not to use 90-minute or two-hour cassettes, except for permanent one-time backups.)<br /></p><p>Several of the students develop control hardware for their cassette recorders or cassette tape decks, to allow various levels of computer control.<br /></p>
<p>
</p><p>Apple releases its first models of the Apple IX line running OS-9 on the 6809 in early July, using the 6847 for console video
output and the 6845 for optional workstation video output. The BIOS ROM is essentially <i>Susumu</i>.</p><p>Also in early July, more magazine articles appear, including rushed articles discussing early versions of Split P and the common code base. Outside the group, Split P generates particular interest at Bell Labs and the parent telephone companies.<br /></p>
<p>And also in early July, Motorola offers the group samples of the upgraded 6809s and 6845s, along with
integrated support chips for the upgraded 6845, and most of the
group who are not out for the summer volunteer to experiment with the parts in various
configurations, including using 6845s with non-Motorola CPUs.
</p>
<p>
Within a few days, students have Micro Chroma style computers with the upgraded 6809 running <i>Susumu</i>, using the 68471 for video output. Computers
employing the upgraded 6845 take a bit more time to get up and running, but
several are running within a week and work begins on improving the code generation for the processor's new features. </p><p>The UT Austin students arrange to move their coursework to the local
branch of UT for fall semester so they can continue to engage with the group. </p><p>Bill Mensch arranges with a couple of the UT Austin students to help him design an upgrade to the 6502 that will be optimized for <i>Susumu</i> and Split P, and returns to Mesa. </p><p>And also early in July, a couple of students get interested in text
windows on the Univac terminals and starts trying to implement them in
their own computers. One of the teachers mentions the Xerox Parc project
and graphical user interfaces, and several students begin back-burner
on-again, off-again projects trying to figure out how to make such a
thing work. Various experiments in pointer devices are attempted. Joysticks are quickly produced, and functional light pens have been produced for the 6845-based displays by mid-July. Various attempts at tracking a pencil writing or sketching on a page are attempted, without much success.<br /></p><p>And again, also starting early in July, inspired in part by the dc/bc basic calculator the
students have discovered in Unix and in part by Tiny BASIC, a group of
students work to re-implement dc on top of the Forth grammar of Split
P, and then re-implement bc on top of the C grammar. They then extend
both with formatted output and other functions usually found in BASIC.
At IBM's request, they add graphics and sound functions, and certain
functions useful in a business environment.</p><p>Overall, July sees more general steady progress, with <i>Susumu</i> becoming stable on the upgraded version of the 9900, and, towards the end of July, on the other non-Motorola CPUs. </p>Discussion of a license for <i>Susumu</i> and the hardware the students are developing reaches a head, and a software license similar to the licenses Berkeley and MIT would later in our reality use for open source software is produced by the students and approved by the college, with help from the sponsoring companies. <p>Several hardware design licenses are produced and approved by the college, and the college and the sponsoring companies help arrange for patent research and applications.<br /></p><p>More stringent sharing agreements are established for participating in and getting support from the research group, similar to the openBSD project source code requirements and the software freedom terms of the GNU Public License that would develop later in our reality. <br /></p><p>Unix on the 6801 remains a bit slow but usable. It is a bit snappier on the 682801, more so on the 6809, even more so on the new version of the 6809. </p><p>Unix is even more usable on the 68000, even though virtual memory management functions on the 68000 can't be handled well without a full memory management unit. Simple bank switching in an address space as large as the 68000's is, of course, not very workable for hardware memory management. And, of course, the students haven't quite figured out virtual memory yet, anyway, even
with the help of the UT students and some industry engineers. The complexities of virtual memory were not generally well understood at the time.<br /></p><p>Experiments begin with using four of the 68000's address registers as explicit run-time segment registers for Unix, mirroring their use in <i>Susumu,</i> to help work around lack of memory management.</p><p>Similar experiments in segmentation on the 8086 do not fare as well, because of the design of the 8086's segmentation, but still proceed. </p>In very early August, Motorola publicly announces sampling of an upgrade to the
6809: <br />
<ul style="text-align: left;">
<li>
Instruction timings for the 681809 are brought up to par with the 6801 and
682801, and the 681809 is announced as an SOC core with most of the I/O library for the 682801 available, including on-board DMA
and bank switching. <br /><br />
</li>
<li>
The 6809 already has direct page mode op-codes for the unary instructions,
so that doesn't change. However, the direct page mode has been added to the
index post-byte, allowing memory indirection on direct page variables and
use of the load effective address (LEA) instruction to calculate the address
of direct page variables.<br /><br />
</li>
<li>
A new I/O page register similar to the direct page register is added. Unlike the original DP register, there are no op-codes allocated for I/O page access. Addressing via the IOP register is implemented entirely as a new mode in the index mode post-byte, and access is entirely via arguments to the TFR and EXG instructions. <br /><br />Since no codes for the IOP are available to the PSH and PUL arguments, pushing and popping IOP require transfer through another register. Since the IO page is considered more of a hardware design constant, this is considered to probably not be a great bottleneck in the use of the IO page. <br /><br />
</li>
<li>
64 byte spill-fill stack caches are implemented for the S and U stacks,
similar to those in the 682801, but with a shorter spill point for S, to cover the
larger register set -- 22:34:8.<br /><br /></li><li>In addition to the stack caches, the initial SOC versions of the 681809 has a half-kilobyte of internal RAM for use as fast direct-page RAM.<br /><br />
</li>
<li>
User and system modes are also implemented similarly to the 682801's. Access to the I/O page register can be prohibited in user mode, generating a new memory access violation interrupt.<br /><br />
</li>
<li>
Likewise, address mode function code outputs are provided, indicating addresses relative to
the PC, to the S stack, to the U stack, to the direct page, to the I/O page, to
extended (absolute) addresses. Two codes of eight are reserved.<br /><br />Because the index post-byte provides the ability to specify each of the separations, address space use for the 68109 is much more flexible than for the 682801 when the address space can be determined at compile time.<br /><br />As with the 682801, utilization of the extra address space is expected to require caution. However, since the index post-byte can specify the mode and therefore the space being used, the 681809 should be able to use the extra space more effectively, as long as the space is known at compile time. Better gains in address space are expected:<br /><br /><ul><li>code <= 64K</li><li>general data <= 64K</li><li>direct page <= 256 of 64K (because of the direct page register)</li><li>I/O page <= 256 of 64K (because of the I/O page register)</li><li>parameter stack <= 64K<br /></li><li>return stack <= 64K<br /></li></ul><br />With the 681809, the actual aggregate active memory space per process is expected to be able to often exceed 128K, but not typically exceed 192K. Again, work with <i>Susumu</i> and Unix bears these expectations out.<br /> </li><li>The initial SOCs provide bank switching somewhat similar to the 682801's. But the mapping registers have four more bits than those on the 682801, allowing maximum physical addresses of 16M. With the larger maximum address space, bank switching is simplified for the two stack spaces and the direct and IO pages.<br /><br />
</li>
<li>Small integer hardware division per the 682801 is also provided.<br /></li>
</ul>
<p></p><p>Motorola also announces the Micro Chroma 681809 as a prototyping kit for the new processor and the 68471, again with <i>Susumu</i> in ROM. <br /></p><p>Improvements in the 68455 over the 6845 are not as easy to
quantify, being mostly a redesign to better fit a video cache used for either
text or graphics, along with improved support for hardware scrolling and generally improved graphics support. Modes which switch between direct pixel output and character ROM mapped output simplify the support circuitry for designs that include being able to switch between graphics mode and text mode screens. </p><p>New support circuits for the 68455, to simplify the digital-to-analog output conversion for color and gray-scale are announced, along with a companion DRAM refresh/bus arbitration part, the <strike>68831</strike> 68835, capable of directly supporting up to 512 kilobytes of video RAM, and a nearly identical <strike>68832</strike> 68837 that includes bank switching for the the CPU side, for processors that can't address large address spaces on their own. </p><p>TI announces the 99S200, a version of the 9900 with the local stack frame implemented as entries in a spill-fill cache instead of in general memory. In addition, the 99S200 adds a separate return address stack with spill-fill caching and improved call overhead, and removes the PC entry in the local stack frame. The 99S200 does retain a 512-byte internal bank of fast RAM. SOC parts and libraries are also announced, initial SOCs including both bank switching and DMA. </p><p>TI also announces the 99V180 video display, compatible with the 9918A, but supporting up to double the 9918A's horizontal and vertical resolution, if the external digital-to-analog circuitry and display device are fast enough. <br /></p><p>And TI announces the TI-99/16 home computer, utilizing the new processor and video generator in a memory configuration less constricted than the 99/4, with both <i>Susumu</i> and an improved 99/4 BASIC compatible BASIC in ROM. A simplified version of the main circuit board for the new home computer is also made available as a prototyping kit, with only <i>Susumu</i> in the ROM.</p><p>Shipments of the 99/16 are expected to begin in plenty of time for Christmas.<br /></p>
<p>A few days later, Motorola pre-announces the 68010 about a half a year ahead of
the announcement in our reality, with the improvements seen in our reality:<br />
</p>
<ul style="text-align: left;">
<li>The Popek and Goldberg virtualization features, </li>
<li>
Exception stack frames corrected to allow recovery from bus faults,
</li>
<li>Three instruction loop mode, </li>
<li>Vector base register,</li>
<li>
And improved instruction cycle counts for certain instructions. <br />
</li></ul><p>-- and a little bit more:<br /></p><ul style="text-align: left;">
<li>New addressing modes -- 32-bit constant offsets for indexed modes
and branches.
</li>
<li>New 32-bit integer multiplies, with 32-bit and 64-bit results, and new
32-bit by 32-bit divide and 64-bit by 32-bit divide instructions. </li><li>A new system-mode A6 is provided in addition to the system-mode A7, to better support split-stack run-times.<br /></li><li>Four spill-fill stack caches are provided, one each for system and user mode A7, and one each for system and user mode A6 as a parameter stack. <br /></li></ul>
<p>At the same time as the pre-announcement, samples are made available to the research group, and students dig into it, helping Motorola debug the design in much the way the students helped debug the 99S200. </p><p>A few days later, moving plans up under pressure because of TI's announcement of the 99/16, IBM announces the IBM PC in two models, the Business PC/09 based on the 681809 with bank switching,
and the Family PC/88 based on the 8088, both with video based on the original 6845. Both models start at 16K of RAM expandable on the mainboard to 64K. The 640K address space limit we are familiar with is present on both the PC/88 and the PC/09, but the PC/09 adds a second connector for each card slot, similar to the AT bus that would later be used in our world, that allows full 16M addressing, making the boundary mostly irrelevant. Pricing for the both models is comparable, $1500 for the PC/88l, 1525 for the PC/09 model.</p><p>Neither comes with disk drives, but disk drives are an option. Both have built-in cassette I/O interface and simple sound generating devices.<br /></p><p>Both models offer integration with existing third
party OS products -- CP/M or MP/M for the 8088 and Uniflex-S/6809 or OS-9S/6809 for the 681809, and both
already have software and hardware products that allow integrating these PCs as
workstations into IBM's mainframe and mid-range systems. </p><p>OS-9S is a split-stack version of OS-9 based on OS-9 level 2 and <i>Susumu</i>, now no longer top-secret. Uniflex-S is based on Uniflex and <i>Susumu</i>. <br /></p><p>Both have <i>Susumu</i> as their BIOS in ROM, and both include (as something of a last minute addition) the students' extended dc/bc languages in the ROMs. </p><p>A third model is also announced, based on the (alternate world) 68010 which Motorola has just announced. It starts at 32K RAM installed on the mainboard, but otherwise has a similar design to the 6809 version, including the expanded bus connector, but with a full 16-bit data bus. Uniflex-S/68000 and OS-9S/68000 are available OSses. Pricing starts a little higher, at $1750.<br /></p><p>Neither Microsoft BASIC nor PC-DOS are mentioned.</p><p>In answer to questions about models based on other processors, IBM specifies a product concept that focuses on the software rather than the hardware. They do not commit to using other CPUs, but they do mention development research on the 99S200 and the newly announced Z8001S. They do not mention their own ROMP.<br /></p><p>A week after IBM's announcement, Radio Shack rushes to announce their own OS-9/6809 compatible model, the TRS-09 Color Business Computer based on, and compatible with, their original Color Computer. It includes a built-in DMA controller, with the Color Computer's Multi-Pak interface, a floppy disk controller, and two hardware serial ports in addition to the single CPU-intensive bit-banging port of the Color Computer also built-in. It boots to Microsoft Disk Color BASIC, but the DOS
command can load an operating system from
a floppy disk or ROM pack. Two cartridge slots for ROM packs are brought out to the side, the other two slot circuits being used internally for the built-in I/O. It comes with 16K of RAM, upgradable to 64K. Price and branding adjustments are also announced, with sales personnel pointing out that the entry level price is only half the price of an IBM PC/09. </p><p>The limit to RAM expansion is swept under the rug, but the magazines dedicated to Radio Shack's computers all point it out.<br /></p><p>The two students who are under NDAs with IBM Instruments go to
Danbury again for short internship sessions, and return before school starts.<br /></p><p></p><p>Efforts to find a source for CRTs are still not very fruitful, but the students have developed some approaches that allow a couple of them to write magazine articles describing circuits using the 68471 for TVs and steps to tune the output to individual TV models, to get legible text at 64 columns, or even 85 columns using 512 pixel wide lines and 6 pixel-wide characters. <br /></p><p>Parents of several of the students meet with the colleges and the other companies that have been supporting the research group, and set up a company to handle commercializing their work. Arguments arise, but a small core group of seven of the students (who have just returned from a little vacation to Japan) work hard to bring everyone to agreement. </p><p>A non-profit research group is set up, and the students take membership in it.</p><p>Additionally, a for-profit development company is set up which can act as agent for the students in accepting contract work.<br /></p><p>Microsoft attempts to sue IBM for getting shut out of the PC product, but IBM legal has their response already prepared. A legal skirmish commences, with public news coverage generally portraying Bill Gates as a modern David against IBM's Goliath.<br /></p>
<p></p>
<p>
Tandy/Radio Shack wakes up and sends lawyers, and then wakes up again and sends engineers, too. After some discussion and belated agreement to follow the licenses and consortium rules, they are allowed to join the sponsors' consortium.<br /></p><p>Western Electric also approaches the group, but do not seek membership in the consortium because of their monopoly status in the communication industry. Top-level negotiations on the licensing of Unix ensue between the <i>Susumu</i> Sponsors Consortium and Western Electric and several other communications industry companies and educational institutions. The core student group participates, along with counselors and legal help from the college and university, representing the students' interests. </p><p>IBM Instruments again sends engineers to observe the students' work, and spend considerable time discussing both <i>Susumu</i> and Microware's OS-9 with the two NDA students and members of the core student group. <br /></p><p>The core student group members all complain quietly about having to take time away from their own projects to deal with all the side issues.<br /></p><p>As summer break ends, a group of foreign exchange students arrive from Japan, mostly on education/research leave from Japanese companies. This creates a bit of confusion and friction, but the core group members manage to iron things out and they join the research group.<br /></p><p style="text-align: center;">~~~~~ <br /></p><p></p>How's that for more alternate reality? Doesn't it sound like it might have been even more fun than the reality we know?<br /><p></p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-62172019970530579622021-11-23T18:58:00.041+09:002022-03-21T22:13:55.615+09:00Alternate Reality -- 33209 (pt. 1, winter 1981)<div style="text-align: center;">
***** Alternate Reality Warning *****<br />
</div>
<p></p>
<p>
So, having treated myself to a
<a href="https://defining-computers.blogspot.com/2021/10/alternate-reality-early-microcomputer.html" target="_blank">trip down memory lane, with some daydreaming about Apple and a different
OS-9 and how things could have been</a>, I decided maybe I should map out some of the directions the
computer/information industry would take in
<a href="https://joelrees-novels.blogspot.com/2020/01/33209-2nd-Microcomputer-Revolution-Homecoming-TOC.html" target="_blank">the alternate world of the 33209</a>. <br />
</p>
<p>
As opposed to the previously mentioned daydream, this alternate reality begins
branching from mainline reality in a general sense in early 1981.
</p>
<p>
(In a more specific, personal sense, it branches much earlier. But that branch
runs pretty much parallel with our reality until late in 1980. I won't deal
with that here, read the novel linked above (what there is of it at this time)
if you're interested.)
</p>
<p>
(You're not sure you're interested? Odd. It's not like this is just working
out a time line for the evolution in technology at the core of the plot of
that novel or anything like that. ;-)
</p>
<p><br /></p>
<h4 style="text-align: left;">* First half of 1981:</h4>
<p>
A group of electronics and computer information science students at a
community college in a football/oil field town in west Texas coalesces around
the Micro Chroma 68, Motorola's prototyping board for the 6802/6808, a
6800-compatible early system-on-a-chip (SOC) MPU, the 6846 ROM-I/O-Timer chip, and the 6847 TV grade video
controller. and around building microprocessor trainers using the 6805 (not
6800 compatible, but close, microcontroller with a little bit of built-in I/O,
timer, ROM, and RAM) and turning the trainers into keyboard controllers. </p><p>Initially, all of the additional circuits they design are hand-wired. <br /></p><p>From the beginning, several of the students help the others keep records of their work and of how they share what they produce. <br /></p>
<p>
Not only their teachers and other faculty members, but representatives of
Motorola, IBM, and TI observe and encourage the students' work. (Why they do,
I won't get into here. See the novel. Admittedly, this is the biggest jump in
logic in the daydream, and big enough to keep it squarely in the realm of
daydreams.)<br />
</p>
<p>
Faculty from the nearby branch of the University of Texas drop in to observe,
as well.<br />
</p>
<p>
One of the students designs a floppy disk controller based on the 6801, and
Motorola negotiates with him for options on the design. As a result, the group
gets access to Motorola engineering support, in the form of both documentation
and parts sampling.
</p>
<p>
The group has the right combination of talents, interests, resources, time,
and connections, and by spring break, they all have working trainer/keyboards
based on the 6805.
</p>
<p>
During spring break, several of the students participate in internship
programs with the three companies. <br />
</p>
<p> By the end of spring break, most of the students have their own working Micro
Chroma 68s, using daughterboards to substitute the 6801, an integrated
microcontroller with a more powerful 6800 compatible CPU, for the 6808 or 6802
originally specified for the board. </p><p>Since most of the students do not yet have floppy disks, and since ROM is faster for reading than floppy disks and displaces less of the usable RAM space, as well, all students include hardware to program EPROMs in their computers. This allows them to program their own ROMs and construct their own monitor/debug systems and their own boot-level input-output software -- what we began to call BIOS (Basic Input/Output System) about this time under influence of CP/M.<br /></p><p>Several simple graphical games get written in the process, for fun, relaxation, and testing the computers.<br />
</p>
<p>
Several of the students get together to rewrite parts of the Micro
Chroma monitor/debug program ROM to use the features of the 6801, and they
call the combination Micro Chroma 6801. The monitor/debugger is shared with
everyone, and all the students also put ROMs containing the freely available
Tiny BASIC in their computers and are able to type in programs and save their
programs to cassette tape and load them back in.
</p>
<p>
Many of the students add bank switching, lots of RAM, and floppy disk drives
to their computers. Fast cassette interfaces are also experiment with,
especially for backing up their work. Some of the students adding floppy disks
use the 6801-based floppy disk controller developed by their fellow student,
others use various commercially available floppy disk controllers, allowing
the group to learn how to separate I/O functions out into device driver modules and
make their software somewhat hardware independent. </p><p>Some of the students add the 6844 DMAC (direct memory access controller) to their Micro Chroma 6801s, to ease timing for the floppy disk high-speed cassette interfaces and device drivers. <br /></p><p>Several students start experimenting with chemical etching to produce their circuit boards. <br /></p>
<p>
With floppy disk drives, they are able to bring up the
<a href="https://en.wikipedia.org/wiki/FLEX_(operating_system)" target="_blank">Flex operating system</a>
from Technical Systems Consultants, which some of them have bought copies of.
<br />
</p>
<p>
Certain members of the group port the
<a href="http://www.forth.org/" target="_blank">Forth Interest Group</a>'s
freely available fig-Forth language/operating environment to their computers,
allowing even those who don't buy an OS to (among other things) use their
floppy disks without always having to type obscure machine code in at the
keyboard.<br />
</p>
<p>
Motorola and IBM both negotiate with the students for limited rights to use
their design work, IBM asking for confidentiality concerning their being
interested.<br />
</p>
<p>
A small group begins an attempt on a port of Edition 7 Unix (now known as
<a href="https://en.wikipedia.org/wiki/Research_Unix" target="_blank">Research Unix</a>) to their computers. The college makes their Univac 1100 available to them
during off-hours, to run Unix on, to help with their work.
</p>
<p>
At this point, TI also negotiates with the students for limited rights to use
their design work.<br />
</p>
<p>
Engineers at Apple hear about the students' work and start visiting with IBM,
TI, and Motorola when they visit. Members of the petroleum industry operating
near and in the town also start visiting.
</p>
<p>
Some of the students have misgivings about the visits until visiting engineers
share some of their experience and help with the Unix port and other problems
students have trouble solving by themselves. Nevertheless, the port gets a bit
stuck in trying to produce a cross-compiler for the C programming language.
<br />
</p>
<p>
Many of the students begin building computers with other CPUs, sharing ideas
and keeping their designs similar enough to the Motorola CPU-based
designs to make porting the Forth reasonably straightforward.
</p>
<p>
TI provides 9900s and 9900 series peripheral parts, including their video
controller, the 9918A, to several students who want to use them. Motorola
provides 6809s and 68000s and peripheral parts to those who want to use them.
IBM steps up to buy CPUs for students who want to use 8086s or 8088s. Apple
steps up to help those who want to use 6502s. Some of the students want to use
Z-80s or 8085s, and petroleum industry companies step in to help them. At the
suggestion and assistance of certain engineers from the petroleum industry, a couple of
students decide to try the Z8000, as well.</p><p>Students working with the more advanced CPUs initially try to implement too many of the advanced features, but after a short time consign themselves to just getting them working first. </p><p>Students working on the 68000, in particular, spend a week trying to design asynchronous bus interfaces, along with 16-bit peripheral circuits to take the place of 16-bit parts that aren't available from Motorola. Not making much progress with that, they settle on using the 6800-compatible bus signals, and rely the special MOVEP instruction which was designed to make it easy to use the existing 8-bit peripheral parts from the 6800 family. This allows them to get basic functions running enough to start figuring out the advanced functions.</p><p>For most things that don't require large address spaces, the 6809 proves the easiest to work with, and students working with it are successful at getting more advanced operating system and hardware features running, including DMA access using the 6844 for high-speed I/O concurrent with other operations, and basic memory management functions using bank switching or the 6829 series memory management unit (MMU), which Motorola also provides samples of. Simple bank switching proves to allow the computers to operate at higher CPU and memory cycle speeds than the 6829, but with less flexibility in function. </p>
<p>
Many of the students elect to use the 6845 or 9918A to control video in the
computers of their own designs instead of the more limited 6847. Some
initially elect not to include video, using the Micro Chromas as terminals for
their computers, instead. Other options for video output are also explored.<br />
</p>
<p>
Motorola samples an upgrade to the 6805 to the group, along with a cross
assembler that the students can run on their Micro Chroma 6801s, either under
Flex or by loading and calling directly from the monitor/debugger firmware. A
couple of students build small computers with the upgraded 6805 for practice,
pairing it with the 6847 video controller. The limits of the byte-wide index
register X make it difficult to port a full Forth, but a small subset of
library-type functionality is put together.
</p>
One of the students quickly redesigns the floppy controller using the libraries
and the new microcontroller's internal direct memory access controller, and
Motorola buys this design outright, giving the group a bit of money.<br />
<p>
</p><p>About the time that the new floppy controllers come up, the students working with the 68000
get their asynchronous bus interfaces running, speeding up memory
access. But they decide to continue to use 8-bit peripheral parts
and the MOVEP glue instruction rather than design their own peripherals, as much as possible.</p><p>In response to students' requests, Motorola offers to make MMU parts for the 68000 available, with the warning that most customers have not found them useful and have been implementing their own MMUs. They do make documentation available, and after a few days of studying the documentation and errata, the students decide to postpone memory management on the 68000 for the time being. <br />
</p>
<p>One of the students notices that a 512 byte table in ROM can be used to
synthesize a fast byte multiplicative inverse on CPUs that lack hardware divide, and that is added as an
optional part of their libraries, the source code for the table being
generated mechanically on one of the students' Micro Chroma 6801s. The table
can be used even on the 6805 by storing the more and less significant parts of the
result in separate tables.
</p>
<p>
One of the students writes a simple inclusion pre-processor inspired by the C
pre-processor to handle assembly language file inclusion, making it easier to handle library
code. <br />
</p>
<p>
A group of the students writes a set of library routines for the Micro Chroma
6801 inspired by the libraries of the programming language C, but using a
split stack architecture inspired by Forth, with one stack to keep return
addresses and the other (a software stack on the 6801) to keep parameters on.
The group then produces a mashup programming language based on fig Forth and
<a href="https://en.wikipedia.org/wiki/C_(programming_language)" target="_blank">the programming language C</a>, which language gets nicknamed first "Split C", then, amid jokes about split
pea soup, renamed "Split P" -- P as in program.
</p>
<p>
The language has two grammars, a Forth grammar, which is modified from the fig
Forth, and a C grammar, which is kept compatible with K&R C. Implementing the C macro-preprocessor takes some effort, and eventualy they implement it in the Forth grammar.<br /></p><p>Almost as fast as it is implemented on the 6801s, split P is implemented on the 6809 and 68000, and the three code bases become the core code bases for the language. <br /></p><p>One of the programming teachers tells the class about Ken Thompson's "Reflections on Trusting Trust", and the students design a process for using cross-compiling to bootstrap the compiler for Split C cleanly of potential library-hidden back doors.. <br /></p><p>Using Split P and borrowing more ideas from both Unix and Forth, members of the
group begin designing a new OS they call <i>Susumu</i> (Japanese for "Proceed").
The BIOS for <i>Susumu</i> is designed to include a ROMmable monitor/debugger
which is a subset of the Forth-like grammar of Split P and includes a minimal
interactive assembler and disassembler. </p><p>Full assemblers are also written, to make the operating system and programming language self-hosting. Early versions of the 68000 assembler do not handle the full instruction set, only enough to compile and run the high-level code.<br />
</p>
<p>The disassembler/debuggers, both minimal BIOS and full versions, and the relocating linking loader are ongoing projects that have to be rewritten as Split P evolves. In particular, the ability to relocate variables and labels within the 6801's direct page requires several tries to get right. These are not complete during April (or even May), but they are working well enough to fork (spin off) versions to use in the project to port Unix. <br /></p>
<p></p>Some of the students write about their work and submit to certain electronics
and computing magazines, and some of the articles are accepted, to be published
beginning in June. <br />
<p>
In late April, Motorola acts on the option to buy the right to use the 6801
floppy controller design, giving the group more money.
</p>
<p>
Motorola then publicly announces the upgraded 6805 system-on-a-chip (SOC)
microcontroller, with immediate sampling:
</p>
<ul style="text-align: left;">
<li>
The 682805, using a pre-byte to expand the op-code map as necessary, gets a
second parameter stack pointer (named U after the 6809 second stack pointer)
and push/pop and transfer pointer instructions (PSH/PUL/TFR) to support it.
It also gets push/pop/transfer instructions for the return stack, to reduce
the bottleneck that a single stack tends to create. Careful design allows
using the pre-fix instruction byte to add u-stack indexed mode instructions to access
local variables on the parameter stack without significant increase in
transistor count, as well. <br />
</li>
<li>
The return address stack and its RAM are moved out of the direct page, with
access optimized so that simple calls cost no more than jumps or branches.
The parameter stack and its RAM replace the return stack in the direct
page.<br />
<br />
</li>
<li>
A two-channel chainable direct memory access controller (DMAC) with 8-bit
counters is added to the list of optional integrated I/O devices.
(Chainability is what makes the microcontroller especially suited to the floppy
controller.) (It's worth noting here that, in our real world, Motorola didn't introduce DMA channels in their published SOC microcontrollers until much later.)<br />
</li>
<li>
Also optional instructions are added, including the 8 by 8 to 16-bit multiply from
the 6801 and 6809. Both of these use the X index register to store the high
byte of the multiply, similar to later versions of the 6805. <br /><br />
</li>
<li>
The MEK 682805 and Micro Chroma 682805 are announced as prototyping kits,
crediting the students' work.<br />
</li>
</ul>
<p>
What is not announced is that monitoring the students' experiences is a large
part of what has actually motivated the changes.
</p>
<ul style="text-align: left;">
<li>
Motorola engineers have also polished the floppy controller designs and
implemented them as integrated, almost single-chip controllers, and Motorola publishes
the designs, both as products and as reference design application notes on
constructing custom SOC designs using Motorola's processors, again acknowledging
the group's contributions. <br />
</li>
</ul>
At about this time, Motorola also samples upgrades to the 6801 and 6847 to the
group, and several of the students elect to build prototyping kits similar to
the un-expanded Micro Chroma kits using these parts, for fun and practice. They work
quickly, producing operational prototypes in just a few days.
<p>
Comparing the 6845 and the upgraded 6847, a few of the other students decide 64 characters wide is good enough, and rework their video designs to use the upgraded 6847 instead of the
6845.</p><p>Most of the students are now building both chemically etched and hand-wired circuit boards.
</p>
<p>As students bring up computers of their own design on CPUs other than the core three, they begin to port both Split P and
the nascent
<i>Susumu</i> to their computers, keeping a high degree of compatibility
between their systems in spite of the different CPUs. In order to make the ports work, techniques for specifying byte-order independence in source code are explored and implemented.<br /></p>Several of the students work together to write cross-assemblers for the non-core CPUs in Split P, which allows much of the development to proceed
on working hardware instead of the hardware prototypes. Coding significant
parts of the OS and interpreter/compiler in Split P instead of assembler also
helps speed development.
<p>
Modifications are made to the Forth grammar of Split P, to include explicit
named statically allocated local variables and global variables controlled by semaphores or
counting monitors. </p><p>Modifications are made to the C grammar to include
structured return types for functions (using the split stack), global variables controlled by
semaphores or counting monitors, and task-local static allocation.<br />
</p>
<p>
Several students working on computers based on the 6809 purchase TSC's Uniflex or Microware's OS-9, as well. TSC and Microware both provide
help to the students to bring their OSses up, both companies becoming interested in Split P and
<i>Susumu</i> in the process. </p><p>Two of the students, frustrated with the restrictions against mixing code and string data in Microware's native assembler write their own assembler to allow mixing string and other constants in with code without needing fix-up tables and code. This allows them to bring up Forth for OS-9 as a relocatable module.<br /></p>
<p>
Apple inquires about obtaining formal rights to produce products based on the
work the group is doing. Motorola, IBM, TI, Microware, TSC, and the petroleum
industry companies also express a need for more formal arrangements, and, with help from the college, the
group formally organizes as a research group under the college umbrella, in
order to simplify legal issues. Legally organized, the group itself becomes able to
receive small grants from each of the interested companies, and individual
students are more easily able to contract to do projects for various
companies.<br />
</p>
<p>The organizational structure within the group is kept flat, and participation voluntary. With
legal help from the colleges, they agree on a liberal license for
collaboration in their work, allowing them to continue collaborating with each
other and also with interested people not formally in their group.
</p>
<p>
More students express interest in joining the group, both from within the
college and from the nearby branch of the University of Texas. After some
discussion, and after the new students agree
to accept the license and respect requests for confidentiality from the corporate sponsors, they are
cautiously welcomed in.
</p>
<p>
As spring semester ends, some students head out for summer jobs and summer
vacation, others stay for the summer terms and to keep the group moving forward. Certain of the students begin internships with one or more of the companies that are taking something of a sponsorship role by this point, working some days with engineers at the corporate campuses and some days back at the college lab. A
newsletter is begun, to keep everyone up to date.
</p>
<p>
More articles are accepted by electronics and computer magazines, including a
few articles that are contracted by Motorola and TI to publicize application notes, including
the floppy controller and the 682805 in particular.</p><p>At Apple, during May, Jef Raskin prevails on management to produce an OS-9/6809-based
business computer line, including workstations and network
computers. </p><p>Nothing really exciting happens within the student group during early June, only steady progress on various projects. Well nothing exciting except that TI also brings samples of an upgraded variant of the 9900 CPU and of the 9918A video generator for the students, and
several agree to try them out. These take a bit more work, with students helping
to debug TI's new design work. </p><p>The appearance of the magazine articles on the students' work creates some excitement outside the group, but the group is so far ahead of what is published that other than the excitement of seeing their work in print, the articles are a bit anti-climatic for the students.</p><p>Engineers at Radio Shack are among those outside the group that take note of the articles, in particular showing their management an article detailing the Micro Chroma revisions for the 6801 and 6809 and another article showing the 6809 interfaced to the 6844 DMAC.<br /></p><p>The students working on <i>Susumu</i> get Split P code generation on the 6801
working well enough to start using the C grammar directly as the system native compiler for the Unix
port project, and parts of Unix are brought up on the Micro Chroma 6801s with bank
switching. There isn't enough room in 64K for Unix without bank switching, but it does run with bank switching.
It's also relatively slow on the 6801, but it works well enough to compile and run some system
tools.
</p>
<p>
The ports of <i>Susumu</i> to the 6809 and 68000 are likewise brought up to a usable
level by mid-June, and ports to the 8086, Z8000, 9900, Z-80, and 6502 are partially
running, but buggy.</p><p>With work on 68000-based computers proceeding well, students working on those are invited to a meeting with engineers at IBM Instruments under non-disclosure. After some discussion, the student group agrees to let the invited students make up their own minds. </p><p>Two of the invited students go, and return in a week with a couple of IBM engineers who observe the student group's activities without comment for several days and then return to Danbury. Everyone in the group studiously avoids tempting the two students to break their NDAs, which engenders some small tension. But the tension gradually evaporates naturally.<br /></p>
<p>
In mid-June, just before the start of summer, Motorola publicly announces sampling
of the upgrade to the 6801 SOC microcontroller:
</p>
<ul style="text-align: left;">
<li>
The 682801 gets direct-page versions of the 6801's unary instructions,
allowing more effective use of the direct page. (The 6800 and 6801 did not
have them, I assume because the design engineers were concerned that there
would not be enough opcodes for required inherent address op-codes. They were present in the 6809, however.) The
op-codes for the new direct-page unary modes are carefully allocated so that
they fit in the primary op-code map and don't cost a pre-byte fetch to use,
but are scattered across the available un-implemented op-codes instead of lined up with the
rest of the unary instructions. This costs more in circuitry, but maintains
object-code compatibility with the 6801 and 6800.<br />
</li>
<li>
The 682801 also uses a pre-byte to expand the op-code map so that it can
have a second U stack pointer for parameters and for index mode op-codes that
allow access to the parameter stack without going through the X index
register. <br /><br />In addition to the 6801's ABX (add B to X)
instruction, new instructions SBX (subtract B from X) and the corollary ABU and SBU are provided for improved
handling of address math and stack allocation/deallocation. And the compare D (CPD) instruction which is
missing on the 6801 is added to the 682801.<br /><br />
</li>
<li>
The 682801 gets a 64 byte spill-fill cache for the return stack, with
18:38:8 hysteresis to provide worst-case overhead space for interrupt
frames, which is large enough for nineteen levels of frameless calls and nine
levels of framed calls without going to the external bus to save the PC or
the frame pointer. This allows simple calls to statistically cost no more memory
cycles than jumps and branches. A similar 64 byte spill-fill cache for the
parameter stack is provided, as well, but with 8:48:8 hysteresis, improving
access cycle times to local parameters. These caches can be locked in place,
for applications that don't need deep stacks.<br />
<br />
</li>
<li>
Separate system and user mode are implemented, with a new status bit in a
<strike> new</strike> status register in direct page I/O space. <strike>In order to make context switches faster for
interrupts, there is a return stack cache and parameter stack cache for both
system mode and user mode. <br /><br />In order to avoid complex delay
circuitry for the caches under interrupt and return, the S and U registers
also have system and user mode versions.</strike> Control registers for the caches
are in the direct page I/O space.<br /><br />
</li>
<li>
Functional equivalent modules for many of the 6800-series peripheral parts are announced as optional
integrated I/O and timer functions, including the 6844 DMAC. Integrated
devices will be shared with the 6805 series MCUs, too.<br /><br />
</li>
<li>
Address function signals are provided, to allow a system to distinguish
between program/interrupt response, return address stack, direct page, and
extended address/general data access for both user and system modes. This allows separate
buses for each function, reducing memory conflicts. <br /><br />This also
allows expanding the address space to an effective total active address
space larger than 64K. <br /><br />(Theoretically: <br /><br />
<ul>
<li>code (f3) <= 64K </li>
<li>+ general data (f2) <= 64K </li>
<li>+ direct page (f0) <= 256 of 64K</li>
<li>+ return address stack (f1) <= 64K;</li>
</ul>
<br />Because of the 16-bit width of the index register X, expected actual aggregate active process space will generally be less than 128K. Even so, the extra address space clears a number of design bottlenecks.) <br /><br />Because the index register X can only point into the general data area, using separation requires care
in software when using addresses and pointers, and when assigning address range
deselection in the hardware.
</li>
<br />But with the return address stack in a separate address space based on the
function codes, rogue code can't overwrite the return address, period. Also,
the spills and fills associated with calls and returns can occur concurrently
with other instructions. <br /><br />
<li>
Hardware 8 bit by 8 bit unsigned divide -- A is dividend, B is divisor,
result B is quotient, and A is remainder, to make it easier to repeat the
division and get the binary fraction.<br /><br />
</li>
<li>
Bank switching memory management is also added to the list of
optional integrated devices, to expand the physical address space and allow
write and address mode protection and address function separation. <br /><br />In the bank-switch module provided by one initial part, 16 8-bit-wide bank switches provide linear mapping of 4-kilobyte banks of a single linear 64K address space into a single 1 megabyte maximum physical address space. The top 4 bits (LA12-LA15) of the logical (CPU) address select the bank switch, and the bank switch provides the top 8 bits of physical address (PA12-PA19), instead. This part provides only 128 bytes of internal direct-page RAM.<br /><br />In the bank-switch module provided by another initial part, address function codes are appended to the top of the CPU address, giving 18 total logical address lines (LA0-LA15,FA16,-FA17) for a 256K logical address space. <br /><br /><ul><li>Data and code (f2 and f3, FA17:FA16=10 and 11) are each given their
own sets of 16 8-bit-wide bank switches (providing PA12-PA19) to map 4K
banks from the 64K space addressed in data mode (extended or indexed) or code mode (program counter or interrupt) into the full 1M max physical
address space.<br /> </li><li><strike>Data (f2, LA17:LA16=10) is given its
own set of 16 10-bit-wide bank switches (providing PA12-PA21) to map 4K
banks from the 64K address space pointed to by the index register or extended mode absolute addresses into the full 4M max physical
address space. <br /><br /></strike></li><li><strike>Code (f3, LA17:LA16=11) is given its
own set of 16 9-bit-wide bank switches (providing PA12-PA20, with PA21=1) to map 4K
banks of the address space pointed to by the program counter or selected by an interrupt response into the top 2M of physical
address space. </strike><br /> </li><li>Stack references (f1, FA17:FA16=01) are hard-mapped to the second 64K range ($010000 - $01FFFF), and four sets of 16-bit bounds register latches and a stack bounds violation interrupt are provided for stack security. The first version of this part provides 256 bytes of internal stack use RAM at $10000-$100FF.<br /> <br /></li><li><strike>Stack (f1, LA17:LA16=01) is given its own set of 16 12-bit wide bank switches (providing PA8-PA19, with PA21:20=00). Shifting the switch addresses down allows stack to be allocated in 256 byte chunks, making more efficient use of stack RAM. The extra width allows allocating illegal address ranges around the stack, to improve system security in case of stack overflow, underflow, or corruption. (Too late at night.)<br /></strike><br /></li><li>Direct page references (f0, FA17:FA16=00,LA15:LA8=00000000) are split into upper and lower halves, and mapped into the lowest 32K range ($000000 - $007FFF), using the high bit of the direct page address (LA7/DP7) and a system/user state bit to select one of four 8-bit switches to provide physical address PA7 - PA14. In this part, 256
bytes of internal direct page RAM are provided at $000400 - $0004FF. internal I/O addresses, including the bank switch registers, are provided from $000000 - $0000FF.<br /><br /></li><li><strike>The upper half of the direct page is mapped through a simple 8-bit latch with constant higher bits that are appended above the lower 7 bits of the direct-page logical address produced by the CPU, to yield a physical address in the range $8000 - $FFFF. In this part, only 512 bytes of internal RAM are provided.<br /><br /></strike></li><li><strike>The lower half of the direct page is similarly mapped, to yield a physical address in the range $0000 - $7FFF. The bank switching registers are mapped into this range, in particular, the direct page latches and related control bits start at address zero, and interrupts </strike><br /><br /></li></ul><strike>Initial SOC parts provide either simple general switching out of a single larger address space or function-based switching out of function-separated address spaces. of 16 by 4K banks out of a single 1 Megabyte max address space, or function-based switching of 16 by 4K banks each of code and data out of 1M max plus 16 1/4K banks of return stack out of 64K max with a constant offset into the CPU address space <br /><br />Special bank switching for the direct page is another optional integrated peripheral, to allow switching half-pages from up to 32K of physical address space into the lower half of the direct page and 32K into the upper half (theoretical max, depending on the width of the implemented bank switch register), in 128 byte chunks. <br /><br />Motorola suggests the convention that I/O should be switched in the lower range from $00 to $7F, and (preferably internal) RAM in the range from $80 to $FF, and the initial SOC parts follow this convention., providing 512 bytes of actual internal direct page RAM. Initial parts only implement 4 bits of direct-page bank switch, physically limiting total addressing for each half to 2K.</strike> <br /><br />With this kind of mapping, keeping task global variables in the direct page can speed task switches.<br /><br /><strike>By keeping task global variables in the $80 to $FF range, fast task switching can be supported.</strike><br /><br />Full 6829-style
MMU functionality is mentioned as under consideration, but not yet committed
to.<br />
</li>
<li>
In addition to the 40 pin DIP package, 48 pin and 64 pin DIP packages <strike>will</strike>
provide access to extra address and port signals.<br />
</li>
</ul>
<p>They also announce the 68471, with the following extensions to the 6847:</p>
<ul style="text-align: left;">
<li>
In character mode, 32 or 64 characters per row, 16 or 32 rows, interlaced or
non-interlaced. Internal character ROM now includes 128 characters. The
default character set includes lower case, full ASCII punctuation, and
common useful symbols in the control code range. External character ROM is
supported, <br />
</li>
<li>
In graphics mode, 512 pixels per row and 384 rows per field modes are added to the 6847's lower resolution modes,
interlaced or non-interlaced. One, two, or four bits per pixel are supported, allowing 2, 4, or 16 colors per pixel, if the output digital-to-analog converters and amplifiers are
fast enough. 24 kilobyte maximum video buffer give 2 colors at highest density, 4 at 256 by 192, and 16 at 128 by 64. <br /><br />8-bit programmable
palette registers are provided, intended as 2 bits each of red, green, blue, and intensity.<br /><br />
</li>
<li>
6883/74LS783 DRAM sequential address multiplexer (SAM) and refresh functions
are built-in. Bank switching into an up-to 16-kilobyte CPU-side window is also built in, to support the large video
buffers for the higher density graphics modes on 16-bit addressing CPUs.<br />
</li>
</ul>
<p></p>
<p></p><p>And Motorola announces the Micro Chroma 682801 as a prototyping kit for the new CPU and video generator, with <strike><i>Susumu</i></strike> the 2 kilobyte monitor in internal ROM implemented as <i>Susumu</i> library functions. <br /></p><p>Specifications for upgrades to the 6809 and 68000 CPUs and 6845 video controller
are also shown to the students, and more than a few express interest in seeing
what they can do with them.</p><p>The students start searching in earnest for a source for CRTs that can function at sufficient resolution to handle 80-columns or more of text output that the 6845 and the improved 6845 can produce, but don't have real success. A few students are able to get a CRT here or there, but the interfaces are varied, and the video generator circuits the students produce are also varied.</p><p>As a result, most of the students remain dependent on TVs for video, and Motorola supplies them with enough 68471s that they can all get 64 column text output circuits for their more advanced computers. Unfortunately, results are rather uneven and require careful tuning to individual TVs, such that, even with the 68471s, the circuits they produce do not approach mass-marketable products.<br /></p><p><i>Susumu</i> and Split P code generation for the 682801, 6809, and 68000 are also functioning well enough as summer begins that the ports of Unix are re-booted using Split P as the system language and <i>Susumu</i> as the system programming language. <br /></p><p>And this is getting long. <br /></p><p style="text-align: center;">~~~~~ <br /></p><p>How's that for alternate reality?</p><p>And the <a href="https://defining-computers.blogspot.com/2021/12/alternate-reality-33209-pt-2-summer-1981.html" target="_blank">outline for summer</a> is ready now, <a href="https://defining-computers.blogspot.com/2021/12/alternate-reality-33209-pt-2-summer-1981.html" target="_blank">here</a>. <br /></p>
<p></p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-76777111905663640882021-11-06T01:52:00.010+09:002022-07-03T09:39:07.965+09:00Connection between Motorola's 6800 and the DDP-116?<p>I think I may have found the computer that was the primary influence in the design of the 6800. This is going to require more research, but the Wikipedia page on <a href="https://en.wikipedia.org/wiki/Honeywell_316" target="_blank">Honeywell's H-316</a> describes the processor, but doesn't give any more information on how much of the design was inherited from the original DDP-116 that Honeywell got from Computer Control Company: </p><p>Accumulators A and B? 16-bit, but check.</p><p>Index? Check. <br /><br />I'm thinking it looks closer than either the <a href="https://en.wikipedia.org/wiki/PDP-8" target="_blank">PDP-8</a> or <a href="https://en.wikipedia.org/wiki/PDP-11_architecture" target="_blank">PDP-11</a>, both of which are usually cited as influences, but have much larger register sets of general purpose registers rather than the 6800's limited pair of small accumulators, single index register, stack pointer, and program counter (instruction pointer).</p><p>The 68000 (one more zero!) clearly shows influences from the PDP-11, and the 6809 borrows indexing modes from the PDP-11, but the 6800 does not look like a PDP in any way that is not attributable to both implementing Turing machines in register-memory architecture. <br /></p><p>[JMR20220703: </p><p>I've been looking around again, and I think the Data General Nova is another CPU that the 6800 might have been fairly directly influenced by -- </p><ul style="text-align: left;"><li>two accumulators (again, 16-bit), <br /></li><li>and, okay, two, not one, index registers. </li><li>The 6800's constant byte offset could be inspired by the Nova's constant offset indexed addressing mode, </li><li>and the 6800's direct page is similar to the Nova's page zero.</li></ul><p>Two index registers -- the 68HC11 descendant of the 6800 finally got a second index register, well more than a decade after the introduction of the 6800. <br /></p><p>It is fairly clear that the Nova was part of the inspiration for the original ARM architecture, I think, and I know several people who agree with me about that.<br /></p><p>]<br /></p><p>This needs more research, however. I'll try to update this stub when I have better information.<br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-82143932642211719872021-10-30T01:39:00.022+09:002021-11-03T21:44:52.638+09:00Alternate Reality -- Early Microcomputer Operating Systems, OS-9, and Apple<p>
This comes from a response I posted to a thread in the
<a href="https://www.facebook.com/groups/macenthusiasts/" target="_blank">Vintage Apple Macintosh Enthusiasts</a>
group on Basshook <a href="https://www.facebook.com/groups/macenthusiasts/posts/4440009266064672/" target="_blank">about Michael Dell's bragging</a>, where the thread took a what-if turn about Rhapsody, Yellow Box, and
such. The OP hypothesized how the computer industry could have changed
if Dell had accepted the opportunity to preinstall Yellow Box (essentially,
Steve Jobs's NeXT, but on top of MSWindows, if I understand/remember
correctly) on the computers he sold.<br />
</p>
<p>
I suggested, for a what-if hypothesis-contrary-to-fact session of daydreaming,
going back more than a decade from there to Microware's OS-9, a modular
real-time operating system available for Motorola's
<a href="https://en.wikipedia.org/wiki/Motorola_6809" target="_blank">6809 CPU</a>
from 1979/80, and for Motorola's
<a href="https://en.wikipedia.org/wiki/Motorola_68000" target="_blank">68000</a>
from 1983, which included graphics and GUI component libraries.
</p>
<p>He said that was before his time.</p>
<p>
So I offered him a gratis (heh), technologically dense daydream of my own:<br />
</p>
<p style="text-align: center;">----- My Response -----</p>
<p>
Worth checking out. OS-9 was a Unix-like, modular multitasking real-time OS
available in 1979. (Wikipedia:
<a href="https://en.wikipedia.org/wiki/OS-9">https://en.wikipedia.org/wiki/OS-9</a>.)
</p>
<p>
The open source project has a FB group,
<a href="https://www.facebook.com/groups/1929079184021683/" target="_blank">Tandy Color Computer OS-9 / NITROS-9</a>, and you can download NITROS-9 and check it out using
<a href="https://en.wikipedia.org/wiki/MAME" target="_blank">MAME</a> or the
<a href="https://www.6809.org.uk/xroar/" target="_blank">XRoar</a> emulator.
</p>
<p>
Kind of surprising what could be done on what was essentially a
<a href="https://en.wikipedia.org/wiki/RadioShack" target="_blank">Radio Shack</a>
<a href="https://en.wikipedia.org/wiki/TRS-80_Color_Computer" target="_blank">toy computer</a>
in 1987.
</p>
<p>
In the late 1970s and early '80s, the
<a href="https://en.wikipedia.org/wiki/Motorola_6809" target="_blank">6809</a>
was viewed by many in the Apple community as the natural evolution from the
6502.
</p>
<p>
There were a few hanging points -- mainly the need to develop an interpreter
front-end to Microware's compiled
<a href="https://en.wikipedia.org/wiki/BASIC09" target="_blank">Basic09</a>,
byte order opposite the 6502, incomplete support of pointer variables in the
byte-addressed direct page, the need for a simpler, faster MMU than Motorola's
over-designed separate 6829 MMU, the penalty that OS-9 placed on
non-position-independent programming techniques, and Motorola's lack of plans
for evolving the 6809 to a fully 16-bit CPU.
</p>
<p>
Also, Apple would have had to plan on developing it as a product line to be
aggressively evolved.
</p>
<p>
But they could have delivered a multi-tasking network-server capable business
machine with high-level language support in 1980 instead of the Apple III,
with minimal engineering effort. This could have been Raskin's text-based
appliance computer. They probably would have wanted to introduce 80-column
Apple IIs with bundled terminal emulators to go with it.
</p>
<p>
A model with built-in hard disk support, bank-switching for a 2 MB memory map,
and a display buffer could have been produced in 1981. They could have
released their own GUI or an improved Multivue in 1983, and it wouldn't have
required 80+ hour workweeks for the engineers.
</p>
<p>
Experience with OS-9/6809 would have allowed them to put OS-9/68000 underneath
the Lisa, and bring that to market at least a half-year earlier, without MMU,
but cheaper.
</p>
<p>
With OS-9/68000 underneath the Lisa, the Macintosh would not have required a
separate technological base. Or it could have had different hardware with
little engineering penalty, and still been part of the same line as Lisa. The
toolbox would simply have been a collection of OS-managed library modules in
ROM.
</p>
<p>
Lack of a true OS underneath the Macintosh would never have been a problem,
and neither would *nix compatible networking. The Lisa/Macintosh would have
been Internet ready pretty much as it was.
</p>
<p>How's that for alternate reality?</p>
<p style="text-align: center;">----- End Response -----<br /></p>
<p> </p>
<p>
Well, some of what I put in that included false memories about who did what
when, but it wasn't that far off. <br />
</p>
<p>
But it was dense. So I'll use unpacking it as an excuse to walk through the
real-world background of that dense daydream:
</p>
<p><br /></p>
<p style="text-align: center;">===== Real Reality ===== <br /></p>
<p>
Two decades before Mac OS9 -- a decade and a half before the failure of
Copland and the transition to Rhapsody, and a decade before Pink -- a small
company called Microware was contracted by Motorola to develop a structured,
i-code compiled version of BASIC for the 8-bit M6809 CPU (<a href="https://defining-computers.blogspot.com/2021/08/differences-between-6800-and-6801-with-notes-68hc11-6809.html" target="_blank">not the 6800, 6801, or 68HC11</a>). For an operating system to put underneath
<a href="https://en.wikipedia.org/wiki/BASIC09" target="_blank">Basic09</a>,
Microware developed the operating system
<a href="https://en.wikipedia.org/wiki/OS-9" target="_blank">OS-9</a>
(OS-9/6809).
</p>
(Microware's lawyers and Apple's lawyers met around the introduction of Mac OS
9, mid-1980s, and were able to come to an agreement about the name fairly
quickly. At the time I write this, Wikipedia's page on OS-9 has
<a href="https://en.wikipedia.org/wiki/OS-9#Name_conflicts_and_court_decisions" target="_blank">minimal mention</a>, but more information is available elsewhere.) <br />
<p>
OS-9 was (<a href="https://www.microware.com/" target="_blank">and still is</a>) a modular, multitasking, real-time Unix-like OS, and it was first available
for computers based on the 6809 in 1979. (OS-9/68000 and OS-9000 and later
versions were POSIX compliant.)<br />
</p>
<p>
It was also a showcase for certain aspects of the 6809 that regularly still
get overlooked by the noisier elements of the industry.
</p>
<p>
A lot of people seemed to notice only one or two features of the 6809, such as
the hardware multiply instruction or the second index register.
</p>
<p>
But the 6809 is much more than that. The architecture of the CPU supports a
programming style that is modular, understandable, and, well, engineerable.
OS-9 demonstrated that the architecture of the 6809 allowed using concise code
to solve truly difficult problems like guaranteed response times and
functional verification with generalized hardware and programming tools for a
wide range of 8-bit applications.
</p>
<p>
(The 68000 also supports such a programming style for 16 and 32 bit
applications, but uses roughly twice the number of registers in doing so -- in
a much more flexible way.) <br />
</p>
<p>
Engineers who understood the architecture made very good use of it in
telephony, sound synthesis, various kinds of real-time controllers, and even
business systems.
</p>
<p>
But there were a lot of famous pundits who could not see the forest for the
trees and spread a lot of useless information about the limitations of
isolated features of the 6809.<br />
</p>
<p>
As just one rather visible example of what the architecture enabled, with
Basic09 and OS-9 you had an integrated development environment pretty much as
powerful as, say, the combination of Turbo Pascal under CP/M and MS-DOS would
be two years later. Admitted, the programming editor provided with the OS was
originally a line editor instead of a full screen editor, but third party
screen editors came pretty quickly.
</p>
One big difference from Turbo Pascal under CP/M or MS-DOS was that OS-9 was
multi-tasking from the outset -- Send a long document to the printer and it
would print in the background while you worked on accounts receivable, your
neighbor in the next cubicle worked on a program she was developing using a
terminal on her desk, and a secretary proofread and prepared a letter on a
terminal on his desk, all at the same time on the same machine running
OS-9/6809.
<p>
(For the record, DR produced the multi-tasking MP/M about the same time as
OS-9 came out. MP/M was more-or-less about desktop-level multitasking, where
OS-9 was more general, being able to multi-task on desktop applications as
well as controller applications. Also, for the record, MS-DOS was not
multi-tasking, and had to rely on the individual applications themselves, or
on terminate-and-stay-resident (TSR) utilities, to manage print spooling and
calendars and such.)<br />
</p>
<p>
The 64K RAM limit in OS-9 level 1 in 1979 did put some limits on things, just
like with any of the 8-bit ALU/16-bit addressing microcomputers of the time.
These limits could be largely overcome with bank switching in OS-9 Level 2,
when it became available in 1980.<br />
</p>
Microware also made a Pascal compiler available soon after the release of OS-9
and Basic09. The 6809 instruction set and register set is well adapted to
high-level languages. A C compiler was not immediately available, but came out
about the time OS-9 Level 2 was released, if I recall correctly.<br />
<p>
OS-9 Level 2 supported bank switching, to open up the address space to 2 MB in
1980. You could have six users or more simultaneously doing pretty memory
intensive stuff on a single machine running Level 2. (And everybody spooling
print jobs, of course.) That meant that single users were also able to make
much more efficient use of the processor's extra cycles, such as compiling and
running utilities and test programs in the background while editing in the
foreground.<br />
</p>
<p>
OS-9 used device drivers to get around dependence on specific hardware, so
that you could build a computer with a cheap floppy disk controller or with a
good hard disk controller, and the only thing that would change in the OS
would be adding or swapping out the driver code. Application code would not
change at all (if the application code followed the rules). So you could hang
a hard disk on an OS-9 computer basically as soon as the actual hardware was
available, in 1980 or '81.<br />
</p>
<p>
One thing that OS-9 lacked was system support for virtual memory and demand
paging. But it wasn't that necessary, since even native machine code for OS-9
could be (and was supposed to be) written position-independently. You didn't
need a virtual machine for position independence. You hardly needed a virtual
machine at all.
</p>
<p>
A position-independent (PIC) code module, whether program or library, could be
loaded to any available address without a link loader patching it up first for
the address it was loaded to. And multiple processes could call the loaded
module from wherever they themselves were located. Even without virtual
memory, you essentially had shared dynamically loaded libraries from the first
version of the OS.
</p>
<p>
Data modules could also be created and/or loaded pretty much arbitrarily, and
OS-9 could use an MMU for memory protection, so most of what virtual memory
gets used for was taken care of without virtual memory.
</p>
<p>
(Looking at, among other things, those pointer-pointers that enabled mobile
allocation in the Classic Macintosh toolbox.) <br />
</p>
<p>
There is a fundamental conflict between generalized virtual memory and the
requirements of a real-time OS. Swapping system stuff in and out from disk
makes it hard to respond to real-world events in real time. For a true
real-time OS to support virtual memory would require separating the virtual
memory interface from the real memory interface, and the system itself would
not use the virtual memory system for anything that had to happen in real
time.
</p>
<p>
In other words, virtual memory in a real-time system must be a secondary
service, provided strictly for the run-time environment of non-critical
applications that need to access more memory than is physically available,
need to access large amounts of high-speed memory backed by persistent store,
or need to make sparse use of large address ranges. According to certain
approaches to computing systems, this is the proper approach to virtual memory
in any system, but it is not an approach that has seen much use in the
industry.<br />
</p>
<p>
The perception that a real-time OS is not for ordinary users may have
contributed to lack of interest in OS-9 in the general-purpose personal
computer market.
</p>
<p>
But the reality is that we as users want the computer to respond in real time,
even if we can choose to be patient. That's a huge part of the magic of the
Macintosh interface.<br />
</p>
Microware made graphics libraries available for OS-9/6809, and had a fairly
complete graphical user interface package no later than 1984. I don't remember
if it was originally called Multivue or not, but Radio Shack would sell it as
Multivue for the Color Computer 3 in 1986 or '87 (about three years after most
of us thought they should have).<br />
<p>
Network interfaces comparable to those available for Unix at the time were
also available from the first versions of OS-9.<br />
</p>
<p>
(If you want to take OS-9/6809 for a test drive, I recommend joining the
<a href="https://www.facebook.com/groups/1929079184021683//" target="_blank">Facebook group for NitrOS9</a>:
<a href="https://www.facebook.com/groups/1929079184021683/" target="_blank">https://www.facebook.com/groups/1929079184021683/</a>. They can point you to the web sites for the open source project,
<a href="http://www.nitros9.org/battle.html" target="_blank">NitrOS9</a>, and
they can help with emulation environments to run NitrOS9 in, such as MAME,
<a href="https://github.com/VCCE/VCC" target="_blank">VCC</a>,
<a href="https://github.com/WallyZambotti/OVCC" target="_blank">OVCC</a>, and
<a href="https://www.6809.org.uk/xroar/" target="_blank">XRoar</a>. )
</p>
<p>
It's rather surprising what OS-9 could do on what was essentially a toy
computer from Radio Shack,
<a href="https://en.wikipedia.org/wiki/TRS-80_Color_Computer" target="_blank">the Color Computer</a>, even in 1981. Radio Shack delivered the Color Computer 3, under-engineered
and rather late to the game in 1986, but it was still impressive, much more
usable than the early versions of Microsoft Windows. Even though it was still
only 8-bit, it had features that would not be replicated on PCs in general
until the mid-to-late 1990s.
</p>
<p>
(And the Tandy/Radio Shack Color Computer gets its own share of what-iffing,
of course.)<br />
</p>
<p>
An important note, Microware ported OS-9 to the 68000, releasing it as
OS-9/68000 in 1983. Graphics libraries and GUI for OS-9/68000 came a bit after
that, some of them from 3rd parties.
</p>
<p></p>
<p></p>
<p>
In the late 1970s and early '80s, the 6809 was viewed by at least some of the
talking heads in the Apple community as the next logical step up from the 6502
in the Apple II. Both the 6502 and the 6809 were derived from the 6800. The
indexing techniques commonly used in the 6502 were all more directly and more
efficiently supported by the 6809's design, which produced much faster code in
both benchmarks and real applications. The 6809's partially 16-bit design
helped speed development, as well.<br />
</p>
<p>
The hanging points relative to migrating from the 6502 to the 6809 that I
mentioned above did exist, but a look at them shows they were not serious
problems:
</p>
<ul style="text-align: left;">
<li>
The 6809 followed the 6800 in putting the more significant bytes of numbers
lower in memory, where the 6502's byte order was reversed, following Intel's
habit of putting the less significant numbers first. This was considered a
low-level optimization by some engineers.
<a href="https://defining-computers.blogspot.com/2017/04/lsb-1st-vs-msb-1st-ignore-gulliver.html" target="_blank"><br /><br />I disagree</a>
about it being an optimization. The technological arguments in favor of
less-significant digits appearing lower in memory ignore the global effects,
especially on testing and debugging. Also, arguments that byte order doesn't
really affect algorithms are only excuses for whichever order is chosen, not
reasons to make a choice. If less-significant first should be no detriment,
neither should more-significant first, and changing the mental habits is not
fatal.<br /><br />As an aside, the Z-80 had the same byte order as the 6502,
but the run-time architecture was not comfortable to programmers used to
working in assembly language on the 6502..<br /><br />
</li>
<li>
The 6502 inherited its short address zero page (or page zero) from the 6800.
On the 6800 it's called direct page, but it's pretty much the same idea,
although the 6502 allows somewhat more efficient use. In particular, the
6502 allows indirection on memory in the zero page. This allows a great deal
of flexibility and a fair degree of efficiency when working with pointer
variables and address math, once you learn how it's done on the 6502.<br /><br />The
6809 also inherited the direct page address mode from the 6800. But it has
both (almost) general memory indirection and the LEA address math
instruction, which mean that you don't need the zero page for address math.
Simpler, faster, improved, but different, meaning you have to take a few
hours puzzling out the differences.<br /><br />One of the frequent uses of
address math was incrementing and decrementing pointers, and the 6809
supports that as an address mode, often not even needing an extra
instruction. Or you can use the LEA instruction, if the addressing mode is
not flexible enough. A common issue for engineers used to other CPUs was not
realizing the addressing modes and LEA instructions could be used to replace
whole sequences of 6502 instructions -- or of 8080, Z-80, etc.,
instructions. Unfortunately, programmers who didn't know what they were
doing shared a lot of code, so example code was often less than useful.
<br /><br />If you want to write good code on a particular processor, you
have to spend a few hours looking at the instruction set and addressing
modes and thinking out how to use them.<br /><br />
It's a bit odd, but indirection through variables in the direct page somehow
got left out of the 6809's indexed address modes. For actual indirection,
it's not a big deal -- one extra load with maybe a save and restore of an
index register when necessary. But it makes the 6809's relocatable
direct-page not quite the advance over the 6502's page zero that it could
have been -- especially when trying to calculate the effective address of a
variable in the direct page, since it can't be done with just an LEA. This
is not a fatal problem, by any means. It just takes 3 to 5 instructions to
replicate what should have been doable with just a single LEA.<br /><br />Specifically
(I prefer to use the U stack, so the following uses the U stack. Those who
prefer to use the S stack, use PSHS and PULS instead for push and pop.) If
the value in X or D does not need to be saved, the push and pop instructions
can be left out, and it's just one extra instruction: <br /><br />
<ol>
<li>
to achieve indirection on a pointer in the direct page, instead of<br />
LDD [<dp_pointer] ; load value at dp_pointer<br />free an index
register if necessary and use it to load the pointer first:<br />
PSHU X ; (if necessary)<br /> LDX
<dp_pointer ; get the pointer<br /> LDD
,X ; load through it<br /> PULU X ;
(pop it back if it was pushed)<br /><br />
</li>
<li>
To get the address of a pointer in the direct page at run-time, instead
of <br /> LEAX <dp_pointer<br />free the double
accumulator if necessary, and use it to calculate the address:<br />
PSHU A,B ; (if necessary)<br /> TFR
DP,A ; high byte of base of direct page<br />
LDB #(dp_pointer-dp_base) ; offset in B<br />
TFR D,X ; put whole address in X<br />
PULU A,B ; (if it was pushed)<br /><br />
</li>
</ol>
It does feel a little more awkward than it really is. D usually will not need to be saved, so it will usually just be 3 instructions. And it's usually not
an operation that occurs often enough to have a significant impact on
execution time, overall.<br /><br />Note that the 6809 does allow memory
indirection on the extended address mode. But that's only useful relative to
the direct page if the direct page address is known and fixed at assemble
time, which interferes with position independent practices and significantly
reduces the usefulness of the direct page.<br /><br />
</li>
<li>
Motorola's plans for the future of the 6809 became somewhat unclear once
microcontrollers based on the 6801 and 6805 were shipping in volume and the
68000 CPU saw better than 99.999% reliability in production. <br /><br />The
earliest materials for the 6809 indicated that the 6809 was being considered
for system-on-a-chip integration, but that never happened -- at least where
it could be seen by general members of the public. Such references tended to
disappear from later materials.<br /><br />I understood from reliable
sources that the 6809 was viewed by some members of management as
cannibalizing the 68000's market share, and that those members of management
pushed for deprecation of the 6809 for marketing purposes. This resulted in
an apparent lack of a "road map", which did become a problem in the
marketplace. On the other hand, eating into the 68000's market never really
was a problem.<br /><br />
</li>
<li>
OS-9 did not directly support non-position-independent code. You could, with
effort, run such code on the system, but you lost most of the advantages of
the OS. <br /><br />Writing code in a position-independent manner was not a
common practice at the time, and many programmers thought it looked more
difficult than it really was.<br /><br />Similar issues existed with the
move to the Macintosh, but going to a 16/32-bit run time from an 8-bit
run-time could be seen to have more apparent benefits than going from the
Apple II/6502 to just the 8/16-bit run-time of the 6809.<br /><br />Because
of the above, code transitioning from the Apple II to OS-9 faced not just
translation from 6502 to 6809, but conversion to position independent
architecture, as well.<br /><br />
</li>
<li>
Since most of Apple's customers would be wanting to run code written in
BASIC for the Apple II on Apple computers based on the 6809, Apple would
need to provide some means of transitioning the code. Apple's engineers
would face the same apparent obstacles and compatibility issues porting
Apple's Integer BASIC to the 6809 under OS-9 that 3rd-party software
companies and end-users would face.<br />
</li>
<li>
Another alternative, producing software to analyze BASIC programs and
convert them (somewhat) automatically to Basic09, was considered the stuff
of pipe dreams, although it was considered. We just generally didn't know
how to do things like that back then. Still don't know how to do it well.
<br /><br />
</li>
<li>
Which would basically leave the alternative of a hardware Apple II emulator
card for the OS-9/6809 computer, for backwards compatibility, and an
OS-9/6809 emulator card to allow Apple II owners to run the OS-9 programs.
<br /><br />We should note that OS-9/6809 cards for the Apple II were, in
fact, produced and sold in early 1981.<br /><br />
</li>
<li>
Motorola had a memory management unit (MMU) for the 6809, called the 6829,
but it slowed the 6809 down by basically one memory cycle per memory access.
One memory cycle felt like a lot, and the 6829 was a little expensive, so
simple bank switching was more commonly used. <br /><br />But OS-9/6809
worked well with moderately-well designed bank switching hardware, so
bank-switching was not such a big problem. And one memory cycle per access
was actually not all that bad, either.<br />
</li>
</ul>
<p>
Comparing the above to the 68000 as a path of evolution from the 6502 --<br />
</p>
<ul style="text-align: left;">
<li>
The 68000 was also most-significant byte first, so it would be something to
get used to in migrating to the 68000, just as with the 6809.<br /><br />
</li>
<li>
The 68000's equivalent of a direct page is, instead of 256 bytes from
address zero, the 64 kilobytes centered around address zero -- the highest
and lowest 32 kilobytes in address space. <br /><br />While it has no
specific direct page register comparable to the 6809's DP, to move that 64K of address space around with,
it has 8 address registers, one of which might be used as an equivalent of
the direct page register, and it has byte (8-bit) indexed address
modes to improve code density and reduce clock cycle counts near a base
address in any of those 8 registers.<br /><br />The 68000 did not have
memory indirection until the 68020, but, with all the address registers,
that was usually not a problem. Similar to memory indirection on a direct
page variable on the 6809, just load the pointer into a free address
register, and then use that address to load the value pointed to and you're
done. One extra instruction that usually didn't cost much in time or code
space.<br /><br />To illustrate, borrowing the examples above, of pointer math in the direct page on the 6809, and assuming A4 points to the local static allocation area and the variable pointed to is a long (32-bit) variable, <br /><br />
<ol>
<li>
to achieve indirection on a statically allocated local pointer variable, instead of<br />
MOVE.L [pointer(A4)],D0 ; load value at pointer<br />free an address
register if necessary and use it to load the pointer first:<br />
MOVE.L A0,-(A6) ; (not usually necessary)<br />
MOVE.L pointer(A4),A0 ; get the pointer<br /> MOVE.L
(A0),D0 ; load through it<br /> MOVE.L
(A6)+,A0 ; (pop it back if it was pushed)<br /><br />
</li>
<li>
To get the address of a local static pointer at run-time<br />
LEA pointer(A4),A0<br /><br />
</li>
</ol>Usually, A0 would be free anyway, and there would be no need to save and restore it with the push and pop auto-mode MOVEs. So the actual cost for indirection would usually be one extra instruction, not three. Also, I should note that, while I prefer to save things on the parameter stack in a case like this, others prefer not to have a separate parameter stack and the push and pop MOVEs, if used, would be<br /><br /> MOVE.L A0,-(A7)<br /><br />and <br /><br /> MOVE.L (A7)+,A0<br /><br />The 68000 had a large set of data registers (accumulators), so math in
general did not need a zero page. But it also had the full set of load
effective address instructions and auto-increment auto-decrement addressing
modes, like the 6809, so you didn't even need to use the data registers to
do the address math. Still, you needed to take time to study the register
set and instruction set, if you wanted to write good code.<br /><br />
</li>
<li>
Motorola's plans for the future of the 68000 were made fairly clear by the
marketing team. They (or the vocal members of the sales team) considered the
68000 the future of the company. (Fortunately, that faction did not attempt
to push customers to use the 68000 where the 6805 or 6801 could obviously be
used -- most of the time.)<br /><br />Anyway, it was always clear during the
1980s that Motorola intended to support and improve the 68000 long term.
(That clarity disappeared, into the 1990s, but that is after the decisive
events.)<br /><br />
</li>
<li>
OS-9/68000 was a bit more forgiving of non-relocatable code, with the
68000's larger address space, but the full features of the OS were still
limited to position independent code. So, writing for OS-9/68000 also meant
learning to write PIC.<br /><br />
</li>
<li>
One convenience of the larger address space of the 68000 was that there
would be more room to build a more complete emulation of the functionality
of Apple II software such as Integer BASIC. Other than that, accomplishing
this on OS-9/68000 would have issues similar to the issues on OS-9/6809.<br />
</li>
<li>Likewise, automatic translation, <br /><br /></li>
<li>
and hardware emulation. If you had mission-critical software to carry over
from the Apple II, you wanted to keep your Apple IIs, or you wanted an
emulation card.<br /><br />
</li>
<li>
Motorola had a memory management unit (MMU) for the 68000, called the 68451,
and it slowed the 68000 down a bit the same as the 6829 did the 6809. One
difference was that the 68000's addressing timing was a little more
flexible, so that a full memory cycle delay was not induced by the 68451.<br /><br />Nonetheless,
the 68451 was not a perfect MMU by any means. Semiconductor technology and
the industry's understanding of the problems were not sufficient at the
time. <br /><br />This lack of technology is also seen in the 68000's design
for interfacing with a memory management unit. Things did not quite fit, and
the 68000 could not recover correctly under certain conditions when a new
page of memory had to be called in from hard disk. These problems were fixed
in the 68010 CPU, released publicly in 1982.<br /><br />But, like OS-9/6809
, OS-9/68000 worked well without memory management or virtual memory, and it
really wasn't a big problem. <br /><br />The 68000 provided support for
separating system mode from user mode, which was also improved in the 68010.
<br /><br />One improvement I think they missed on was failing to add 32-bit
constant offsets to the the indexed addressing modes of the 68000,
penalizing large module designs. This was addressed in the 68020, which, in
my opinion, added way too much complexity. <br /><br />Among other things,
too many of the new instructions and addressing modes of the 68020 cost more
in time and bytes of code than simply doing the same in existing 68000 code.
(The execution time did improve significantly in the 68030.) Also, testing
become significantly more difficult on the 68020 because of the
complexity.<br /><br />For some reason, Motorola refrained from pushing
customers to move their designs to the 68010. I assume that was because they
preferred the customers moving to the 68020, instead, when moving up.<br />
</li>
</ul>
<p>
All of that said, Jeff Raskin was known to be working on a text-based
appliance computer around 1979. The 6809 was one of the options he was known
to have considered. I would be surprised if he hadn't considered OS-9/6809 for
the project, but I do not at present have proof.
</p>
<p>That's the factual background.</p>
<p>
With the factual background laid out, here's the short version of the daydream
again
</p>
<p><br /></p>
<p style="text-align: center;">
***** Alternate Reality, Short Version *****<br />
</p>
<p>
If Raskin had induced Apple to produce an Apple-branded text-based appliance
computer running OS-9/6809, they could have introduced the machine in late
1980 with minimal design effort and expense.
</p>
<p>
With that timing, the buggy Apple III of our reality could have been postponed
and redesigned as a bridge machine. Instead of a pair of 6502s, it could have
had a 6502 for Apple II emulation and a 6809 for business applications. Paired
with OS-9/6809, this would have provided a stable mini-computer class
offering.<br />
</p>
<p>
The 6502 emulator might have done double duty as the system console for the
6809, especially if the video buffer in the emulator provided 80-column wide
text. <br />
</p>
<p>
If the initial model did not have bank switching, an advanced model with bank
switching could have been produced quickly, in 1981. This model would have
supported a 2 MB memory map, giving more room for a larger video buffer.
</p>
<p>
They could have produced their own graphics environment/GUI, or, by 1983, they
could have delivered it with Microware's Multivue. None of this would have
required engineers to work 80+ hour weeks.<br />
</p>
<p>
When OS-9/68000 pre-release became available to Apple, Apple might have
re-engineered the Lisa with OS-9/68000 underneath. Perhaps the re-engineering
would have been part of the Macintosh project, or, if Microware were working
with Apple early enough, the Lisa might have had OS-9/68000 as its original
foundation OS. The Toolbox would have been a collection of OS-9 modules.
<br />
</p>
<p>
With a shared technological base, the Lisa and the Macintosh would have shared
most of their software architecture and the success of either would have
supported the success of the other, the Lisa positioned as an office machine
class computer and the Macintosh as a true personal computer at a price within
reach of individuals.
</p>
<p>
A/UX would never have been necessary. Copland could have been NeXT, instead of
a failed mess of marketing dreams.
</p>
<p>IBM PC? </p>
<p>
IBM would have had to re-think that launch. Even if they had ended up using
the 8088, they wouldn't have dared launch it with PC-DOS. They'd have been
motivated enough to overlook the miscommunications with Kildall, so maybe
MP/M. Or maybe they could have gone to TSC and got a Uniflex/8086 -- that
might have been achievable before year-end 1981. Or maybe they'd have been
sensible enough to scrub the 8088 prototype and redo it with a 68000.
</p>
<p>
Or simply copy the OS-9/6809 gambit, get it out in August, and put their stamp
of approval on Apple and Tandy. Or, more likely, set up two lines of PCs, one
based on the 6809 and one on the 8088. And when OS-9/68000 came out, simply
add another line based on the 68000.
</p>
<p>
At any rate, we'd have had a much more pluralistic market, relative to CPUs.
Whether that would have improved the substance of the CPU wars or pushed the
wars more toward pathological can't be seen from here, but we could hope:
</p>
<p>
Among other actual fixes to the 80x86 architecture, Intel might have been
motivated to stretch the segment register widths in the first iteration of the
8086 line, instead of just adding a DMA controller to the register-poor model
continued in the 80186. This might have allowed them to avoid most of the
detours of the 80286. (I think it's doubtful. They had the heritage of the
iAPX 432 to get over.)<br />
</p>
<p>
Motorola, on the other hand, might have made a fully 16-bit 6809 capable of 24
or 32 bits of address. True segment registers? Or maybe just extending the
width of the index registers and the direct page register? (I'd lean toward
the latter, actually.) Hopefully, they'd have fixed the infelicities I've
mentioned above, and a few others related to system/user space separation,
that I haven't.<br />
</p>
<p>
And maybe, just maybe, instead of over-designing the 68020, Motorola might
have taken the lessons of the 6809 and the nascent RISC designs to heart and
refrained from adding instructions and addressing modes that took more time
and more bits of code than simply doing the same thing with existing
instructions, and generally added to complexity more than utility. With a
simplified target run-time architecture, they could have brought a fully
tested, true 32-bit internal 680X0 to market much more quickly.<br />
</p>
<p>
Commodore, of course, would have jumped on OS-9/6809, if they wouldn't have
been doing it alone. <br />
</p>
<p>
And Tandy would have quit puttering around and focused on the Color Computer
line, and had a Color Computer 3 equivalent ready, probably with DMA and other
business-class features, when Microware released Multi-vue in 1983 or '84.<br />
</p>
<p>
Who knows what would have become of Microsoft? We can suppose they would have
built their company on Xenix. Microsoft BASIC running on Xenix? Heh. But, yes.
We can guess that "Windows" would have remained a GUI paradigm, instead of
becoming a captured trademark of Microsoft. Or maybe Bill Gates would have
doubled down on "Windows", but at least it would have had Xenix underneath.
</p>
<p>
BSD/Research Unix would probably have remained the dominant OS in colleges.
</p>
<p></p>
<p>
Minix, Xinu, and such would still have been developed. But Minix probably
would not have been written first for the 8088, since the 8088 would have been
left behind much earlier.
</p>
<p>
Likewise, the GNU re-implementation of the Unix Workbench. With more solid
microprocessor technology to choose from and earlier implementation of the
core tools, the migration from LISP would not have been necessary on the one
hand, and, on the other hand, the vaporware kernel for Hurd might have seen
real implementation by the early 1990s.<br />
</p>
<p>
Linus might still have made his own kernel, but probably not on the 80386. But
he might not have felt the need to, if the Gnu/Hurd OS were already there.
<br />
</p>
<p>How's that for alternate reality?</p>
<p>
(Yeah, I've been thinking about this for a while. Maybe too long. In fact, I'm
<a href="https://joelrees-novels.blogspot.com/2020/01/33209-2nd-Microcomputer-Revolution-Homecoming-TOC.html" target="_blank">working on a novel based in something like this alternate reality, but a little different</a>.) <br />
</p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-91767533810119537202021-08-26T23:08:00.008+09:002021-08-27T22:04:46.909+09:00Differences between the 6800 and the 6801, with Notes on 68HC11 and 6809<p>
<i>(It occurs to me that adding notes about the 68HC11 and 6809 in here would be
useful, but I want to leave the simpler
<a href="https://defining-computers.blogspot.com/2021/08/differences-between-6800-and-6801.html" target="_blank">comparison of the 6800 and 6801</a>
as it is. So I'm duplicating that here, and adding notes for the 6809 and 68HC11. However, I am not comparing any of these with the 6805 or its descendants here.) </i><br />
</p>
<p>
I've described the differences between the 6800 and the 6801 instruction
architectures at length in several other posts. (Many of those posts also take
up other CPUs.) This is a high-level overview.
</p>
<p>
**68HC11: I'll note here that, much as the 6801 is object-code upwards
compatible with the 6800, the 68HC11 is object-code upwards compatible with
the 6801, but with some timing differences.
</p>
<p>
**6809: The 6809 is not object-code upwards compatible with either, although
it is, to a great degree, assembler source upwards compatible with both, using
macros. It is not upward source code compatible with the 68HC11, needing
hardware divide instructions and direct bit manipulation instructions. (The
bit instructions generally need more than two 6809 instructions to synthesize,
and the divide instructions just take too long to wink at.)<br />
</p>
<p>
(But I still ignore the built-in ROM, RAM, and peripheral devices in the 6801
here. Those are important, but require separate treatment.)
</p>
<p>
**68HC11: (The 68CH11 has built-in RAM, ROM,. and peripherals, like the
6801.
</p>
<p>
**6809: The 6809 does not, and never came in a publicly available version that
did, that I know of.)<br />
</p>
<p>
First, where the 6800 had two independent 8-bit accumulators, A and B, the
6801 has, in addition, the ability to combine them as a single 16-bit
accumulator (A:B, or D) for several key instructions, load, store, add, and
subtract: <br />
</p>
<p></p>
<blockquote>LDD<br />STD<br />ADDD<br />SUBD</blockquote>
<p>HIgh byte is in A, low byte in B. <br /></p>
<p>
These are available in the full complement of binary addressing modes that the
6800 provides:
</p>
<ul style="text-align: left;">
<li>immediate (16-bit immediate value) <br /></li>
<li>direct/zero page (addresses 0 to 255)<br /></li>
<li>indexed (with 8-bit constant offset)</li>
<li>extended/absolute (addresses 0 to 65536) <br /></li>
</ul>
<p></p>
<p>
Note that there is no separate CMPD. If you really need to compare two 16-bit
values, you'll need to go ahead do a destructive compare -- save D in a
temporary, if necessary, and use the subtract instruction. (The way the flags
work, it's only rarely necessary to do so.)
</p>
<p>
Be aware that the D register is not an actual additional register. It is
simply the concatenation and A and B. If you LDD #$ABCD, A will have $AB in it
and B will have $CD in it.
</p>
<p></p>
<p></p>
<table border="1">
<caption>
Registers in the 6800:
</caption>
<tbody style="text-align: center;">
<tr>
<td width="50%"><br /></td>
<td>accumulator A:8</td>
</tr>
<tr>
<td width="50%"><br /></td>
<td>accumulator B:8</td>
</tr>
<tr>
<td colspan="2">Index X:16</td>
</tr>
<tr>
<td colspan="2">Stack Pointer SP:16</td>
</tr>
<tr>
<td colspan="2">Program Counter PC:16</td>
</tr>
<tr>
<td width="50%"><br /></td>
<td>Condition Codes CC:8</td>
</tr>
</tbody>
</table>
<p></p>
<table border="1">
<caption>
Registers in the 6801:
</caption>
<tbody style="text-align: center;">
<tr>
<td colspan="2">
<div>
accumulator D (A:B)<br />
<div style="border: 3px ridge black; display: inline-block; text-align: center; width: 47%;">
accumulator A
</div>
<div style="border: 3px ridge black; display: inline-block; text-align: center; width: 47%;">
accumulator B
</div>
</div>
</td>
</tr>
<tr>
<td colspan="2">Index X</td>
</tr>
<tr>
<td colspan="2">Stack Pointer SP</td>
</tr>
<tr>
<td colspan="2">Program Counter PC</td>
</tr>
<tr>
<td width="50%"><br /></td>
<td>Condition Codes CC</td>
</tr>
</tbody>
</table>
<p> </p>
<p>
**68HC11/6809: Both the 68HC11 and the 6809 include the CMPD instruction, with
the full set of addressing modes.
</p>
<p>
**68HC11: The 68HC11 has an actual additional Y (IY) index register. Other
than the additional byte and cycle implied by the prebyte for the IY
instructions, it can do essentially anything the IX register can do.
</p>
<p></p>
<table border="1">
<caption>
Registers in the 68HC11:
</caption>
<tbody style="text-align: center;">
<tr>
<td colspan="2">
<div>
accumulator D:16 (A:B)<br />
<div style="border: 3px ridge black; display: inline-block; text-align: center; width: 47%;">
accumulator A:8
</div>
<div style="border: 3px ridge black; display: inline-block; text-align: center; width: 47%;">
accumulator B:8
</div>
</div>
</td>
</tr>
<tr>
<td colspan="2">Index IX:16</td>
</tr>
<tr>
<td colspan="2">Index IY:16</td>
</tr>
<tr>
<td colspan="2">Stack Pointer SP:16</td>
</tr>
<tr>
<td colspan="2">Program Counter PC:16</td>
</tr>
<tr>
<td width="50%"><br /></td>
<td>Condition Codes CC:8</td>
</tr>
</tbody>
</table>
<p></p>
<p> </p>
<p>
**68HC11: Like the 6800/6801, the 68HC11 has only one indexing mode, a
constant 8-bit unsigned offset of 0 to 255 from the index register, but it can
be used with either IX or IY:
</p>
<ul style="text-align: left;">
<li>n,X or n,Y. <br /></li>
</ul>
<p>
**6809: The 6809 adds to the X index and S stack pointer a Y index
register and a U stack pointer, and allows indexed address modes with both
stack pointers. And the indexed addressing modes are significantly improved,
allowing auto-inc/dec, various sizes of signed constant offsets, variable
offset using the accumulators, pushing and popping multiple registers on the U
and S stacks, and memory indirection. (Deep breath.) Oh. I forgot. PC is
indexable, too.<br />
</p>
<p>
**6809: And the 6809 provides a DP register that serves as the top 8 bits of
address when using the direct page address mode. (Unfortunately, while memory
indirection on an extended address is provided, memory indirection on a
direct-page address is not. Darn.)<br />
</p>
<p></p>
<table border="1">
<caption>
Registers in the 6809:
</caption>
<tbody style="text-align: center;">
<tr>
<td colspan="2">
<div>
accumulator D:16 (A|B)<br />
<div style="border: 3px ridge black; display: inline-block; text-align: center; width: 47%;">
accumulator A:8
</div>
<div style="border: 3px ridge black; display: inline-block; text-align: center; width: 47%;">
accumulator B:8
</div>
</div>
</td>
</tr>
<tr>
<td colspan="2">Index X:16</td>
</tr>
<tr>
<td colspan="2">Index Y:16</td>
</tr>
<tr>
<td colspan="2">Indexable Return Stack Pointer S:16</td>
</tr>
<tr>
<td colspan="2">Indexable User Stack Pointer U:16</td>
</tr>
<tr>
<td colspan="2">Indexable Program Counter PC:16</td>
</tr>
<tr>
<td width="50%"><br /></td>
<td>Direct Page Base DP:8</td>
</tr>
<tr>
<td width="50%"><br /></td>
<td>Condition Codes CC:8</td>
</tr>
</tbody>
</table>
<p> </p>
<p>**6809: Indexing modes for the 6809 for all four indexable registers X, Y, U, and S include </p>
<ul style="text-align: left;">
<li>zero offset,<br /></li>
<li>
constant signed offset of
<ul>
<li>5 bits (-16 to 15),<br /></li>
<li>8 bits (-128 to 127),<br /></li>
<li>16 bits (-32768 to 32767),<br /></li>
</ul></li>
<li>
(signed variable) accumulator offset of <br />
<ul>
<li>A (-128 to 127), </li>
<li>B (-128 to 127),</li>
<li>D (-32768 to 32767),<br /></li>
</ul></li>
<li>auto increment/decrement by 1 or 2,<br /></li>
<li>constant signed offset from PC of<br />
<ul><li>8 bits (-128 to 127),</li><li>16 bits (-32768 to 32767), <br /></li>
</ul></li>
<li>absolute/extended memory indirect<br /></li></ul><div><ul style="text-align: left;">
</ul>
<p>**6809: In addition to the indexed modes referencing the four index registers, two additional indexed modes are provided via the index post-byte encoding:</p><ul style="text-align: left;"><li>program counter relative, with constant signed offset from PC of<br />
</li><ul><li>8 bits (-128 to 127) or <br /></li><li>16 bits (-32768 to 32767), </li></ul><li>extended (absolute) memory indirect.</li></ul><p>**6809: The index post-byte encoding also provides one level of memory indirection on the result address for all indexed addressing modes except for constant 5-bit offset mode. (However, this does not imply double indirection for the extended memory indirect mode.) Memory indirection allows, for example, popping a pointer off a stack and loading an accumulator from the pointer without using an intermediate index register -- thus </p><p></p><blockquote>LDX ,U++ ; pop pointer into X<br />LDA ,X ; use pointer to load A<br /></blockquote><p></p><p>can be done in one instruction, without using X:</p><blockquote><p>LDA [,U++]<br /></p></blockquote>
<p> </p>
<p>Second, the 6801 has an 8-bit by 8-bit integer multiply, </p>
<blockquote><p>MUL</p></blockquote>
<p></p>
<p></p>
<p></p>
<p>
which multiplies the A and B accumulators yielding a 16-bit result in the
double accumulator D. 16-bit multiplies can be done the traditional bit-by-bit
way to save a few bytes, or with four 8-bit MULs and appropriate adding of
columns for a much faster result.
</p>
<p>**68HC11/6809: Both the 68HC11 and the 6809 have the 8 by 8 multiply, as well. <br /></p>
<p>
There is no hardware divide in the 6801. You'll have to move up to 68HC11,
68HC12, 68HC16, 68000, or Coldfire, for that.
</p>
<p>
**6809: The 6809 does not have a hardware divide, either. </p><p>**68HC11: The 68HC11 has both
integer and fractional 16 bit by 16 bit hardware divide.<br />
</p>
<p>
Third, the 6801 adds two 16-bit shifts, logical shift left and right,
accumulator-only:
</p>
<p></p>
<blockquote>
<p>LSLD (ASLD)<br />LSRD</p>
</blockquote>
<p></p>
<p>
**68HC11: These two instructions are present in the 68HC11, as well.</p><p>**6809: These two instructions are not present in the 6809. <br />
</p>
<p>
These were considered key instructions. If you need 16-bit versions of
the rest of the accumulator shifts and rotates, they are easy, and not
expensive, to synthesize with shift-rotate or shift-shift pairs.
</p>
<p>
(Note that the 6800/6801 does not provide an arithmetic shift left distinct
from the logical shift left. As an exercise for the reader, see if you can
make an argument for doing so, and describe the separate behavior the
arithmetic shift left should have. Heh. Not sure if I'm kidding or not.
Saturation?)
</p>
<p>
Fourth, the 6801 adds a bit of index math, and push and pop for the index
register:
</p>
<blockquote>
<p>ABX (add B to X, unsigned only)<br />PSHX<br />PULX<br /></p>
</blockquote>
<p>
**68HC11: The 68HC11 adds ABY, PSHY, and PULY. It also, of course, adds increment and
decrement Y -- INY/DEY. (But no subtract B from X or Y, darn.)
</p>
<p>
**6809: The 6809 has two sets of pushes and pops, PSHS/U and PULS/U. Each takes a
register list, so you can essentially save or restore the entire processor
state on either stack in a single instruction.
</p>
<p>
**6809/68HC11: There is one important difference between the 6800/6801/68HC11 and the
6809:
</p>
<p>
**68HC11 The stack in the former is post-decrement push, just as the 6800 stack is, always pointing one byte
below the top of stack (next free byte). (TSX, TSY, TXS, and TYS adjust the
pointer before moving it to or from the index register, so that indexing has
no surprises. But if you save S to memory, then load it to the index register,
surprise!)
</p>
<p>
**6809: The stacks in the 6809 are pre-decrement push, always pointing to the last
element pushed. Never any surprise on the 6809, but you do need to be careful
about this when moving code from the 6800/6801/68HC11 to the 6809.<br />
</p>
<p>
**6809: The 6809 adds (hang on to your seat again) a load effective address
instruction that can load the result address of any indexed addressing mode
into any of the four indexable registers other than the PC. Use the LEA
instruction to add any signed constant offset to X, Y, U, or S, or to add
either 8-bit accumulator or the double accumulator to X, Y, U, or S. PC cannot
be used as a destination of LEA, but it can be used as a source, allowing such things as constant tables embedded in fully position-independent code without having to game the return stack to access them. Yes, this
seriously makes up for the otherwise limited register set of the 6809. Serious
magic.</p>
<p>
There is one 6801 instruction (four op-codes for each of the binary addressing
modes), and one only, with different semantics from the 6800:
</p>
<blockquote>
<p>CPX (full results in flags)<br /></p>
</blockquote>
<p>
On the 6800, you could only depend on the Zero flag after a CPX. (Actually,
Negative was also set, and oVerflow, too, but not by rules that were useful.)
On the 6801, the Negative, oVerflow, and Carry flags are also set
appropriately, so that you can use any conditional branch, not just BEQ/BNE,
after a CPX and get meaningful results.
</p>
<p>** The 68HC11 adds the CPY instruction, in full addressing modes, with full 16-bit comparison semantics like the 6801 CPX.</p>
<p>
** The 6809 adds CMPY, CMPS, and CMPU, in full addressing modes, with full 16-bit comparison semantics. (The CPX mnemonic is
CMPX on the 6809.)<br />
</p>
<p>These improve index handling, help support stack frames, etc.</p>
<p>
(I am not a fan of stack frames on the return address stack, but the frame
pointer pushed on S could as easily be a pointer into a synthesized parameter
stack. I think maintaining a synthetic parameter stack is no more expensive
than maintaining stack frames in a combined stack on the 6800 or 6801.) </p><p>**6809 The 6809 U register can be used as a frame pointer in a combined stack run-time architecture. Alternatively, in a split-stack run-time, it can be pushed to the S stack on routine entry as a frame link. <br /></p>
<p>
The 6801 provides an additional op-code for the call instruction, adding the
direct/zero page addressing mode: <br />
</p>
<blockquote>
<p>JSR (direct page) <br /></p>
</blockquote>
<p>
This allows the programmer to put short, critical subroutines in the direct
page for more efficient calls.
</p>
<p>
(JMP does not get the additional op-code, which means that inner-interpreter
loops for virtual machines do not benefit from allocation in the direct page,
same as the 6800.)</p><p>*68HC11: The 68HC11 follows the 6801 with regards to JSR and JMP. Only extended and indexed mode for JMP, but direct page mode added for JSR.<br /></p><p>**6809: The 6809, on the other hand, adds direct page mode JMP as well as JSR.</p><p>**68HC11: The 68HC11 follows the 6800/6801 in not providing direct page mode addressing for unary instructions (increments/decrements, shifts, etc.).</p><p>**6809: The 6809, on the other hand, does provide the direct page mode opcodes for all unary instructions. (Unfortunately, it does not provide indirection through the direct page.)<br />
</p>
<p>Finally, the 6801 adds a branch never instruction:</p>
<blockquote><p>BRN</p></blockquote>
<p>
This can be useful as a marker no-op in object code, helpful in debugging,
linking, and compiler code generation. </p><p>**68HC11: The 68HC11 also includes BRN.</p><p>**6809: The 6809 also include BRN.</p><p>**6809: The 6809 provides long versions of all branches. This, in addition to allowing PC relative indexing, provides significant support for position independent coding so that modules can be loaded anywhere in memory. <br /></p>
<p>
If you want more information than this and my writing seems understandable
--<br />
</p>
<p>
You can see something of how these additions improve code in a
<a href="https://joels-programming-fun.blogspot.com/2020/12/64-bit-addition-on-four-retro-cpus-6800.html" target="_blank">post I put up describing 64-bit math on several of the Motorola CPUs</a>: <br /><a href="https://joels-programming-fun.blogspot.com/2020/12/64-bit-addition-on-four-retro-cpus-6800.html" target="_blank"></a>
</p>
<blockquote>
<a href="https://joels-programming-fun.blogspot.com/2020/12/64-bit-addition-on-four-retro-cpus-6800.html" target="_blank">https://joels-programming-fun.blogspot.com/2020/12/64-bit-addition-on-four-retro-cpus-6800.html</a>
</blockquote>
Also, I discussed
<a href="https://defining-computers.blogspot.com/2021/08/guessing-which-motorola-microcontroller-6801-6805-6811-6809.html" target="_blank">microcontroller (plus 6809) differences in this post</a>:<br />
<blockquote>
<a href="https://defining-computers.blogspot.com/2021/08/guessing-which-motorola-microcontroller-6801-6805-6811-6809.html" target="_blank">https://defining-computers.blogspot.com/2021/08/guessing-which-motorola-microcontroller-6801-6805-6811-6809.html</a>
</blockquote>
I have a long
<a href="https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html" target="_blank">rant discussing the differences between the 68HC11 and the 6809, which picks
up the 6801 and 68000 along the way</a>:
<p></p>
<blockquote>
<p>
<a href="https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html" target="_blank">https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html</a><br />
</p>
</blockquote>
<p>
And I have
<a href="https://joelrees-novels.blogspot.com/2020/02/33209-little-about-6800-and-others.html" target="_blank">this chapter</a>
of a
<a href="https://joelrees-novels.blogspot.com/2020/01/33209-2nd-Microcomputer-Revolution-Homecoming-TOC.html" target="_blank">novel in process</a>
(or suspended animation, not sure which),
<a href="https://joelrees-novels.blogspot.com/2020/02/33209-little-about-6800-and-others.html" target="_blank">which gives a bit more of a detailed discussion of the 6800 instruction set
architecture, touching on the 6801</a>:
</p>
<blockquote>
<p>
<a href="https://joelrees-novels.blogspot.com/2020/02/33209-little-about-6800-and-others.html" target="_blank">https://joelrees-novels.blogspot.com/2020/02/33209-little-about-6800-and-others.html</a><br />
</p>
</blockquote>
<p>
Also, with Joe H. Allen's permission, I forked his Exorsim project and
<a href="https://osdn.net/users/reiisi/pf/exorsim6801/wiki/FrontPage" target="_blank">added instruction set architecture support for the 6801</a>
to it. You can find source code for a fig-Forth implementation for the 6800
and an implementation (somewhat) optimized to the 6801 in the test source code
I include there:
</p>
<blockquote>
<p>
<a href="https://osdn.net/users/reiisi/pf/exorsim6801/wiki/FrontPage" target="_blank">https://osdn.net/users/reiisi/pf/exorsim6801/wiki/FrontPage</a>
<br />
</p>
</blockquote>
<p>
My
<a href="https://sourceforge.net/projects/asm68c/" target="_blank">assembler for 6800/6801</a>
may be useful for assembling the fig-Forth source:
</p>
<blockquote>
<p>
<a href="https://sourceforge.net/projects/asm68c/" target="_blank">https://sourceforge.net/projects/asm68c/</a><br />
</p>
</blockquote>
<p> </p>
</div>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-46349291978606851492021-08-23T22:03:00.008+09:002021-08-30T16:00:19.358+09:00Differences between the 6800 and the 6801 (Revisited)<p>I've described the differences between the 6800 and the 6801 instruction architectures at length in several other posts. This is a high-level overview.</p><p><i>(This same <a href="https://defining-computers.blogspot.com/2021/08/differences-between-6800-and-6801-with-notes-68hc11-6809.html" target="_blank">rant with notes on 68HC11 and 6809</a> can be found here: <a href="https://defining-computers.blogspot.com/2021/08/differences-between-6800-and-6801-with-notes-68hc11-6809.html" target="_blank">https://defining-computers.blogspot.com/2021/08/differences-between-6800-and-6801-with-notes-68hc11-6809.html</a> .)</i><br /></p><p>(But I still ignore the built-in ROM, RAM, and peripheral devices in the 6801. Those are important, but require separate treatment.)<br /></p><p>First, where the 6800 had two independent 8-bit accumulators, A and B, the 6801 has, in addition, the ability to combine them as a single 16-bit accumulator (A:B, or D) for several key instructions, load, store, add, and subtract: <br /></p><p></p><blockquote>LDD<br />STD<br />ADDD<br />SUBD</blockquote><p>HIgh byte is in A, low byte in B. <br /></p><p>These are available in the full complement of binary addressing modes that the 6800 provides:</p><ul style="text-align: left;"><li>immediate (16-bit immediate value) <br /></li><li>direct/zero page (addresses 0 to 255)<br /></li><li>indexed (with 8-bit constant offset)</li><li>extended/absolute (addresses 0 to 65536) <br /></li></ul><p></p><p>Note that there is no separate CMPD. If you really need to compare two 16-bit values, you'll need to go ahead do a destructive compare -- save D in a temporary, if necessary, and use the subtract instruction. (The way the flags work, it's only rarely necessary to do so.)</p><p>Be aware that the D register is not an actual additional register. It is simply the concatenation and A and B. If you LDD #$ABCD, A will have $AB in it and B will have $CD in it.<br /></p><p>Second, the 6801 has an 8-bit by 8-bit integer multiply, </p><blockquote><p>MUL</p></blockquote><p></p><p></p><p></p><p>which multiplies the A and B accumulators yielding a 16-bit result in the double accumulator D. 16-bit multiplies can be done the traditional bit-by-bit way to save a few bytes, or with four 8-bit MULs and appropriate adding of columns for a much faster result.</p><p>There is no hardware divide in the 6801. You'll have to move up to 68HC11, 68HC12, 68HC16, 68000, or Coldfire, for that.<br /></p><p>Third, the 6801 adds two 16-bit shifts, logical shift left and right, accumulator-only:</p><p></p><blockquote><p>LSLD (ASLD)<br />LSRD</p></blockquote><p></p><p>These were considered key instructions. If you need 16-bit versions of the rest of the accumulator shifts and rotates, they are easy, and not expensive, to synthesize with shift-rotate or shift-shift pairs.<br /></p><p>(Note that the 6800/6801 does not provide an arithmetic shift left distinct from the logical shift left. As an exercise for the reader, see if you can make an argument for doing so, and describe the separate behavior the arithmetic shift left should have. Heh. Not sure if I'm kidding or not. Saturation?) </p><p>Fourth, the 6801 adds a bit of index math, and push and pop for the index register: </p><blockquote><p>ABX (add B to X, unsigned only)<br />PSHX<br />PULX<br /></p></blockquote><p>There is one 6801 instruction (four op-codes for each of the binary addressing modes), and one only, with different semantics from the 6800:</p><blockquote><p>CPX (full results in flags)<br /></p></blockquote><p>On the 6800, you could only depend on the Zero flag after a CPX. (Actually, Negative was also set, and oVerflow, too, but not by rules that were useful.) On the 6801, the Negative, oVerflow, and Carry flags are also set appropriately, so that you can use any conditional branch, not just BEQ/BNE, after a CPX and get meaningful results. <br /></p><p>These improve index handling, help support stack frames, etc.</p><p>(I am not a fan of stack frames on the return address stack, but the frame pointer pushed on S could as easily be a pointer into a synthesized parameter stack. I think maintaining a synthetic parameter stack is no more expensive than maintaining stack frames in a combined stack on the 6800 or 6801.) </p><p>The 6801 provides an additional op-code for the call instruction, adding the direct/zero page addressing mode: <br /></p><blockquote><p>JSR (direct page) <br /></p></blockquote><p>This allows the programmer to put short, critical subroutines in the direct page for more efficient calls. </p><p>(JMP does not get the additional op-code, which means that inner-interpreter loops for virtual machines do not benefit from allocation in the direct page, same as the 6800.)<br /></p><p>Finally, the 6801 adds a branch never instruction:</p><blockquote><p>BRN</p></blockquote><p>This can be useful as a marker no-op in object code, helpful in debugging, linking, and compiler code generation.</p><p>At this point, you should suspect that the 6801 is fully object-code level upwards compatible with the 6800. It is. <br /></p><p>If you want more information than this and my writing seems understandable --<br /></p><p>You can see something of how these additions improve code in a <a href="https://joels-programming-fun.blogspot.com/2020/12/64-bit-addition-on-four-retro-cpus-6800.html" target="_blank">post I put up describing 64-bit math on several of the Motorola CPUs</a>: <br /><a href="https://joels-programming-fun.blogspot.com/2020/12/64-bit-addition-on-four-retro-cpus-6800.html" target="_blank"></a></p><blockquote><a href="https://joels-programming-fun.blogspot.com/2020/12/64-bit-addition-on-four-retro-cpus-6800.html" target="_blank">https://joels-programming-fun.blogspot.com/2020/12/64-bit-addition-on-four-retro-cpus-6800.html</a></blockquote>Also, I discussed <a href="https://defining-computers.blogspot.com/2021/08/guessing-which-motorola-microcontroller-6801-6805-6811-6809.html" target="_blank">microcontroller (plus 6809) differences in this post</a>:<br /><blockquote><a href="https://defining-computers.blogspot.com/2021/08/guessing-which-motorola-microcontroller-6801-6805-6811-6809.html" target="_blank">https://defining-computers.blogspot.com/2021/08/guessing-which-motorola-microcontroller-6801-6805-6811-6809.html</a></blockquote>I have a long <a href="https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html" target="_blank">rant discussing the differences between the 68HC11 and the 6809, which picks up the 6801 and 68000 along the way</a>:<p></p><blockquote><p><a href="https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html" target="_blank">https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html</a><br /></p></blockquote><p>And I have <a href="https://joelrees-novels.blogspot.com/2020/02/33209-little-about-6800-and-others.html" target="_blank">this chapter</a> of a <a href="https://joelrees-novels.blogspot.com/2020/01/33209-2nd-Microcomputer-Revolution-Homecoming-TOC.html" target="_blank">novel in process</a> (or suspended animation, not sure which), <a href="https://joelrees-novels.blogspot.com/2020/02/33209-little-about-6800-and-others.html" target="_blank">which gives a bit more of a detailed discussion of the 6800 instruction set architecture, touching on the 6801</a>:</p><blockquote><p><a href="https://joelrees-novels.blogspot.com/2020/02/33209-little-about-6800-and-others.html" target="_blank">https://joelrees-novels.blogspot.com/2020/02/33209-little-about-6800-and-others.html</a><br /></p></blockquote><p>Also, with Joe H. Allen's permission, I forked his Exorsim project and <a href="https://osdn.net/users/reiisi/pf/exorsim6801/wiki/FrontPage" target="_blank">added instruction set architecture support for the 6801</a> to it. You can find source code for a fig-Forth implementation for the 6800 and an implementation (somewhat) optimized to the 6801 in the test source code I include there:</p><blockquote><p><a href="https://osdn.net/users/reiisi/pf/exorsim6801/wiki/FrontPage" target="_blank">https://osdn.net/users/reiisi/pf/exorsim6801/wiki/FrontPage</a> <br /></p></blockquote><p>My <a href="https://sourceforge.net/projects/asm68c/" target="_blank">assembler for 6800/6801</a> may be useful for assembling the fig-Forth source:</p><blockquote><p><a href="https://sourceforge.net/projects/asm68c/" target="_blank">https://sourceforge.net/projects/asm68c/</a><br /></p></blockquote><p> </p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-69703418794717790382021-08-15T17:07:00.014+09:002022-05-07T19:05:49.907+09:00The Real Reason IBM Chose the "Wrong" CPU for the PC<p>[JMR202108181958 -- adding summary:</p><p>My own sisters don't want to read this because I'm speaking too much Geek. Re-reading it, I guess they're right. And what I wrote seems to wander all over without apparent reason -- not apparent unless you already know what I'm trying to say.</p><p>So, I guess I should put a high-level summary up front here:</p><p>Up until twenty years ago, common wisdom was that IBM picked the "right" CPU for the IBM PC. Only crackpots like me thought differently. But the evidence mounts, and now the common wisdom is that IBM picked the "wrong" CPU for the "right reasons". </p><p>And then you still hear lots of things that don't match reality. Or, at least, I still hear a lot of things that don't match what I know and remember on the subject. That's part of the reason I wrote this rant, to tell what I remember of things.</p><p>But you never hear what I think is the real reason, and that's the real reason I wrote this.</p><p>The short version of what I understood at the time was the reason -- and I have seen no real evidence to the contrary -- is this:<br /></p><p>(1) IBM didn't choose Motorola's 6809 because Motorola did not support it like they should have. Motorola didn't support the 6809 like they should have because they were afraid of it eating into the 68000's market.</p><p>(2) IBM didn't choose Motorola's 68000 because they (IBM) were afraid that a personal computer based on the 68000 would eat into their market for their mid-range (minicomputer-class) computers based on the System 3.</p><p>That's the conclusion, and the rest of this rant is about where that conclusion comes from.<br /></p><p>] <br /></p><p>You need to understand the situation in 1979-81 clearly.<br /></p><p>You have to remember that IBM was not officially considering entering the personal/home computer manufacturing business. </p><p>In another company, the project to develop the PC might have been called a skunkworks project. But the PC project had even less official recognition. Yes, they were working separately from the main company, yes, management kept hands-off. (Some had even washed their hands of it.) Work was performed in secrecy, and it was not only started without contract or official directive, it was mostly complete by the time upper-level management acknowledged it.</p><p>IBM already had their blue-sky research projects, which was something more akin to the Skunk Works at Lockheed. This was different.<br /></p><p>Not to say that Skunk Works and blue-sky were completely free from adversarial management, but the PC project was pursued in a much more adversarial management environment. The engineers who built the initial prototype were permitted to do so by their manager, who acted in specific contradiction to direction from the next-up level of management. </p><p>At least, that's the story I heard several times while working internships for IBM, and those stories matched what I was seeing, where later stories do not.<br /></p><p>Of course, those who let the project move ahead were, to more-or-less extent, putting their own jobs on the line for the results. <br /></p><p>IBM's marketing and engineering did not want to deal with the threat of microprocessors in general-purpose computing devices. The attitude I heard was, of course microprocessors can't do the job. They are strictly for controls devices and calculators. </p><p>And it wasn't exactly a mistaken attitude. All existing microprocessors at the time were missing elements that were important to general-purpose computing -- memory management hardware, direct memory access input-output channels, a hardware timer for dividing the CPU's time between tasks, proper hardware division of task resources, .... And the list goes on. </p><p>Microprocessors are still missing much of that list. But even the "big" computers of the time didn't have all these things in place, either. So it was, in fact, ignoring reality. <br /></p><p>I will mention this attitude again further down, but this is enough to get a feel of how things were at IBM.<br /></p><p>Apple and Commodore's history is pretty well known. That is to say, I was not close enough to them to add much, so I won't. Everyone knows that Apple IIs were selling well in business and education markets, and Commodore's offerings were just behind in business and education, but were ahead in personal/home use sales, and eating away at the dedicated games machine market.<br /></p><p>But Radio Shack's history is not so well known. This should not be surprising. Radio Shack really didn't have an approach to write about. The TRS-80 sort of fell into their lap. </p><p>(Again, I was listening to local management discuss things while it was happening. I was trying to be a Radio Shack salesman in Odessa, Texas when the first TRS-80s were delivered. I got to unpack our demo unit and write something up as a demonstration program.) </p><p>The guy who designed the Z-80-based TRS-80 original model (now called Model 1) just kludged some stuff together from demonstration circuits published by Zilog and hobby circuits from the hobby industry. I saw the circuit diagrams, and I knew enough to tell where some of the short-cuts had been taken that were a bit beyond specifications for the parts. It was intended as a proof-of-concept, but Radio Shack had no engineers at the time to actually fix the design, and had no motivation to do so, either. It was part of their culture, and they were looking to compete with folks like Tramiel.</p><p>Now, I took a break from the technical world during 1979 and 1980, to serve God according to my understandings and belief. When I returned, Radio Shack had hired a few engineers and was half-heartedly trying to clean up the Z-80-based model designs. </p><p>And they had had another personal/home computer design fall into their lap, the M6809-based Color Computer. This one was a Motorola example circuit, with just a little modification. </p><p>The 6809 was potentially more powerful than the Z-80, powerful enough to go head-to-head with the 8088 in certain applications, powerful enough to build a minicomputer-class microcomputer with. But Radio Shack didn't have the engineers or the motivation to do anything but sell the thing as a game machine -- or as a toy for geeks. </p><p>Microware had been able to get their OS-9/6809 operating system running on the Color Computer, and it was suddenly a serious business and/or industrial controller class machine -- a seriously (woefully) hobbled business machine, but squarely in the same class as the IBM-PC would shortly be released for. So Radio Shack let some engineers cobble together a floppy controller that could support OS-9/6809 and could be plugged into the game cartridge port -- instead of investing in designing a true business-class machine as an upgrade to the Color Computer.<br /></p>Admittedly, Motorola was not really supporting the 6809. They were busy selling into the microcontroller market as many true 8-bit 6805s and mixed 8/16-bit 6801s as they could manufacture, and they considered their future to be riding on such smaller microcontrollers and on the 68000.<p>We should remember that the nascent PC market was not nearly as important to Motorola (or to Zilog) as it was to Intel or Commodore. Motorola sold several orders of magnitude more microcontrollers than any CPU manufacture has ever sold microprocessors for personal computers. And it was hard to argue with their attitude. It would be fifteen years down the road before mind-share issues would be eroding their position in the controls market enough to get the full attention of Motorola's management.<br /></p><p>Focusing on controls was not a mistake for Motorola. Not seeing the mind-share problem was. And only a very few engineers anywhere were seeing the potential uses for PCs as communications devices at the time, so it's not too surprising that Motorola didn't recognize why the PC was important to their future.<br /></p><p>That was how things stood when I returned in late 1980, intending to work my way through school as an electronics tech.<br /></p><p>So, Radio Shack needed to have better engineering to do anything with the better CPU technology that had fallen into their lap -- twice. </p><p>Commodore's Tramiel wasn't the only one in the industry who thought engineering should be sacrificed for price and immediate profits, and was not the only one who found himself left to drink from the marketing stream he himself had polluted. <br /></p><p>So what about engineering? Was it really so obviously necessary back then as it seems in hind-sight?<br /></p><p>Motorola was fighting with engineering missteps in the 68000. Intel was struggling with over-engineering the iAPX 432 (and under-engineering the 8086, but ...). Zilog was struggling with similar problems with their Z8000. </p><p>The first 32-bit addressing CPUs (microprocessors or not) all bore the marks of engineering specs that were way too ambitious in application areas that were poorly understood -- too much engineering without foundation in real-world experience. (We're still doing it.) <br /></p><p>Consider this -- the 32-bit CPUs existing before the 68000 were not microprocessors, of course. They were all leading-edge hardware in mainframes, and the companies that produced them were protecting their inventions and technology with strict secrecy. </p><p>Large memory systems, sharing a processor between multiple tasks, and coordinating the work of multiple processors were all brand new application fields for most of the industry. </p><p>Real technology was just not available. That is, what was available other than theoretical technology was really hard to come by. And when you engineer things that complex with little real-world experience to guide you, you're going to make mistakes.<br /></p><p>Insane competition drove the industry to push ahead into the 32-bit world a lot harder and faster than was safe or wise. In some senses, IBM's and Motorola's hesitation made sense.<br /></p><p>Usually, you hear that IBM's options to the 8086 were Motorola's 68000 and Zilog's Z8000. Of the three, the obvious choice was the 68000, and the usual question is why it was not chosen.<br /></p><p>Remember, nobody knew what they were doing. </p><p>The 68K was the best of the actually
available options, but Motorola's design for handling the big memory
space was too ambitious, relying on false understandings of the
underlying problems. That was where the "bugs" were, although nobody
really had a better handle on it at the time. </p><p>Three specific design
misfeatures in the 68000: </p><p>(1) Complexity, and the expectations that induced, although that didn't get really out of hand until the 68020. </p><p>Nowadays, the 68000 seems relatively tame, but it was at least an order of magnitude more complex than, say, the 6809, and we lacked testing and design tools for the complexity then. </p><p>You could use 8-bit engineering on the 68000, but you had to ignore all the cool features of the CPU to do so. That was hard for engineers designing for the 68000 to stomach, not to mention for management and marketing to justify. </p><p>I think it was less the cost of supporting it all than having to plan on starting with a machine that both you and the customer would know was going to be replaced by a more complete re-design within the year.</p><p>So, why not build the more complex design to start with? That was what a lot of tech companies tried to do, and what a lot of them failed at.<br /></p><p>It took too long to test, especially to the expectation levels we had then. We were too afraid of bugs in non-critical software and hardware. We were missing tools, and management was too scared of sinking money into the projects to fund developing them.</p><p>We are now used to the market itself being a required testing stage, but we weren't used to that idea back then. (Think about how your "smart" phone does so many things you don't want it to now. That's misfeatures, you know -- bugs -- that you are testing.)</p><p>For the record, note that the 8086 was about half an order of complexity more complex than the 6809, although it was less complex than the 68000.<br /></p><p>(2) The original 68000 was not directly fully 32-bit addressing when using position-independent code. (Absolute addressing code, yes. Position independent, almost, but requiring software shims for modules larger than 64K.) There were missing 32-bit constant offsets in certain addressing modes. </p><p>Position
independence was a great plan, and Motorola should be commended for embracing it. But by only giving offsets of +/- 32K in those indexing
modes, Motorola had built the 68K with a hidden barrier to
overcoming the 64K module size barrier when designing for position independent code. You had to use two instructions and a register (a 32-bit load immediate in addition to the register-offset indexing mode of the instruction you wanted) to get full 32-bit range with position independence. </p><p>This is most of the reason Codewarrior for Mac had a small memory model similar
to the x86's small memory model. If the 68000 had had full-range constant offsets,
the memory models for the 68000 could have been blended, and programmer
wouldn't have had to plan for it, and the small model would have disappeared as the compiler matured.</p><p>Imagine telling your manager about that, after you successfully
lobbied for the 68000 instead of the cheaper, but less capable 8086, because the 8086 had the barrier and the 68000 didn't. (Then think about trying to explain why position independence, which wasn't really achievable on the 8086, was important.) </p><p>Well, two instructions and one data register on a CPU with 8 data registers is not nearly the impediment that four-to-eight instructions and the only accumulator the CPU has on a CPU with only one accumulator. The small and big models on the 68000 could have been blended anyway, but we (the market) had this antipathy to compilers that took more than two optimizing passes and then added another optimizing pass in the link phase. </p><p>(It would be a few years before we as an industry generally recognized that optimizing too early was a big problem, but that's a rant for another day.)<br /></p><p>(3) Motorola made some problematic design decisions in how the 68000 handled exceptions in the intransigent case of memory page faults and such. As a result, there was not enough information about the cause of the exception to recover and continue, which made it difficult to share memory between running applications safely.</p><p>Intel just
didn't handle it at all in the 8086, which left them able to quickly recover from
the mistakes of trying to implement too much complexity in the 80286
when they moved to the 80386. Most companies who needed to deal with memory exceptions for the 80286 said they would try to implement the exception stuff in their next
software upgrades, but by the time they were getting started on the next
upgrades, Intel had the design for the 386 ready. The 80386's approach was much simpler,
and nobody needed to go down the 80286 rabbit hole any further.</p><p>Motorola
tried to handle those exceptions in the original 68000, but got it slightly wrong, and that, more than
the extra clock cycles, was what kept the 68451 from being the MMU in
any of the major workstations built on the 68K. Engineers understood the
problems of wait states, and expected that the newer versions would be
able to run with fewer wait states. They could expect wait states to be
handled in hardware. But the exception handling misfeature meant having to plan on
re-writing the very code that they thought they only wanted to write
once. </p><p>Motorola did fix that in the 68010, which was released to the market in 1982. Unfortunately, they did not fix the 32-bit constant offset problems until the 68020.</p><p>Now, it should be noted that, ultimately, problems in the code for handling memory management is
still biting us in the form of microcode vulnerabilities, and somebody has to regularly rewrite it. (Remember the Heartbleed vulnerabilities, for example?) ARM64 suffers
less than AMD/Intel64, but the CPU vendors are still struggling with it. It's a very difficult problem to solve well.<br /></p><p>Which means
that the mistakes in the 68000's exception handling were not really a 68000 problem, they were a general problem for the whole industry. But everyone still likes to call it a 68000 problem, because no one really
wants to admit we as a race don't already know everything we need to
know for handling today's problems yesterday.</p><p>Now, Intel did have an advantage of sort-of learning from their
misadventures with the iAPX 432. That is, they just decided to punt on the problem with the 8086, and gave it non-enforcing segment registers -- which aren't really segment registers. It was a (not very good) non-solution with interesting and sort-of useful side-effects that cause more problems down the road.<br /></p><p>The 8086 segment registers, instead of being the width of the 20-bit addressing of the 8086, were just sixteen bits slid over four, which made the segment registers clumsy to work with as base registers for large arrays and such. You ended up needing four to eight instructions and two or three registers to properly handle 20-bit pointers. <br /></p><p>And there were no segment limit registers, which is why I say they were not really segment registers. Implementing segment limits in software on the 8086 essentially doubled the already excessive instruction overburden.<br /></p><p>In a sense, the pseudo-segmentation was used like a cheap alternative to bank-switching, avoiding the use of external hardware to achieve expanded memory range. Ironically, though, until the 80386 was available, external hardware bank switching was still used in addition to those pseudo-segment registers on the 80x86-based PCs with large memory requirements.<br /></p><p>The segment registers did not really solve the underlying problems, nor did they contribute to a proper external hardware solution, they just allowed (with the cooperation of the market) the problems to be swept under the rug until the 80386 was available.</p><p>Yes, as I mentioned above, the 68000 had sort-of similar problems, but not nearly to the same degree. If you needed segmentation on the 68000, it was just a matter of index modes, and even with the shortcoming with constant offsets, it only took two instructions and a register to take care of, not four-to-eight instructions. (You just had to ignore the obvious CHK instruction if your segments were going to be larger than 64K -- until the 68020, when that was fixed. But, back then, nobody wanted to waste time bounds-checking things anyway.)<br /></p><p>(4) (Note, this is more than three.) The 8088's 8-bit external was cheaper! -- but no, not really. </p><p>You may have heard such nonsense about the 8-bit external 8088 being cheaper to design for than the 16-bit external 68000. Let's calculate this: <br /></p><p>-- Yes, the entry level model would have had sixteen 16Kx1 dRAMS instead of just eight. </p><p>No, that would not have broken the bank. The initial sticker price could have added the extra USD 32.00 or so without breaking any significant price barrier. (Go look up the original prices.) That differential would drop as production ramped up.<br /></p><p>The RAM configuration would have been 16 chips wide instead of 8, but that was only a small routing problem -- eight extra wires. The max on-board RAM for the mainboard in the original version could have remained at 64K, although in configuration of 32Kx16 bits (32Kx2 bytes).</p><p>-- Yes, the expansion bus did present a problem. Catering to the 8088's limitations allowed sweeping the expansion bus width under the rug for a couple of years, limiting performance options that could otherwise have been had if the 8086 had been a planned option from the beginning. </p><p>Wider connectors were more expensive, but only until ordered in large numbers. Total added to the initial price? Maybe USD 5.00. And that would disappear quickly as production ramped up.<br /></p><p>Really, the narrower 8-bit expansion bus was solving a different problem than cost. <br /></p><p>-- The 8 kilobytes of BIOS ROM was done as 4 2Kx8 ROMS, and the only difference necessary would have been the 16-bit wide data bus configuration. 16-bit data meant pairing the ROMs, but 8 kilobytes is 8 kilobytes whether done as 4 unpaired ROMs or 2 pairs of the same size ROMs:</p><blockquote><p>4x(2Kx8)== 8Kx8 bits => eight kilobytes<br /></p></blockquote><p>vs. <br /></p><blockquote><p>2x2x(2Kx8) == 4Kx16 bits => eight kilobytes<br /></p></blockquote><p>Yes, the engineers would have had to take a little time learning the 68000's superior instruction set to be sure they weren't wasting space and cycles in the 4 2Kx8 BIOS ROMs, whereas they were already familiar with the 8086's instruction set. </p><p>Marketing types with no patience to understand what they were talking about tended to say things like, </p><blockquote><p>But the instructions are 16 bits wide! That's going to be <b>twice the memory to store programs!!!!!!</b><br /></p></blockquote><p>That is potentially a legitimate concern with 32-bit RISC instruction sets, which are usually not as densely encoded as either the 8086 or 68000. </p><p>But instructions in the 8086 are variable width, in 8-bit chunks. Instructions in the 68000 are also variable width, in 16-bit chunks that do more work than the 8-bit chunks of the 8086. Motorola was careful there.</p><p>(Just for the record, neither the 8086 nor the 68000 is as densely encoded as the 6809, but if the 6809 instruction set is expanded to handle addresses larger than 16-bits, you lose some of that density.)<br /></p><p>-- What else? Peripheral chips? </p><p>The 68000 had instructions
specifically to enable using 8-bit wide peripheral parts without having
to adapt the peripheral's 8-bit wide interface to the 68000's 16-bit
wide data bus. It might have felt a little "awkward" or "non-ideal" to
some engineers, but most experienced engineers would not have even blinked an eye. </p><p>As a specific case-in-point, the original IBM PC used Motorola's 6845 to
generate video. Motorola had a reference design for using that exact
chip with the 68000 (and, indeed, hardware for the 68000 based off that reference design were not unknown in the industry).</p><p>(5) Lack of software isn't exactly a misfeature, but it is often invoked as a reason the 68000 wasn't ready.</p><p>Remember that CPM/86 had not entered into development when the IBM PC unofficial project began. Whatever CPU they chose, they were either going to be dependent on the CPU's manufacturer for an existing OS, developing their own, or getting a third party to develop one for them. Choosing the 8086, they initially turned to Digital Research. </p><p>Note that it was already known that 8080 or Z-80 software needed more than just re-assembling the source code with an 8086 assembler. Transliteration was possible to an extent, but cleaning up the transliteration did take time. <br /></p><p>Considering the amount of Z-80 software that was quickly (and crudely) transliterated to the 68000, asking Digital Research to develop a version for the 68000 would not have been unreasonable. Likewise, they might have reached out to Technical Systems Consultants or Microware for a 68000 version of their OS products for the 6800 and 6809.<br /></p><p>At this point, you should be able to see that all of the usually mentioned strikes against the 68000 were not strikes at all. Balls. The 68000 really should have walked the bases, so to speak.<br /></p><p>Setting the 68000 aside for a moment, I've mentioned the 6809 a bit above. You may be aware that Apple considered the 6809 for the Macintosh, and even wired up a prototype before deciding that the Macintosh really needed more address space. <br /></p><p>I've heard, but have not corroborated, that the 6809 was also considered by IBM for their PC somewhere along the line. Would it have been a bad choice?</p><p>On the plus side:</p><p>(1) The 6809 was designed from the beginning to handle high-level languages, multi-tasking and such. I mentioned Microware's OS-9/6809 above. Uniflex from Technical System Consultants was also available in 1979. (TSC's Flex for 6809 was available almost as soon as the CPU was.) IBM would have had two relatively mature quality OSses and a good developers' environment and community from the outset, and much less aggressive partners to work with.</p>(2) Motorola did have a page-mode MMU part for the 6809, the 6829. External MMUs do tend to slow a processor down a little, but, with the 6829, the 6809 was able to compete with minicomputers -- minicomputers from the mid-1970s, but that's not bad considering that those mid-1970s minicomputers were still very actively used into the mid-1980s.<p>(3) Motorola had a floating-point ROM for the 6809 as well, the 6839. It was not as fast as floating point in hardware, but it was ready, and cheaper. <br /></p><p>(4) The 6809 was fairly well-known at the time for its ability to handle graphics. <br /></p><p></p><p>On the minus side:<br /></p><p>(1) The 6809 did not have segment registers. Breaking the 64K barrier would have required bank switching or full paging, and doing large arrays with bank switching or full paging requires a bit more code than even with the 8086's sloppy segments.</p><p>(2) The 6809 did not have hardware divide. And hardware multiply was only eight bits wide, so you had to put four of those together to get a 16-bit multiply. That also slowed down the floating-point operations.</p><p>(4) Motorola was not talking about extending the design. They were focused on their profitable 6805 and 6801 CPUs.<br /></p><p>That last was possibly the killer for the 6809.</p>If Motorola had been showing signs of actually incorporating the 6809 as a core in microcontrollers similar to the 6801 and 6805, IBM could have been confidant in being able to get them to build a 6809 with something like the 6829 memory management unit built-in, and integrated MMUs tend not to slow CPUs down nearly as much as external MMUs. <p>And they could have had reason to expect Motorola to extend the 6809 architecture. Simply adding linear segment registers and full 16-bit hardware
multiply and divide to the 6809 would have made it head-to-head competive with the 8086, in spite of the 6809's 8-bit architecture. Extending the architecture in something the way the 8086 was extended to the 80386 would have been no particular problem, either.<br /></p><p>But such a 6809 would also, as the theory goes, have eaten into the 68000's market. </p><p>(It is true that adding segment registers would have conflicted with the page-mode MMU operation that had become sort-of expected in 6809 software, requiring a small bit of work to bring, for example, OS-9/6809 up in a segmented model instead of a paged model.)<br /></p><p>I personally think Motorola should have given the 6809 more attention, as I have <a href="https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html" target="_blank">mentioned elsewhere in this blog</a>.<br /></p><p>Anyway.</p><p>No, not anyway.</p><p>The Z8000 was not ready, and was not mature. <br /></p><p>The 6809 was ready, and was mature. It would have made a very good base for a PC, if Motorola had had plans to upgrade the CPU architecture in future offerings. Such plans were beginning to be obviously not in evidence.</p><p>The 68000 was ready, and Motorola was clearly committing to supporting and upgrading the architecture. Anyone who tells you otherwise either does not know or is ignoring the overall situation in the marketplace.</p><p>The 68000 was also 8-bit capable, contrary to what some have said.<br /></p><p>Even if it would have taken an additional six months to a year to qualify, IBM never did a proper qualification of the 8088 PC design either, and there was no formal marketing study that defined a market window to be met or anything like that. And there really was no reason to expect it to take any longer.<br /></p><p>It was not theoretical delays, not deficiencies in the 68000, per se, not anything like that.</p><p><b>So, drum roll:</b></p><p>Here is the real reason, from what I heard at the time, and I still believe it:</p><p>The
68000 was powerful enough to allow building a microcomputer that would
have competed with IBM's system 32/34/36/38 series of minicomputers. </p><p>(Yeah. I saw that, when I was doing the internship with IBM. The system 3 CPU even looked a little like the 68000, superficially, inside. We
could guess they'd have done better by moving the System/3 series to
their own custom version of 68K, but thinking about that requires enough
hypothesis contrary to fact to push us into the realm of <a href="https://joelrees-novels.blogspot.com/2020/01/33209-2nd-Microcomputer-Revolution-Homecoming-TOC.html" target="_blank">writing alternative reality SF</a>. </p><p>And we can give a nod to IBM's first exercise in putting the system 360 on a microprocessor, while we're at it. Look up the IBM Personal Computer XT/370. That was real history.)</p><p>The x86 was not powerful enough for that. That left IBM's marketing team able to imagine they had breathing room.</p><p>The real reason was the same reason Motorola didn't support
the 6809 the way they should have:</p><p>There was this meme that seemed to go around every marketing department back then -- <b> </b></p><p></p><blockquote><b>"MUST. NOT. COMPETE. WITH. OURSELVES!!!!!!!!!"</b></blockquote><p>(I think we all now know that, when you're careful to avoid going to all-out war with yourself, competing with yourself keeps you from getting complacent.)<br /></p><p>So the IBM PC project had to be kept hidden. And that is why it resulted in a non-optimal product. And the casual engineering of the x86 PC itself ultimately almost took IBM down with it, anyway. </p><p>Microsoft and Intel would avoid the back-swill of their sins with a lot of planned smoke-and-mirrors, always (barely) able to make it look like they were leading the way out of the mess they themselves were creating. </p><p>And that is probably as close as you can get to the real story of why bad engineering prevailed,
but was only truly successful for Microsoft and Intel, both of whom are now eating and drinking the
pollution they made -- along with the rest of us.</p><p>The real question is</p><blockquote><p>Why did such a non-optimal design succeed? <br /></p></blockquote><p>Full answer takes us deep into religion. I'm not going to ask you to go there with me today. Maybe some other day. </p><p>Quicker answer, if only partial: </p><p>The (business) world wanted spreadsheets along with the word processors. We didn't all know exactly what they were,
but we wanted a bigger and better calculator that would allow us to do with our
accounting books what word processors allowed with prose records. And we wanted
it cheap, and we wanted it big. And we wanted it on our desks.</p><p>The Apple II got us close to giving us that, but spreadsheets on the Apple were not as intuitive as word processors were becoming, and were somewhat limited in size.<br /></p><p>Both the x86 or the 68K supported more intuitive spreadsheet apps capable of handling larger spreadsheets, but the 68K was a threat to marketing departments.</p><p>That was why the IBM PC snowballed. Small, weak things are sometimes bigger and badder than big strong things. (That's the short version of the religious discussion, too, by the way.)<br /> </p><p>There really was only one way to have avoided the mess, and that was for IBM to have resisted the temptation to try to jump into the front of the race. (And I'm busy writing a <a href="https://joelrees-novels.blogspot.com/2020/01/33209-2nd-Microcomputer-Revolution-Homecoming-TOC.html" target="_blank">not-very-good novel about that</a>, when I'm not too tired after a day of delivering mail to keep food on the table, so I'll forego talking about that here.)<br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-29796990322005728682021-08-09T13:04:00.017+09:002021-08-24T22:00:50.482+09:00Guessing Which Motorola Microcontroller Part It Is (6801/6805/68HC11/6809)<p>
Wasted too much time on this. </p><p>This is extracted from my response to a
post
to the Facebook
<a href="https://www.facebook.com/groups/1233645553343772/" target="_blank">vintage {Computers | Microprocessors | Microcontrollers} group</a>, asking for
<a href="https://www.facebook.com/groups/1233645553343772/posts/5924044950970452/" target="_blank">help identifying a Motorolo-logo microcontroller found in a washing machine
with an apparent custom SOC part number ZC85148L</a>, with an apparent date stamp from early 1984. </p><p>There were many guesses as to what the ZC85148 was, and I thought I'd put my guesses and reasoning out here, to make them more available for searching: <br /></p>
<p></p>
<p>
</p>
<p>I'm guessing either 6805/68HC05 or 6801/68HC01.</p>
<p>6805
was essentially a stripped-down 6800, with only one eight-bit accumulator (A),
one eight-bit index (X, yes, eight-bit), bit instructions, expanded indexed
modes, better power-saving stuff, lots of timers, some analog-to-digital, and
other integrated I/O to choose from. At least some parts included hardware
8-bit multiply.</p>
<p>6801/68HC01 was exactly the 6800 <a href="https://defining-computers.blogspot.com/2021/08/differences-between-6800-and-6801.html" target="_blank">with a few new 16-bit instructions for the double accumulator A:B pair, better X handling, hardware 8-bit multiply</a>, and better power-saving mode stuff.</p><p>My reasoning is as follows: <br /></p><p>If the chip were a bare CPU, it would need separate ROM and RAM parts on the circuit board, and such were nowhere in evidence. From the late 1970s until Motorola spun off the microprocessors business and renamed it Freescale, they provided Systems-On-a-Chip (SOC) semi-custom microcontrollers which included RAM, ROM, and I/O on chip. It's a pretty safe bet that the part was an SOC microcontroller. <br /></p>
<p>I am not aware
of any 6809 SOC products from Motorola, so, since the circuit board showed no evidence of ROM or RAM, I'm pretty sure that it is not a
6809 variant of any sort. (If there were any special-order 6809 SOC microcontrollers,
that would be interesting to hear about.) </p><p>The 68HC11 did start
shipping in 1984, so it could possibly be a 68HC11. </p><p>(68HC11 is a 6801 in
HCMOS, with an additional Y index register and a pre-byte that converts
X-indexed op-codes to Y-indexed opcodes, plus hardware integer and fraction
divide, bit instructions, and a little bit more. Some people confuse the 68HC11 with the 6809. I discussed the
differences between the 68CH11 and the 6809, comparing their
architecture and lineage, <a href="https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html" target="_blank">in another rant here, several years back</a>: <a href="https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html">https://defining-computers.blogspot.com/2018/12/68hc11-is-not-modified-6809-and-what-if.html</a>. There are a few errors there I need to go back and correct sometime, but they are not errors of substance, I think.)<br /></p>
<p>I don't have
solid information on when the first HC08 microcontrollers started shipping,
but my impression is no sooner than the late 1980s. So I'm guessing it
was not an HC08 or HCS08 or any of the later extensions thereof. </p><p>(HC08s were
68HC05s with a high-byte extension for the X register and some other useful
stuff, including hardware 8-bit multiply and divide.)</p>
<p>Other possibilities -- I understand
that Motorola did second-source at least one other company's CPU in the 1970s. Whether the Intel 8501 might have been one of those, well, it seems ludicrous, but I have some conflicting memories. I think they had mostly gotten out of that business by the mid-1980s</p>
<p>It is my memory that they dabbled in manufacturing IBM compatible desktop PCs in the mid-to-late
1980s, but I don't remember whether that included manufacturing their own 8086 compatible CPUs, or second-sourcing Intel's. Anyway, that ended up only for desktop, and management quickly recognized
that business model was not going to be profitable for them (and had been a
marketing misstep). <br /></p>
<p>I did also see announcements and engineering
materials for 6502 core SOCs from Motorola, somewhere around 1986 or '87,
IIRC, but those also seemed to have been dropped pretty quickly. It would not
have been there in 1984.</p>
<p>And it is also my understanding
that Motorola provided manufacturing for some mil-std microcontrollers, but I think that was
only to the military and maybe NASA. Those were 16-bit, and not based on any
of the 68XX or 68XXX series. I suppose it would not be impossible to see something like that in a washing machine controller, but it would be overkill.<br /></p>
<p>There were also 4-bit and 1-bit SOC
microcontrollers that Motorola produced in the late 1970s, but I don't
remember any of them in 40-pin dip. I think they would be a bit underpowered.<br /></p>
<p>I don't think the 88000 RISC
series was even announced yet in 1984, and the Power architecture discussions
with IBM and Apple had not even been imagined yet. There was the one-off
custom implementation of the 360 architecture borrowing from the 68000. We can
be sure that none of these would have been in a 40-pin chip in a washer in
1984.</p>
<p>Which is why my guesses come down to the 6805/68HC05 or the 6801/68HC01.</p><p><br /></p>
零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-4361762829662782812021-03-01T08:50:00.004+09:002021-03-01T11:15:58.548+09:00What Is (the Programming Language) Forth?<p>The colon definition grammar.</p><p>The post-fix expression grammar.</p><p>Two stacks.</p><p>On-line dictionary (symbol table) at development time, and, unless explicitly removed, at run-time.</p><p>The ability to blend run-time with compile-time at development time.</p><p>But this is way too loose. It leaves out all sorts of implementation details that non-trivial applications depend on. <br /></p><p>Comparing it to the programming language C, it's like saying that, not just all K&R C compilers are included as C, but all the different versions of Small C and Tiny C. (And maybe even Objective C, Javascript, Java, Ruby, PHP, and a number of other languages that borrow heavily from C syntax and grammar?)</p><p>The solution?</p><p><a href="http://www.forth.org/fig-forth/contents.html" target="_blank">fig-Forth</a> is one group of dialects that have a lot in common. We could define and develop a standard fig-Forth.</p><p> I've <a href="https://sourceforge.net/projects/asm68c/files/fig-forth_6800-stuff/" target="_blank">transcribed the 6800 fig-Forth model</a> and <a href="https://osdn.net/users/reiisi/pf/exorsim6801/scm/tree/master/" target="_blank">optimized it for the 6801</a>. In addition to some I/O bugs, there are enough differences from the 6502 model to cause problems for non-trivial applications. <br /></p><p>I have a near-fig-Forth I call <a href="https://osdn.net/projects/bif-6809/" target="_blank">BIF-6809</a> (and a non-functional one I call <a href="https://sourceforge.net/projects/bif-c/" target="_blank">BIF-C</a>). They use a binary tree symbol table, and that alters the language enough to make it unreasonable not to give them separate names. (Double negative intentional. Not just reasonable to have separate names, but unreasonable not to.) <br /></p><p>Forth77 and <a href="http://forth.sourceforge.net/standard/fst83/" target="_blank">Forth83</a> are separate languages, and should be treated as such. And they should be referred to by their complete names.</p><p><a href="https://www.forth.com/swiftforth/" target="_blank">SwiftForth</a>'s language should be referred to as SwiftForth. <br /></p><p><a href="http://www.forth.org/svfig/Win32Forth/DPANS94.txt" target="_blank">ANSI Forth</a> should be renamed. Call it CommitteeForth or something.</p><p>The name Forth, unadorned, should be reserved to whatever <a href="https://en.wikipedia.org/wiki/Charles_H._Moore" target="_blank">Charles H. Moore</a> (the original author of Forth) wants to call Forth.</p><p>(I should note that Moore himself calls his own dialect <a href="https://en.wikipedia.org/wiki/ColorForth" target="_blank">ColorForth</a>. He's leading out, here.)<br /></p><p>We need to make the nomenclature a part of the dictionary/symbol table. A word called version should bring up a version number, sure. </p><p>We need a word that returns (without printing it to the terminal) a string containing the name of the language/dialect. Maybe even one word for the language family name and one for the dialect. That would give us a base point, after which it would become possible to check what kind of glue needs to be brought it, to make a particular source code compilable with a particular compiler.</p><p>After some consideration, I'll suggest the following four new words:</p><p>* language ( --- adr )</p><p style="margin-left: 40px; text-align: left;">Returns a string containing the language name, "Forth". (This would allow distinction from derived but different languages.)<br /></p><p style="text-align: left;">* dialect ( --- adr )</p><p style="margin-left: 40px; text-align: left;">Returns a string containing a dialect name, "fig-Forth", "Forth77", "Forth83", "SwiftForth", "gforth", "BIF-6809", etc.<br /></p><p> * sub-dialect ( --- adr )</p><p style="margin-left: 40px; text-align: left;">Returns a string containing a modifier of the dialect name. </p><p style="text-align: left;">* target-cpu ( --- adr )</p><p style="margin-left: 40px; text-align: left;">Returns a string containing the CPU targeted, such as "6502", "6801", "Z80", etc.</p><p style="text-align: left;">In addition,</p><p style="text-align: left;">* version ( --- ud )</p><p style="margin-left: 40px; text-align: left;">Should return an unsigned double integer in which the first byte contains the major version number, the second byte the minor, and the third and fourth contain a sequencing number within the minor version.<br /></p><p>To further aid tuning source to the host language, run-time, CPU, etc., dialects which adopt this practice should also adapt the practice of defining words that describe such things as cell width, sign representation, defined boolean constant to set a logical true, etc. We could take some inspiration from C's (original, sparse) limits.h include file for this, but it should not duplicate the contents of limits.h. <br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-11514976278612631832020-11-19T06:14:00.001+09:002020-11-19T06:14:23.544+09:00Only install system updates from within the OS – from the settings menu.Got an update notice from Playstore. <div><br></div><div>I went to System Settings to update.</div><div><br></div><div>Nothing.</div><div><br></div><div>That means Playstore apps can send notices that look like system update notices.</div><div><br></div><div>Fake notices.</div><div><br></div><div>From apps that probably install malware of some sort. I need to tell Google about this but I don't have time until lunch.</div><div><br></div><div>Lesson?</div><div><br></div><div>Only install system updates from within the OS – from the settings menu.</div>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-30094030865302544362020-10-26T19:43:00.007+09:002020-10-26T19:43:44.072+09:006800 Example VM with Synthetic Parameter Stack<p></p><p></p><p></p><p></p><p>In <a href="https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html" target="_blank">Forth Threading (Is Not Process Threads)</a>, I commented on the fig Forth models insisting on using the CPU stack register for parameters. The code I presented for the 6800 follows that design choice, but the code I presented there for the 6809 and 68000 does not. Here I'll show, for your consideration, the snippets for indirect threading on the 6800 done right (according to me) -- with the CPU stack implementing RP and the synthesized stack implementing Forth's SP. </p><p>You'll note that the code here will end up maybe 5% slower and 5% bigger overall than the code that the fig model for the 6800 shows. But it will not mix return addresses and parameters on the same stack, and it should end up a bit easier to read and analyze, meaning that it should be more amenable to optimization techniques, when generating machine code instead of i-code lists.</p><p>In the code below, the virtual registers are allocated as follows:<br /></p><p></p><ul style="text-align: left;"><li>W RMB 2 the instruction register points to 6800 code</li><li>IP RMB 2 the instruction pointer points to pointer to 6800 code</li><li>* RP is S</li><li>FSP RMB 2 Forth SP, the parameter stack pointer</li><li>UP RMB 2 the pointer to base of current user's 'USER' table ( altered during multi-tasking ) <br /></li><li>N RMB 10 used as scratch by various routines</li></ul><p style="text-align: left;">The symbol table entry remains the same:</p><blockquote> HEADER<br />LABEL CODE_POINTER<br /> DEFINITION</blockquote>So we won't discuss it here. (See the discussion <a href="https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html" target="_blank">in the post on threading techniques</a>.)<p style="text-align: left;">I won't repeat the fig model here. I'll be bringing more of it into the discussion than I did in the post on threading techniques, refer to the <a href="https://sourceforge.net/p/asm68c/code/ci/master/tree/fig-forth/fig-forth.68c" target="_blank">model in my repository</a>, here: <a href="https://sourceforge.net/p/asm68c/code/ci/master/tree/fig-forth/fig-forth.68c" target="_blank">https://sourceforge.net/p/asm68c/code/ci/master/tree/fig-forth/fig-forth.68c</a>.<br /></p><p style="text-align: left;">This is an exercise in refraining from early optimization, so the parameter stack pointer will point to the last item pushed, even though it's tempting to cater to the lack of negative index offset in the 6800 and point to the first free cell beyond top of stack.<br /></p><p style="text-align: left;">The inner interpreter of the virtual machine looks the same, but the code surrounding it changes. </p><p style="text-align: left;">PULABX is only used by "!" (STORE), so let's start the rework there:</p><p style="text-align: left;"><span></span></p><a name='more'></a><p> STORE FDB *+2 ; ( n adr --- )<br /> LDX FSP<br /> LDAA 2,X ; get n<br /> LDAB 3,X<br /> LDX 0,X ; get adr<br /> STAA 0,X ; store it away<br /> STAB 1,X<br />DEALL2 LDX FSP ; potential code-stealing point<br /> INX<br /> INX<br /> INX<br /> INX<br /> STX FSP<br /> JMP NEXT<br /></p><p><span></span></p><!--more--><p>And we no longer have a use for PULABX. </p><p>As a quick estimate, we're using 2 more instructions and maybe 12 more cycles this way. (I should count cycles, but that would take several minutes.) The code is much more readable, and much less likely to be impacted by changing between indirect and direct threading and between subroutine calls and hard jumps.<br /></p><p>STABX is used in eight places, including AND, OR, XOR, and +, which all follow the same pattern. Let's look at +, and then at replacement code stealing in AND.</p><p><span></span></p><!--more--><p>PLUS FDB *+2 ; ( n1 n2 --- sum )<br /> LDX FSP<br /> LDAA 2,X<br /> LDAB 3,X<br /> ADDB 1,X<br /> ADCA 0,X<br />DEAST1 INX ; code steal if X is FSP, to deallocate a cell and overwrite top with A:B<br /> INX<br /> STX FSP<br />STABX STAB 1,X ; code steal to store A:B at X and proceed<br /> STAA 0,X<br /> JMP NEXT<br />*<br />*<br />AND FDB *+2 ; OR and XOR also follow this template<br /> LDX FSP<br /> LDAA 0,X<br /> LDAB 1,X<br /> ANDB 3,X<br /> ANDA 2,X<br /> JMP DEAST1<br />*<br />* 0= and 0< also use the DEAST1 code steal<br /></p><p></p><p><span></span></p><!--more--> Again, a few more bytes and a few more cycles than using S for parameters.<br /><p></p><p style="text-align: left;">GETX is used in five places (six if you don't alias I to R), so we'll look more closely at that, as well. I is the first place in the model, and the simplest. We might consider locating the body of I/R in front of NEXT, to optimize loop counter access, but there are other uses to consider.</p><p style="text-align: left;"><span></span></p><!--more-->R FDB *+2 ; ( --- n )<br /> TSX ; Remember that the 6800 adjusts S for this.<br />GETX LDA A 0,X <br /> LDA B 1,X ; no deallocate<br />PUSHBA LDX FSP<br /> DEX<br /> DEX<br /> STX FSP<br /> STAA 0,X<br /> STAB 1,X<br /> JMP NEXT<br />*<br />*<br />I FDB R+2<br />*<span></span><p></p><p style="text-align: left;"><span></span></p><!--more-->GETX also uses PUSHBA, which is used in 18 places, and we'll examine that, as well.<p></p><p style="text-align: left;">I possibly has the most frequent use of GETX, but @ has the most frequent use of PUSHBA, and is significantly more used than I and R together, so if we locate a definition body in front of NEXT, it probably should be @.<br /></p><span><!--more--></span><span></span><p style="text-align: left;">AT FDB *+2 ; ( adr --- n )<br /> LDX FSP<br /> LDX 0,X<br />* Same as GETX? No.<br /> LDAA 0,X<br /> LDAB 1,X ; no need to deallocate or allocate<br />* Same as STABX?<br /> STAB 1,X ; code steal to store A:B at X and proceed<br /> STAA 0,X<br />* JMP NEXT? Put NEXT here?<br />* Wait. That's the same code as STABX up there. <br /></p><p style="text-align: left;"><span></span></p><!--more--><p></p><p style="text-align: left;">At this point, we are sure that we need differently organized code from the model, so we throw out all the code above and start over. Well, we don't just throw it out, we save it away where we can look at it as we rebuild it.<br /></p><p style="text-align: left;">We consider that @ is more frequently used than !, so we make @ the focus:<br /></p><p style="text-align: left;"><span></span></p><!--more--><p>R FDB *+2 ; ( --- n ) ( n *** n )<br /> TSX ; Remember that the 6800 adjusts S for this.<br />GETX LDA A 0,X <br /> LDA B 1,X ; no deallocate<br />PUSHBA LDX FSP<br /> DEX<br /> DEX<br /> STX FSP<br /> BRA STABX ; Doesn't save time, but saves a few bytes of repeated code.<br />*<br />*<br />I FDB R+2<br />*<br />*<br />STORE FDB *+2 ; ( n adr --- )<br /> LDX FSP<br /> LDAA 2,X ; get n<br /> LDAB 3,X<br /> LDX 0,X ; get address<br /> STAA 0,X ; store n away<br /> STAB 1,X<br />DEALL2 LDX FSP ; potential code-stealing point<br />DEALLS<br /> INX<br /> INX<br />DEALL1 INX<br /> INX<br /> STX FSP<br /> BRA NEXT ; although the optimizer will recognize a JMP more directly<br />*<br />*<br />AT FDB *+2 ; ( adr --- n )<br /> LDX FSP<br /> LDX 0,X<br />LDABX LDAA 0,X<br /> LDAB 1,X ; no need to deallocate or allocate<br />STOTOP LDX FSP<br />STABX STAB 1,X ; code steal to store A:B at X and proceed<br /> STAA 0,X<br />NEXT LDX IP<br />NEXTW INX ; pre-increment mode<br /> INX<br /> STX IP<br />NEXT2 LDX 0,X ; get W which points to CFA of word to be done<br />NEXT3 STX W<br /> LDX 0,X<br />NEXTGO JMP 0,X<br />*<br />*<br />DOCOL LDAA IP ; 6801 can PSHX<br /> LDAB IP+1<br /> PSHB<br /> PSHA<br /> LDX W ; Get first sub-word of that definition<br /> BRA NEXTW ; and execute it<br />*<br />*<br />SEMIS FDB *+2<br /> TSX<br /> INS<br /> INS<br /> LDX 0,X get address we have just finished.<br /> BRA NEXTW increment the return address & do next word<br /></p><p></p><p style="text-align: left;"><span></span></p><!--more--><p>@ , DOCOL, and ;S are cases where the synthetic stack yields better performance and tighter code than the model code, balancing the hit we take in definitions such as ! . </p><p>Continuing, instead of focusing on + for STABX, 0= is more frequently used, so we'll focus on that, and return a -1 flag instead of a 1, in keeping with more common Forth practices:<br /></p><p></p><p style="text-align: left;"><span></span></p><!--more--><p>ZEQU FDB *+2 ; ( n --- flag )<br /> LDX FSP<br /> LDAA 0,X<br /> ORAB 1,X<br /> BNE ZEQUT<br /> DEC B<br /> TBA<br /> BRA STABX<br />ZEQUT CLRA<br /> CLRB<br /> JMP STABX<br />*<br />*<br />PLUS FDB *+2 ; ( n1 n2 --- sum )<br /> LDX FSP<br /> LDAA 2,X<br /> LDAB 3,X<br /> ADDB 1,X<br /> ADCA 0,X<br />DEAST1 INX ; code steal if X is FSP, to deallocate a cell and overwrite top with A:B<br /> INX<br /> STX FSP<br /> JMP STABX<br />*<br />*<br />AND FDB *+2 ; ( n1 n2 --- n3 ) OR and XOR can do this, too.<br /> LDX FSP<br /> LDAA 0,X<br /> LDAB 1,X<br /> ANDB 3,X<br /> ANDA 2,X<br /> BRA DEAST1<br />*<br />*<br />DPLUS FDB *+2 ; ( d1 d2 --- dsum )<br /> LDX FSP<br /> CLC<br /> LDA B #4<br />DPLUS2 LDA A 3,X ; This loop is worth flattening on the 6801 and 6809.<br /> ADC A 7,X<br /> STA A 7,X<br /> DEX<br /> DEC B<br /> BNE DPLUS2<br /> JMP DEALLS<br /></p><p></p><p style="text-align: left;"><span></span></p><!--more--><p>Except for a few very specialized definitions, the coding in i-code list
definitions won't change. The 2+ example is not one of those few that
would change, so we won't take the time to look at it.<br /></p><p>As noted above, improved DOCOL and ;S will improve entry and exit times for high-level definitions, so I think we can be justified in using a synthetic parameter stack to avoid mixing parameters with return addresses.<br /></p><p>Why did I waste a day going through these calculations?</p><p>One, to demonstrate that focusing too early on optimizations often leads us into tricky code that is not necessary. </p><p>(This is not the swamp monster some people have made of it, however. Working out optimizations often helps us understand the dark corners of code that are blocking progress. It's just something to keep in mind.)</p><p>Two, as I mentioned in the <a href="https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html" target="_blank">previous post</a> and above, understandable code lends itself better to mechanical optimization. Even 6800 code can be optimized. Let's look at 2+ again, after all.<br /></p><p></p><blockquote>TWOP FDB DOCOL<br /> FDB TWO,PLUS<br /> FDB SEMIS</blockquote>DOCOL and ;S are recognized as entry and exit. Let's probe 2:<p></p><p></p><blockquote>TWO FDB DOCON<br /> FDB 2 </blockquote>Our optimizer should give constants specific attention anyway, so maybe we won't probe DOCON further. We can just put <br /><p></p><p></p><blockquote> CLRA<br /> LDAB #2<br /> JMP PUSHBA</blockquote><p></p><p>into the object code trough and expand PUSHBA, then STABX:</p><p></p><blockquote> CLRA<br /> LDAB #2<br /> LDX FSP<br /> DEX<br /> DEX<br /> STX FSP<br /> STAB 1,X<br /> STAA 0,X</blockquote><p></p><p>which brings us to NEXT. Now we can probe + and expand it, then expand STABX at the end:<br /></p><p></p><blockquote> CLRA<br /> LDAB #2<br /> LDX FSP<br /> DEX<br /> DEX<br /> STX FSP<br /> STAB 1,X<br /> STAA 0,X<br /> LDX FSP<br /> LDAA 2,X<br /> LDAB 3,X<br /> ADDB 1,X<br /> ADCA 0,X<br /> INX <br /> INX<br /> STX FSP<br /> STAB 1,X<br /> STAA 0,X</blockquote><p></p><p>Find and clear the redundant STX and LDX:<br /></p><p></p><blockquote> CLRA<br /> LDAB #2<br /> LDX FSP<br /> DEX<br /> DEX<br />* STX FSP<br /> STAB 1,X<br /> STAA 0,X<br />* LDX FSP<br /> LDAA 2,X<br /> LDAB 3,X<br /> ADDB 1,X<br /> ADCA 0,X<br /> INX <br /> INX<br /> STX FSP<br /> STAB 1,X<br /> STAA 0,X </blockquote><p></p><p> Move accumulator activity closer together:</p><p></p><blockquote> LDX FSP<br /> DEX<br /> DEX<br /> CLRA<br /> LDAB #2<br /> STAB 1,X<br /> STAA 0,X<br /> LDAA 2,X<br /> LDAB 3,X<br /> ADDB 1,X<br /> ADCA 0,X<br /> INX <br /> INX<br /> STX FSP<br /> STAB 1,X<br /> STAA 0,X</blockquote><p></p><p>Recognize that we are storing a constant on the stack and then adding it to what was the top as a pattern, and collapse the data movement: <br /></p><p></p><blockquote> LDX FSP<br /> DEX<br /> DEX<br />* CLRA<br />* LDAB #2<br />* STAB 1,X<br />* STAA 0,X<br /> LDAA 2,X<br /> LDAB 3,X<br />* ADDB 1,X<br />* ADCA 0,X<br /> ADDB #2<br /> ADCA #0<br /> INX <br /> INX<br /> STX FSP<br /> STAB 1,X<br /> STAA 0,X</blockquote> Recognize the unnecessary adjustments to the stack pointer, including the unnecessary update at the end:<br /><p></p><p></p><blockquote> LDX FSP<br />* DEX<br />* DEX<br />* LDAA 2,X<br />* LDAB 3,X<br /> LDAA 0,X<br /> LDAB 1,X<br /> ADDB #2<br /> ADCA #0<br />* INX <br />* INX<br />* STX FSP<br /> STAB 1,X<br /> STAA 0,X </blockquote><p></p><p>Examine the result,<br /></p><p></p><blockquote> LDX FSP<br /> LDAA 0,X<br /> LDAB 1,X<br /> ADDB #2<br /> ADCA #0<br /> STAB 1,X<br /> STAA 0,X </blockquote><p></p><p>to weigh it against the allowed size expansion -- 6 bytes of i-code vs. 14 bytes of machine code. </p><p>That's more than double the size, but it provides oppurtunities for further optimizations where 2+ is used, so it is likely to be kept.</p><p>Of course, an actual optimizer would already have optimized DOCON constants out, so a lot of that would be already done. </p><p>But it should be clear that code that is readable is easier to optimize, which, added to the increased safety of not mixing parameters with return addresses, justifies using a synthetic stack. </p><p>It might even be motivation to use a locals stack instead of the return stack for the actual target of R , >R and R> , which is something else I would encourage considering.<br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-82251407439658312412020-10-26T01:24:00.005+09:002020-10-26T19:51:15.240+09:00Forth Threading (Is Not Process Threads)<p>Code-threading in the context of Forth deserves its own rant to help
untangle the knot I picked up in <a href="https://defining-computers.blogspot.com/2020/10/computer-languages-interpreted-vs.html" target="_blank">Computer Languages -- Interpreted vs. Compiled Forth?</a>. This is not a complete treatment, only background and an overview. YMMV. </p><h4 style="text-align: center;">Forth Threading (Is Not Process Threads)</h4><p>If
you've worked much with low-level code, when I mentioned (in <a href="https://defining-computers.blogspot.com/2020/10/computer-languages-interpreted-vs.html" target="_blank">comparing compiled and interpreted Forth</a>) that Forth
compiles definitions to a list of the addresses of definitions, you will
have recognized that there is a recursion there that has to be broken at leaf
calls if a Forth VM ever wants to get any useful work done. </p><p>At some
point, real code has to be executed.</p><p>The terms "direct-threaded",
"indirect-threaded", and "subroutine threaded" are generally invoked to
describe the means of breaking the recursion, but they tend to be used
differently by different engineers. <br /></p><p>Before I get into things, consider a casual assertion I slipped past you in the previous post:</p><blockquote><p>The run-time architecture of a compiled language is a corollary of a virtual machine. <br /></p></blockquote><p>I'm
not sure whether this is obvious or not. When I dug into my first fig
model, the one for the 6800, it seemed obvious to me. But many people
talking about Forth seemed to not even want to talk about the
inner-interpreter as a VM. When I started looking at the Pascal virtual
machine, I saw all sorts of parallels, and when I started reading the
output of the C compilers at school, I saw most of the same essential
parallels. (And none of my professors seemed to want to compare run-time architectures with VMs, either.)<br /></p><ul style="text-align: left;"><li>Stack pointer(s): Both VMs and non-VM run-times usually have one or more stack pointers.</li><li>Call protocol: Both VMs and non-VM run-times have call protocols.</li><li>Global variables/parameters: Both VMs and non-VM run-times have them.</li><li>Local variables/parameters: Likewise.</li><li>Instruction pointer(s): Again, in both.</li><li>Scratch registers: In both. </li><li>Current procedure/function pointer: Necessary in either if object-like self-inspection is part of the language.</li><li>Stack frames: of questionable utility in either, more often not present when parameters are separated from return pointers. <br /></li></ul><p>Stack
frames in C had me scratching my head, at first. I could almost
understand why Pascal defined them, because of the existence of local
procedures and functions. But C had no such concepts, and still wasted
the time to build and tear-down stack frames. The only conclusion I
could come to was that the compiler authors wanted the convenience of
not having to remember how deeply things were stacked in the current
expression evaluation, plus the offset for the local parameters. And it
seemed obvious to me that the difficulty in tracking was primarily due
to interleaving the stacks.</p><p>Should I have considered the
possibility that certain well-known system architects simply preferred
to have easily crashed stacks?<br /></p><p>Anyway, a stack frame is not a required element of a run-time architecture.<br /></p><p>Setting aside the call protocols for a bit, let's talk about register mappings.</p>The fig-Forth model run time includes the following registers:<br /><ul style="text-align: left;"><li>W: pointer to the currently running definition</li><li>IP: pointer to the next i-code for the VM to execute<br /></li><li>RP: pointer to the top of the stack of nested calls</li><li>SP: pointer to the top of the stacked parameters <br /></li><li>UP: pointer to the base of the user process state table</li><li>PC: pointer to next machine code to execute (on non-Forth CPUs) <br /></li><li>scratch registers for intermediate values and operators.<br /></li></ul><p>Common C run-times statically map the following registers:</p><ul style="text-align: left;"><li>IP or PC: pointer to next CPU instruction to perform</li><li>SP: pointer to the top of the stack of nested calls</li><li>Link: cached most recent return address, not saved in leaf routines (Not present in all run-times.)</li><li>Heap pointer: Often implicitly the bottom of global address space</li><li>Here pointer: Pointer to the state of the currently active object in object-oriented languages<br /></li><li>Scratch registers for intermediate values and operators<br /></li></ul><p>Now we can make some comparisons.</p><p>IP
is to virtual instructions when a VM is running, and is separate from
the processor's PC. But they are the same essential concept.</p><p>SP and RP, as mentioned, are tangled concepts for saving the state of a routine or function before it calls another.</p><p>W
and Link are not the same, even if neither is saved in leaf routines. W
is the currently executing definition, Link is the caller.</p><p>UP and the heap address/pointer are essentially corollary, even if what they contain is organized differently.<br /></p><p>However, ...</p><p>Here are some possible register mappings for (fig) Forth vs. certain common C run-times: <br /></p><p>On the 68000: </p><ul style="text-align: left;"><li>A3: W vs. Link (maybe), here pointer, or scratch<br /></li><li>A4: IP vs. scratch<br /></li><li>A7: RP vs. interleaved stack pointer <br /></li><li>A6: Forth SP vs. frame pointer (if present) (Note that MOVEM={PUSH|POP} for all An.)<br /></li><li>A5:UP vs. heap pointer (if present)<br /></li><li>A0~A2, D0~D7: scratch<br /></li></ul><p>On the 6809:</p><ul style="text-align: left;"><li>DP[0] (first two bytes in the direct page): W vs. unspecified/scratch<br /></li><li>Y: IP (save before using for other things) vs. unspecified/scratch<br /></li><li>S: RP vs. interleaved stack pointer <br /></li><li>U: Forth SP vs. frame pointer or scratch<br /></li><li>DP: UP (putting W in the user state table) vs. optional optimization use or per-task variables<br /></li><li>X, D (A:B) vs. scratch<br /></li></ul><p>On the 6800:<br /></p><ul style="text-align: left;"><li>$F0 in the direct (zero) page: W vs. frame pointer or scratch<br /></li><li>$F6 in the direct (zero) page: IP vs. scratch<br /></li><li>SP: RP vs. interleaved stack pointer<br /></li><li>$F4 in the direct (zero) page: SP vs. scratch <br /></li><li>$F2: UP vs. heap pointer or scratch<br /></li><li> X, A, B: scratch <br /></li></ul><p>What
I want to point out is that C is really stripped down. This is because C
was originally structured to be very lightweight, to use as few of the
processor resources as possible, leaving the rest to the compiler or
application to make optimal use of. (But current trends in C have been
adding things like the here pointer to the run-time model.)</p><p>Also, C was designed to, if possible, use no global RAM. This is not possible on the 6800 because there are so few registers, and you have to use scratch RAM to do many things. On the 6801, it becomes possible, if a bit awkward, to move all scratch RAM to an interleaved stack.<br /></p><p>I should note that my assignment of the CPU's stack pointer to the return stack pointer and the consequent use of a synthetic stack pointer as the parameter stack pointer is opposite of the fig Forth models. I'm not sure if the engineers who built the 68000 or 6809 versions of the fig models were aware that the MOVEM instruction that does the PUSH and POP on on the 68000 operates equally on all address registers, or that the 6809's U register has it's own PSHU and PULU instructions, with no overhead compared to PSHS and PULS.</p><p>With the 6800, it was probable that the the PSH/PUL A/B instructions were desired for the parameter stack, but I have done scratch calculations showing that they provide only minimal advantage. (For me, five percent is minimal when it means something like data and return addresses on the same stack.) <br /></p><p>You have to use some sort of global RAM variable for one or the other of the stack pointers on the 6800. If it's in the direct page, the choice balances out a bit.</p><p>I'll emphasize this -- I prefer to avoid mixing parameters with return addresses whenever possible.<br /></p><p>Now let's look at the call protocols. </p><p>In C, various call protocols have been used. Points where the protocols differ:</p>
<ul style="text-align: left;"><li>Saving registers --
<ul><li>Calling routine saves the registers it needs preserved, vs.<br /></li><li>the called routine saves the registers it uses.</li></ul>
</li><li>Building a stack frame or not</li><li>Caching the most recent caller in a register or not<br /></li></ul><p>Likewise, in Forth, various call protocols have been used. </p><p>A certain aspect of the call protocol is the
primary point of distinction between the three threading types in Forth, but there is still a bit of variation in how the calls are implemented.</p><p>I'll initially analyze the 6800 fig model (Find the actual <a href="https://sourceforge.net/p/asm68c/code/ci/master/tree/fig-forth/fig-forth.68c" target="_blank">source here</a>, and the <a href="https://sourceforge.net/p/asm68c/code/ci/master/tree/fig-forth/fig-forth.list.goal" target="_blank">assembled listing here</a>.), then extrapolate from there. In this analysis, I'll use the fig Forth register assignments, with all VM registers in the direct (zero) page except for SP:</p><ul style="text-align: left;"><li>W RMB 2 the instruction register points to 6800 code</li><li>IP RMB 2 the instruction pointer points to pointer to 6800 code</li><li>RP RMB 2 the return stack pointer</li><li>UP RMB 2 the pointer to base of current user's 'USER' table ( altered during multi-tasking ) <br /></li><li>N RMB 10 used as scratch by various routines</li><li>* SP is S<br /></li></ul><p style="text-align: left;">(For an in-depth look at how this plays out using a synthetic stack for the parameters and the CPU stack for RP, see <a href="https://defining-computers.blogspot.com/2020/10/6800-example-vm-with-synthetic.html" target="_blank">here</a>.) <br /></p><p style="text-align: left;">The inner interpreter of the virtual machine looks like this:</p><p style="text-align: left;"></p><blockquote><p style="text-align: left;">NEXT<br /> LDX IP<br /> INX pre-increment fetch from i-code list in definition<br /> INX<br /> STX IP Update it.<br />NEXT2<br /> LDX 0,X Get the i-code which points to a pointer to CPU code<br />NEXT3<br /> STX W This is the same as the label in the symbol table.<br /> LDX 0,X Get pointer to executable code.<br />NEXTGO<br /> JMP 0,X Jump to executable code. <br /></p></blockquote><p></p><p>The definition header consists of a symbol table entry like this:</p><p></p><blockquote> HEADER<br />LABEL CODE_POINTER<br /> DEFINITION</blockquote>The header consists of the name that the Forth command line interpreter will recognize, massaged a bit, and a link to the previous definition. (I'll explain some of that below.)<br /><p></p><p>The CODE_POINTER is a pointer to machine language code the CPU can execute directly.</p><p>For a low-level definition, the definition consists of the machine code that the CPU can execute, and this is where the CODE_POINTER points. The code ends with a jump back to NEXT (or some other appropriate place that leads back to NEXT).<br /></p><p>For a "high-level" (non-leaf) definition, the CODE_POINTER points to a nesting routine, and the definition consists of the list of definition addresses -- as i-codes for the virtual machine inner interpreter as described in <a href="https://defining-computers.blogspot.com/2020/10/computer-languages-interpreted-vs.html" target="_blank">part one of this rant</a>. This list will be terminated by an i-code that unnests the VM and returns to the caller.<br /></p><p>For an easy example of a low-level (leaf) definition, we can look at the code to add a double integer, starting at line 1008 in the assembled listing. </p><p>Note that the header starts with the length of the symbol name for Forth masked with some mode bits, followed by all but the last of the characters of the name, then the last character with its high bit set. ("$" says hexadecimal, ASCII for "+" is hexadecimal 4B.) The mode/length byte has its high bit set, as well, to brace the name. The link to the previous definition in the table, PLUS, is adjusted to point to that definition's name: two bytes of link, one byte of length, and one byte of name makes 4. </p><p>"*+" means "address of here plus 2", which address is where the machine code starts:<br /></p><p>
</p>
<hr />
<pre> 13FD 82 FCB $82
13FE 44 FCC 1,D+
13FF AB FCB $AB
1400 13 ED FDB PLUS-4
1402 14 04 DPLUS FDB *+2
1404 30 TSX
1405 0C CLC
1406 C6 04 LDA B #4
1408 A6 03 DPLUS2 LDA A 3,X
140A A9 07 ADC A 7,X
140C A7 07 STA A 7,X
140E 09 DEX
140F 5A DEC B
1410 26 F6 BNE DPLUS2
1412 31 INS
1413 31 INS
1414 31 INS
1415 31 INS
1416 7E 10 34 JMP NEXT</pre>
<hr />
<p>For an easy example of a definition that consists of i-codes, we can look at a convenience definition that adds 2 to the top integer on stack, starting at line 1535. The listing is a little awkward -- it's hard to see that TWOP labels the address 171D, and that the six bytes following are the label values of DOCOL (1525), TWO (15B5), and PLUS (13F1). (Remember, the 6800 is most significant byte first, so you don't have to reverse the bytes in your mind.)<br /></p><p>So the CODE_POINTER for "2+" is DOCOL, and the i-code list ends with SEMIS. <br /></p>
<hr />
<pre> 1718 82 FCB $82
1719 32 FCC 1,2+
171A AB FCB $AB
171B 17 0B FDB ONEP-5
171D 15 25 15 B2 13 F1
TWOP FDB DOCOL,TWO,PLUS
1723 13 67 FDB SEMIS
</pre>
<hr />
<p>Let's look at DOCOL, starting at line 1206. It's part of the mixed definition COLON, and has no header of its own:</p><hr />
<pre> 1525 DE F4 DOCOL LDX RP make room in the stack
1527 09 DEX
1528 09 DEX
1529 DF F4 STX RP
152B 96 F2 LDA A IP
152D D6 F3 LDA B IP+1
152F A7 02 STA A 2,X Store address of the high level word
1531 E7 03 STA B 3,X that we are starting to execute
1533 DE F0 LDX W Get first sub-word of that definition
1535 7E 10 36 JMP NEXT+2 and execute it
</pre>
<hr />We can see how it decrements RP to save the current IP, saves it, loads the address that NEXT just stored in W, then jumps into the right place in the NEXT inner interpreter to save it as the new IP and store the appropriate new W.<br /><p>To brace our understanding of this, let's look at SEMIS, starting at line 896:<br /></p><hr />
<pre> 1362 82 FCB $82
1363 3B FCC 1,;S
1364 D3 FCB $D3
1365 13 52 FDB RPSTOR-6
1367 13 69 SEMIS FDB *+2
1369 DE F4 LDX RP
136B 08 INX
136C 08 INX
136D DF F4 STX RP
136F EE 00 LDX 0,X get address we have just finished.
1371 7E 10 36 JMP NEXT+2 increment the return address & do next word
</pre>
<hr /><p>We can see that it is a low-level leaf. It increments RP, gets the saved IP, and jumps into the right place in NEXT to store it back in IP and continue where the caller left off.</p><p>This threading of functionality through low-level and high-level definitions is what the Forth community calls "threaded".</p><p>And the above model is an example of indirect threading, where the i-codes are pointers to pointers to code.</p><p>In case you're wondering, no, this is not the only way to do indirect threading. It's just one example.</p><p>Now, what about direct threading?</p><p>We'll stick with the same register model. And we'll start with the routine for double add, since it's sort-of familiar. The header structure will be the same, up to the label.</p><pre><blockquote>DPLUS TSX ; index the parameters<br /> CLC ; so we can do this in a loop<br /> LDA B #4 ; bytes to add, start with least significant<br />DPLUSL LDA A 3,X ; right-hand term<br /> ADC A 7,X ; left-hand term<br /> STA A 7,X ; overwrite left-hand term<br /> DEX<span> ; </span><br /> DEC B ; done?<br /> BNE DPLUSL ; back for more<br /> INS ; Deallocate right-hand term.<br /> INS<br /> INS<br /> INS<br /> JMP NEXT </blockquote></pre><p>the only thing that has changed is that the CODE_POINTER is missing in SEMIS. So how do the i-code lists get their starts? Let's look at 2+:<br /></p><pre><blockquote>TWOP JMP DOCOL<br /> FDB TWO,PLUS,SEMIS
</blockquote></pre><p>Now the i-code lists start with a little machine language code. On the 6801, DOCOL could be located in the direct page, and the jump would only be two bytes, FWTW. But you'd still be starting an i-code list with something that doesn't even look like an i-code. That adds a bump in designing debuggers, and in code analysis of the i-code lists.<br /></p><p>Let's see how NEXT changes to support this:</p><pre><blockquote>NEXT LDX IP<br /> INX ; We can still do pre-increment mode<br /> INX<br /> STX IP<br />NEXT2 LDX 0,X ; get W which points to definition to be done<br />NEXT3 STX W<br /> JMP 0,X ; Just jump right to it.</blockquote></pre><p>This looks like it ought to be faster, by a small amount. What happens to DOCOL and SEMIS, then? Interestingly enough, they don't have to change, except for losing the CODE_POINTER. </p><p>So why use indirect threading?</p><p>Basically, it keeps the i-code list pure, which simplifies debugging tools and such. Also, indirect threading helps avoid some of the problems your developers' tools may make for you, when they try to help you. (Yet another topic.)<br /></p><p>Again, this is not the only way to do direct-threading.</p><p>How about subroutine threading?</p><p>In subroutine threading, the definitions are called by subroutine. This can be done both indirect-threaded and direct-threaded, but, typically, it is done direct-threaded, in the idea that it is faster. NEXT looks like this:</p><pre><blockquote>NEXT LDX IP<br /> INX ; We can still do pre-increment mode<br /> INX<br /> STX IP<br />NEXT2 LDX 0,X ; get W which points to definition to be done<br />NEXT3 STX W<br />* LDX 0,X ; if doing indirect<br /> JSR 0,X ; Use a native call.<br /></blockquote></pre><p>To match this, low-level definitions end in a RTS instead of a JMP NEXT. The call and return actually take a bit more time than JMP, having to save and restore data on the CPU stack. </p><p>DOCOL doesn't have to change, and SEMIS ends in a RTS. Certain definitions we haven't looked at change because of return addresses that are now on the parameter stack. (Using the CPU's call stack for the parameter stack causes the greater ripple effect here.)<br /></p><p>Again, these are not the only ways of doing subroutine threading. In particular, since the call in the direct-threaded model actually saves W on the return stack, explicitly having a W register in the VM and storing X to it at NEXT3 can be done away with. Such an approach does requires changes to DOCOL, however.<br /></p><p>Some daredevils pervert the CPU's stack pointer into the IP and NEXT becomes a CPU return. I say daredevil, because, if the CPU gets any sort of interrupt, the interrupt walks all over the code. (Most CPUs save interrupt state on the call stack.) If you're that desperate to use an IP register with auto-increment mode, use a CPU that has a valid auto-increment mode for a non-stack register.<br /></p><p>Just for curiosity's sake, let's look at non-subroutine indirect threading on the 6809, going with my recommendation of using the U stack for parameters this time. The header will not change. Registers in the virtual machine will be assigned as above:</p><ul><li>W: X (Save before using for other things, if you need W.)<br /></li><li>IP: Y (save before using for other things and restore before NEXT)<br /></li><li>RP: S <br /></li><li>SP: U</li><li>UP: DP<br /></li></ul><p>The examples, as with the 6800:</p><p></p><blockquote>DPLUS FDB *-2<br /> LDD 6,U ; left-hand<br /> ADDD 2,U ; right-hand<br /> STD 6,U<br /> LDD 4,U ; left-hand<br /> ADCB 1,U ; No ADCD, do it by bytes.<br /> ADCA ,U<br /> LEAU 4,U ; deallocate before store to save a cycle.<br /> STD ,U<br /> JMP NEXT<br />* JMP [,Y++] ; This could be NEXT on the 6809, if ignoring W<br />*</blockquote><p> DOCOL, NEXT, and SEMIS are significantly optimized:<br /></p><blockquote>DOCOL LDX ,Y++ ; using post-increment instead of pre-<br /> PSHS Y ; save pointer to next<br /> LDY ,Y ; Get new IP<br />NEXT LDX ,Y++ ; using post-increment instead of pre-<br /> JMP [,X]<br />* <br />SEMIS PULS Y<br /> BRA NEXT<br /></blockquote><p></p><p>How about the 68000? It gives us a 32-bit add naturally, so we'll define a 64-bit double add. Using the registers as I suggest above,</p><p></p><blockquote>DPLUS DC.L *+4<br /> MOVEM (A6),D0/D1/D2/D3<br /> ADD.L D1,D3 ; less significant 32 bits<br /> ADDX.L D0,D2 ; more significant 32 bits<br /> LEA 8(A6),A6 ; deallocate right-hand term<br /> JMP NEXT</blockquote>And<br /><p></p><p></p><blockquote>DOCOL MOVE.L (A4)+,A3 ; using post-increment instead of pre-<br /> MOVE.L A4,-(A7) ; save pointer to next<br /> MOVE.L A3,A4 ; Get new IP<br />NEXT (A4)+,A3 ; using post-increment instead of pre-<br /> MOVE.L (A3),A2<br /> JMP (A2)<br />* <br />SEMIS MOVE.L (A7)+,A4<br /> BRA NEXT</blockquote> The shift from indirect-threaded to direct, and the use of subroutine-threading would follow pretty much as the shift in the 6800, modulus the advantage of having the registers in actual registers instead of in memory.<br /><p></p><p>There is one more step I mentioned in passing in part one of this rant. If using native calls, a certain degree of speed optimization can be obtained by flattening an i-code list. Instead of starting the definition to be optimized this way with a jump to the inner interpreter, the definition can be compiled as a series of calls.</p><p>On the 6809, 2+ can be changed from<br /></p><p></p><blockquote>TWOP JMP DOCOL<br /> FDB TWO,PLUS,SEMIS </blockquote><p></p><p>to </p><p></p><blockquote>TWOP JSR TWO<br /> JSR PLUS<br /> RTS</blockquote>And from there, optimizing to <br /><p></p><p></p><blockquote>TWOP LDD #2<br /> ADDD ,U<br /> STD ,U<br /> RTS</blockquote><p></p><p>becomes almost trivial.</p><p>(I also discuss optimizations on the 6800 a little in the <a href="https://defining-computers.blogspot.com/2020/10/6800-example-vm-with-synthetic.html" target="_blank">synthetic stack examples here</a>: <a href="https://defining-computers.blogspot.com/2020/10/6800-example-vm-with-synthetic.html" target="_blank">https://defining-computers.blogspot.com/2020/10/6800-example-vm-with-synthetic.html</a>.)<br /></p><p>Caveat: It's way past my bedtime again, and I may have glaring mistakes in the above. If so, leave me comments, please.</p><p>Why is this important?</p><p>Using the fig Forth VM, it is a bit difficult to share the Forth words as library code for compiled languages such as C. Using the native CPU calls instead, it becomes possible to share, if the compiler for the other language uses a split stack.<br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-3649745104556256172020-10-22T00:34:00.013+09:002020-10-26T09:21:39.946+09:00Computer Languages -- Interpreted vs. Compiled Forth?<p>[JMR202010251132:<br /></p><p>Comments on a <a href="https://www.facebook.com/groups/2225595150855239/permalink/3405586649522744/" target="_blank">post by Peter Forth</a> in the <a href="https://www.facebook.com/groups/2225595150855239" target="_blank">Forth2020 Users-Group</a> on FB set me off on a long response that turned into this webrant on the distinctions between<br /></p><h4 style="text-align: center;">Interpreter vs. compiler? -- Interpreted vs. Compiled? </h4><h4 style="text-align: center;">Forth?<br /></h4>][JMR202010251132 -- edited for clarification.] <br /><p>This is a particularly tangled intersection of concepts and jargon.</p>
<p>Many early BASICs directly interpreted the source code text. If you typed</p>
<blockquote><p>PRINT "HELLO "; 4*ATN(1)<br /></p></blockquote>
<p>and hit ENTER, the computer would parse (scan) along the line of text you had typed and find the space after the "PRINT", and go to the symbol table (the BASIC interpreter's dictionary) and look it up. There it would find some routines that would prepare to print something out on the screen (or other active output device).</p>
<p>Then it would keep parsing and find the quoted </p>
<blockquote><p>"HELLO " </p></blockquote>
<p>followed by the semicolon, and it would prepare the string "HELLO " for printing, probably by putting the string into a print buffer.</p>
<p>Continuing to scan, it would find the numeric text "4", followed by the asterisk. It would recognize it as a number and convert the numeric text to the number 4, then, since asterisk is the multiplication symbol in BASIC, put both the number and the multiplication operation somewhere to remember them. </p><p>Then it would scan the symbol "ATN" followed by the opening parenthesis character. Looking this up in the symbol table, it would find a routine to compute the arctangent of a number, and remember that waiting function call, along with the fact that it was looking for the closing parenthesis character.</p><p>Then it would scan the text "1", followed by the closing parenthesis. Recognizing the text as a number, it would convert it to the number 1. Then it would act on the closing parenthesis and call the saved function, passing it the 1 to work on. The function would return the result, 0.7853982, and save it away after the 4 stored earlier.</p><p>Then it would see that the line had come to an end, and it would go back and see what was left to do. It would find the multiplication, and multiply 4 x 0.7853982, giving 3.1415927 (which needs a little explaining), and remember the result. Then it would see that it was in the middle of PRINTing, and convert the number to text, put it in the buffer following the string, and print the buffer to the screen, something like</p><blockquote><p>HELLO 3.1415927 <br /></p></blockquote><p>I know that seems like a lot of work to go to. Maybe it would not help you to know I'm only giving you the easy overview. There are lots more steps I'm not mentioning.</p>(So, you're wondering why 4 x 0.7853982 is 3.1415927, not 3.1415928? Cheap
calculators are inexact after about four to eight digits of decimal
fraction. BASIC languages tended to implement a cheap calculator --
especially early BASIC languages. But, just for the record, written with a few bits more accuracy, π/4 ≒ 0.78539816; π ≒ 3.14159266. The computer often keeps more accuracy than it shows.)<br /><p>Interpreted vs. compiled actually indicates a spectrum, not a binary classification. What I have described above is one extreme end of the spectrum, referred to in such terms as "pure source text interpreter".</p><p>Other approaches to the BASIC language were taken. Some were pure compiled languages. Instead of acting immediately on each symbol as it parses, compilers save the corresponding code in the CPU's native machine language away in a file, and the user has to call the compiled program back up to run it -- after the compiler has finished checking that the code follows the grammar rules and has finished saving all the generated machine code. (Again, there are steps I am not mentioning at this point.) <br /></p><p>In the case of the above line of BASIC, the resultant machine code might look something like the following, if the compiler compiles to object code for the 6809 CPU: </p><p></p><blockquote>48454C4C4F20<br />00<br />40800000<br />3F800000<br />308CEE<br />3610<br />17CFE9<br />308CED<br />3610<br />308CEC<br />3610<br />17D828<br />17D801<br />17CFEE<br />17CFF7<br /></blockquote><p>That isn't very easily read by humans. Here is what the above looks like in 6809 assembly language:<br /></p><p></p><p></p><blockquote><code>1000 00007 S0001<br />1000 48454C4C4F20 00008 FCC 'HELLO '<br />1006 00 00009 FCB 0<br />1007 00010 FP0001<br />1007 40800000 00011 FCB $40,$80,00,00 ; 4<br />100B 00012 FP0002<br />100B 3F800000 00013 FCB $3F,$80,00,00 ; 1<br />100F 00014 L0001<br />100F 308CEE 00015 LEAX S0001,PCR<br />1012 3610 00016 PSHU X<br />1014 17CFE9 00017 LBSR BUFPUTSTR<br />1017 308CED 00018 LEAX FP0001,PCR<br />101A 3610 00019 PSHU X<br />101C 308CEC 00020 LEAX FP0002,PCR<br />101F 3610 00021 PSHU X<br />1021 17D828 00022 LBSR ATAN<br />1024 17D801 00023 LBSR FPMUL<br />1027 17CFEE 00024 LBSR BUFPUTFP<br />102A 17CFF7 00025 LBSR PRTBUF <br /></code></blockquote>
<p></p><p>Think you could almost read that? You see the word "HELLO " in there, and you can guess that </p>
<ol style="text-align: left;">
<li>FP0001 and FP0002 are the floating point encodings for 4 and 1, respectively. </li>
<li>LEAX calculates a pointer to the value given to it, and </li>
<li>LBSR calls subroutines which
<ol type="a">
<li>put the string in the print buffer, </li>
<li>calculate the arctangent, </li>
<li>multiply the arctangent (by the saved 4.0), </li>
<li>convert the result to text and put it in the print buffer, </li>
<li>and send the print buffer to the screen (or other current output device).<br /></li>
</ol>
</li>
</ol>
<p>Since the CPU interprets the machine language directly, the compiled code is going to run very fast. </p><p>With something this short, we don't really care about how fast it is, but with a really long program we might care a lot.<br /></p><p>On the other hand, with something as short as this, we are less interested in how fast it runs than in the effort and time it takes to prepare and run the code. </p><p>Compiled languages add a compile step between typing the code in and running it. If we want to change something, we have to edit the source code and run it through the compiler again. </p><p>On modern computers with certain integrated developer environments, that's actually less trouble than it sounds, but without those IDEs, or on slower computers, the pure text interpreter is much easier to use for short programs.</p><p>On eight bit computers, those fancy IDEs took too much of the computer's resources, because the machine code of a CPU can be very complicated.</p><p>Hmm. The example of compiling to the 6809 is a bit counter to my purpose here, because the architecture and machine code of the 6809 were at once quite simple and well-matched to the programs we write. </p><p>Let's see what that line of code would look like compiled on a "modern" CPU. I'm going to cheat a little and show you what it looks like written in C, and what it looks like compiled from C, because I don't want to take the time to install either FreeBasic or Gambas in my workstation and figure out how to get a look at the compiled output. Here's the C version, wrapped in the mandatory main() procedure:<br /></p><p></p><blockquote>/* Using the C compiler as a calculator.<br />*/<br /><br />#include <stdlib.h><br />#include <math.h><br />#include <stdio.h><br /><br />int main( int argc, char *argv[] )<br />{<br /> printf( "HELLO %g\n", 4 * atan( 1 ) );<br />}</blockquote><p>That's a lot of wrapper just to do calculations, don't you think?</p><p>(Not all compiled languages require the wrapping main function to be explicitly declared, but many do.)<br /></p><p>Well, here's the assembly language output on my Ubuntu (Intel64 perversion of AMD64 evolution of 'x86 CPU) workstation: </p><p></p><blockquote><p> .file "printsample.c"<br /> .text<br /> .section .rodata<br />.LC1:<br /> .string "HELLO %g\n"<br /> .text<br /> .globl main<br /> .type main, @function<br />main:<br />.LFB5:<br /> .cfi_startproc<br /> pushq %rbp<br /> .cfi_def_cfa_offset 16<br /> .cfi_offset 6, -16<br /> movq %rsp, %rbp<br /> .cfi_def_cfa_register 6<br /> subq $32, %rsp<br /> movl %edi, -4(%rbp)<br /> movq %rsi, -16(%rbp)<br /> movq .LC0(%rip), %rax<br /> movq %rax, -24(%rbp)<br /> movsd -24(%rbp), %xmm0<br /> leaq .LC1(%rip), %rdi<br /> movl $1, %eax<br /> call printf@PLT<br /> movl $0, %eax<br /> leave<br /> .cfi_def_cfa 7, 8<br /> ret<br /> .cfi_endproc<br />.LFE5:<br /> .size main, .-main<br /> .section .rodata<br /> .align 8<br />.LC0:<br /> .long 1413754136<br /> .long 1074340347<br /> .ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"<br /> .section .note.GNU-stack,"",@progbits<br /></p><p></p></blockquote><p>Now, with a bit of effort, I can read that. I can see where it's setting up a stack frame on entry and discarding it on exit. I can see where the compiler has pre-calculated the constant, so that all that is left to do at run-time is load the result and print it. But, even though you can see the string "HELLO " and the final call to printf(), if you can decipher the stuff in between without explanation, you probably already know anything I can tell you about this stuff. <br /></p><p>The IDE running on your computer has to keep track of stuff like this very quickly, in order to make it easy for you to change something and get quick results. (Back in the 1980s, the <a href="https://osdn.net/projects/splitstack-runtimelib/" target="_blank">Think C compiler</a> for the 68000 (Apple Macintosh) was able to do stuff like this, in no small part because the 68000, like it's little brother the 6809, is pretty powerful even though it's relatively simple.)</p><p>So, back in the 1970s and 1980s, companies that weren't, for various reasons, willing to use either the 6809 or 68000 in their designs, wanted something a little simpler than the CPU to compile to, so that the IDE (and the humans) could keep track of what was happening.</p><p>So they devised "virtual machines" that interpreted an intermediate code between the complexity of the CPU's raw machine code and the readability of pure text. These virtual machines also made it easier to have compilers that were portable between various processors, but that's another topic.</p>
<p>The intermediate codes these VMs ran were known as "i-codes", but were often called "p-codes" (for Pascal i-codes) or "byte-codes" (because many of them, especially Java, were byte-oriented encodings). (And, BTW, "P-code" was also an expression used at the time to indicate pseudo-coding, which can be a bit confusing, but that's another topic.)</p><p>If you want an idea what an i-code looked like, one possible i-code using prefix 3-tuples and converted to human-readable form might look like</p><p></p><blockquote><p>string 'HELLO '<br />putstr stdout s0001<br />float 4<br />float 1<br />library arctan fp0002<br />multiply fp0001 res0001<br />putfloat stdout res0002<br />printbuffer stdout</p><p></p></blockquote><p></p>
<p>A stacked mixed-tuple postfix i-code would also be possible:</p><p></p><blockquote>'HELLO ' putstr<br />4.0 <br />1.0<br />arctan<br />multiply<br />putfloat<br />stdout printbuffer <br /></blockquote><p></p><p>There were versions of BASIC that ran on VMs. The Basic compilers compiled the source code to the i-codes, and the VM interpreted the i-codes. <br />
</p><p>Note that we have added a level of interpretation between the source text and the CPU. <br /></p><p>You might think that the 6809 wouldn't need an i-code interpreted BASIC, but a byte-code could be designed to be more compact even than 6809 machine code. </p><p>Basic09 was basically developed at the same time as the processor itself was designed. The 6809 was designed to run the Basic09 VM very efficiently, and it did. But it was a compiled language, in the sense that there was no command-line interpreter built-in. (You could build or buy one, but it wasn't built-in.) It was source-code compiled, i-code interpreted. (Analyzing the functionality of Basic09 is also yet another topic, but it worked out well, partly because of the host OS-9 operating system.)</p>
<p>Even now, there are a number of languages that (mostly) run in a VM, for various reasons. The modular VM of Java, with the division between the CPU and the language, can make the language safer to use, which is yet another topic. (Except that now Java has had so much added to it, that .... Which is another topic, indeed.)<br /></p><p>Which brings me to Forth.</p><p>Early Forth languages tended to be i-code interpreted VMs, for simplicity and portability. The i-code was very efficient. Not byte-code, at least not originally, but i-code. </p><p>(Specifically, Forth built-in operations were essentially functions, and, when defining a new function, the function definition would be compiled as a list of addresses of the functions to call. A newly-defined function would also be callable by its address in the same way as built-in functions. How that is done involves something the Forth community calls threaded code, which is, ahem, another topic, partially addressed <a href="https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html" target="_blank">here</a>: <a href="https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html" target="_blank">https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html</a>.)<br /></p><p>Forth interpreters include a command-line interpretation mode, which is made easier because of the i-code interpretation. </p><p>(In a sense, the interpreter mode is much like a directly interpreted BASIC interpreter with a giant case statement, where new definitions implicitly become part of the giant case statement.)<br /></p><p>The built-in i-codes include commands that can be invoked as post-fix calculator commands, directly from the command-line.</p><p>The above example, entered at the command line of gforth, which is a Forth that knows floating-point, would look like this:</p><blockquote><p>4.0e0 1.0e0 fatan f* f.<br /></p></blockquote><p>(giving the result 3.14159265358979.) <br /></p>
<p>Uh, yes, I know, in terms of readability, that only looks marginally improved over 6809 assembler source. Maybe.</p>
<p>Lemme 'splain s'more!</p>
<p>That's because Forth is, essentially, the i-code of it's VM. Not byte-code, but address-word-code VM. </p><p>Forth has a postfix grammar, something like the RPN of Hewlett Packard calculators. So the first two text blobs are 4.0 x 10<sup>0</sup> and 1.0 x 10<sup>0</sup> in a modified scientific notation, pushed on the floating point stack in that order. Fatan is the floating point arctangent function, which operates on the topmost item on the floating point stack, 1.0. It returns the result -- 0.785398163397448, or a quarter of π -- in the place of the 1.0e0. </p><p>'F*' is not a swear word, it's the floating point multiply, and it multiplies the top two items on the floating point stack, resulting in (an approximation of) π. 'F.' prints the top floating point number to the current output device, in this case, the screen.<br /></p><p>The stack, in combination with the post-fix notation, allows most of the symbols the user types in from the command-line to have a one-to-one correspondence with the i-codes themselves. <br /></p><p>The 6809 and 68000 have an interesting quality in common (almost, but not quite, shared by the 8086) -- Both the 6809 and the 68000 have an instruction set that includes op-codes that can be mapped one-to-one or one-to-two with a significant subset of the Forth primitives.</p><p>There are have been other processors designed specifically to be mapped one-to-one with all the important Forth leaf-level primitives. The Forth Interest Group lists a <a href="http://www.forth.org/cores.html" target="_blank">few of the current ones on one of their web pages</a>.<br /></p><p></p><p>Thus, Forth is a bit unusual, even among byte-code and i-code languages. It partakes of both ends of the spectrum, and of the middle, all at once, being both interpreted at the text level and compiled at the machine-code level, as well as operating at the intermediate level (although not at all levels at once in every implementation). <br /></p><p>Now, before we think we have covered the bases, we should note that all modern compilers compile to an intermediate form, as a means of separating the handling of the source language from the handling of the target CPU.</p><p>Forth might actually define one of the possible intermediate forms a compiler might compile to, if the compiler compiles to a split stack run-time model instead of a linked-list-of-interleaved-frames-in-a-single-stack-model. </p><p>(Note here that the run-time model of compiled languages is essentially a close corollary of a virtual machine.)<br /></p><p>An interleaved single-stack model mixes local variables, parameters, and return addresses on the same stack. This means that a called function can walk all over both the link to the calling frame and the address in the calling code to return to, if the programmer is only a little bit careless. When that happens, the program usually loses complete track of what it's doing, and may begin doing many kinds of random and potentially damaging things.<br /></p><p>This is where most of the stack vulnerabilities and stack attack vectors that you hear about in software come from.<br /></p><p>Some engineers consider the single stack easier to manage and maintain than a split stack. You only have to ask the operating system to allocate one stack region, and you only have to worry about overflow and underflow in that region.</p><p>But interleaving the return addresses with the parameters and local variables comes at a trade-off cost of having to worry about frame bounds in every function, in addition to the cost of maintaining the frame.</p><p>With a (correctly) split stack, there is no need to maintain frames in most run-times because they maintain themselves, and it takes more than carelessness to overwrite the return address (and optional frame pointer for those languages which need one).</p><p>Given this, one might expect that most modern run-times would prefer the problems of maintaining a split stack over the problems of maintaining a single stack. </p><p>But memory was tight in the 1980s (or so we thought), and the single stack was the model the industry grew up with. (Maybe you can tell, but I believe this was a serious error.) <br /></p><p>(By the way, this is one of the key places where the 8086 misses being able to implement a near one-to-one mapping -- segmentation biases it towards the single-stack model. BP shares a segment register with SP. If BP had its own segment register, you could put the return address stack in it's own somewhat well-protected segment, but it doesn't, so you can't.)</p><p>Especially with CPUs that map well to the Forth model, Forth interpreters can be designed to compile to machine code, and run without an intermediary virtual machine. <i>[JMR202010222211: I'm actually working -- a little at a time -- on a project that will do that, here: <a href="https://osdn.net/projects/splitstack-runtimelib/">https://osdn.net/projects/splitstack-runtimelib/</a>.]</i> </p><p>One more topic that needs to be explicitly addressed: Most languages have only limited ability to extend their symbol tables, especially at run-time. Most text-only interpreters have severe limits to extending their symbol tables -- in some cases just ten possible user functions named fn0 to fn9, in some cases, no extensions at all.<br /></p><p>I've implied this above, but, by design, Forth interpreters do not have such limits. Symbols defined by the programmer have pretty much equal standing with pre-defined symbols. This makes Forth even less like most interpreted languages.</p><p>Yes, Forth is often an interpreted language, but not like other interpreted languages.</p><p>And it's after midnight here, and I keep drifting off, so this rant ends here.</p><p>[JMR202010251351:</p><p>Discussing of threading added in a <a href="https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html" target="_blank">separate post</a>: <a href="https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html">https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html</a>, also linked above. <br /></p><p>] <br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com0tag:blogger.com,1999:blog-3954576973420765520.post-10482206088949517362020-09-22T07:55:00.006+09:002020-09-22T07:56:26.041+09:00Google Bias against Plain Speech? Google has been telling me and others that my blogs that I haven't set to auto-redirect to HTTPS encryption are "<b>Insecure!</b>" and "<b>Could be dangerous to access!</b>".<p>WTFoolishness?</p><p>HTTPS
has nothing to do with safe content. If you want to host a virus,
hosting it over https doesn't block it or change the fact that it's a
virus. If somebody gets your password or gets into your server and drops
malware in your web pages for you, encrypting it won't change the fact
that it's malware.<br /></p><p>HTTPS can be used to help ensure identity, by making it harder to put a man-in-the-middle and send you, the reader, to http://defining-computers.blogsp0t.com instead of http://defining-computers.blogspot.com. (Did you see that was blogsp0t instead of blogspot?) I won't explain the weaknesses in HTTPS here, but I can tell you that even that is not all that strong.</p>And encrypting everything actually gives an attacker more surface area to attack the encryption. <p>HTTPS is a speed bump, or, at best, a low wall. That's the best it can give you. <br /></p><p>Proper use of HTTPS encryption is to limit it to the pages where it is needed, and to identity tokens on otherwise unsecured ordinary information-type pages (like blogs). <br /></p><p>I just posted a post in my political blog <a href="https://joel-for-president.blogspot.com/2020/09/cap-in-hand.html" target="_blank">about the evils of mocking</a>, but -- </p><p>I
use nothing but plain text and a few simple images on my blogs. Okay, on
some of my blogs, I let Google put their advertisements up, since I can't afford to
pay Google for the use.</p><p>Plain text, simple images, and Google's
own ads. </p><p>No other content. If there's dangerous content, it would have
to be Google's ads.</p><p>And I am letting Blogspot host the blogs --
again, because I can't afford to host them on my own server. If there's
some problem with identity, it's not me handling my on-line identity,
it's Blogspot/Google.</p><p>Okay, I guess that's why they want me to let them force my blogs to HTTPS. They think it makes things simpler for them. </p><p>Anyway, I am turning on automatic HTTPS redirect for all my blogs, not because it's a good idea, but because I don't have money or time to host my blogs myself, the right way. And I don't want Google warning innocent bystanders that I'm dangerous.</p><p>(Whether the ideas I expound on my blogs are dangerous or not is another problem.)<br /></p>零石http://www.blogger.com/profile/01111094813708912513noreply@blogger.com2