Misunderstanding Computers

Why do we insist on seeing the computer as a magic box for controlling other people?
人はどうしてコンピュータを、人を制する魔法の箱として考えたいのですか?
Why do we want so much to control others when we won't control ourselves?
どうしてそれほど、自分を制しないのに、人をコントロールしたいのですか?

Computer memory is just fancy paper, CPUs are just fancy pens with fancy erasers, and the network is just a fancy backyard fence.
コンピュータの記憶というものはただ改良した紙ですし、CPU 何て特長ある筆に特殊の消しゴムがついたものにすぎないし、ネットワークそのものは裏庭の塀が少し拡大されたものぐらいです。

(original post/元の投稿 -- defining computers site/コンピュータを定義しようのサイト)

Thursday, October 22, 2020

Computer Languages -- Interpreted vs. Compiled Forth?

[JMR202010251132:

Comments on a post by Peter Forth in the Forth2020 Users-Group on FB set me off on a long response that turned into this webrant on the distinctions between

Interpreter vs. compiler? -- Interpreted vs. Compiled? 

Forth?

][JMR202010251132 -- edited for clarification.]

This is a particularly tangled intersection of concepts and jargon.

Many early BASICs directly interpreted the source code text. If you typed

PRINT "HELLO "; 4*ATN(1)

and hit ENTER, the computer would parse (scan) along the line of text you had typed and find the space after the "PRINT", and go to the symbol table (the BASIC interpreter's dictionary) and look it up. There it would find some routines that would prepare to print something out on the screen (or other active output device).

Then it would keep parsing and find the quoted 

"HELLO " 

followed by the semicolon, and it would prepare the string "HELLO " for printing, probably by putting the string into a print buffer.

Continuing to scan, it would find the numeric text "4", followed by the asterisk. It would recognize it as a number and convert the numeric text to the number 4, then, since asterisk is the multiplication symbol in BASIC, put both the number and the multiplication operation somewhere to remember them. 

Then it would scan the symbol "ATN" followed by the opening parenthesis character. Looking this up in the symbol table, it would find a routine to compute the arctangent of a number, and remember that waiting function call, along with the fact that it was looking for the closing parenthesis character.

Then it would scan the text "1", followed by the closing parenthesis. Recognizing the text as a number, it would convert it to the number 1. Then it would act on the closing parenthesis and call the saved function, passing it the 1 to work on. The function would return the result,  0.7853982, and save it away after the 4 stored earlier.

Then it would see that the line had come to an end, and it would go back and see what was left to do. It would find the multiplication, and multiply 4 x 0.7853982, giving 3.1415927 (which needs a little explaining), and remember the result. Then it would see that it was in the middle of PRINTing, and convert the number to text, put it in the buffer following the string, and print the buffer to the screen, something like

HELLO  3.1415927

I know that seems like a lot of work to go to. Maybe it would not help you to know I'm only giving you the easy overview. There are lots more steps I'm not mentioning.

(So, you're wondering why 4 x 0.7853982 is 3.1415927, not 3.1415928? Cheap calculators are inexact after about four to eight digits of decimal fraction. BASIC languages tended to implement a cheap calculator -- especially early BASIC languages. But, just for the record, written with a few bits more accuracy, π/4 ≒ 0.78539816; π ≒ 3.14159266. The computer often keeps more accuracy than it shows.)

Interpreted vs. compiled actually indicates a spectrum, not a binary classification. What I have described above is one extreme end of the spectrum, referred to in such terms as "pure source text interpreter".

Other approaches to the BASIC language were taken. Some were pure compiled languages. Instead of acting immediately on each symbol as it parses, compilers save the corresponding code in  the CPU's native machine language away in a file, and the user has to call the compiled program back up to run it -- after the compiler has finished checking that the code follows the grammar rules and has finished saving all the generated machine code. (Again, there are steps I am not mentioning at this point.)

In the case of the above line of BASIC, the resultant machine code might look something like the following, if the compiler compiles to object code for the 6809 CPU: 

48454C4C4F20
00
40800000
3F800000
308CEE
3610
17CFE9
308CED
3610
308CEC
3610
17D828
17D801
17CFEE
17CFF7

That isn't very easily read by humans. Here is what the above looks like in 6809 assembly language:

1000                  00007         S0001
1000 48454C4C4F20     00008             FCC 'HELLO '
1006 00               00009             FCB 0
1007                  00010         FP0001
1007 40800000         00011             FCB $40,$80,00,00 ; 4
100B                  00012         FP0002
100B 3F800000         00013             FCB $3F,$80,00,00 ; 1
100F                  00014         L0001
100F 308CEE           00015             LEAX S0001,PCR
1012 3610             00016             PSHU X
1014 17CFE9           00017             LBSR BUFPUTSTR
1017 308CED           00018             LEAX FP0001,PCR
101A 3610             00019             PSHU X
101C 308CEC           00020             LEAX FP0002,PCR
101F 3610             00021             PSHU X
1021 17D828           00022             LBSR ATAN
1024 17D801           00023             LBSR FPMUL
1027 17CFEE           00024             LBSR BUFPUTFP
102A 17CFF7           00025             LBSR PRTBUF

Think you could almost read that? You see the word "HELLO " in there, and you can guess that 

  1. FP0001 and FP0002 are the floating point encodings for 4 and 1, respectively. 
  2. LEAX calculates a pointer to the value given to it, and 
  3. LBSR calls subroutines which
    1. put the string in the print buffer, 
    2. calculate the arctangent, 
    3. multiply the arctangent (by the saved 4.0), 
    4. convert the result to text and put it in the print buffer, 
    5. and send the print buffer to the screen (or other current output device).

Since the CPU interprets the machine language directly, the compiled code is going to run very fast. 

With something this short, we don't really care about how fast it is, but with a really long program we might care a lot.

On the other hand, with something as short as this, we are less interested in how fast it runs than in the effort and time it takes to prepare and run the code. 

Compiled languages add a compile step between typing the code in and running it. If we want to change something, we have to edit the source code and run it through the compiler again. 

On modern computers with certain integrated developer environments, that's actually less trouble than it sounds, but without those IDEs, or on slower computers, the pure text interpreter is much easier to use for short programs.

On eight bit computers, those fancy IDEs took too much of the computer's resources, because the machine code of a CPU can be very complicated.

Hmm. The example of compiling to the 6809 is a bit counter to my purpose here, because the architecture and machine code of the 6809 were at once quite simple and well-matched to the programs we write. 

Let's see what that line of code would look like compiled on a "modern" CPU. I'm going to cheat a little and show you what it looks like written in C, and what it looks like compiled from C, because I don't want to take the time to install either FreeBasic or Gambas in my workstation and figure out how to get a look at the compiled output. Here's the C version, wrapped in the mandatory main() procedure:

/* Using the C compiler as a calculator.
*/

#include <stdlib.h>
#include <math.h>
#include <stdio.h>

int main( int argc, char *argv[] )
{
  printf( "HELLO %g\n", 4 * atan( 1 ) );
}

That's a lot of wrapper just to do calculations, don't you think?

(Not all compiled languages require the wrapping main function to be explicitly declared, but many do.)

Well, here's the assembly language output on my Ubuntu (Intel64 perversion of AMD64 evolution of 'x86 CPU) workstation: 

    .file    "printsample.c"
    .text
    .section    .rodata
.LC1:
    .string    "HELLO %g\n"
    .text
    .globl    main
    .type    main, @function
main:
.LFB5:
    .cfi_startproc
    pushq    %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $32, %rsp
    movl    %edi, -4(%rbp)
    movq    %rsi, -16(%rbp)
    movq    .LC0(%rip), %rax
    movq    %rax, -24(%rbp)
    movsd    -24(%rbp), %xmm0
    leaq    .LC1(%rip), %rdi
    movl    $1, %eax
    call    printf@PLT
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE5:
    .size    main, .-main
    .section    .rodata
    .align 8
.LC0:
    .long    1413754136
    .long    1074340347
    .ident    "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
    .section    .note.GNU-stack,"",@progbits

Now, with a bit of effort, I can read that. I can see where it's setting up a stack frame on entry and discarding it on exit. I can see where the compiler has pre-calculated the constant, so that all that is left to do at run-time is load the result and print it. But, even though you can see the string "HELLO " and the final call to printf(), if you can decipher the stuff in between without explanation, you probably already know anything I can tell you about this stuff. 

The IDE running on your computer has to keep track of stuff like this very quickly, in order to make it easy for you to change something and get quick results. (Back in the 1980s, the Think C compiler for the 68000 (Apple Macintosh) was able to do stuff like this, in no small part because the 68000, like it's little brother the 6809, is pretty powerful even though it's relatively simple.)

So, back in the 1970s and 1980s, companies that weren't, for various reasons, willing to use either the 6809 or 68000 in their designs, wanted something a little simpler than the CPU to compile to, so that the IDE (and the humans) could keep track of what was happening.

So they devised "virtual machines" that interpreted an intermediate code between the complexity of the CPU's raw machine code and the readability of pure text. These virtual machines also made it easier to have compilers that were portable between various processors, but that's another topic.

The intermediate codes these VMs ran were known as "i-codes", but were often called "p-codes" (for Pascal i-codes) or "byte-codes" (because many of them, especially Java, were byte-oriented encodings). (And, BTW, "P-code" was also an expression used at the time to indicate pseudo-coding, which can be a bit confusing, but that's another topic.)

If you want an idea what an i-code looked like, one possible i-code using prefix 3-tuples and converted to human-readable form might look like

string 'HELLO '
putstr stdout s0001
float 4
float 1
library    arctan fp0002
multiply fp0001 res0001
putfloat stdout res0002
printbuffer stdout

A stacked mixed-tuple postfix i-code would also be possible:

'HELLO ' putstr
4.0
1.0
arctan
multiply
putfloat
stdout printbuffer

There were versions of BASIC that ran on VMs. The Basic compilers compiled the source code to the i-codes, and the VM interpreted the i-codes. 

Note that we have added a level of interpretation between the source text and the CPU.

You might think that the 6809 wouldn't need an i-code interpreted BASIC, but a byte-code could be designed to be more compact even than 6809 machine code. 

Basic09 was basically developed at the same time as the processor itself was designed. The 6809 was designed to run the  Basic09 VM very efficiently, and it did. But it was a compiled language, in the sense that there was no command-line interpreter built-in. (You could build or buy one, but it wasn't built-in.) It was source-code compiled, i-code interpreted. (Analyzing the functionality of Basic09 is also yet another topic, but it worked out well, partly because of the host OS-9 operating system.)

Even now, there are a number of languages that (mostly) run in a VM, for various reasons. The modular VM of Java, with the division between the CPU and the language, can make the language safer to use, which is yet another topic. (Except that now Java has had so much added to it, that .... Which is another topic, indeed.)

Which brings me to Forth.

Early Forth languages tended to be i-code interpreted VMs, for simplicity and portability. The i-code was very efficient. Not byte-code, at least not originally, but i-code. 

(Specifically, Forth built-in operations were essentially functions, and, when defining a new function, the function definition would be compiled as a list of addresses of the functions to call. A newly-defined function would also be callable by its address in the same way as built-in functions. How that is done involves something the Forth community calls threaded code, which is, ahem, another topic, partially addressed here: https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html.)

Forth interpreters include a command-line interpretation mode, which is made easier because of the i-code interpretation. 

(In a sense, the interpreter mode is much like a directly interpreted BASIC interpreter with a giant case statement, where new definitions implicitly become part of the giant case statement.)

The built-in i-codes include commands that can be invoked as post-fix calculator commands, directly from the command-line.

The above example, entered at the command line of gforth, which is a Forth that knows floating-point, would look like this:

4.0e0   1.0e0   fatan   f*   f.

(giving the result 3.14159265358979.)

Uh, yes, I know, in terms of readability, that only looks marginally improved over 6809 assembler source. Maybe.

Lemme 'splain s'more!

That's because Forth is, essentially, the i-code of it's VM. Not byte-code, but address-word-code VM.

Forth has a postfix grammar, something like the RPN of Hewlett Packard calculators. So the first two text blobs are 4.0 x 100 and 1.0 x 100 in a modified scientific notation, pushed on the floating point stack in that order. Fatan is the floating point arctangent function, which operates on the topmost item on the floating point stack, 1.0. It returns the result -- 0.785398163397448, or a quarter of π -- in the place of the 1.0e0. 

'F*' is not a swear word, it's the floating point multiply, and it multiplies the top two items on the floating point stack, resulting in (an approximation of) π. 'F.'  prints the top floating point number to the current output device, in this case, the screen.

The stack, in combination with the post-fix notation, allows most of the symbols the user types in from the command-line to have a one-to-one correspondence with the i-codes themselves.

The 6809 and 68000 have an interesting quality in common (almost, but not quite, shared by the 8086) -- Both the 6809 and the 68000 have an instruction set that includes op-codes that can be mapped one-to-one or one-to-two with a significant subset of the Forth primitives.

There are have been other processors designed specifically to be mapped one-to-one with all the important Forth leaf-level primitives. The Forth Interest Group lists a few of the current ones on one of their web pages.

Thus, Forth is a bit unusual, even among byte-code and i-code languages. It partakes of both ends of the spectrum, and of the middle, all at once, being both interpreted at the text level and compiled at the machine-code level, as well as operating at the intermediate level (although not at all levels at once in every implementation).

Now, before we think we have covered the bases, we should note that all modern compilers compile to an intermediate form, as a means of separating the handling of the source language from the handling of the target CPU.

Forth might actually define one of the possible intermediate forms a compiler might compile to, if the compiler compiles to a split stack run-time model instead of a linked-list-of-interleaved-frames-in-a-single-stack-model. 

(Note here that the run-time model of compiled languages is essentially a close corollary of a virtual machine.)

An interleaved single-stack model mixes local variables, parameters, and return addresses on the same stack. This means that a called function can walk all over both the link to the calling frame and the address in the calling code to return to, if the programmer is only a little bit careless. When that happens, the program usually loses complete track of what it's doing, and may begin doing many kinds of random and potentially damaging things.

This is where most of the stack vulnerabilities and stack attack vectors that you hear about in software come from.

Some engineers consider the single stack easier to manage and maintain than a split stack. You only have to ask the operating system to allocate one stack region, and you only have to worry about overflow and underflow in that region.

But interleaving the return addresses with the parameters and local variables comes at a trade-off cost of having to worry about frame bounds in every function, in addition to the cost of maintaining the frame.

With a (correctly) split stack, there is no need to maintain frames in most run-times because they maintain themselves, and it takes more than carelessness to overwrite the return address (and optional frame pointer for those languages which need one).

Given this, one might expect that most modern run-times would prefer the problems of maintaining a split stack over the problems of maintaining a single stack. 

But memory was tight in the 1980s (or so we thought), and the single stack was the model the industry grew up with. (Maybe you can tell, but I believe this was a serious error.)

(By the way, this is one of the key places where the 8086 misses being able to implement a near one-to-one mapping -- segmentation biases it towards the single-stack model. BP shares a segment register with SP. If BP had its own segment register, you could put the return address stack in it's own somewhat well-protected segment, but it doesn't, so you can't.)

Especially with CPUs that map well to the Forth model, Forth interpreters can be designed to compile to machine code, and run without an intermediary virtual machine. [JMR202010222211: I'm actually working -- a little at a time -- on a project that will do that, here: https://osdn.net/projects/splitstack-runtimelib/.]

One more topic that needs to be explicitly addressed: Most languages have only limited ability to extend their symbol tables, especially at run-time. Most text-only interpreters have severe limits to extending their symbol tables -- in some cases just ten possible user functions named fn0 to fn9, in some cases, no extensions at all.

I've implied this above, but, by design, Forth interpreters do not have such limits. Symbols defined by the programmer have pretty much equal standing with pre-defined symbols. This makes Forth even less like most interpreted languages.

Yes, Forth is often an interpreted language, but not like other interpreted languages.

And it's after midnight here, and I keep drifting off, so this rant ends here.

[JMR202010251351:

Discussing of threading added in a separate post: https://defining-computers.blogspot.com/2020/10/forth-threading-is-not-process-threads.html, also linked above.

]

No comments:

Post a Comment