In the early days of PC compatible computers, MS-DOS came with GW-BASIC, a BASIC interpreter with many built-in commands for graphics, event handling and file access. .We had commands like:
1000 ON KEY(1) GOSUB 100: REM call a subroutine when F1 key is pressed
1010 LINE (100,200)-(300,400): REM DRAW a line on the screen
1020 OPEN "myfile.txt" FOR INPUT AS #1 : REM speaks for itself
1030 PRINT USING "####.##";A : REM print number with specific formatting
1040 PRINT #1,"The numbers are:";A,B: REM print to a file, different parameter separators.
Each of these command had its own special syntax. Apart from ON KEY we also had ON PEN to specify an action for the light pen, a truly forgotten device, that nevertheless got its own keyword in the syntax. The parentheses around the end points in the LINE command were part of the syntax and the “-” did not mean subtraction, but it was there suggest from..to. The PRINT command had an optional USING part, but also an optional ‘#’ part to specify a file and different separators between parameters: “;” to specify that the next parameter had to be printed immediately after the previous one and “,’ to specify that it had to be printed at the next “tab stop”.
At the same time, GW-BASIC did not allow you to create new procedures, only the humble GOSUB, which is in some respects below the level of assembly language. QBASIC did add named subroutines, but calls to them did not look anything like the built-in commands.
COBOL also has a very large number of commands built into the language, each with its own syntax.
These languages represent one end of the extensibility spectrum. They came with many features included, but what you got, was all you would ever have.
Suffice it to say that no modern programming language comes with built-in commands to draw lines and circles or to bind keypress events to functions. If you need this type of functionality, you can load a library for it.
Pascal
Pascal is a nice step up from the horrors of GW-BASIC. It lets you define your own functions and procedures and even your own data types. But if you look closer to Pascal, you will find that built-in procedures can do many things that user-defined procedures can’t:
- The read and write procedures can take parameters of many different types. The same is true for some other built-in procedures, like reset, rewrite, new and dispose.
- The read and write procedures can take a variable number of parameters.
- The write procedure has special formatting syntax with colon characters to describe how a number should be printed. like
write(a:12:3);
Pascal clearly has its I/O functions baked into the language, using many privileged procedures.
Small versus Big Languages
Almost any modern programming languages has its I/O functionality outside the core language. Any I/O functionality in the standard library, could also be implemented in a library you could implement in the language itself. Programs in those languages can run on embedded systems that do not have terminal or file I/O, but very different I/O functionality instead.
C is a small language, but flexible enough to write its own I/O library in C itself. C. Functions from stdio.h, such as fopen and fread are ordinary functions that you can just write in C. Even the printf function, with its variable number and types of parameters, can be written in C, though it requires some trickery.. Better yet, heap allocation functions like malloc and free can just be written in C.
If you write code in C and don’t call any functions outside of your program, the resulting program will include very few external functions when it is linked. If you are on a RISC system without integer division, you get an integer division function linked with your program and you may get memcpy or its equivalent when you assign struct variables to one another. Software floating point functions are another thing you might get if your CPU has no hardware floating point and your program uses floating point. You can use C to write operating system kernels, boot loaders, embedded firmware, real-time applications and more.
Zig, C3 and Odin are similarly small languages that don’t force you to link with an excessively sized runtime library..
On the other hand, if your language has a garbage collector, your compiled programs will depend on a large piece of external code, the garbage collator itself. It takes away the control that you need when you write an operating system kernel or a real time embedded application.
C does not have all the extension features that exist in programming languages. It has no operator overloading, no generics, no true modules and the preprocessor macros are very crude by today’s standards. A language like C3 has more extensibility features in that respect, even though this language is still not very big.
Operator overloading makes a language more complex, but at the same time, it allows you to add data types like matrices, vectors, complex numbers and arbitrary size integers later, complete with algebraic operators, so we do not have to choose between including them in the base language (like Odin does) or not having the convenience of algebraic expressions with these types at all.
Ada, C++, Java, Go, Rust and D are much larger languages than for example C. Some of these have a garbage collector, Rust achieves memory safety using extensive compiler checks. Sine if these languages support exceptions, so an errors can cause the program to unwind the stack through many call levels and then return to the level at which the exception is caught.
While perl has hash tables and array lists as built-in types, most modern languages have the extensibility features to define them in their standard libraries, without them being necessary in the core language. Even a small language can have these features at its disposal, if it has enough extensibility features to add them.
The Upper End of the Spectrum
In terms of extensibility, LISP is probably the best language. While some languages brag about object-oriented features, LISP has always had the ability to be extended in a way to add them.
LISP has macros that can rewrite programs in an arbitrary way. You can add new control structures, like more advanced loops or switch statements. Very few programming languages have that ability.
LISP is one of very few languages that has no infix expressions. That makes its parser very simple. Once parsed, the LISP program is just a bunch of nested lists, a direct representation of the parse tree.
A honourable mention goes to FORTH, Like LISP, it sacrifices infix expressions, but unlike LISP, it does not parse an expression into a nested list, instead it directly executes each token and lets the stack do all the work. While LISP uses prefix notation with parentheses to indicate nesting, FORTH uses postfix notation without parentheses. FORTH can also build new control structures. FORTH taps into some of the unique extensibility features that LISP also has, but its implementation is considerably simpler. It could work on machines with very little memory and it would execute much faster than interpreted BASIC or LISP.. I should devote s separate post to this little language.