Category: Uncategorized

  • What was wrong with Pascal?

    Pascal was developed in 1970 by Niklaus Wirth. It was primarily an educational programming language that taught students structured programming and data structures.

    Pascal did have some limitations in its standard form, that made it less suitable for some real-world applications. Later languages like Modula-2, Oberon and Ada, were better for writing larger real-world programs, but they never had the success that Pascal once had.

    Pascal’s limitations were already pointed out by Brian Kernighan in a classic paper “why Pascal Is Not My Favorite Programming Language” https://www.cs.virginia.edu/~evans/cs655/readings/bwk-on-pascal.html. More practical implementations of Pascal added features that mitigated those limitations, but these features were not standard. However, the situation was not half as bad as with BASIC, where even the basic syntax of procedure definitions differed wildly among dialects.

    Pascal on Microcomputers

    One of the first implementations of Pascal on microcomputers was UCSD Pascal. It was popular on the old Apple II and on many other machines. As it used P-code (comparable to Java byte code) instead of native compilation, it was not the fasted Pascal system on the planet. It did not implement everything of the full standard, but it did add useful features, like separately compiled modules, usable file access and a usable string type.

    Turbo Pascal came out in 1983, both for CP/M and for MS-DOS. It contained an editor and a compiler in one large program and you could edit, compile and run your programs from that program. It was a very basic IDE. The compiler was extremely fast for the time and it compiled to native machine code. Turbo Pascal had usable file access and a usable string type too, plus some more useful features, like bitwise operations. Modules (units) came in version 4 of Turbo Pascal.

    Pascal was very popular in the 1980s and much of the early Apple Macintosh software was written in Pascal. In the 1990s, Turbo Pascal evolved into Delphi, adding C++ style object-oriented programming and GUI development.

    But What Was Wrong with it?

    I should not repeat what Brian Kernighan was stating in his 1981 paper already, but to sum some things up (not all are mentioned in that paper):

    • No default clause with the CASE statement, if the selector matched none of the alternatives, behaviour was undefined according to the standard. Most if not all implementations did add an ELSE or OTHERWISE clause to CASE.
    • An array’s length is part of its type and if you want to implement a set of functions acting on arrays, each function could only act on one size array. For example a function to multiply matrices could not take matrices of different sizes. A later version of the Pascal standard added conformant array parameters, but few if any Pascal compilers implemented that.
    • What’s true for arrays in general, is also true for strings. In standard Pascal, a string array type has a predefined length and it is impossible to create functions that work on strings of any length. In many cases the programmer had to fill out string literals with spaces to give them all the same length. UCSD Pascal and Turbo Pascal did add a usable string type.
    • Only a limited number of operator precedences, therefore the AND and OR operators had a higher precedence than the relational operators like ‘<=’ or ‘=’.
    • Variables of the main program had to be declared far away from the main program itself, with all procedures and functions in between.
    • No usable file access in standard Pascal. For example a program could not specify a name of a file to open.
    • A program was a stand-alone entity. Pascal was not unique in this. For example Algol-68 had the same flaw. USCD Pascal and later versions of Turbo Pascal did add modules (that were called units).
    • Pascal has no exception mechanism. Certain errors simply cause the program to abort. Turbo Pascal came with a mini-spreadsheet program. The program checked against division by zero, but when a program got a floating point overflow because you multiplied two huge numbers, it was game over for the program and you lost your work.
    • You could not write procedures or functions that took a variable number of parameters or parameters of different types. Yet the built-in procedures read and write could take parameters of different type and a variable number of parameters. You could not write your own procedures that had the same flexibility as the built-in ones.
    • And one of my pet peeves. Identifiers in Pascal consist of letters and digits only. No spaces or underscores are allowed. While in German it is normal to write composite words together, in English it is not. Therefore they ended up putting uppercase letters in the middle of names to mark the start of the constituent words, leading to the ugly camelCase convention. For centuries, words had only one capital letter at the start (if they were not written in all uppercase). Now this camelCase convention has proliferated into words and brand names like eBook and iPhone. I really prefer the snake_case convention, but Pascal does not support it. As a saving grace, Pascal is case-insensitive.

    And the Good Things

    Of course Pascal had some good things as well:

    • In 1557, the Welsh mathematician Robert Recorde invented the equals sign. Remember, the symbol ‘=’ means equals, not assignment. Pascal users ‘:=” for assignment and ‘=’ for equals, like Modula-2 and Ada, but unlike C, Java, Python and anything else introduced after 1990. The C syntax has led to countless bugs in C programs because ‘=’ was written instead of ‘==’ in a conditional expression.
    • The syntax is generally easier to read than that of C. It also has fewer pitfalls.
    • The use of GOTO was discouraged, as it should. Structured programming was the norm in Pascal.
    • Pascal had recursive procedures and functions, like any language should have, but FORTRAN didn’t.
    • A true Boolean type (not in C until C99), real enumerated types (not fancy names for integer constants like in C) and a SET data type.
    • The string data type in Pascal (those dialects that had it) was a lot more memory-safe than C strings. A string variable recorded its maximum size and the actual string length of the currently stored string. Any function that altered a string variable, checked the maximum string size.
    • As opposed to C, Pascal procedures were aware of the sizes of any arrays passed to them making them more memory-safe. Arrays ‘decaying’ into pointers when passed as a parameter, was really a bad idea.
    • Compilers were comparatively fast and they required little memory (compared to compilers of other languages).

  • What was wrong with BASIC?

    The BASIC programming language was introduced in 1964 at Dartmouth College by John Kemeny and Thomas Kurtz. It was one of the first time sharing systems, where students could sit behind a teletype and begin typing their programs and start running them immediately. In those days it was far more common that you had to type your program on punched cards, hand it over to the computer centre and get your printout back after a few hours.

    BASIC was easy to learn and it was useful to university students who were not computer specialists. It supported many mathematical functions and early on, it supported matrix operations, a feature that got lost when BASIC was later ported to microcomputers. And what else got lost too?

    No Structured Programming

    The earliest versions of Dartmouth BASIC did not support structured programming, other than the FOR..NEXT loop. BASIC used line numbers for two different purposes:

    • To support editing of programs. Terminals were teletypes that printed everything you typed on paper. There was no computer screen , where you could move the cursor around and start editing code anywhere in your program. The internal editor kept the program ordered by line number. If you want to insert a line into a program, you simply enter a line with an intermediate line number (therefore you started your program using line numbers that incremented by 10, so there were intermediate numbers). By entering a line with an existing number, you replace that line. When you enter the line number by itself, you delete that line.
    • As labels for GOTO and GOSUB.

    At least in FORTRAN, you had labels only at lines to which you wanted to jump. In BASIC, every line could be jumped to. This may be somewhat easier to learn, but it made programs harder to maintain.

    The first version of FORTRAN had no subroutines at all, but all versions after that, had named subroutines, with local variables and named parameters. Nothing of that in early BASIC. In BASIC you had GOSUB to a line number. With Assembly language you could at least give your subroutines meaningful names. This was not the case with early versions of BASIC (including early versions for microcomputers). That made BASIC feel like an even lower level language than Assembly language in that respect. If you renumbered your program to get the line numbers less haphazard, your subroutines would change line numbers.

    The very earliest BASIC could only specify line numbers in an IF statement, for example:IF A<B THEN 30 ELSE 50

    All versions of Microsoft BASIC I know, did allow you to execute one or more statements conditionally, so you did not need line numbers and implied GOTO statements for all IF statements, for example IF A<B THEN PRINT "TOO LOW": C=C+1

    Not all versions of Microsoft BASIC had ELSE though. And ANSI Minimal BASIC only supported the variant with line numbers, as did TI BASIC.

    Too Much Variation

    Microsoft was the main developer of BASIC implementations for microcomputers. But Sinclair, Acorn and TI had their own implementations of BASIC. There was very little that the BASIC versions had in common. Some only supported variable names of a single letter (Sinclair, other than numeric scalar variables), others allowed arbitrary length names. Most had string variables that allowed strings of variable length, but some versions of BASIC had all strings in a string array the same length (Sinclair).

    Even within Microsoft itself, there was a wild difference between the commands and functions supported. For example, some BASIC versions supported user-defined functions, but only with numeric values and a single numeric parameter. Others supported both string and numeric functions with an arbitrary number of parameters of any type. Some BASIC versions did not have user defined functions at all. In some versions of Microsoft BASIC you had long variable names, all characters significant, but in other versions, only the first two characters of a variable name counted. So the variables DEBT and DECLARED would be the same. And if you dared to use a variable name MORTGAGE, the parser would mix up and tokenise the letters OR into a single byte (the OR operator). This happened only in some versions of Microsoft BASIC (for example MSX BASIC), but not in others (for example GW-BASIC on MS-DOS). On some computer systems, the disk BASIC was not even compatible with the cassette BASIC on the same machine.

    Acorn introduced BBC BASIC in 1981. It had structured loops like REPEAT..UNTIL, it had named procedures and named multi-line functions, while it was still largely compatible with the then current versions of Microsoft BASIC. Later versions of BBC BASIC added a full block-structured IF-THEN-ELSE-ENDIF, a proper WHILE loop and more. Procedures and functions did had parameters and local variables.

    Unfortunately, BBC BASIC completely ignored the ANSI standard for full BASIC. Microsoft did not. When they introduced Quick Basic and later QBASIC, they had named subroutines (with local variables and parameters) and the usual block-structured constructs, but they were completely different from those of BBC BASIC.

    You really can’t call two versions of BASIC the same language if the basic constructs for named subroutines or procedures are so different. You might as well call Ada and Pascal the same language.

    Specialised Syntax

    In BASIC, each command tends to have its own unique syntax. For example, the PRINT statement uses commas and semicolons between parameters to determine how the output will be formatted. A numeric parameter starting with # specifies a file to print to. Many BASIC versions have a USING parameter to allow formatted output. We have funny specialised syntaxes in Microsoft BASIC for OPEN (admittedly quite readable), like : OPEN "name" FOR INPUT AS #1

    And we have funny little syntaxes for line drawing, like LINE (100,100)-(200,150)

    User-defined subroutines always have to be called with CALL, like CALL MYSUB(10,20)

    They are not allowed to look like normal BASIC commands and they can certainly not have the same wild syntax variations as built-in commands.

    Pascal commits the same sin to a lesser degree. Pascal is very rigid in not allowing procedures with a variable number of parameters or parameters of different types. But the built-in procedures read and write can have a variable number of parameters and parameters of many different types. Add to this the weird formatting syntax like write(a:12:7); to specify that the number A must be printed with 12 characters and 7 digits after the decimal point. But other than the colon formatting syntax, calls to read and write look like ordinary procedure calls.

    C on the other hand,has no specialised syntax for printf or fopen. These are just ordinary functions. the printf function does have a variable number of arguments and it is somewhat of a challenge to implement such a function, but it can be done. Details of the I/O library are not baked into the syntax of the language itself.

    And the Good Things?

    Of course, BASIC did have some good things too:

    • Most versions of BASIC supported variable length strings and quite a few functions to manipulate strings. Standard Pascal lacked a variable length string type and the language went out of its way to make you suffer. You ended up counting out spaces to make all messages the same length, for example if they had to be passed as parameters. Turbo Pascal did have a usable string type, but with extensions like this, there is no standardisation among implementations.
    • Data embedded in the program. READ, DATA and RESTORE may look clumsy compared to initialised C arrays, but standard Pascal did not have a way to embed data in the program itself. You had to read it from file or you had to type endless lines of assignment statements to initialise the elements of an array.
    • Low memory usage. This was key in early microcomputers. A simple Microsoft BASIC fit into 8 kilobytes of ROM and programs were stored in RAM in such a way that every keyword occupied one byte. There was no separate storage of source code and parsed byte code. The LIST command showed you the program as you had typed it, with all byte tokens converted back to full keywords and all binary representations of line numbers printed in decimal.Interpretation was usually rather slow, but it got done with very little memory.
    • Floating point support and a full set of mathematical functions. Apart from integer BASIC versions of some very stripped down versions, you had full trigonometric and logarithmic functions.
  • Cross-border AM radio: the big thing that’s gone missing

    Back in the early 1980s, The Netherlands had four national radio stations and three of these were on mediumwave. They could be heard far outside our national borders. On the other hand, two Dutch language stations from Belgium could be heard throughout The Netherlands, as well as two French language stations from that country. That was not all you could hear though: there were a bunch of German stations and some English stations to be heard as well. BBC World Service was booming in on 648 kHz. During some years, we had some good pirate stations too.

    Much radio listening still happened on mediumwave, as not all radios had the FM band back then and one of the national radio stations in The Netherlands was on mediumwave only. Starting in 1985, we got an additional national station (Hilversum 5) and it was on mediumwave only, but it took the frequency of Hilversum 1, which became FM-only.

    If your radio had longwave too, you got a bunch of foreign stations: from France, Luxembourg, the UK, Germany and some other countries. After sunset, the mediumwave band came to life and if you took the trouble, you could hear radio stations from all over Europe. These were some of the main national programmes of those countries, hence it was quality stuff. BBC Radio 1 played the latest British pop songs, weeks or sometimes months before they were allowed to enter our country. Radio Moscow once had a Dutch language service on mediumwave, one of the few international broadcasters to have one, along with South Africa (that and an even more objectionable regime and that was on the 16 meter band on shortwave). But at night the mediumwave band was indeed used as another shortwave band, on both sides of the Iron Curtain.

    If you had a good radio and you were skilled with the rotating ferrite bar antenna, you could often pick up two different stations on the same channel. Hearing stations outside Europe was a challenge though, as there were many more European stations occupying those frequencies. Today it’s relatively easy to pick up North Africa for example.

    And of course many, many countries had a presence on shortwave. Back in the day, the 49 meter band was chock full of stations, whatever time of the day you listened. Most European countries were all there, all the time. Higher shortwave bands, like 25m, 19m and 16m gave you the more distant stations. Shortwave was not listened to by everybody though. But everybody got exposed to mediumwave during day-to-day listening. And everybody occasionally heard foreign stations, if only unintentionally.

    We are less exposed to radio stations from other European countries than we were a few decades ago. Since 2003, the FM band in The Netherlands was reorganised and it became very overpopulated, reducing the opportunities to hear foreign stations on that band. DAB has a far shorter reach across the border than FM used to have. We have internet radio now, but some countries (in particular the UK) reduce their internet radio streams to within their own country. With true broadcast radio, there’s nothing between you and the station you want to hear, that can be controlled by governments. You can add jamming stations, but you can’t take radio propagation away. Compare that to the many links that exist between an internet radio station and you.

    Exposure to radio stations from other European countries was a good thing, even if you were not intentionally listening to them. It created an awareness of a being part of a larger Europe. In most places in The Netherlands we have access to over 50 stations on DAB+, but many of these are non-stop computer generated playlist stations without a soul. BBC World Service is actually available on DAB+ in The Netherlands, which is a good thing. But this will only last as long as somebody pays to have it there. And we should have German and French voices as well, plus something from Belgium. Some of those more-of-the-same popular music stations can really be missed and be exchanged for existing foreign radio stations. I know,this is not how commercial radio works. But why should everything work according to the rules of commercial radio?

    The UK is the last country with a significant presence on mediumwave in our part of Europe and it will likely stop by the end of this year. Nobody will bring mediumwave back. But maybe the EU should facilitate airing public radio broadcasts outside national borders, to replace what mediumwave used to do for us naturally. There’s spare capacity on some DAB+ multiplexes, like that of our national public broadcaster NPO. It carries 10 programmes, 2 of which will likely be axed by the end of this year. You can easily have 12 programmes with good audio. Why not get some Belgian, French or German stations there?

  • Why the Dutch keyboard layout never caught on.

    The Netherlands is one of the few countries in Europe that uses the US-QWERTY keyboard layout for computers. Before there were personal computers, we had typewriters and nearly all typewriters sold in The Netherlands had two dead keys: one dead key contained the acute and grave accents and the other contained the diaeresis and circumflex accents. When you press a dead key on a typewriter, the accent is printed, but the carriage does not advance. The next letter you type gets printed under the accent you just typed. The positions of many symbols on the keyboard were not standardised: different brands of typewriters could have the question mark in a different location. The letters always followed the QWERTY layout. Some did have a special key for the letter ij, on others you had to type the letters I and J separately.

    Typewriters in the UK and the USA typically didn’t have dead keys. If you really, really want to type an accented letter, you could type the apostrophe, then backspace and then the letter. The same could we done with the double quote to get something that resembled a letter with an umlaut or diaeresis. In The Netherlands we used to type a comma, then backspace and then the letter C to get a C-cedilla (ç). Typewriters in Germany had dedicated keys for Ä, Ö, Ü and ß. Typewriters in Sweden had Ä, Ö and Å, in Norway and Denmark they had Æ, Ø and Å. Most of them had dead keys for accented letter too.

    France and Belgium had the weird AZERTY layout. French typewriters typically had a single dead key for circumflex and diaeresis, the few letters with acute and grave had their own keys, as did c-cedilla, but all only in lowercase.

    When personal computers were introduced, nearly all European countries standardised on a keyboard layout that was similar to the layout of the local typewriters, except for The Netherlands, that used the US layout, without any support for accented letters. In Dutch, accented letters are infrequent, but they are still important. The IBM-PC allowed you to type arbitrary symbols by holding down the Alt-key and then typing the numeric code of the character on the numeric keypad. For example the letter ë could be had by typing Alt-1-3-7. Oddly enough, this still works in modern Windows, even though it no longer uses that character encoding at all. Yes, the old CP-437 codes are translated to whatever Unicode character is applicable.

    National keyboard layouts other than US-QWERTY, all have one additional key left of the leftmost letter on the lower row. On QWERTY, this means left of the Z key, but on QWERTZ, it’s left of the Y and on AZERTY, it’s left of the W. In most national layouts, this key has the less-than and greater-than symbols..As there is an extra key in that position, the left shift key is much smaller than on the US keyboard. Some US keyboards (ISO layout) have this additional key too, while it is redundant for US-QWERTY. Many typists prefer the true ANSI layout with the wider shift key and the horizontal Enter key, over the ISO layout with the narrow shift key and the vertical Enter key.

    National keyboard layouts other then US-QWERTY, also use the right Alt key (marked as Alt-Gr) to access additional symbols. The ASCII characters @, [, ], { and } often require the Alt-Gr key. Most national keyboard layouts use the keys right of the letters for national letters (like Ä, Æ or Ñ) and/or dead keys, making fewer keys available for symbols. Only the UK keyboard layout does not have national letters or dead keys and is very similar to US-QWERTY. It swaps the @ and double-quote, it puts the £ instead of the # above the 3 and it puts the # on the key normally used by the \ symbol, which moves to the additional key left of the Z.

    Programmers prefer the straight US-QWERTY layout over any national keyboard layout, because of the easy access to square brackets and curly braces, which are used frequently in C and similar programming languages.

    Interestingly enough, the Netherlands does have a national keyboard layout. See https://en.wikipedia.org/wiki/List_of_QWERTY_keyboard_language_variants under Dutch. This keyboard has dead keys, like the old typewriters and shuffles the symbols around quite a bit. the square brackets end up on the key left of the Z. These proper Dutch keyboards are very rare. They may have been used by government institutions and Dutch publishers, but as 99% of the users has US-QWERTY at home, they are used to it and want to use it for work too.

    The reason why the Dutch layout never caught on may be a combination of programmer preference and cost awareness. US keyboards are made in larger series, therefore they tend to be somewhat cheaper. Schools may have selected US-QWERTY because it is more practical for programming.

    In Belgium, even in the Dutch speaking part, AZERTY is still the norm. Some programmers do use QWERTY keyboards, but that’s always a special order.

    At least since the 1990s, Windows lets you configure the US-QWERTY keyboard as US International with dead keys. This changes the following:

    • the apostrophe and double quote key becomes a dead key for acute accent and diaeresis. The same goes for the Caret (circumflex on shift 6) and the grave and tilde key.
    • The right Alt key gives access to many additional symbols and accented letters. Unfortunately for Dutch users, the letters ë and ï are not accessible this way, they require the dead double quote key instead.

    The US International layout with dead keys, is considered the Dutch keyboard layout, but this is not the same as the real Dutch keyboard layout. The big disadvantage is that you need to type an additional space whenever you type an apostrophe or double quote (or caret, left quote or tilde). This is super annoying for programmers, but it can also get in your way when typing just text..

    Linux distributions come with another option: US International with AltGr dead keys. It differs from US international with dead keys in the following way:

    • The dead keys only become dead when you type them with AltGr (the right Alt key). When you type the apostrophe key normally, you just get the apostrophe. It only becomes the dead acute accent key when typed with AltGr.
    • The set of symbols accessible with AltGr (without dead keys) is changed somewhat. the imported Dutch accented letters ë and ï are now in.

    But there are other layouts based on US-QWERTY as well, see for example: https://altgr-weur.eu/

    In Linux you can also configure a Compose key. You can use the right Control key, the right “Windows” key or the Menu key for that purpose. The disadvantage is that it requires three keystrokes to get a composed character. For example you type Compose, followed by /, followed by o to get ø. The advantage is you get access to many more symbols than with just dead keys or Alt-Gr combinations and that these symbols are mostly logical and easy to remember combinations of ASCII characters.

    Finally, Linux allows you to type any arbitrary Unicode character by typing first Ctrl-Shift-u, then the hex code of the desired character and finally Enter. For example Ctrl-shift-u, then the letter e b, then Enter gives you ë.

  • Open source radio receiver projects

    Today’s single-chip radio tuner chips make it possible for hobbyists to construct decent quality broadcast receivers. One chip that has been around for 15 years or so is the Skyworks (formerly known as Silicon Labs) Si4735. This chip contains a full LW/MW/SW/FM radio tuner from antenna to audio. It requires a microcontroller to control it, for example the 8-bit AVR processor used by Arduino. There are many hobbyist and open-source designs based on this chip. The Si4735 is an SDR radio internally, it does all filtering and demodulation on its internal DSP. There is a custom firmware blob for it that can demodulate SSB. However, this firmware blob is not open source. The AVR sends the Si475 its commands to tune to a specified frequency and set its other operating parameters, The AVR controls the display and the user controls of the radio. There are many ready-made Chines radios designed around this chip (or the Si4732) and an Arduino/AVR. This chip gives nearly continuous LW/MW/SW coverage up to 30 MHz.It is also the main part (or the final IF filter/demodulator) of nearly all world band radios currently on the market.

    Another single-chip radio tuner is the TEF-6686, designed by NXP. This chip contains a very capable FM tuner, along with LW, MW and SW. SW coverage is limited to 27 MHz (not 30) and there is no way to do SSB. As microcontrollers are now predominantly 32-bit, the TEF-6686 radios (and some newer Si473x designs too) uses an ESP32 microcontroller. One such open-source design was provided by PE6PVB https://www.pe5pvb.nl. This design has found its way to numerous ready-made Chinese radios too.

    There is also the Skyworks Si4684, a complete FM/DAB+ radio on a chip. PE5PVB designed a very capable DAB+ radio around it, using this chip, an ESP32 and a colour display. As the ESP32 does slide show decoding and display, the software uses nearly all the flash of the ESP32, so the FM side of the chip cannot be accessed using the current software. Unfortunately, there are no ready-made Chinese versions of this radio yet. DAB+ is not a thing in China and many markets these Chinese radios get exported to.,

    As interesting as these projects are, they leave the nitty-gritty radio stuff to a closed-source DSP and the open-source part is only about user control and sometimes RDS decoding or decoding of other digital data that comes with the radio signal. The Pico Rx project is different though. See https://github.com/dawsonjon/PicoRX There a Raspberry Pi Pico does all the hard work of an SDR in software. It uses minimalistic front-end circuitry (a set of analogue multiplexers to act as an IQ mixer and some frontend filters) and uses the on-board circuitry of the Pico itself for the rest, including A/D converters and the I/Q oscillator signal generation. This circuit gives you a capable all-mode HF receiver (0-30 MHz), optionally with a spectrum display. All the DSP stuff is done on the Pico, which is a fairly powerful dual-core 32-bit microcontroller in its own right. The radio can run on batteries and draws little current.