C was developed in 1972 by Brian Kernighan and Dennis Ritchie as a systems programming language to implement the Unix operating system. It was based on B, which was in turn based on BCPL, which was in tern based in CPL, a very full-featured programming language. B was the most minimalist of the series: it had only one data type, serving both as integer and as pointer (and as a sequence of characters) C added new data types to it, like separate char, short, int and long, plus pointers. Structures ware added very soon..C became very popular in the 1980s, eventually overtaking Pascal.
What’s Wrong
There are a lot of things wrong with C. They have been discussed at nauseam already, so I won’t go deep here. Here are a few of my favourites:
- Syntactic pitfalls, for example a stray semicolon after if or while, making the next statement separate from the conditional statement, even though it looks like it is controlled by it. There is also fall-trhough in switch statements and the dreaded single ‘=’ in a conditional where ‘==’ was intended. Even if you know to be careful with these, one in a thousand times your attention slips and a new bug is born.
- Null-terminated strings might have been a good idea in the 1970s, when every byte counted, nowadays they are a bug magnet. It is not easy to find the length of a string, therefore it is not easy to ensure that a string will fit the buffer allocated to it. For example, before uyou call a function like ‘strcat’, you need two calls to ‘strlen’ to determine the lengths of the two inputs. You need to add them and add one for the terminating null byte, before you can compare it to the size of the destination buffer.
- When an array is passed as a parameter to a function, no information about its length is automatically passed with it. From the point of view of the called function, it is just a pointer. We say that arrays decay into pointers. If you want to pass the array length to a function, it has to be done in a separate parameter. Both the caller and the called function have to do the right thing to make it work,
- Macros are very error-prone, in particular function-like macros. You always need to surround the parameters with parentheses and put the resulting expression in parentheses as well. The resulting expression has to look like a single (parenthesised) expression or as a do-while statement. C macros are not Turing-complete (which may be a good thing after all) as conditional evaluation cannot occur during expansion of a macro. There’s always m4 if you want this kind of flexibility.
- Header files are there only by convention, the language itself has no clear idea about module interfaces and modules. We end up using external tools to sort out the dependencies between object files and header files or editing Makefiles manually.
And we are lucky that in C89 (the first ANSI standard), the full parameter list is part of the function prototype (its declaration in the header). In earlier versions this was not the case. Your header only specified the name and the return type of a function. For example, a function that took one integer as a a parameter, could be called with two floats and a pointer as parameters instead. The compiler had no way of knowing that the function was called in the wrong way, as it only looked at one source file at a time. A separate program called ‘lint’ was there to find inconsistencies like that.
What’s Good
There are a lot of good things in C:
- C lets you do low level things, hat standard Pascal does not allow. C lets you cast any integer to a pointer and then access memory via that pointer. C lets you do pointer arithmetic. C lets you do bitwise ‘and’ and ‘or’, which is not standard in Pascal. C lets you implement a function like ‘malloc’ in C itself.
- C does not restrict you to use arrays of all the same size if you want to pass them as parameters to a function.
- As opposed to early standards of Pascal, C lets you compile parts of your program separately and it lets you reference functions in other parts via header files. It’s lower level than true modules, but it can be done.
- Any C function that you can use, can be implemented in the language itself. Compare that to Pascal, where you cannot implement functions like ‘read’ and ‘write’ as they take a variable number of parameters. On top of that, write has the special formatting syntax using the colon character. The C standard library is clearly separate from the language itself and it can be completely implemented in C.
- C is low level, so it does not require an extensive runtime library. C on an embedded system, only requires a small initialisation routine. You can run C programs (without most of the standard library functions) on a bare metal system with no operating system.
- C control structures are flexible, compared to Pascal. You can terminate functions early with ‘return’ and terminate loops early with ‘break’. This helps avoid excessive nesting, stupid additional Booleans and ‘goto’ statements.
- It is usually easy to inspect the machine code generated by the compiler and compare it with the C source code.
- A lot of libraries are written in C.
- C compilers are available for every platform under the sun.
C Alternatives
Many alternatives exist to C.
First the big languages:
- C++ is a superset of C. It has objects and inheritance, it has operator overloading and exceptions. Modern C++ adds smart pointers (that have a single owner at any given time, like in Rust) and it has convenient data types, defined by STL, that ‘just work’. It is a highly complex language. And because it is a superset of C, the dirty pitfalls are still there. You can still do manual allocation with ‘malloc’. Because of that, it can be harder to know what’s the right thing to do for any given data type. C++ does not have a garbage collector. If you leaned C++ in the 1990s, you will be amazed of what has been added to the language during the past decades.
- Go does have a garbage collector and it also has parallelism built-in. It was originally developed by Google. See https://go.dev, t is the ideal language for multi-threaded servers. Go aims to be a safe language, where simple mistakes cannot lead to memory corruption.
- Rust on the other hand avoids the garbage collector, but instead it uses an ownership model for each dynamically allocated object. At any given time, one piece of the program owns the object. References can be borrowed by other functions. Rust is not a truly object-oriented language with inheritance, but most of the benefits can be had with interfaces, that are called ‘traits’ in Rust. There are no exceptions, but the language helps you to handle error returns at each function call level. See https://rust-lang.org. Like Go, Rust aims to be a safe language.
- D. is an extremely feature-rich language (including an optional garbage collector). It has many high-level constructs, but as opposed to Python, it is still statically typed and still truly compiled. And it is mostly C-like syntax. See https://dlang.org
There are also smaller languages that want to stay closer to the true spirit of C. They want to fix some of the flaws of C, without introducing highly complex features like garbage collection, multiple inheritance, exceptions or parallel execution.. Some of these are single-developer projects, therefore they have no large communities around them. Some developers are very firm about features that will never be part of their languages, like inheritance, operator overloading, exceptions or macros. These languages do tend to have explicit allocation, a ‘defer’ statement (specify that something must be done whenever leaving a scope) and array slices. Some of these languages are:
- Zig has no macros, but it has compile-time execution instead. Learn one language and program the build system, generics and everything else. Memory allocators are explicit, error handling is explicit and integer overflow is checked by default. It can directly include C headers to call C functions. See https://ziglang.org
- Odin is another language at roughly the same level. Like in GO, there is no ‘while’ keyword and the ‘for’ loop allows you to just specify the terminating condition, so it behaves exactly like ‘while’ in C. Odin does not have any methods and a limited form of polymorphism. Map types (hash tables) are part of the language itself. See https:://odin-lang.org
- C3 has operator overloading and it has a macro system that closely matches the desired use cases. It has explicit error handling using an ‘Optional’ type. It supports contracts (assertion checking). C3 has a very C-like syntax, but it has capitalisation rules to distinguish type names, constant names and other names (this to simplify parsing). See https://c3lang.org
None of these smaller languages are going to displace C in the near future. Of the bigger languages mentioned, C++ is extremely widespread for large applications and Rust is taking over some of the code in the Linux kernel and system utilities, that were originally programmed in C.
Leave a Reply