There are many ways in which programming languages may support runtime error handling. Languages like Python, C++ and Java support exceptions, but this feature has become less popular over the last few years, as it leads to problems with freeing up resources. Rust and Zig require you to handle errors at each call level and they allow you to simply return any error to the calling function with minimal additional code.
Some early BASIC interpreters would simply abort when they detected a runtime error. Turbo Pascal would do the same. It came with a mini-spreadsheet. While it did check for division by zero and avoided that error, a floating point overflow error would abort the program without any opportunity to save your data that you entered. Even GW-BASIC had an ON ERROR GOTO statement, which allowed you to keep the program running if an error occurred.
Error Detection
There are two ways a runtime error can be detected:
- A hardware trap. For example, most modern CPUs would trap on accessing memory addresses out of range and on division by zero. Some could even trap on integer overflow. On Unix systems, these hardware traps would cause your program to receive a ‘signal’. Your program would be aborted by default, but you can handle the signal by calling a handling function when it occurs.
- A check in software. Most I/O functions return a result code. For example if you try to open a non-existing file, your OS detects in software that the file does not exists. The open system call returns an error code. The C library function fopen will return a NULL pointer instead of a pointer to a valid FILE data structure. A hardware trap is never involved in this case.
Early Strategies
Many versions of BASIC had an ON ERROR GOTO statement. The error handler at that line would then try to fix the error condition and continue the program with RESUME or RESUME NEXT. The latter variant would skip the statement that caused the error, instead of retrying it. Instead of restarting, you could try to end the program more gracefully, possibly allowing the user to save unsaved data first.
C and Unix use the signal functions, that allow you to handle errors like out-of-bounds memory access and division by zero. C also has the infamous setjmp/longjmp functions. These functions implement a very crude way to handle exceptions. The function setjmp stores the contents of registers in a jmp_buf structure, including the current value of the stack pointer and the return address the setjmp function would return to. After this, setjmp returns the value 0. The function longjmp would restore the registers from the jmp_buf data structure and jump back to the point where setjmp would have returned to, but now returning a non-zero value. Every function called after setjmp could later call longjmp and return to the location of setjmp. The longjmp function could be called by a signal handler to throw the program back to a defined state, from which it could safely recover.
I/O errors in C would typically be handled by error results from function calls and each function had to propagate the error back to its caller.
There is also the possibility not to disrupt the control flow in case of an error, but to store a special result value like NaN or Infinity. This is typically used for floating point computations. The program is allowed to continue to run and at the end, some results are invalid, but others may be valid and useful.
Exceptions
Some languages implement exceptions, which act a bit like the setjmp/longjmp functions in C, but now there is a nicer language syntax around it. Plus you can have the exception handled at multiple levels. For example, if function A catches an exception and it calls function B, which also catches that exception, then B would handle the exception when it occurs. After B returns, the exception would again be handled by A.
In a simplistic implementation, when an exception occurs, the registers like the stack pointer are restored to the point where the exception was last caught. Any memory allocated at intermediate call levels would not be freed and would likely be leaked, And I’m not even mentioning other resources like open files, network connections and windows in a GUI. A bit more sophisticated implementation would visit each stack frame in turn and free all memory that was known to need freeing when that function returned. Garbage collected languages would not free the memory anyway and would leave it to the garbage collector at a later time. But even they would typically not release other resources.
Handling an exception properly, under all circumstances, is very hard. It is therefore a much less popular feature than a few decades ago. Exceptions are still great when you want to handle error conditions that are typically detected by a hardware trap, such as integer division by zero. Or for integer overflow in general.
It may be feasible to add error handling code to each I/O function call, but it would be very unwieldy to add this to every arithmetic expression that might overflow. Exceptions may still be a good solution for these.
Error Propagation
Rust and Zig use error propagation instead of exceptions. There are special result types that are logically the union of an error result and a regular result. So a function can return either an error result or a regular result. Function returns are checked at every call, but there is easy syntax to just return from the function early and just propagate the error result to the caller. Function A calls function B (that can return an error result), and function B opens a file. With some convenient syntax, function B can just return the error to A when the file open call returns an error result. In these languages, each function always returns to its caller and there is no magic stack unwinding.
Zig (and C3 and Odin) have the defer statement to specify that some statements have to be executed before returning from the function (or leaving another scope), regardless of how the function returns. These defer statements will be executed even in case of error propagation. These defer statements are typically used to free up resources that were allocated during that function.
Conclusion
Error handling is complex and no matter what strategy you use for it, error conditions are always the least tested aspects of a program. Whether it is acceptable to abort the program when a serious error occurs, depends on the situation. Embedded control systems often do not have the option to abort and they must keep the system under control under all circumstances. Your fly by wire system is never allowed to drop the plane from the sky. Of course a single program could abort in this case (and probably be restarted), as long as a fallback system is in place. Coping with error situations in cars, planes and nuclear power plants is an engineering discipline of its own.
Programs like text editors or spreadsheets, that allow a user to enter large amounts of data, must be designed to allow that data to be saved. Losing a few hours worth of spreadsheet data may not be as bad as a crashed plane, but it is certainly rude to lose that data without putting up a fight to save it.
Ignoring an error is never a good idea. Checking inputs to ensure that buffer overflows cannot happen and integers cannot overflow, is generally a good idea. It is generally better to refuse to complete a transaction because it fails range checks (even though the values are valid after all) than to complete it with a silent overflow and a bogus result.