Overflow checking

Written by

in

In the beginning we had 16-bit integers. £327.67 was still a lot of money and the total amount in a single sale of your shop was unlikely to be more than that. Until it was and you got a negative amount on your sales slip. When the amount was represented in decimal, you still found yourself in for surprises when amounts started to exceed £999.99 or whatever you thought the maximum would be.

How computer systems handle integer overflow, depends very much on the programming language used. Integer BASIC versions usually aborted the program with an overflow error. That sucked, but at least you knew there was something wrong. Many compiled languages on the other hand, silently ignored the error and just wrapped the number around. Compiled C typically ignores the error. Most CPUs do have an overflow flag, so the machine has the possibility to check for overflow and throw an error in this case. But the designers of RISC-V, in all their wisdom, decided to leave out the overflow status flag because C was the only language worth dealing with and nobody ever checked for integer overflow. Of course you can still do it on RISC-V, but it takes multiple instructions.

In some cases, the wrap-around behaviour of integer overflow is actually desired, especially in hash functions or when emulating the behaviour of real hardware. This mostly applies to unsigned integers. The C standard specifies wrap-around behaviour for unsigned integers and (until recently) specified signed integer overflow as undefined behaviour.

Possible Handling

There are many ways to deal with integer overflow:

  • Ignore it and wrap around, which is the default behaviour in C.
  • Set an integer status bit, which could be checked after every addition and subtraction. Most CPUs, such as x86 and ARM have this.
  • Trap the overflow in hardware. Some CPUs, like MIPS, can do this.
  • Have a ‘sticky’ overflow status bit. It could be cleared before a computation and could be checked at the end of a computation. If any operation caused overflow, the bit would be set. This is what the exception flags in floating point hardware actually do.
  • Use the maximum negative value 0x8000, 0x80000000 or 0x8000000000000000 as a special overflow value. If any of the inputs of an addition or subtraction have this value, or if an overflow occurs, the result is this value. It would be similar to the NaN value for floating point operations.
  • Redo the operation with floating point (some versions of BASIC) or arbitrary size integers (Python).
  • Throw an exception on integer overflow.
  • Panic (abort the program) on integer overflow
  • Saturate, i.e. let the result be either the maximum positive value (0x7ffffffff) or the maximum negative value (0x80000000). Some DSPs or SIMD instructions in some general purpose CPUs can do this.

Consequences

If you ignore integer overflow and just wrap around, integer addition retains its associative properties (a+b)+c is equal to a+(b+c). As soon as you start to saturate or trap the overflow condition, the associative property no longer holds. (MAXINT+1)-1 is not equal to MAXINT+(1-1). In the former case the result will be MAXINT-1 (in the case of saturation) or an error condition. In the latter case the result will be MAXINT. Therefore, in some cases, you are saved by not detecting overflow, but this is not something you should be relying on.

One early observation was that (A-B) < 0 is not a reliable way to check A < B. A-B could overflow and you would get the wrong result. For 16-bit signed integers, 30000 – -10000 overflows and gives a negative number, wile of course 30000 > -10000. Most CPUs have dedicated compare and conditional branch instructions that give you the correct result in all cases. The 8080 did not have an overflow flag at all and the 6502 and Z80 had an overflow flag, but no dedicated Branch-if-less-than instruction. You had to do multiple conditional branches to properly test for less-than of signed quantities. The 6809 on the other hand did have a BLT (Branch if less than) instruction, as did the 8086, 68000, ARM and anything more modern.

Computing the average of two integers can fail due to overflow, if done in a straightforward way. There are tricks to make it work under all circumstances. For unsigned integers we have the expression (a&b) + ((a^b)>>1)

Even if the computation itself does not overflow (as it is done in 32-bit on most modern machines), storing the result in a shorter variable (8 or 16 bits) could still overflow. Compilers may or may not add checks for this. For C they typically don’t,

In some cases you have to be very careful to cast the operands to a wider type before doing the operations. For example:

uint32_t a=1000000;
uint32_t b=2000000;
uint64_t c = (uint64_t)a * (uint64_t)b;

If we forget to cast, we do a 32×32 bit multiplication that overflows.

The Zig programming language is very picky about overflow checking. It even checks unsigned operations. If you do want wrap-around behaviour (for example for hash functions), you have to explicitly use operations that wrap around, like ‘+%’ and ‘-%’.

If a multiplication is part of the size computation for an allocation, we could allocate a way too small buffer if the multiplication overflows, which in turn could lead to memory corruption. It is therefore important to detect the overflow in this case. (but in the case of C, this is often not possible).

Avoiding Overflow

Integer overflow can often be avoided by choosing the data types for your computation sufficiently wide and to range-check your inputs.

For example, in a webshop (for consumers), it makes no sense to accept orders of more than 100 of the same item. If you allow the user to type any integer, even 1 billion, and do not check this is within reasonable limits, you could cause an overflow when multiplying it by the item price. If both the item price and the number of items are within a set range, you can prove that the multiplication cannot overflow.

With simple formulas, it is often feasible to assure that no overflow occurs whatever the input values, when the inputs are all within the acceptable range. If you also range-check intermediate values (like the running total of your order), you can prevent overflow from happening and possibly error out with a sensible message, such as “Orders above £10000 are not accepted”.

Dealing with Overflow

If you cannot avoid overflow (or stumble across it despite best efforts), you have to deal with it. It depends entirely on the situation what you should do. If your job is controlling an aircraft, it’s never acceptable to just abort and let the plane drop from the sky. It your job is computing this year’s tax statement, it might be good to abort, so somebody could recompile the program with larger integer types and redo the calculations. If your program contains lots of unsaved user data, you should at least try to save that data before aborting. A spreadsheet may convert the integer values to floating point values and redo the calculation that way.

Maybe the database format needs to be changed and all existing databases converted, because some fixed width fields are now too small to hold values that can occur after all. You should see these things coming in advance so you can plan the conversion ahead. For example the Unix timestamp overflow in 2038.

As with any error handling, there is never one good way that works in all cases.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *