Operator overloading

Written by

in

Last time I talked about garbage collection, a feature that some programming languages have and other don’t. Garbage collection adds runtime overhead, but makes memory management safer. There is also a third way, used by Rust, in which the compiler makes checks at compile time and guarantees memory safety that way, but at the cost of very complex restrictions, hard to grasp by new programmers.

This time I will take about operator overloading. Some people like the feature, but others insist on it being left out of their favourite programming language because it adds unnecessary complexity to the language.

Infix operators

In the mid 1950s, FORTRAN introduced infix operators, complete with precedence rules ( in A+B*C the multiplication between B and C is performed first, before adding the result to A) and parentheses (In A*(B+C), the addition between B and C is performed first, before multiplying the result by A). This infix notation closely follows algebraic formulas used in mathematics for centuries.

Nearly all programming languages adopted infix expressions. Notable exceptions are FORTH (that uses Reverse Polish Notation, like B C + A *) or LISP (that puts the operator first and uses lots of parentheses, like (* A (+ B C))).

Infix expressions were a step up from assembly, where we used to write things like:

ADD T, B, C
MUL T, T, A

Operands in infix expressions can be variables, constants, array elements and function calls. Infix expressions are typically used on the right hand side of an assignment. An assignment statement in FORTRAN may look like:

A = A + COS(PHI) + 2*SIN(B(I))

Where B is an array (indexed by I), A and PHI are variables and COS and SIN are functions.

Operator Overloading

Languages like FORTRAN, BASIC, Pascal and C can use operators on a fixed number of data types, always including integers and real numbers. In Pascal we can use + and * operators for sets, in Basic we can use + to concatenate strings and in C we can use + to add to pointers. Nearly all languages have relational operations like < and > and also Boolean operators like AND and OR. But the types and semantics, as well as the data types that can be used, are hardcoded into the programming language. For example, it is not possible to define a COMPLEX number type and define new infix operators for them, equivalent to what the algebraic operators are for complex numbers in mathematics (some of these languages have COMPLEX data types, but these are then also hardcoded into the language).

Algol-68 was one of the first programming languages to have operator overloading. It could redefine existing operators for new types, it allowed you to add completely new infix operators and you could specify the priorities of any of these operators. This might have been too much flexibility and room for abuse.

Most large programming languages, like Python, C++, Ada and Rust do allow you to overload infix operators, but none of them allows to to add completely new infix operators or to redefine operator priorities.

Benefits

In mathematics, algebraic expressions are not just used with numbers, but for example also with vectors and matrices. Especially for those with knowledge of the problem domain, infix expressions are very readable. A single infix expression with vectors and matrices, can replace a rat’s nest of hard to read function calls. An expression with vectors and matrices is certainly more readable than a loop over the separate array values or two nested loops for matrix multiplication.

Types that are suitable for operator overloading:

  • Vectors and matrices, with scalar multiplication, inner product, matrix multiplication and matrix-vector multiplication.
  • New numeric types, like arbitrary size integers or complex numbers.
  • Sets
  • Lists and strings for concatenation (using + operator) and sometimes the * operator for replicating n times.
  • Arbitrary algebraic types as mathematicians define them, as long as they can be represented in a computer and operations can be implemented.

Drawbacks

To an unsuspecting reader, an expression with infix operators may look like it’s just adding and multiplying integers (or maybe floating point numbers), while in reality it is doing arbitrarily complex operations on arbitrarily complex data structures.

Sometimes, operators are overloaded to do operations that are totally unrelated to their original meaning, for example the << operation in C++, that was originally “left shift”, but it is used to output something on an output stream.

In some languages, such as C++, you can also overload operations like assignment or array subscripting. This is all fine, as long as you do this to implement sane assignment or subscripting semantics for them, but it gets pretty ugly if you implement totally unrelated functionality for these operators.

In some languages, such as C++, you can also overload operations like assignment or array subscripting. This is all fine, as long as you do this to implement sane assignment or subscripting semantics for them, but it gets pretty ugly if you implement totally unrelated functionality for these operators.

Even in cases where it does make sense to overload, there are some drawbacks:

  • It is not always clear what an overloaded operator will do. For example: is the ‘*’ operator between matrices doing matrix multiplication or component-wise multiplication instead?
  • Overloaded operators and the function calls that the compiler invokes for these operations, may in many cases not be the optimal way to perform a task. Dedicated function calls for a specific task may run more efficiently.

What Some Languages Do

Three “small” languages that aim to replace C, chose different solutions:

  • Zig does not have operator overloading. It has some infix operations on vectors though.
  • Odin does not have operator overloading, but it has many operations on vectors, matrices, complex numbers and even quaternions. AFAIK, this is the only language that has quaternions as a built-in type.
  • C3 does allow operator overloading. The documentation contains a plea to use it only in useful ways, but that might not help too much.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *