Clojure, numbers, despair

Tags: clojure

Warning: this is a very angry post, but most points in here are valid despite the tone.

Once upon a time, a high level language was developed. It's beginnings were humble and the developers focused on things that mattered. Numbers were not things that mattered. Numbers were used, but how they were used mattered very little.

So...

The language is Clojure. And numbers in Clojure are this:

;; Auto-promotion is cool
user> (type (inc Integer/MAX_VALUE))
java.lang.Long
;; Except it doesn't always work!
user> (type (inc Long/MAX_VALUE))
ArithmeticException integer overflow  clojure.lang.Numbers.throwIntOverflow (Numbers.java:1501)

or this:

;; Because three ways of parsing a string as number is a Good Thing
user> (= (Double/parseDouble "1.2") (Double/valueOf "1.2") (read-string "1.2"))
true
;; Because having function return different types based on parameters is an Even Better Thing
user> (= (type (/ 3 2)) (type (/ 2 2)))
false

Do you think that girl was pretty?

There's no way to put it lightly: I hate Clojure number types. Java keeps leaking into it and no-one cares. To add the insult to the injury, on top of what you have in JVM, Clojure adds two more ways of representing numbers and then builds a huge pile of logic on top of that. Let's quickly cover what types one may find in a typical Clojure application:

  • clojure.lang.BigInt
  • clojure.lang.Ratio
  • java.lang.Number
  • java.lang.Integer
  • java.lang.Long
  • java.math.BigInteger
  • java.math.BigDecimal
  • java.lang.Float
  • java.lang.Double

Not surprisingly, most of these are just Java types. However, two more types are added: BigInt and Ratio. Both are weird. I'd like to focus a bit on Ratio. Ratio can be created by integer division, but only in case the division can not produce an integer:

;; Aight
user> (type (/ 1 2))
clojure.lang.Ratio
;; Not really expecting this
user> (type (/ 1 1))
java.lang.Long
;; Yeah, well, WAIT WHAT
user> (type (/ 1N 1M))
java.math.BigDecimal

We can also just call the Ratio constructor (and fail miserably in some cases):

;; Cool
user> (type 1/2)
clojure.lang.Ratio
;; Eh?
user> (clojure.lang.Ratio. 1 1)
ClassCastException java.lang.Long cannot be cast to java.math.BigInteger  user/eval21314 (form-init5235971328632709373.clj:1)
;; Ah!
user> (clojure.lang.Ratio. (biginteger 1) (biginteger 1))
1/1

The proper way is to coerce the parameters to java.math.BigInteger. Why? Historical reasons: clojure.lang.Ratio only accepts java.math.BigInteger because back when it was written Clojure didn't have clojure.lang.BigInt type and no-one touched the code since quite literally1 forever.

The fun train doesn't stop here. For example, we may want to create a ratio with a denominator of 0. Let's try the usual way:

;; Good
user> 1/0
ArithmeticException Divide by zero  clojure.lang.Numbers.divide (Numbers.java:158)
;; Consistent!
user> (/ 1 0)
ArithmeticException Divide by zero  clojure.lang.Numbers.divide (Numbers.java:158)

Bummer. But then again it might make sense, after all a Ratio with a denominator value 0 may result in some weird math occurring. But we haven't tried all the available constructors yet, so let's do that:

;; I hate this :/
user> (clojure.lang.Ratio. (biginteger 1) (biginteger 0))
1/0

WAIT WHAT.

Combining java.math.BigInteger with clojure.lang.Ratio is even more fun, especially when it comes to corner cases:

;; Alright makes sense
user> (.denominator (* 7919/7920 (/ 1 Long/MAX_VALUE)))
73049106531889824391440
user> (class (.denominator (* 7919/7920 (/ 1 Long/MAX_VALUE))))
java.math.BigInteger
;; WAIT BUT WHY
user> (/ 7919 (* 7919/7920 (/ 1 Long/MAX_VALUE)))
73049106531889824391440N
user> (class (/ 7919 (* 7919/7920 (/ 1 Long/MAX_VALUE))))
clojure.lang.BigInt

The result type differs while logically you performed the exact same computation. And don't forget that those types are not always cooperating nicely, so you introduce more corner cases. Oh boy!

Who wears Cheetah?

Leaking abstractions is not cool. Clojure tries to present leaking abstractions as a feature. This is doubly not cool.

Number type promotion is not cool if there's no clear way to demote type. It's doubly not cool in Clojure, because there's no clear documentation on how and when promotion works. Existing documentation is lacking at best.

Consistency is great. Clojure is not great at consistency though and sometimes it feels like the "the principle of least astonishment" is being pro-actively broken by Clojure's design in the numbers domain.

Here's an incomplete and perhaps redundant list of things that I find annoying, surprising or outright stupid in Clojure:

  • Arithmetic overflows everywhere! Multiplying java.lang.Integer will never cause overflow, however java.lang.Long will fail to be autopromoted. To be fair, this behavior is right there in the docstring for * but then again, who reads docstsring for multiplication? There's also *', +' and -', all of which auto-promote the result, but what are the chances you ever even knew about them?
  • clojure.lang.Ratio uses java.math.BigInteger and not clojure.lang.BigInt for numerator and denominator. Why? Because when Ratio was created (back in 2010) clojure.lang.BigInt simply didn't exist and when it was finally created, Ratio was not updated to represent the change. Bonus points for figuring out why clojure.lang.BigInt was created in the first place.
  • Floats and doubles are... Well, the same floats and doubles as in Java. There's no attempt to hide them away. So, things like infinity and NaN are there, but they're not really supported by Clojure. How does one check if the number is NaN or Infinity in Clojure? You use java.lang.Float or java.lang.Double classes for that, specifically static methods such as isNaN, isFinite, etc. Hardly a portable solution.
  • Documentation is bad. Like, terribad. We're talking about a language with 8 years of development history, with strong backing from commercial companies, with successful commercial and open-source products written in the languge and yet we see very little focus on documenting things, even essential things, numbers being one of them.
  • Unsigned math is not supported. There's nothing in Java, thus there's nothing in Clojure. Make what you want out of it.
  • Bit operations do not belong in core namespace. It's clutter, most programs don't need them. More than that they're simply broken. More on that in a few bits.
  • *unchecked-math* is one big can of worms and can quite literally screw up your library performance or even behavior when someone using your library sets said dynamic var.

So, bit operations. Clojure really lets you down here and you as a programmer would have to extremely careful to avoid the common pitfalls. Most recovering C and C++ addicts would say that bit shift to the left by one bit is equal to multiplying by 2. Clojure says NO. Unless you multiply by a different kind of two:

user> (bit-shift-left Long/MAX_VALUE 1)
-2
user> (* 2 Long/MAX_VALUE)
ArithmeticException integer overflow  clojure.lang.Numbers.throwIntOverflow (Numbers.java:1501)
user> (* 2N Long/MAX_VALUE)
18446744073709551614N

The first behavior is a result of lacking proper unsigned, modular number type. The exception in the second is the result of "protecting" the users from overflowing, instead of promoting the type (as expected). And then the third one does the right thing. Or maybe a wrong thing, but in any case you would expect all 3 functions to do the same thing. What's worse is that there are plenty similar examples. Predictability is important, people!

I wanna look tan

Even though no-one asked me, I'll try to imagine a better world of Clojure math. First off, the number types. There should be only two ways to represent numbers in Clojure: integers and reals. Integers should be signed and unbounded. Integer division always produces reals WITHOUT EXCEPTIONS. Integers can be promoted to reals, but reals can never be demoted to integers. Reals can follow the same approach as java.lang.BigDecimal, Python's decimal module or MPFR. Now, obviously you'll immediately find a problem with this approach, namely that you need a proper context for all decimal operations. I say, default to large, and I mean LARGE precision. As in, precision that doesn't even make sense anymore, like 2^20. Let people control the precision through a context. Leave only basic math operations in core namespace: addition, subtraction, multiplication and division. Define those operations clearly, make sure that division always produces reals and truncate where need be.

Then, introduces math namespace. Put modular math operations in math.modular. math.binary for binary math, bit shifting. math.real containing functions and macroses helping with handling real context, rounding, etc. math.ratio for, well Ratio. math.float for IEEE 754-2008 floating point numbers. math.platform.jvm and math.platform.js for exposing platform-specific numbers.

But most importantly, write documentation. Everything has to be documented extensively and clearly, without exceptions. Great code and great design is only half the battle, clear documentation is the other.

As far as negative impact of said change, I can only think of performance. But only a small minority of Clojure users type-hints everything or uses Zachary Tellman's primitive-math. Everyone else? They get to enjoy the math setup that has very questionable decision baked into it without worrying much about the performance.

Let me take a selfie

I tend to complain. A lot. The math in Clojure is just one of my complaint targets. However, it's a valid target. The math is neglected in Clojure, I see no attention being paid to it by core developers, there's no organized effort to make it better, there has been zero calls to community to ask for improvement ideas. And what's worse is that this math is completely ingrained into Clojure core namespace and you can't replace it easily. There's no way to fix the numbers in Clojure from the outside.

I could take on this, spend plenty of time writing the proposal pushing it to Clojure core, writing the code afterwards, push for solution, but what are the chances that it's ever going to get accepted? The upfront cost of this work is tremendous and there's very little chance that such work would ever end up in Clojure core.


  1. Where "literally forever" is used in terms of Internet age.