Erlang Term Ordering: Evaluating Elixir Functions Part 2

By: Scott Newcomer
Ordered button patterns

In part 1, we looked at the mechanics of how Stream.unfold/2 works. Don’t worry; we’ll be digging into other Stream methods. But let’s take a break with a simple look at Erlang term ordering. Take a peek at this page and you will see this referenced in lots of places. Before we jump into code examples, I want to first evaluate some finer linguistic details.

Elixir documentation

Erlang Term Ordering

Term is a keyword I see throughout the Elixir community — Erlang term ordering, Erlang Term Storage (ETS), and Disk-Based Term Storage are a few examples. In Erlang, an expression like sub_exp1 + sub_exp2 are made up of two terms. A term can be an integer, float, atom, string, list, map, or tuple. Term comparison operators also include ==, /=, >, and many more that you are probably familiar with.

Can we compare across data types? There is a pecking order.

number < atom < reference < fun < port < pid < tuple < list < bit-string
iex> {1, 2} > %{:a => 1}
false

We can scope this idea of Erlang term ordering to a few data types. That simplifies things. Except, there is no exclusive string data type in Erlang. I was surprised as well.

We are ready to look at some examples where term ordering comes into play. Here is a basic tuple sort.

iex> Enum.sort([{2, 1}, {1, 2}])
[{1, 2}, {2, 1}]

What about tuples of various lengths?

iex> coordinates = [{1, 1}, {0, 0, 0}, {1, 6}]
iex> Enum.sort(coordinates)
[{1, 1}, {1, 6}, {0, 0, 0}]

What if we need to reverse it? We can take advantage of Erlang term ordering explicity and use Enum.sort/2.

iex> coordinates = [{1, 1}, {0, 0, 0}, {1, 6}]
iex> Enum.sort_by(coordinates, &(&1 > &2))
[{0, 0, 0}, {1, 6}, {1, 1}]

This leads us to one conclusion about tuple sizes.

iex> {0, 0, 0} > {0, 0}
true

What about maps?

iex> coordinates = [%{:b => 1}, %{:a => 0, :c => 0}]
iex> Enum.sort(coordinates)
[%{b: 1}, %{a: 0, c: 0}]

iex> coordinates = [%{:b => 1, :a => 0}, %{:a => 1}]
iex> Enum.sort(coordinates)
[%{a: 1}, %{a: 0, b: 1}]

This implies map sizes matter

iex> %{:a => 1, :b => 1} > %{:a => 1} 
true

and key order matters

iex> %{:b => 1, :a => 0} > %{:a => 0, :b => 1}
true

And now a binary

iex> vowels = "aeiou"
iex> vowels? = "aeyou"
iex> vowels > vowels?
false

We can see a pattern emerging that can be solidified into various maxims on how Erlang term ordering works.

  • Lists are compared element by element.

  • Tuples are ordered by size, two tuples with the same size are compared element by element.

  • Maps are ordered by size; two maps with the same size are compared by keys in ascending term order and then by values in key order. In maps, key order integers types are considered less than floats types.

  • When comparing an integer to a float, the term with the lesser precision is converted into the type of the other term, unless the operator is one of =:= or =/=. A float is more precise than an integer until all significant figures of the float are to the left of the decimal point. The conversion strategy is changed depending on the size of the float, because otherwise, comparison of large floats and integers would lose their transitivity.

  • Bitstrings are compared byte by byte, incomplete bytes are compared bit by bit.

Let’s look at one more function before we move on to make sure this knowledge is solidified. min_max/2 is defined as returning a tuple with the minimal and the maximal elements in the enumerable according to Erlang’s term ordering.

Ordering based on ASCII value.

iex> coordinates = [:a, :z, :b]
iex> Enum.min_max(coordinates)
{:a, :z}

Ordering based on map size.

iex> coordinates = [%{:a => 1}, %{:a => 1, :b => 2, :c => 3}, %{:a => 1, :b => 2}]
iex> Enum.min_max(coordinates)
{%{a: 1}, %{a: 1, b: 2, c: 3}}

Our knowledge is cross functional. I hope this is getting easier.

We can also take a quick look at hairier examples with Date and DateTime. To our comparison operators above, Date and DateTime is just a map.

iex> early_january = ~D[2019-01-02]
iex> early_february = ~D[2019-02-01]

iex> early_february > early_january
false WAT?!?!

Well that is weird. However, as stated in the docs:

Remember, comparisons in Elixir using ==/2, >/2, </2 and friends are structural and based on the DateTime struct fields.

If we check i early_february and i early_january, it may make sense that comparison fails, since we learned above how maps will eventually compare each key by order.

# ~D[2019-01-02]
Raw representation
%Date{calendar: Calendar.ISO, day: 2, month: 1, year: 2019}

# ~D[2019-02-01]
Raw representation
%Date{calendar: Calendar.ISO, day: 1, month: 2, year: 2019}

Another interesting tidbit: In the Erlang VM, atoms can have a maximum of 255 characters. On the other hand, integers in Erlang/Elixir uses bignum arithmetic. As such, integers are only limited by the memory available on the system. Consequently, I can always give you a number higher or lower than what you stated the integer is: infinity + 1 > infinity!

That’s It

Before I dived into writing this article, I had taken Erlang term ordering as a given and hopelessly applied it without evaluating what it really means. I hope some of the finer details help solidify your knowledge of Erlang terms and how they can be used with comparison operators in Elixir.

In part 3, we will jump back to the Stream module and look at the mechanics of how methods like Stream.map/2 create composable, lazy enumerables differently than in part 1.

DockYard is a digital product agency offering exceptional strategy, design, full stack engineering, web app development, custom software, Ember, Elixir, and Phoenix services, consulting, and training. With a nationwide staff, we’ve got consultants in key markets across the U.S., including Portland, Los Angeles, Salt Lake City, Minneapolis, Miami, Washington D.C., and Boston.