Collatz in J

We can improve slightly on the Collatz examples of Chapter 10 of Learning J by noting that, while taking a function power of infinity produces the fixed-point, taking a function power of boxed infinity (or simply the empty box) also gives us back the intermediate values. Let's start by bringing back the function itself:

collatz =. -:`(1 + 3 * ])@.(2&|)

To review briefly, we can treat f`g@.t as

false`true @. test

In other words, this is a very condensed version of what in C would look something like:

(test x) ? (true x) : (false x)

So the overall expression says "if odd" (2&|) "then multiply by 3 and add one" ((1 + 3 * ])) "otherwise halve" (:-), . The @. causes the evaluation of cases and the backtick combines the cases.

As it happens there are some other J idioms that happen around this @. function, whose name is "agenda:"

<zero-case>`<non-zero case> @. *

This exploits the fact that the monadic * gives you 0 for 0 and 1 for all other positive numbers. And indeed we find over a hundred instances of this in the distribution, especially in the math addons. There are several other idioms in the standard library, such as @.(*@#), which can be read:

<empty array case>`<non-empty array case> @. (*@#)

or

<not length-N case>`<length N case> @. (N=#)

or

<not square matrix case>`<square matrix case> @. (=/@$)

The trick here being that reducing by equality against the shape will only be true if all the dimensions are the same.

The next step of Collatz is to build something that applies the function multiple times and gives us back the result. In Learning J we see that iterated application will gives 1 eventually, but we don't see that using the ace will give us all those intermediates at once:

hailstones =. collatz^:(1&~:)^:a:

So there's basically two tricks going on here. The ^:a: is literally, to the power of empty box, which means in J, to the power of infinity, but give me all the results. The function being applied thusly is collatz^:(1&~:), which you can read as collatz, while it isn't equal to 1. This is because our definition has a stupid loop in it, from 1 to 4 to 2 to 1:

   collatz 1
4
   collatz 4
2
   collatz 2
1
   collatz 1
4

With it the change, we see that it works:

   hailstones 9
9 28 14 7 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1

A note about the pattern ^:a:, it occurs pretty rarely in the library, only 9 times, mostly in dissect. ^:_ is not terribly common either. Looking through the code, I see a very large number of occurrences of ^:_1 to access inverse or obverse functions. +:^:_2 is a pattern for square root (which has its own primitive, %:) and *:^:_2 seems to be a pattern for 4th-root (why you would need this is beyond me).

One interesting thing happens if we try to apply our hailstones monad to an array:

   hailstones i. 4
|rank error: collatz
|       hailstones i.4

This is because the rank of our functions appears to be wrong:

   hailstones b. 0
_ _ _
   collatz b. 0
_ _ _

At rank infinity, these are hoping to apply to the whole array at once, which obviously will not work. We can improve our definition of collatz by adding an explicit rank, and then the problem with hailstones will go away, or we can just apply a rank to hailstones when we call it:

   hailstones"0 i. 4
0  0 0  0 0 0 0 0
1  0 0  0 0 0 0 0
2  1 0  0 0 0 0 0
3 10 5 16 8 4 2 1

Now this is also surprising. We have a matrix with a bunch of zeros, and the answers are in the non-zero cells. This turns out to be something J and APL do to fill in when vectors of inequal length are combined to form a matrix. A better solution would probably be to box them.

   (<@hailstones)"0 i. 4
┌─┬─┬───┬─────────────────┐
│0│1│2 1│3 10 5 16 8 4 2 1│
└─┴─┴───┴─────────────────┘

But, it might be more interesting to see how many iterations were necessary, rather than what those were, and just count them up:

   (#@hailstones)"0 i. 4
1 1 2 8

Now we can just walk through the first hundred and find the one with the largest number of hops:

   (] i. >./) ((#@hailstones)"0 i.100)
97

The >./ says, "maximum" and dyadic i. finds the index of a value. So, the train ] i. >./ means, give me the index of the maximum value of this array. The answer is 97.

   hailstones 97
97 292 146 73 220 110 55 166 83 250 125 376 188 94 47 142 71 214 107 322 161 484 242 121 364 182 91 274 137 412 206 103 310 155 466 233 700 350 175 526 263 790 395 1186 593 1780 890 445 1336 668 334 167 502 251 754 377 1132 566 283 850 425 1276 638 319 958...
   # hailstones 97
119

There are 119 skips for #97. Wow!