Quick J: Thousands Separators

Often, we deal with big numbers. Sometimes the numbers are represented in scientific notation, sometimes they are not. When they are not, it can be challenging to gauge the size of a number of it is a single long string of digits:


   NB. ! is factorial
   ! 20x
2432902008176640000

Wow! That’s hard to read. Let’s add some thousands separators.

What are thousands separators?

If you don’t know what they are, they are commas (or full stops/periods in other locales) which go between every three digits of a number. This breaks the number up into groups of three that are easy to read. You probably know this.

There are other systems in the world, too, such as the Indian System, but I won’t go into those here.

Separating the Thousands

Let’s consider the steps we will take to solve this problem:

Should be simple!

Convert our number to a string

J makes this easy — it’s just ": default format. Here’s an example:


   number =: 123456789
   datatype number
integer
   ": number
123456789
   datatype ": number
literal

Break the string into consecutive groups of three, starting from the right

This is where things start getting annoying. J has several ways of manipulating arrays to extract subarrays, but none of them are quite perfect for this use case. The issue is that we are starting from the right. Our three best options, \ infix, ;.0 subarray, and ;.3 subarrays all start from the left. Dang.

One option could be to try and fashion indices to start from and take substrings, but an easier option is to reverse the string and then we are starting from the left. To do this, we will use |. reverse, < box, @ atop, and \ infix.

So, let’s build this step-by-step.

Start by reversing the array:


   |. ": number
987654321

Now we need the groups of three:


   _3 <\ |. ": number
┌───┬───┬───┐
│987│654│321│
└───┴───┴───┘

Here, the negative part of _3 (minus three) specifies to infix that we want non-overlapping groups. If we had used 3 (positive three), the groups would have overlapped:


   3 <\ |. ": number
┌───┬───┬───┬───┬───┬───┬───┐
│987│876│765│654│543│432│321│
└───┴───┴───┴───┴───┴───┴───┘

Also note that \ is an adverb. In J, all adverbs need a verb to act on. Our verb here is < box, which just takes an object and puts it in a box. A box is like a pointer to something, and since J arrays have to be flat and homogenous, it can be used to create an equivalent of jagged or heterogenous arrays.

However, it would be better if the substrings were not reversed. So, we can compose in another reversal to swap each group around into the right way:


   <@|. 'asdf'
┌────┐
│fdsa│
└────┘
   _3 <@|.\ |. ": number
┌───┬───┬───┐
│789│456│123│
└───┴───┴───┘

Put commas between them

This should be easy! We can take advantage of another J adverb here, / insert. Insert inserts the verb it modifies between elements of an array. So, as a basic example, +/ 1 2 3 is the same as 1 + 2 + 3.

For our application, we need the verb to be “concatenate and put a comma in between”: aka, a few uses of , append which appends arrays.


   '123' , ',' , '456'
123,456

But wait! Our groups are in reverse order, so we actually want to swap which way around they get combined.


   '456' {{ y , ',' , x }} '123'
123,456

J language aficionados may recognise the form x u v y as a hook, which is a way of writing this verb in a tacit (point-free) way. It will also require us to use & bond however, which is like partial application.


   prependComma =: ','&,
   prependComma '123'
,123
   '456' {{ y , prependComma x }} '123'
123,456
   '456' (, prependComma) '123'
456,123
   '456' (, ','&,) '123'
456,123

When we made the hook, we lost the swapped arguments — let’s get them back! There’s a J adverb for this, and it’s ~ reflex. It swaps arguments.


   '456' (, ','&,)~ '123'
123,456

Now all that’s left is making it work on the boxed list we have. To do that, we just need to unbox our operands first.

There’s another J composition conjuction for this, too! It’s &: appose, which takes two verbs.

It applies the right-hand verb to each operand separately, and then combines them with the left-hand operand. Here’s a quick diagram to try and show what I mean.

x u&:v y x y v v u

Unboxing is as simple as > open. So, composing with this, and combining with the previous steps, we have:


   number
123456789
   (, ','&,)~&:>/ _3 <@:|.\ |. ": number
123,456,789

And that’s everything! The verb can be packaged up into a direct definition (cleaner than tacit in this case) as follows:


   thousandsSeparate =: {{ (, ','&,)~&:>/ _3 <@:|.\ |. ": y }}
   ! 20x
2432902008176640000
   thousandsSeparate ! 20x
2,432,902,008,176,640,000

That’s all for today! See you ’round for the next time I decide to write a quick blog post about a piece of J I write randomly.