Loop-up tables in Makecode/Javascript vs C - speed?

I am in the middle of creating a Makecode extension with lots of useful graphics/pixel routines for the Pomoroni 119-bit LED display.

As part of this I am using look-up tables to speed up the computation for the location of pixel because the LEDs in the array are in an awkward order.

So in Makecode/JS, I have a constant as an array:

const COLUMN_MAP: number[] = [16, 14, 12, 10, 8, 6, 4, 2, 0, 1, 3, 5, 7, 9, 11, 13, 15, 17]

But it is also possible to define a look-up function in a .cpp file

const int column_map[] = {16, 14, 12, 10, 8, 6, 4, 2, 0, 1, 3, 5, 7, 9, 11, 13, 15, 17};
int COL_MAP(int x)
{
return column_map[x];
}

My question is this: why is it that if the look-up table is 18 bytes long it is quicker to do this directly in MakeCode, whereas if the look-up table is 127 bytes long it appears to be faster to execute if you define it in C using a .cpp file within the extension?

The array is a mutable runtime object so it has some overhead. Try using a buffer literal to encode your lookup table:

 const COL_MAP = hex`0123457789abcdef...`

It will be compiled as a buffer in flash and won’t use any heap at runtime. The buffer has various apis to readout the data in a number format.

Oh, I realised I was using the code in different contexts hence the difference in the execution time, and so the look-up table is always faster directly in the JS. But I don’t know about buffer literals, so I will look into that, thanks. I need this for a 128-byte look-up table used to reverse the order of a 7-bit number.

I’m also using two short look-up tables to calculate both 2 **x and 2 **(7-x), and doing that very frequently so I want the quickest possible code.

Whilst doing run-time tests I discovered that if doing some bitwise operations such as X = (X AND A) OR B, where A and B are values fetched from look-up tables, it is a few percent faster on the Microbit to write it this way:

let A = LOOP_UP[num1]
let B = LOOK_UP[num2]
myarray[x] = (myarray[x] & A) | B

than it is to do it with a single, complex statement:

myarray[num] = (myarray[num] & LOOK_UP[num1])  |  LOOK_UP[num2]

Which suggests I should check all other time-critical frequent operations for similar optimisations. This is only my first week messing about with the Microbits, MakeCode and Javascript.

But it seems that using the buffer[x] read/write method, for a simple buffer storing bytes, is a lot faster than using the buffer.getNumber(…) method, as I discovered when time-trialling these two alternatives.

let bitmask = (col < 9) ? TWOS_R[row] : TWOS[row] 

let bitmask = (col < 9) ? TWOS_R.getNumber(NumberFormat.UInt8LE, row) : TWOS.getNumber(NumberFormat.UInt8LE, row)

buffer literals are always going to be better memory wise; and you’ll want that for microbit. @mmoskal comments on the perf?

Yes, buffer[idx] is inlined in assembly, whereas buffer.getNumber() is a C++ call. Always use indexer if you just want a byte.

The array lookup in JS is faster than calling a C++ method, since you skip a bunch of conversions and the lookup is again inlined in assembly.

The complex vs simpler expression shouldn’t make much of a difference, I would actually expect the “complex” one to be faster.

There are some notes on profiling performance here: https://makecode.com/js/profiling - you can use that on Arcade with STM32 chips only right now, but the results should translate to micro:bit (allowing for the fact that micro:bit will be 8x slower or so).

I can understand why perhaps it’s quicker to separate some of the assignments and break it down when a complex expression involves more than two elements. It may not be true in general but from the tests I have just done querying the run-time in microseconds, it makes a noticeable different to fetch the array/look-up table elements first into local variables and then perform the complex expression on three simple variables, as in the example with (myarray[] & A) | B, above.

It’s interesting to experiment on this and see how and where valuable time can be shaved off.