So, the first thing I did upon discovering Makecode Arcade, was to throw together this Mandelbrot hack, just for fun:
@frank_schmidt kindly ran it on real hardware (I did not have a Meowbit at the time - I now do!), and the result was that it was so slow that it was effectively unusable since Arcade is configured to always have Typescript use double-precision software-emulated floating point.
@peli pointed out that there is a fixed-point library, declaring the Fx8 type, which makes real-number operations faster than the slow double emulation. But I still figured there must be a way to use the FPU which the Cortex M4 has.
Just as a test I thought maybe I could pass a pair of numbers, boxed inside a Buffer, to a C++ extension (leading to my other thread: "pxt target arcade" fails - #16 by paul, and thanks are due to @mmoskal for his help there), write code to add them together using the âfloatâ type rather than âdoubleâ, and that this would emit hardware floating point code. Effectively my idea was to model the FPU as if it were a separate hardware device, and communicate with it passing Buffers back and forth to a native extensionâŚ
However, using arm-none-eabi-objdump (from Yotta) to disassemble the generated code, showed that single-precision floating point operations still yielded software emulation. The âadd(Buffer f1, Buffer f2)â functionâs code looked like:
push {r4, lr}
ldr r1, [r1, #8]
mov r4, r0
ldr r0, [r0, #8]
bl 0 <__aeabi_fadd>
str r0, [r4, #8]
pop {r4, pc}
the call to _aeabi_fadd being the offending function call to the GCC floating point emulation library. That was not what I expected. So as a test, I tried compiling for the samd51 target instead of the stm32f401 target. In this case, the emitted code used the expected hardware instructions (vldr, vadd, vstr):
vldr s15, [r0, #8]
vldr s14, [r1, #8]
vadd.f32 s15, s15, s14
vstr s15, [r0, #8]
bx lr
nop
As a further test, I tried inlining the above assembly language, and compiling again for stm32f401 but the assembler complained that these instructions were invalid. Even writing a raw .s file gave the same result - the assembler wasnât going to accept the vldr, vadd, or vstr instructions as valid. But I was sure that they were.
It turns out that whilst the samd51 target is configured to emit native floating point instructions for single-precision floats, the stm32f401 is not. These configs are in the relevant codal packages, i.e.:
and
vs
and
So, I took a mirror of the codal-big-brainpad repo, and made itâs cpu_opts the same as the itsybitsy target. Then, on a local copy of pxt-arcade, modified the pxtarget.json to point to it instead of Lancaster Universityâs repo i.e.
Changed
"url": "https://github.com/lancaster-university/codal-big-brainpad",
"branch": "v1.0.22",
To
"url": "https://github.com/junk100/codal-big-brainpad",
"branch": "master",
Running this locally via âpxt serveâ and compiing test code against my extension finally had the desired result, and I could compile my inline assembly, and run it successfully on hardware, proving that floating point is possible (but requires the codal-big-brainpad target to be patched).
My extension is here: https://github.com/junk100/pxt-shimtest
The test code I wrote is simply:
let f = fpu.createFloat(13)
let f2 = fpu.createFloat(17)
f.addToSelf(f2)
game.splash("Hello" + f.get())
(To prove that the native shim is definitely being used, I added a deliberate math mistake to the in-simulator version of the extension, by adding a random number to the result - https://github.com/junk100/pxt-shimtest/blob/master/extension.ts#L33). Running the code in the simulator gives the âwrongâ answer, and the correct calculation is performed on the hardware.
I did actually suspect that the Meowbit might not have the FPU enabled (i.e. requring this startup code: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0439b/BEHBJHIG.html), but the bootloader must already have this covered.
So, now that this overly long post is nearly over:
A question for the Arcade owners
Could/should the codal-big-brainpad target be patched to allow floating point instructions, at least for native extensions? If the âbig brainpadâ itself doesnât have the FPU, then perhaps a separate target for the Meowbit (which definitely does).
Thanks for reading!