I cleaned up the code a bit and used `ftbench` to get performance numbers for the `smooth` and `dense` rasterizers.
There is much room for improvement in the ported code, as can be clearly seen in the performance graph.
(ftbench output for both renderers at 20ppem attached)
Phase-2 will be focused on profiling and improving the performance of the ported code.
This looks encouraging, within reach after some more optimizations, although the square curvature is probably unavoidable for the dense arrays. I suggest you use 'ftbench -bc' to focus on rendering and avoid other benchmarks. You can also add '-f 0x2' to apply FT_LOAD_NO_HINTING and '-i 0-100' or so, to use some portion of good glyphs and avoid simple diacritics.
I also strongly recommend 'perf record ftbench -bc...' followed by 'perf report' to find functions that take most of the time. Fedora distributes perf in kernel-tools.