|
From: | David Mertens |
Subject: | Re: [Tinycc-devel] Implementing gcc intrinsics |
Date: | Sun, 24 Apr 2016 13:58:59 -0400 |
> PS: please help to fix a current version of the exsymtab. There are some
> changes in the current tcc which breaks Symtab streaming (asm_label)
Well, this looks like a ginormous patch (close to 7k lines). I had browsed the exsymtab fork on github before (had plans to hoist its tests suite that never flied) and frankly either I don't get the point of it or the impl seems over-complicated.
Tries for sym lookup IMO are a dead end. I already implemented one on ident_table to replace the TOK_HASH_FUNC and results are negative (at least 20% performance hit). It's not that I not tried a lot to fine tune the implementation -- including direct 256 elements arrays for children and even balanced AVL trees in desperation.
The problem with tries (and trees in general) is that these consume extra memory (10s of MB) and that costs in term of cache misses. Trees are average on most operations (including insert/update/delete) but sym lookup needs fastest possible retrieval, initial inserts are amortized by lookups in the long run.
IMO, the way to go w/ performance improvement is to vectorize the curent hash implementation. Instead of computing the hash one character at a time, take next 4 characters, process these in parallel and compute 4 hashes. Then combine these hashes (w/ XOR or anything) and proceed with collisions handling. Even collisions are not perf problem at the moment because increasing hash table to 64k slots (practicly w/o collisions) has no noticable performance improvement. On contrary -- w/ large hash tables cache misses on array lookups start to kick in.
[Prev in Thread] | Current Thread | [Next in Thread] |