freetype
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ft] Optimizing ft_adobe_glyph_list


From: Povilas Kanapickas
Subject: Re: [ft] Optimizing ft_adobe_glyph_list
Date: Fri, 12 Oct 2018 09:12:25 +0300
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 11/10/2018 11:10, Werner LEMBERG wrote:
> 
>> I've recently came across ft_adobe_glyph_list array which is a 56kB
>> table.  It seems that it's possible to compress it better by
>> extracting duplicate substrings possibly at some small performance
>> cost.
>>
>> My preliminary calculations put the size of the optimized table
>> somewhere between 24-26kB.  That would be 30kB savings which is
>> non-insignificant part of the freetype library.
>>
>> Would you be interested in this kind of optimization?  I'd be glad
>> to continue working on it.
> 
> Yes, I'm interested!  However, before you are investing more time into
> it, could you give us a reference to what algorithm you plan to use?
> Alternatively, please outline with a few words how you are going to
> implement the compression.
> 

The size reduction is a result of several optimizations:

 - the most common substrings are deduplicated by putting them into a
single location and referring to them by index. We use the fact that the
AGL uses only ~90 ASCII points out of 256 available, so we can fit
enough indices into the byte that previously referred to the letter of
the node. The "notlast" field is no longer used.

 - the size of the node and its value is reduced from 3 to 2 bytes in a
lot of cases. This has been achieved by noticing that with some remap
function 13 bits are enough to store information required to produce the
original 16-bit unicode value.

 - the size needed to represent links from a node to its children has
been reduced from 2*<num_children> bytes to <num_children>-1 bytes in a
lot of cases. We notice that most nodes have few children which are
nearby, so instead of absolute offset in the table we use child size as
a way to get relative offset and thus absolute offset. This makes binary
search slower in such cases, but linear search is likely not much slower
in nodes with few children anyway.

 - a\d+ and afii\d+ entries are handled separately in a simple, but more
optimized format.

All of the above increase the complexity and size of the retrieval
function. Unfortunately we can't know the impact of the changes until
basically everything is finished and we do tests. Maybe the slowdown
will be so severe that the proposal does not make sense :-)

Regards,
Povilas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]