Re: [ft-devel] More fuzzing for freetype2?

+address@hidden (as suggested by Werner)

Hello FreeType developers!

tl;dr. we have set up a continuous fuzzing bot for FreeType:

https://github.com/kcc/libfuzzer-example/wiki/FreeType-Fuzzer-Bot

The goal is to find more bugs in FreeType and prevent regressions in future.

But we may need a bit of help from the FreeType community.

More details below.

On Sat, Oct 3, 2015 at 9:21 PM, Werner LEMBERG <address@hidden> wrote:

Hello team!

> We are currently setting up an Open Source Security Group at Google
> to provide more structured resources to critical Open Source
> efforts. We’re starting with fuzzing and build infrastructure “as a
> service,” and we’d like to continue collaborating with you and
> FreeType to understand how you can benefit from these resources.

Great!

> On to practicalities: We’re continuing fuzz testing of FreeType.
> The fuzzer has been running for several days, and so far hasn’t
> found any new issues, so we’ve set up an 8-CPU continuously running
> public bot
> <https://github.com/kcc/libfuzzer-example/wiki/FreeType-Fuzzer-Bot>.
>
> Our current plan is to have the bot running indefinitely, with the
> hope that it will help us detect regressions and maybe find some
> more issues over time.

BTW, I completely forgot to tell you that FreeType has a fuzzer on its
own! Please have a look at `src/tools/ftrandom'. Some years ago
George Williams and I run it for some time, and indeed the program
found a lot of bugs.

This looks like a mutation based fuzzer (which makes it somewhat similar to libFuzzer),

but there are at least two key differences:

1. it uses fork/exec, which means you have significant overhead per unit of testing.

For FreeType it may not be critical, because running a single unit is expensive anyway,

but still it's an overhead we can avoid.

2. This fuzzer is not guided, i.e. if a mutation is interesting we are not adding it back to the test corpus (right?)

However, I haven't done this recently, and your
bot approach is far more intensive, of course.

Correct. The key word here is not "fuzzing", but "continuous"

> At the bare minimum expect us to pass you reports when issues are
> discovered. Beyond that, if you’re willing, it would be helpful and
> productive if you could do the following:
>
> * Accept the target function
> <https://github.com/kcc/libfuzzer-example/blob/master/freetype-experiment/freetype2_fuzzer.cc>
> into the FreeType trunk.

This ...

> * Extend it to cover all the interesting functionality, possibly
> split it into several independent functions.

... and that could be based on the `ftrandom' code, so please have a
look first.

I did. As is, I can't use this code together with libFuzzer -- it requires a standalone function

that does not fork/exec/exit/assert, etc.

Example is in the link above.

> * Point us to a public test corpus that we can use to extend the
> code coverage further. Ideally, it should be maintained in the
> FreeType git or similar.

What exactly do you mean with `test corpus'?

A set of files which you can pass to FT_New_Memory_Face

or any other API function that receives byte arrays.

These do not have to be only valid font files.

For example, having the inputs that used to trigger bugs in the past would be good.

Note that the biggest
problem of testing FreeType (mainly to compare rendering results of
valid fonts)

The goal of fuzzing is not to test for correctness of rendering (although this may probably be achieved too),

but to protect from stability/reliability/security bugs.

is that most fonts of big importance are copyrighted so
that I can't add them to a public repository...

Of course, we don't want any copyrighted material in the public test corpus.

> * Look at the coverage reports generated by the bot, see what parts
> of code are not covered, provide test inputs for that code.

Yes, this is very interesting. Are the coverage reports cumulative?

Of course, that's the whole point!

Currently, the bot produces function-level coverage using this script:

https://github.com/kcc/libfuzzer-example/blob/master/freetype-experiment/dump_uncovered.sh

The data shows that ~500 out of ~1000 functions were never executed.

The report looks like this (it simply lists file name and line number of a not-covered function):

...
493 src/truetype/ttobjs.c:884:0
494 src/type1/t1afm.c:236:0
495 src/type1/t1afm.c:363:0
496 src/type1/t1afm.c:56:0
497 src/type1/t1afm.c:88:0
498 src/type1/t1driver.c:102:0
...

My hope is that if we extend the target function and the test corpus we can get more coverage.

Only then it will make sense to actually look at the coverage report in detail.

If we ever get close to ~90% function coverage, it will become interesting to go deeper into basic block (or edge) level coverage.

That can be achieved by replacing -fsanitize-coverage=func with =block or =edge in the compilation command.

In particular, a single input file normally tests a single font module
only.

We are on the same page here.

This might be another reason to look at `ftrandom' since it
starts with a whole directory of fonts that can cover input files for
all font modules.

libFuzzer does the same

> *And, most importantly, we’d love your feedback.* Our goal is for
> this to be actually useful for those doing the hard work developing
> software, such as yourself. We would love your insight on how we
> can make that happen. What's missing in libFuzzer, sanitizers,
> coverage, documentation, bot? How we can make the process simpler
> for you so that you concentrate on the quality of code and not on
> the testing infrastructure? What other similar resources could we
> provide?

This is something that will take time to get acquainted with. Right
now, the current setup, that is, you run the bot and report the bugs,
is ideal for me :-)

Yea!

Unfortunately, I don't scale to 1000 OSS projects :(

Our goal is to remove all the obstacles from you (and many other projects)

so that you can benefit from fuzzing w/o having to deal with infrastructure.

But we can't go too deep into every project's internals.

> Please let us know if there’s someone else we should be in touch
> with on the FreeType team, if it’s not you.

I suggest that you write to the `freetype-devel' mailing list:

Done!

There
you find all the interested people, in particular Behdad Esfahbod
(also from google) – his HarfBuzz library certainly deserves the same
tests as FreeType, BTW.

May I forward this mail to the list? This would be a good start, I
think.

Werner

--kcc

From:	Kostya Serebryany
Subject:	Re: [ft-devel] More fuzzing for freetype2?
Date:	Sun, 4 Oct 2015 11:40:38 -0700