|From:||William ML Leslie|
|Subject:||Re: Verifying Toolchain Semantics|
|Date:||Mon, 6 Oct 2014 19:51:36 +1100|
From: William ML Leslie
Date: Mon, 6 Oct 2014 00:57:49 +1100
On 3 October 2014 22:56, Taylan Ulrich Bayirli/Kammer <address@hidden> wrote:
> Say, for example, that I can guess you will end up viewing the
> document with one of five output programs - maybe, evince, chrome,
> tk, gecko translating xsl-fo to xul or firefox viewing html 4
> directly, and say that I only have exploits for the first three.
> Now I have a bit of a problem: I need to find a way to get your tool
> to emit the specific structure that will invoke the exploit. Since
> it will show a document that looks correct, although underneath may
> have a different structure, I will have to account for variation in
> what it emits but there will be a limited class of algebraic
> manipulations that it can make without changing the semantics of the
It's not about this. These are all the programs that have problems.
We're not going to use these sorts of programs anymore. We're going to
use auto-generated, nearly perfect code. And after a few years it
won't be just near perfect, it'll be perfect.
> Maybe I would not find a way to /reliably/ produce a document that
> will infect your machine. Or maybe I would. It's hypothetical.
> But you know as well as I do that fuzzing a document generator to
> see what instructions produce exploitable output is easier than
> either writing a verifiable PDF viewer or writing a tool that
> bisimulates a PDF document with a class of layout languages. The
> cherry on top is that all that extra code you introduce is surely
> ripe with additional vulnerabilities - so maybe I don't need to
> bother with producing those back-end exploits at all.
This is wrong. If you look at the automatically-generated assembler
code from a formal spec, in the form of a functional program
generating that code:
you will see that the extra code has no extra attack vectors. It's all
machine-generated. If this were a piece of code for reading a binary
file, then it would have been generated from a grammar describing
exactly what are the valid words in that language. (Using words and
language in the technical sense as "valid strings" and "set of all
valid strings".) So buffer overruns won't happen. And if all the
semantics are formalised, then off-by-one errors won't happen
either. If they do, they'll happen all over the place and you will
find out very quickly.
> Though I think your analogy is wrong. PDF is already an abstract
> semantics, like any formal language spec. The question is whether
> an implementation realizes precisely those semantics, or adds "oh
> and insert a backdoor" to some pieces of code in that language, "as
> an addition to the language semantics." This augmentation to the
> semantics needn't be apparent in the source code of the
> implementation, since the language it's written in might again be
> compiled by a compiler that "augments" the semantics of the language
> it compiles, and that in turn needn't be apparent in the compiler's
> source ... until you arrive at the since long vanished initial
> source code that planted the virus.
> I hope by now you understand why I think this to be nonsense. It's
> still a C compiler. You know that, I know that. Why wouldn't the
> machine be able to tell?
Because it's formally undecidable. That's Rice's Theorem. And it's not
"in practice" probably detectable by semantantic methods, because we
are going to extra lengths to obscure the effective operational
semantics by multiple re-encodings. How are you going to recognise a
compiled Fortran90 program translating a C program into an abstract
assembler like lightning, as a C compiler? You don't even know what
the opcode numbers are: we can permute them.
> The problem that I have with this discussion is that rather than
> address many* of the points in my email, he instead resorted to
> insisting I was a shill. It's very difficult to have a conversation
> with someone who uses silencing tactics. Or better - it's
What are you talking about? Here is what you wrote to me:
> Please understand that our lack of enthusiasm for it is not without
> our own individual commitments to the security and freedoms of
> users; rather while the extra assurances your project may provide
> sound nice, they are frankly rather minimal and don't have much
> additional utility, and so are not priority one.
Now if you wanted me to continue answering your questions, then well,
I'm sorry, I just got the wrong end of stick completely. I thought you
wanted me to fuck off!
So I replied as below. To you, privately. And that was it. Silencing
tactics? You then copied it to the world, and a guy who I _told_ you,
was deputy director of the NSA's National Computer Security Centre!
|[Prev in Thread]||Current Thread||[Next in Thread]|