Re: Verifying Toolchain Semantics

guile-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Verifying Toolchain Semantics

From:	Ian Grant
Subject:	Re: Verifying Toolchain Semantics
Date:	Tue, 7 Oct 2014 13:18:31 -0400
On Mon, Oct 6, 2014 at 12:23 AM, Mike Gerwitz <address@hidden> wrote:
> On Sun, Oct 05, 2014 at 12:11:00PM -0400, Ian Grant wrote:
>> > As has been stated---your concerns are substantiated and understood,
>>
>> I wasn't aware that my concerns _have_ been substantiated. How? I am
>> not sure they have been understood, either.
>
> They were substantiated long ago by the very references you've provided.
> Your application to the GNU project doesn't make this a novel discussion.

This issue is not all of my concerns. This issue of insecure software
is a tiny part of the project.I am not "applying" for anything from
"the GNU project" I am _giving_ the GNU project a lifeline. I have
given all my ideas about how to write software properly, which you
desperately need to know. And I am giving you my time, teaching you
how to think about software and computing. I don't expect anything in
return. Some respect would be nice, but it's not essential, as you can
see.

> The problem is understood. A proper and **practical** solution to the
> problem may be less well understood, but you haven't yet convinced those in
> charge of making such decisions that this is a threat (a) greater than all
> other threats worth devoting time to protect against; and (b) worthy of such
> an enormous time investment at this time.

The problem is not understood by _you._ otherwise you wouldn't say it
was an enormous investment of time. We would have had half of it done
by now if people had responded to me constructively. But instead I
have had to spend over a month countering incoherent objections made
by smart-ass kids who can't tell one end of a computer from the other.

> It is worthy of addressing, and addressing it in the manner that you and
> Dijkstra suggest will certainly go a long way to preventing nearly all
> problems contributing to (a), but you need to take a different approach if
> you're going to incite such change.

Why do you think that?

>> Well my body is getting on for 50 years old, so it doesn't matter so
>> much. Here's a "better" example:
>>
>>    http://en.wikipedia.org/wiki/Aaron_Swartz
>
> I hope very much that using Aaron as an example is to further your argument
> that programming is a life-or-death situation, and that it's not a "better"
> example of personal neglect or instability.

No I did not mean it was a better example of personal neglect. But
arguably suicide *is* a better example of personal neglect and
instability. I wouldn't have been so crass as to say that though,
especially since I _sincerely doubt_ Aaron Swartz did commit suicide.
It is _much_ more likely that he was murdered.

> The former has no relation to the topic at hand, and the latter
> is---well, let's just hope that you poorly represented yourself.

Yes it _does_ have a relation to the topic at hand. I said "you are
combatants in a global information war which will cost some you your
lives" and you replied saying I was being melodramatic and that you
personally were sure there were a million things more likely to take
your life, and suggested that remaining seated for extended periods
and/or neglecting personal hygeine was one of them. Aaron's was a
better example of how this war could take someone's life. I  put
"better" in quotes, thinking people would know then that I didn't in
any way think that what happened to Aaron was in any way good.

>> > This argument is not valid---why is it hard to alter a PDF? In fact,
>> > PDF manipulation is a dark (and probably cancer-causing) art that's
>> > automated by countless businesses worldwide; it is a topic that eats
>> > up a significant portion of development time at my employer's office.
>>
>> It is valid. If you want to see why, then try to alter one of the PDFs
>> I've sent out.
>> What will you do when I later send out a series of checksums using a
>> checksum algorithm that neither you or anyone else ever heard of
>> before?
>
> Your argument favored PDF over plain text. The point that I had made was
> that you are unable to verify that two PDFs---unless binary
> identical---render text that is unchanged from the original. If I convert
> your PDF to a PNG and back to a PDF, it renders the same text. What does
> that mean to you? Does that mean that the text has been altered? Do you care
> that your work has been distribute in an alternative form? At what point
> does it become "modified"? And how can you distinguish reasonable
> alterations from unreasonable ones?

You can convert my PDF to anything you like, and if it represents the
same text, then I don't care. It would be a pointless thing to do
however, unless you wanted to defeat some automated mass-surveillance
system like Echelon.  What I don't want is people producing something
that says things I didn't say but which others might think I wrote. I
also don't want shoddy reproductions of text being passed around as
emails, accumulating quote marks and comments. I also want people to
see the text as I typeset it. I encode messages in these texts, and
they are only decodable from the typesetting.

> What does your checksum algorithm accomplish than existing checksum or
> hashing algorithms do not? Why can that same principle not be applied to
> plain text---why does the implementation of your checksum make it any more
> difficult to modify the PDF? And why is your checksum better than an
> established and heavily studied cryptographic standard (e.g. SHA-256)?

Do you mind me asking: do you have a university degree-level
qualification in computer science? If so, from which university is it?
(I don't have any such thing, in case you are wondering.)  Even
cryptographic checksums are not unique. So if you know what checksum
you are trying to defeat (MD5, say) you could make a new PDF with
different text, but which had the same checksum. The fact that this
wasn't obvious to you demonstrates something important: people
mistakenly use cryptography as a substitute for actual knowledge, even
when they _know_ they don't know how or why it actually works.

>> > Have you considered just distributing a GPG/PGP signature with your
>> > works, or even signing the work itself? After all, this whole
>> > discussion is about proving the unlikelihood of and preventing the
>> > modification of data.
>>
>> Of course I have thought of that, and rejected it as a pointless waste
>> of time! If you read what I've written, on this list
>
> You have written a lot on this list, and there has been an absurd amount of
> redundancy. I have not read it all.

The absurd amount of redundancy is because people like you and "Manicz
Patzigkeit" and William Leslie and Mark Weaver and Oleg and Richard
Stallman argue with me before they have understood what I've already
written on this list. It's a self-perpetuating error, it seems.

>> and also in that blog article Mark pointed up, you will see why I think
>> this. And even if public key crypto was positively secure, it is worthless
>> when I have no rational basis for confidence in the security of the system
>> which holds the private keys.
>
> In the case of PGP, you hold your private keys. I suppose if you're worried
> about a Thompson virus, you can't even provide yourself assurances of the
> security of your own system, but in such a severe case, there is nothing you
> do on your system that can be trusted, so surely you must not enter your
> password to access your e-mail on any of those systems. Because if you did,
> surely someone could just impersonate you to begin with, and there'd be no
> point in signing anything cryptographically, because such attempts would
> have been preempted.

I have zero confidence in the security of my own system. And not
because of a "Thompson virus" (It's not a virus, and it's no more
associated with Thompson than it is with the NSA.) it's because it's
mostly GNU software, which I know is so totally insecure you wouldn't
need a compiler trap door to get into any and every system.

> All that aside, it is still possible to host your private key on a machine
> that is wholly segregated from the rest of the world, perform your
> cryptographic operation on that, transfer its output (doesn't really matter
> how at this point, as long as it's output-only and cannot be used to
> compromise the machine---it only matters that the same output is supplied to
> each machine) to N machines with N different lineages (even if one is a
> PDP-11!), and verify the result of the cryptographic operation on those
> machines to ensure that your data were not modified in the process of
> applying the cryptographic operation.

What is amazing to me is that you think I don't know what you're
telling me. I have been writing software professionally for over 30
years, and I have been a unix sys-admin for fifteen years in an
institution where some of the secretaries apparently know more about
computing than you do.

> But we're not talking about such assurances here, so most of those steps
> would be unnecessary---upon receiving your signed request from the machine
> hosting your private key, it *does not matter* what its output is: the plain
> text is left, well, plain. You can verify it by hand. Then everyone else in
> the world with your public key can verify that it is indeed the signature
> produced by the same machine that produced the signature for all of your
> other archived messages all over the internet (mailing lists, your own blog
> entries, etc)---a task that can be trivially automated to scan for
> discrepancies.
>
> Your checksum would merely state "this PDF has not been modified". This
> would say "this PDF has not been modified and I am actually the same person
> that wrote all these other articles / e-mail messages, while claiming to be
> Ian". And even if you have no trust in the cryptography that links the
> signature to the private key, or have no trust in public key cryptography,
> the signature *still* contains a hash[0], and so you get the former
> assurance for free, assuming a [currently] trusted digest algorithm.
>
> So tell me: how does your yet-to-be-released checksum algorithm provide
> any better assurances than this?

I think I've explained that.

> Oh, but let's not stop there! Your signature relies on the security of the
> hashing algorithm---if a collision is found, what does that mean for
> cleartext signing? Well, if you had distributed a plain-text document in
> ASCII (and Unicode is almost certainly fine too), it's very highly likely,
> unless the hashing algorithm is just brutally and fundamentally flawed, that
> the collision will contain a lot of garbage---garbage that would be
> immediately noticeable. But if you distribute a PDF, it might be possible to
> hide all of that garbage within its constructs.

Let's do stop there. For your sake: you are in grave danger of making
an even bigger fool of yourself than you already have.

>> PK crypto, in this state we're in, has negative value: it just creates a
>> false sense oif security, and _discourages_ people from actually thinking
>> about the real problem because they "know" their comms. are secure. They
>> don't know that at all.
>
> I certainly hope that developers of systems that use public-key cryptography
> understand the obvious, fundamental principle that you described in the
> article that Mark linked.[1] Of course, I understand that this is not often
> the case, and it's certainly not the case for most *users* of the software.
> And that is dangerous, because GPG/PGP/PK-crypto users should certainly be
> aware that any message they encrypt, even if the private key is discarded,
> may very well one day be broken. And the adversary may be little 10yo Bobby
> on his modern-era PC in the not-too-distant future.

I sincerely doubt that developers of systems that use public-key
cryptography know that. In fact, I doubt anyone will be able to
provide me a reference to a commercially published source that clearly
and explicitly says essentially what I said there. I doubt anyone
could even give me a URL for a text that clearly and explicitly states
that. And to show it is widely acknowledged you would need to point to
dozens of instances.

> But for the purpose of *this* discussion, the integrity of your chosen PK
> algorithm isn't entirely relevant. Yes, your identity assurance is nice to
> have, but at the *very least*, your signature provides a hash that is
> verifiable.

I don't give a shit about my identity. It's the identity of the *text*
that's important.

> Let's not play the "well the software performing the verification may be
> compromised" card---it's still possible to manually perform the verification
> if you really felt it to be necessary, if the situation were dire enough.
>
> So again---have you considered distributing a signature with your documents?

No. You have not told me anything I didn't already know. But I hope
you appreciate I have told you things you should have known, had you
only thought a little about them.

>> I could ship them as low-res NG files. But really, it would be much
>> better just to get good working PDF viewers. There are _millions_ of
>> important documents which are in PDF form. We need to be able to
>> reliably verify and reference text within them. These include things
>> like the Intel architecture reference manuals.
>
> Certainly, and I'm not arguing that. I'm arguing against your assertion that
> PDFs are harder to modify than plain text and somehow provide greater
> assurances; I'm arguing the exact opposite.

PDFs _are_ harder to modify than plain text. If you really don't
believe me, then modify one of the PDFs I sent and tell me how easy it
was to do. And PDFs can provide greater assurances that the content is
original because they are structured documents, not linear strings of
characters. The same "multiple representations of a given text"
arguments apply to plain text files. PDFs also look better, and they
are easier for people to read. And some of them include complex
formatting that is more clearly expressed using high-quality
typesetting than linear text strings. PDFs can also be incrementally
and _verifiably_ modified.

It's really funny. You kids are all up in arms over an international
standard, meticulously defined, beautifully documented format that you
don't actually seem to know anything about. I suppose that's because
the specification is _quel horreur!_ a PDF file! And if you had just
done what I told you to do, and got a decent lexical analyzer
generator working for guile, and typed in the grammar, and got the
lightning interface for guile working,, you would have had a scheme
program that generated PDF reader programs by now. And that would have
given you a guile PDF display plugin for Firefox, for example.

>> I am well aware of this, and it is one reason why we need to be able
>> to read PDFs reliably. We can then vary the concrete representation
>> and defeat attempts at mass-surveillance and also prevent people from
>> using automated methods to edit the files in transit.
>
> Preventing file modifications in transit is a well-understood problem
> (and the problems with various implementations understood) unrelated to this
> one.

It is related, because those documents can be transformed if/when
necessary, and there are hopefully hundreds of copies around the place
by now, so it will be a bit hard for anyone to have them all deleted.

> But having a PDF reader that is generated from a formal definition won't
> prevent the problems that I described (although it'd prevent the majority of
> today's attacks against, say, well-known proprietary readers), because *PDF
> exists to present formatted text*, and there are countless ways to present
> what appears to be the exact same formatting. So the answer to the
> fundamental question "do PDFs X and Y present equivalent content?" is
> non-trivial.

You've lost the point of your argument now. This is incoherent. I am
not interested in automatically determining whether two PDFs represent
the same content, and I never was. That is formally undecidable, just
like William's "do programs X and Y present equivalent content?". It
requires actual human knowledge about what the glyphs represent.

>> And before some smart-ass says "Oh yeah, and how are you going to
>> verify the contents with cryptographic checksums?" I warn you to
>> _think_ first, because I will metaphorically tear you into fifty thin,
>> bloody little strips before you can say "Doh!"!
>
> I don't follow. Surely your method isn't novel.

Well if the method involved anything "novel" would you expect to be
able to know what it was just by thinking about it?

> Free Software Hacker | GNU Maintainer

GNU software needs a lot more developers and far fewer maintainers
than it has. We don't need to maintain crap, we need to replace it.

And maybe one-day you won't feel embarrassed to call yourself a
programmer, or a software engineer. Here's hoping, anyway.

Ian
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Verifying Toolchain Semantics, (continued)
- Re: Verifying Toolchain Semantics, Ian Grant, 2014/10/05
  - Re: Verifying Toolchain Semantics, Ian Grant, 2014/10/05
- Re: Verifying Toolchain Semantics, Ian Grant, 2014/10/05
  - Re: Verifying Toolchain Semantics, William ML Leslie, 2014/10/06
Prev by Date: g-wrap: web pages, online manual pages updated
Next by Date: Re: Verifying Toolchain Semantics
Previous by thread: Re: Verifying Toolchain Semantics
Next by thread: Re: Verifying Toolchain Semantics
Index(es):
- Date
- Thread