[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GPG-Signed Commits proposal
From: |
Sylvain Beucler |
Subject: |
Re: GPG-Signed Commits proposal |
Date: |
Mon, 29 Aug 2005 23:40:12 +0200 |
User-agent: |
Mutt/1.5.9i |
I realize that the follow-up I sent a week or two ago didn't made it
to the list :/
Please find here some comments to your reply, then my original
follow-up :)
On Mon, Aug 29, 2005 at 03:57:29PM -0400, Derek Price wrote:
> I think it would be best to add an RCS newphrase in the archive file for
> storing signatures. Old versions of CVS and RCS which don't understand
> the newphrase would even ignore it. See the recent addition of commit
> ids for an example.
Sounds good :)
> As for working around keywords, I don't think signing can be performed
> securely with keywords in use at all.
>
> For instance, consider the line in a function:
>
> char *author = "$Author$";
>
> If we decided to sign the -kk version of the file or even the -ko
> version of the file, then a compromised server could send a line like:
>
> char *author = "$Author: ";int gotcha = dosomethingnasty();char *dummy =
> "$";
>
> and a verfier which converted to -kk or -ko mode for verification would
> hapily confirm the file was the original.
>
> And don't forget, even if we decide to ignore keywords and tell folks
> they can't use signing with keywords (a warning to set -ko mode from CVS
> when keywords are detected may be in order), files will still need to be
> converted to UNIX EOLs before sigining and verification on systems which
> would have converted the EOLs for network transport.
I had in mind to sign what is stored in the RCS file, that is the
commit-time content minus EOL conversions under MS Woe. AFAICS, that
commit-time content is what you get using '-kb' later on. The verifier
would check the signature before to perform keyword substitution (I
assumed the client did the substitution, but maybe I'm wrong. In the
worst case the file could be checked out twice, one in -kb for
signature-checking and the other one with any keyword substitution).
> > >How much time do you think it would take to a good CVS hacker to
> > >implement this in CVS (or even code this as an external wrapper?). If
> > >you think that's possible maybe I could implement a prototype myself.
> >
> >
> > I am not sure how long it would take to hack CVS changes into place.
>
>
> I don't think that this should be very complicated at all since RCS
> keywords must be ignored by necessity. Most of the necessary code for
> EOL conversion and RCS newphrases should be inside CVS already and not
> very complicated to hook into. If you were to hook into the src/run.c
> code for external GPG execution and trust the user to maintain their gpg
> executable and keyring, I'd imagine a motivated developer could make
> short work of this.
I think the user can be trusted for maintaining the keyring, but I
think a keyring could be specified, to avoid accepting signatures from
people who have nothing to do in the development of the current
repository.
Here're some more thoughs on the implementation, then a sum-up.
1) We need a canonical version of each revision to sign.
I figure one could get that canonical version using binary mode ('cvs
update -kb'). When the file is ready to be commited, it should match
that canonical version, except for line endings. Afaics keyword
substitution didn't occur yet, it's rather done at update time (if we
consider that 'cvs ci' performs an update after the commit per se).
So the only work to get a canonical version just before the commit is
to convert line endings on non-Unix platforms, except if we deal with
a real binary file. Even a wrapper should be able to reimplement that
transformation easily, shouldn't it?
The verification, if done by a wrapper, needs to checkout the file
twice (the normal checkout and the canonical/binary one), hopefully
this could be optimized with a builtin solution.
Is there something else than can change the content of a revision? Are
there server-side hooks that can modify the content of the file
itself? (as verifymsg does for the commit message?)
2) We need to specify the checksum
a) I though that we could directly use what GPG uses when 'gpg -a
-b'. I could not find what checksum is used exactly though. The
advantage is that we seamlessly follow GPG's evolution and don't have
to bother updating the checksum in use over the years.
b) Else we can use whatever checksum is considered secure today. We
just need to explicitely state what checksum was used in case it
changes in the future.
3) We need to sign more than the checksum,
else an attacker could put an old vulnerable version back to HEAD.
So we need to state different "assertions" about the code: the
checksum, the filename, the path, the revision, the date, the author,
the commit message. Maybe also the new commitid you mentioned.
With this the revision should be difficult to move around without
notice.
To avoid reply attacks, I see two ways to combine the information:
a) Sign a canonical representation of all these informations at
once. This may generate troubles later: for example if we want to add
the commitid to the signed informations, that requires a change to the
canonical representation (which maybe can be induced by the precense
of the commitid field).
b) Assign a unique ID to the revision and include it in all signed
information; still following the example of Monotone, this implies we
use solution b) for point 2), and that we include this checksum in all
signed information. The advantage is that we can put each information
as a separate signature, pile them up and hence add any new
information at any time in the future. the drawback is that it's more
easy to remove part of the signed information, and we lose the
maintenance ease from 2a).
It would also be good to consider multiple signatures of the same
revision (signed again by the package maintainer after an old key
expired or was revoked, for example)
We're still vulnerable to 'cvs admin -b' that changes the HEAD of a
given file to apparently any revision. Not sure how we can fix that
point :/
4) Where to store the information?
If the signature can be added as a new field in the RCS file without
breaking backward compatibility, that would be great. The commit
message is subject to modification by the verifymsg hook, so it be
better to avoid it, I guess. The fact the commit log is changeable via
'cvs admin' is not really an issue, since it would allow later
additional signatures, though it would be better if such changes were
append-only. If we store the information somewhere else, then a new
'cvs admin' command would be needed so as to change the related
signature as well.
Do you think it's possible and reasonable to store all the signatures
as a new RCS field?
5) Command line options
> True. You could also add somekind of a new cvs command to perform the
> checksum validation after checkout has finished on demand rather than
> doing it all of the time.
I think a 'cvs' (or an 'update/co/export') option would also be
good. It needn't be default, and could be easily added to
~/.cvsrc. Another tool to apply batch validation would be needed as
well indeed.
Another configuration option would be to specify which set of
signatures you trust for a given repository.
To sum up:
1. Identification (revid):
a) checksum_type+checksum
or
b) rev+date+file+(path? problem with CVSROOT/modules?)
2. Format
---
`gpg -a -b "$REVID$CANONICAL_VERSION"`
(`gpg --clearsign` of:
revid: cf 1.
assertion1: value1
assertion2: value2
.
.
.
){1,}
---
Required assertions (unless they're part of the revid):
- rev
- date
- file
- path??
- *author*
- commit message??
Assertions about tags ('cvs tag') needs to be signed as well
3. Canonical version:
'cvs update -p -kb' + line ending convertion to Unix-style
4. Verification algorithm
Some sigs may not work (say, revoked or expired), but at the end of
the process, a set of assertions need to be true and signed. False
signed assertions that are not required will make the verification
fail anyway. Assertions unverifiable but not required (eg the presence
of a valid commitid, or a non-matching log message) will not make the
verification fail. If multiple assertions of the same type exist, only
one true+verified assertion is needed (allows resigning or fixing a
transmission error (or CVS bug) a posteriori).
Assertions to check may be modified with options, so if cvs requires
more options to be valid by default in the future, old archives still
can be signed.
5. Where to store?
a) for a prototype in commit message
b) for a real implementation in a RCS revision newphrase
Comments? :)
--
Sylvain