Re: Non-committers can't keep authenticated forks updated

guix-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Non-committers can't keep authenticated forks updated

From:	45mg
Subject:	Re: Non-committers can't keep authenticated forks updated
Date:	Thu, 16 Jan 2025 17:29:42 +0000
Hi Liliana,

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

> Am Donnerstag, dem 16.01.2025 um 13:10 +0000 schrieb 45mg:
>> As the 'Authenticate Your Git Checkouts'
>> blog post [9] pointed out, we wouldn't need `guix git authenticate`
>> if we were willing to delegate our security to a trusted third party,
>> like all the open source projects that sport those "fancy “✅
>> verified” badges as found on GitLab and on GitHub" do. We shouldn't
>> force anyone hosting a fork to do so as well.
> I mean, you can host your own fork and use the fancy “✅ verified” badge
> of your host as source of trust – it just won't be checked by `guix
> pull', that's all.  If you do do that, I'd recommend using a file://
> URI with a local checkout for your channel, so that you can verify that
> little check mark on your own (then you only need to trust your own
> file system).

Yeah, I know I can. My point is that people who use remote forks
shouldn't have to rely on a trusted third party. We've figured out a way
for upstream Guix not to have to, now let's try to extend that to forks.

>> > I think you're making this more complicated than it needs to be. 
>> > checkout, authenticate, rebase*, merge* ought to have you covered.
>> > 
>> > * you can authenticate after these if you're paranoid 
>> 
>> The complexity is due to the requirements of not bumping the channel
>> introduction (to avoid the increased attack surface from having to
>> keep obtaining the updated one, as I discussed earlier), keeping fork
>> history intact (to avoid force pulls), keeping upstream history
>> intact, and being able to authenticate both upstream and fork
>> commits. If you can think of a simpler method that meets these
>> requirements, I'd love to hear it.
> Guix committers are more than happy to use work trees and rebases,
> which simplify this a lot – again, it's as simple as pull,
> authenticate, rebase.

No doubt this works for those with commit access. The title of this
issue is 'Non-committers can't keep authenticated forks updated'.

Since most future committers will take years to attain that status, and
many (most?) Guix contributors can't commit (heh) to being committers, I
think it would be a good thing for them to be able to make use of our
security mechanisms.

(Actually, if you mean the variant of rebasing you describe below, then
I see what you mean, never mind)

> W.r.t. keeping history intact, we had the following exchange on IRC
> yesterday.
>
> <Rutherther>  lfam: that's interesting that there is really a merge
> commit, for example if I remember right, the core updates merge few
> months ago happened by directly appending the commits instead of a
> merge commit
> <lfam>        Yes, there are two ways to do it (rebase and merge) and it's a
> matter of taste
> <lfam>        Of course there is a correct choice, as with all questions of
> taste ;)
> <Rutherther>  I personally prefer a merge commit, since it has two
> parents, you can track where the previous master pointed to
> <lfam>        And I prefer a rebase. But ultimately it's up to whoever is in
> front of the keyboard
> <lilyp>       FWIW, a rebase is cleaner, but requires that only one person
> signs off commits on any given branch (or else you're signing commits
> that someone else signed before and have to update the trailer… not
> ideal)
> <lfam>        It doesn't matter who signs the commits as long as they are
> authorized. That's the security model we have
>
> So yeah, even for (branch-)local work at least some committers prefer
> rebasing.

That seems to be a discussion about a merge commit in upstream Guix, not
about the kind of merge commits that I'm trying to allow.

Again, not disputing that things work fine for people with commit
access. Perhaps that is part of why this issue hasn't been addressed
before :P

>> > No, it wouldn't.  You would rebase those changes on top of what you
>> > already have on those respective branches.
>> 
>> It looks like at least one of us is misunderstanding rebasing. Could
>> be me, so I'm consulting the relevant chapter from the Pro Git book
>> [11] for a refresher.
>> 
>> Let's imagine that the first example given there represents our fork
>> of Guix, where the 'experiment' branch marks the beginning of our
>> fork (and its channel introduction) and the 'master' branch tracks
>> upstream Guix.  After `git rebase master`, the commit that used to be
>> C4 is gone, and now C4' takes its place. It may contain the same
>> changes, but it's a different commit - so it (and any commits that
>> it's the parent of) has a different hash. So the channel introduction
>> has changed, and so has the entire history of the `experimental`
>> branch. So we need to force-pull.
> Yes, that's one variant – the one where you need to keep bumping your
> channel introductions.  The other direction would be to rebase Guix
> changes on top of your local branch.  This keeps your channel
> introduction as-is.

Ah, I see. Actually, I think that might work... if I create a
'upstream-backup' branch before rebasing, and reset the 'upstream'
branch to that branch after, I can keep the full history of upstream,
and authenticate it separately. And thanks to Ricardo's suggestion [12]
to compare by Change-ID, it should even be possible to find corresponding
commits between my fork and upstream. Looks like this solution meets all
four requirements that I stated earlier, and it wouldn't be TOO annoying
or complicated to script either.

One limitation I can think of is that you can't really verify whether a
commit ostensibly rebased from master is actually the same commit
(imagine 100 commits in a rebase, are you going to check the diff for
each? Versus with a merge you only have to check that the merge commit
looks sane). Then again, this is really only a problem if you're using
someone ELSE's fork and need to ensure they've not gone mad or evil,
which is not my personal use-case, and perhaps not that common.

You know what, I think I'll give it a try. If there aren't any unforseen
complications, then this solution may be good enough for me, and perhaps
good enough to add to the documentation as a recommendation so that
others don't have to puzzle this out themselves. Thanks for giving me
the idea!

(At any rate, feel free to reply to my points here in the meantime; If
it turns out that the rebase workflow is insufficient for some reason
then I can pick up where we left off)

>> > > 
>> […]
>> This led my to think of an attack that's possible with my design - if
>> I want to screw with anyone `guix pull`ing from my fork, I can merge
>> upstream into my fork branch, add a bunch of malicious commits, and
>> then make the upstream branch ref point to the latest such commit.
>> Now anyone pulling from my fork will recieve the malicious commits as
>> part of upstream's history - since no commit hashes needed to change,
>> the pull is a regular fast-forward one, with no indication that
>> anything is amiss. Authentication will succeed since the malicious
>> merge commit has our fork as its (first) parent, and that parent has
>> the primary introduction as its most recent ancestor.
>> 
>> The takeaway here is that anyone authorized via the primary
>> introduction can fake new upstream commits.
> Care to state how designating one introduction as "primary" adds to
> security here?  

The problem I'm trying to solve is that you can't merge upstream into
your fork unless you sign the merge commit with a key authorized in
upstream, because of the intersections-of-parent-authorizations design.
Tomas's solution was to do away with the 'intersections' requirement,
and allow a key that's authorized for any parent to sign the merge
commit; this is vulnerable to attacks like [4]. My solution falls
somewhere in between - we keep the 'intersections' requirement, /except/
when one of the parent commits is a descendant of an introduction
designated as the 'primary introduction'.

If you drop the 'primary introduction' designation (make all of them
'primary'), then you're basically dropping the intersections
requirement. IOW, we've returned to the situation in Tomas's
solution, where [4] is possible - anyone with even a single signed
commit in Guix can now create a merge commit between Guix and your fork,
even if their key was later removed from Guix.

>> So why bother with additional introductions at all, then? Because as
>> far as I can tell, they are still the only solution mentioned so far
>> that satisfies the requirements I mentioned earlier:
>> > not bumping the channel introduction (to avoid the increased attack
>> > surface from having to keep obtaining the updated one, as I
>> > discussed earlier), keeping fork history intact (to avoid force
>> > pulls), keeping upstream history intact, and being able to
>> > authenticate both upstream and fork commits
>> ...and yes, you do have to trust the fork maintainer to not
>> deliberately mess those things up. But that's nothing new - even in
>> the existing design, we have to trust everyone who can make trusted
>> commits not to mess things up on purpose.
> You are trading one attack surface for another.  Again, all is fine
> while you only have to trust yourself, but weakening an invariant is
> weakening an invariant (:

Which attack surface am I introducing or worsening? The 'authorized
committer can mess things up' attack surface was already /there/. The
attack I described does not make it worse AFAICT - its potential impact
is to compromise the ability to easily authenticate all upstream and
fork commits in one go, but that ability is not something we previously
had in the first place.

>> So what does this all of this mean for the statement of my design?
>> Well, it means that we need to stop thinking in terms of 'which
>> branch can be merged into which?' and more in terms of 'which merge
>> commits can be authenticated?'. And the answer to that question, with
>> my design, would be:
>> 
>> 1. Any merge commit signed with a key in the intersection of its
>>    parents' .guix_authorizations. (Standard authorization invariant.)
>> 
>> 2. Any merge commit that doesn't meet the above conditions, but has a
>>    parent whose most recent ancestor is the primary introduction, and
>>    is signed by a key in the .guix_authorizations of that parent. (My
>>    weakened authorization invariant.)
> That's a pretty long way of saying "Any merge commit signed with a key
> in one of its parents' .guix_authorizations."

Not quite. Merge commits between upstream Guix and your fork, whose
signing key is in the .guix_authorizations of its parent in upstream but
not in the .guix_authorizations of its parent in your fork, would not be
authenticated. (This is the kind of merge commit that would be used for
[4].)

> It is (by your design) trivial to have the "primary introduction"
> under your control.

Yes, as long as you're using your own fork this will be the case.

>> Finally, let me restate the conditions for authenticating merge
>> commits, taking this into account:
>> 
>> --8<---------------cut here---------------start------------->8---
>> For commits that have multiple parents - ie. merge commits - we
>> weaken the invariant as follows:
>> 
>> 1. If all parents have the primary introduction as their most recent
>>    ancestor, then the invariant holds as usual.
>>    
>> 2. If one or more parents has the primary introduction as its most
>>    recent ancestor (call these the 'primary parents'), and the rest
>>    have any of the additional introductions, then the merge commit is
>>    authenticated if and only if it's signed by a key authorized in 
>>    all of the primary parents.
>>    
>> 3. If all parents have the same additional introduction as their most
>>    recent ancestor, then the invariant holds as usual.
>>    
>> 4. If none of the parents have the primary introduction as their most
>>    recent ancestor, nor do they have the same additional
>>    introduction, then the merge commit cannot be authenticated.
>> --8<---------------cut here---------------end--------------->8---
>> 
>> I merged 2a. into 2., and removed 2b.
>> 
>> Now let me try to respond to you:
>> 
>> > Yeah, I think this scheme will still end up in [4].  As pointed out
>> > in [8], "primary" is just a convention that we can't rely on.
>> 
>> Not really. As I discussed, [8] points out that /merge order/ is the
>> convention that we can't rely on. Introductions can be deliberately
>> specified as primary or additional, whether via command-line flags or
>> separate sections in .git/config.
>> 
>> > So let's just talk about the idea of widening one channel
>> > introduction to any number of channel introductions – we can always
>> > store a mapping of HEAD → first authenticated commit and then
>> > assert that this set is a subset of what we declare as
>> > introductions.  (This mapping will also make authentication as
>> > efficient as it currently is, since we don't need to reauthenticate
>> > everything all the time.)
>> 
>> I'm not sure what you mean. What do you mean by "mapping of HEAD →
>> first authenticated commit"? Does this perhaps mean 'all commits
>> between the latest one and the first authenticated commit'?
> Little refresher: Guix stores a list of already authenticated commits
> so as to not redo this work all over again.  If we were to allow
> multiple introductions, we would also need to find the first
> authenticated commit among them to match against the channel
> introductions.

Ok, so I guess you meant what I thought you did.

We can cache the introduction of each commit. In fact, that could
actually help block the attack I described in my previous mail [12], as
in that attack the introduction for all the upstream commits suddenly
changes to the primary introduction; if we manage to detect this, the
conflict with the cached introduction can tell us that something is off.
It's not a real defense as it wouldn't work on a fresh clone, but it's
something, at least.

>> What does "assert that this set is a subset of what we declare as
>> introductions" mean?
> Let's say that you work on branches B, C, and D with "primary"
> introductions I, J , and K.  If you want to merge C into B, you need to
> remember that B has I as its primary introduction, C has J, and so on.

Ah, got it. Yes, that's correct. (Except that there's only one primary
introduction, at least as I intended it.)

>> > Is this good enough?  No: an attacker could easily add their own
>> > introduction and call it a day.  In fact, this scheme is even worse
>> > than what was exploited in [4], because they never need commit
>> > access to the Guix repo to do so.  Ahh, but wait!  `guix pull` on
>> > the user's side uses their clean set of channels for
>> > authentication.  Those only have upstream Guix… unless you actually
>> > pull your own fork or manage an attack as outlined below (in which
>> > case you do need commit access for some amount of time).
>> 
>> I should point out - my design does not require us to distribute any
>> introductions besides Guix's existing one, nor will it provide any
>> mechanism to automatically 'install' someone else's introduction.
> Yes it will, per `%default-guix-channel'.

Ok, technically true. And someone with the ability to make trusted
commits upstream could modify that so that a fresh `guix pull` skips
whatever malicious stuff they've done. But my solution doesn't make this
situation any worse AFAICT.

>> An introduction is a tuple of (introductory commit, key that signs
>> it) that you specify as arguments to `guix git authenticate`. An
>> attacker would have to convince the entire Guix community to specify
>> their (the attacker's) own introduction on the command line (or
>> directly add it into .git/config). And given that there is no reason
>> to ever do so unless you're using someone's fork... that's a hard
>> sell.
> Well, another hard sell would be introducing a feature to `guix git-
> authenticate` that must not ever be used in Guix itself.  Now, since
> you are already soft-forking Guix, you can obviously add this to your
> own guix command, but do beware the dragons you're summoning with it.

By that logic, we would never have gotten the ability to specify
multiple channels, since that isn't used in upstream Guix and doesn't
need to be.

> Cheers

[12] https://lists.gnu.org/archive/html/bug-guix/2025-01/msg00130.html
[13] https://lists.gnu.org/archive/html/bug-guix/2025-01/msg00135.html
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Non-committers can't keep authenticated forks updated, (continued)
- Re: Non-committers can't keep authenticated forks updated, Attila Lendvai, 2025/01/15
  - [Attila Lendvai] Re: Non-committers can't keep authenticated forks updated, 45mg, 2025/01/16
Prev by Date: Common Lisp packaging conventions
Next by Date: Re: bug#74736: [PATCH v2 0/1] Add Request-For-Comment process.
Previous by thread: Re: Non-committers can't keep authenticated forks updated
Next by thread: Re: Non-committers can't keep authenticated forks updated
Index(es):
- Date
- Thread