[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
P2P Guix package building and distribution
From: |
Christine Lemmer-Webber |
Subject: |
P2P Guix package building and distribution |
Date: |
Wed, 21 Aug 2024 18:07:58 -0400 |
User-agent: |
mu4e 1.12.4; emacs 29.3 |
"Jonathan Frederickson" <jonathan@terracrypt.net> writes:
> On Tue, Aug 13, 2024, at 12:23 PM, Sergio Pastor Pérez wrote:
>
>> Wouldn't it be enough to have a few independent seeders that have the
>> same derivation output? We could have a field in the p2p service type
>> which allows the user to configure a "level of trust", where the user
>> specifies the minimum number of seeders with the same output for the
>> daemon to accept the substitute.
>
> This might be enough if you could do it, but the trouble is
> identifying "independent" seeders. If you get the same output from
> five different seeders, that could be five different people... or I
> could have set up five different nodes participating in the swarm
> serving my malicious substitutes. (This is known as a Sibyl attack.)
>
> But maybe taking inspiration from this... perhaps you could do
> something more akin to some of the web-of-trust features of
> e.g. PGP. In other words, you might have the ability to partially
> trust a server's substitutes such that you'll only use a substitute if
> N other partially trusted servers (or at least one fully trusted
> server) serve up the same content. This would still not let you have a
> totally permissionless set of P2P substitutes, but it would allow the
> community to build a list of individuals who are at least trusted not
> to collude with one another, if not fully trusted.
>
> Though there's a detail that might need addressing for this to
> work... you would want this to be an indication that multiple
> individuals were able to reproducibly build the same packages
> bit-for-bit. But my impression is that substitutes served by 'guix
> publish' are always signed with the substitute server's signing key,
> regardless of where they were built. That does mean that if 4 people
> were to pull substitutes of a package from one other person, those 5
> people would end up serving substitutes originating from one
> person. You may want a way for someone running a substitute server to
> additionally attest that they had individually built the derivation in
> question.
I definitely think that this is a future we'd want with Guix.
Goals:
- That our software be fully reproducible in the first place
- P2P distribution mechanisms for inputs (this one is relatively easy!)
- "Community participation" of building derivatives
- P2P distribution of built artifacts (actually, if you have p2p
distribution of inputs, you can have this one relatively for
free/cheap)
There are challenges with all of this, but really we know enough what
p2p content addressed infrastructure looks like, this isn't the hard part.
Figuring out how to build a set of "semi-trusted sources" and the UX
around it is the hard part.
(In a weird way, compiling and verifying software is a "soft trap door
problem", I have been thinking. Certainly not as much so as the
functions we require for cryptography to work, but it's still a trap
door, which is why build farms are expensive.)
I guess a worthwhile question is "where are the costs coming from"?
Ludovic said:
> The various options and back-of-the-envelope estimates we came up with
> are as follows:
>
> 1. Buying and hosting hardware:
> 250k€ for hardware
> 3k€/month (36k€/year)
>
> 2. Renting machines (e.g., on Hetzner):
> 6k€/month (72k€/year)
>
> 3. Sponsored:
> get hardware and/or hosting sponsored (by academic institutions or
> companies).
So I am guessing bandwith costs are significant but the 250k EUR for
hardware indicates this is especially a build farm issue rather than a
content distribution / bandwith issue. (Do I have that right?)
Regardless, something I have thought about... #2 and #3 are cheaper but
"less preferable" than #1 because of security concerns, per my
understanding.
But... what if we managed to make #2 and #3 *more secure*? Here is an
idea that is semi-p2p, and maybe a path towards a more full p2p option,
that we could possibly persue.
- We have machines hosted that we trust a bit less, at some hosting
facility, or possibly sponsored from someplace we trust even less.
Let's imagine we went with the imaginary MegaCloud Inc.
- We have a set of keys for semi-trusted "Guix Builders", who have
machines that we run in our houses/lodgings/etc which sit around
compiling Guix packages all day. These could be eg people who aren't
even committers but have gained the trust of committers and maybe
have even come to something like Guix Days in person.
Now imagine for a moment that I wanted to download the latest version
of... let's go oldschool in our FOSS references and say some expensive
to compile browser named IceWeasel. ;)
I want to download the latest version of IceWeasel. I could compile it
myself, or I could get a substitute. #1 feels like the most
"trustworthy" option at first glance but actually it could be even a
single point of failure attack source.
Okay, but what if instead I had the option to download something signed
off by *all of* the MegaCloud build service and two "Guix Builders", and
they all came to the same hash?
This seems even better than #1 from a security/integrity perspective, I
think.
Just speculating...
- Christine