guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About SWH, let avoid the wrong discussion


From: Vagrant Cascadian
Subject: Re: About SWH, let avoid the wrong discussion
Date: Fri, 21 Jun 2024 09:51:30 -0700

On 2024-06-21, MSavoritias wrote:
> On Fri, 21 Jun 2024 11:46:56 +0200
> Andreas Enge <andreas@enge.fr> wrote:
>> Am Fri, Jun 21, 2024 at 12:12:13PM +0300 schrieb MSavoritias:
>> > and as I mention in my first email I want to apply social pressure and 
>> > make it clear to package authors what is happening so we can move to an 
>> > opt-in model.  
>> 
>> Well, the opt-in model is in place: As soon as I put my code under a free
>> license on the Internet, I opt in for it to be harvested by SWH (and anybody
>> else, including non-friendly companies and state actors).
>
> That may be how you have understood it but that is not how most people 
> understand it.
> See for example mirroring videos that creators have made online, or more 
> recently some activitypub software harvesting posts for a search engine.

I think the fundamental difference is that such videos or activitypub
posts are not necessarily released under a license that *expressly*
permits sharing.

In most cases, those posts and videos are often released without any
license at all, and the person retains the legal, social, moral and
ethical rights to decide how that content is shared if at all. (I am
speaking with those terms in the "plain" english sense, although they
may have specific legal meanings in some contexts)


> As I have been saying a lot in this thread (because there seem to be a
> lot of people in the Guix community not familiar that legal are not
> the same as social rules):

> -Just because you CAN do something doesn't mean you SHOULD. In the sense that 
> yes somebody can probably harvest all my posts from activitypub and post them 
> somewhere else, 
> in practise they are an asshole tho and probably are going to be
> deferated pretty fast for breaking the social rules of common human
> decency :)

With something released under a Free Software license, calling someone
an "asshole" simply for using the permissions granted by that license,
by the very person who granted those permissions, starts to feel a bit
like a baited trap and honestly, maybe outright duplicitous. Certainly
rude, at the very least.

Again, that is different from some arbitrary post or video or cat
picture on the internet, which more likely than not has no explicit
permissions granted.


> TBH it seems you are not the only one in this thread not knowing that laws 
> (legal rules of states) ie. the FSF licenses and work and whatever, are not 
> the same as social rules.
> But given that Guix has a CoC and social rules on top of that I am hopeful :)

Well... free software ... is a bunch of social rules. Licenses are
social rules. Contracts are social rules. Laws are social
rules. Admittedly, a lot of the mechanics involved in law creation and
enforcement are dubious and suspect and weighted in the favor large,
wealthy and/or otherwise powerful entities...

I am not sure arguing about social vs. legal vs. whatever is even really
a useful direction... almost missing the point entirely.

I would rather ask... what is the intention of the Free Software
movement?

The licenses are merely imperfect tools to achieve those aims, and a
clever way to leverage some specific legal mechanisms, but the licenses
are not an end unto themselves.

For me personally, it is about creating a shared commons that can be
used to build healthy thriving local, regional, global and virtual
communities that do useful or interesting things... I dare dream that
some of those collaboration skills leak into other aspects of life too,
not just software!

I have a lot of doubts that the LLM training from SWH data is going to
further this vision for free software... while the overall work of SWH
most definitely does.


Given my crude understanding of how LLM training works, it seems hard to
imagine that it could actually produce models that comply with all of
the license terms of innumerable free software projects, some of which
have mutually incompatible terms. For just a handful of examples that
are incompatible with the GPL:

  https://www.gnu.org/licenses/license-list.html#GPLIncompatibleLicenses

So unless they are very extremely exceedingly excruciatingly careful
about not including incompatible licenses... I have significant doubts.
The incentives are just not there.


I am a bit disappointed with the very optimistic take SWH has regarding
LLMs for code:

  https://www.softwareheritage.org/2023/10/19/swh-statement-on-llm-for-code/

Even with all the identifiers to show which code a model was trained on,
the whole point of a large model is it is built from a huge
dataset... my guess is it takes significantly more effort to audit that
dataset than to create an LLM with it.

Which is to say license compliance, one of the few tools of the Free
Software movement, seems unlikely to be effective. It is barely
effective with more traditional software development.


In short, er, at length, I am really not sure what to do.

I find the opt-out/opt-in angle to be almost tangential.

I find all the hype, and more importantly, active harm done with LLMs to
be a very serious threat to free software, various disadvantaged
communities, and possibly the literal liveability of our biggest commons
so far, dear planet earth... to be appalling.


If some social pressure from the Guix community could improve things, by
all means, though I worry that it might be at best performative rather
than effective, especially if the pressure is placed N parties removed
from the source of the actual problem (e.g. those irresponsibly training
of LLMs without respecting the licenses).


Aaaaaand... I have to cut myself off now. :)


live well,
  vagrant

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]