[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re-approaching package tagging

From: swedebugia
Subject: Re: Re-approaching package tagging
Date: Wed, 19 Dec 2018 08:42:24 +0100

On 2018-12-19 07:51, swedebugia wrote:
On 2018-12-18 08:48, Catonano wrote:

Il giorno lun 17 dic 2018 alle ore 22:10 swedebugia <address@hidden <mailto:address@hidden>> ha scritto:

    Hi :)

    On 2018-12-17 20:01, Christopher Lemmer Webber wrote:
     > Hello,
     > In the past when we've discussed package tagging, I think Ludo'
    has been
     > against it, primarily because it's a giant source of bikeshedding.  I
     > agree that it's a huge space for bikeshedding... no space
    provides more
     > bikeshedding than naming things, and tagging things is a many to many
     > naming system.
     > However, I will say that finding packages based on topical
    interest is
     > pretty hard right now.  If I want to find all the available
     > address@hidden:~$ guix package -A rogue
     > hyperrogue    10.5    out     gnu/packages/games.scm:3652:2
     > roguebox-adventures   2.2.1   out  gnu/packages/games.scm:1047:2
     > Hm, that's strange, there's definitely more roguelikes that
    should show
     > up than that!  A more specific search is even worse:
     > address@hidden:~$ guix package -A roguelike
     > address@hidden:~$
     > What I should have gotten back:
     >   - angband
     >   - cataclysm-dda
     >   - crawl
     >   - crawl-tiles
     >   - hyperrogue
     >   - nethack
     >   - roguebox-adventures
     >   - tome4
     > So I only got 1/4 of the entries I was interested in in my first
     > Too bad!
     > I get that we're opening up space for bikeshedding and *that's true*.
     > But it seems like not doing so makes things hard on users.
     > What do you think?  Is there a way to open the (pandora's?) box
    of tags
     > safely?

    Yes and no.

    Pjotr and I have discussed this relating to biotech software. He said
    that many scientists have a hard time finding the right tools for
    the job.

    I proposed tight integration with wikidata[1] (every software in the
    world will eventually have an item there) and Guix (QID on every
    and lookup/catogory integration) and leave all the categorizing to
    Ha problem sidestepped, they are bikeshedding experts over there in
    wikiland! :D

    The advantage of this is that everyone using wikidata (every package
    manager) could pull the same categorization so we only do it once in a

    What do you think?


There is also the Free Software Directory

I don't know what the relationship between Wikidata and the FSD is

Does Wikidata import data from the FSD ? Or viceversa ?

I don't know. For now at least they keep reference to the FSD on software-entries that exists in the FSD.

We could integrate the FSD also but I have yet to investigate if they provide an API for their entries.

Anyways I view FSD as a subset of Wikidata/Wikipedia. Wikidata is the node and FSD the leaf. Wikidata/Wikipedia will probably within a few years contain the data or links to the data that now exists in the FSD.

Correct me if I'm wrong but the only advantage of FSD over Wikidata & Wikipedia is that they do not include references to proprietary software at all.

In my view it is more feasible to compile the information on in a structured way in central node and then pull the relevant bits to the leaf.

E.g. FSD of the future could be generated from all wikidata-entries and extracts of wikipedia that are an instance of This would avoid fragmentation and help concentrate on building a large shared collective source of all knowledge within the wiki-community. FSD could exist anyhow and surely help enrich the upstream data.

Similarly we could generate a wikipedia subset without any entries pointing to (evil) private corporations (any entries that is part of or whatever). I can't imagine what this would be good for but it its possible.

I cannot imagine that the information in FSD would not be accepted in any of the wikimedia projects. I could be wrong though as I honestly did not visit or study the FSD very much.

Also the license of the FSD (GFDL 1.2) differs from both Wikidata (CC0) and Wikipedia (CC-BY-SA 4.0 + GFDL 1.2).

This is not to their advantage in the long run.

I fear the FSD is already becoming unmaintained and obsolete with people favoring more open and smarter solutions from the wikimedia-projects (at least I am).

When it comes to completeness we have at least 500.000 packages missing in both Wikidata and FSD (450.000+ MIT & CC0 licensed npm packages). Would any of you like to import those twice? I don't and as I see it Wikidata is far superior in multiple ways to get the job done and do it well with a big community backing it up with tools, bots, manual edits, et all. Who wants to update with new versions in two places when we have over half a million free software packages to juggle?

Here is a small comparison example:
Top 8 JS packages according to (900.000+ repositories in total!) (i filtered out a few non softwares)

1. angular.js
2. node
3. axios
not found in either
4. three.js
not found (poor search function in my view)
6. reveal.js
not found
not found
7. chart.js
not found
not found
8. json-server
not found
not found

Wikidata already contains way more entries and data on the entries I compared (e.g. node, npm, gcc) than FSD despite it being a much younger project.

Cheers Swedebugia

reply via email to

[Prev in Thread] Current Thread [Next in Thread]