guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#49828] [PATCH 06/20] guix: Add ContentDB importer.


From: Leo Prikler
Subject: [bug#49828] [PATCH 06/20] guix: Add ContentDB importer.
Date: Thu, 05 Aug 2021 18:41:00 +0200
User-agent: Evolution 3.34.2

Hi,

Am Montag, den 02.08.2021, 17:50 +0200 schrieb Maxime Devos:
> * guix/import/contentdb.scm: New file.
> * guix/scripts/import/contentdb.scm: New file.
> * tests/contentdb.scm: New file.
> * Makefile.am (MODULES, SCM_TESTS): Register them.
> * po/guix/POTFILES.in: Likewise.
> * doc/guix.texi (Invoking guix import): Document it.
> [...]
> diff --git a/doc/guix.texi b/doc/guix.texi
> index 43c248234d..d06c9b73c5 100644
> --- a/doc/guix.texi
> +++ b/doc/guix.texi
> @@ -11313,6 +11313,31 @@ and generate package expressions for all
> those packages that are not yet
>  in Guix.
>  @end table
>  
> +@item contentdb
> +@cindex ContentDB
> +Import metadata from @uref{https://content.minetest.net, ContentDB}.
> +Information is taken from the JSON-formatted metadata provided
> through
> +@uref{https://content.minetest.net/help/api/, ContentDB's API} and
> +includes most relevant information, including dependencies.  There
> are
> +some caveats, however.  The license information on ContentDB does
> not
> +distinguish between GPLvN-only and GPLvN-or-later.  The commit id is
> +sometimes missing.  The descriptions are in the Markdown format, but
> +Guix uses Texinfo instead.  Texture packs and subgames are
> unsupported.
What is the "commit id"?  Is it the hash?  A tag?  Anything that
resolves to a commit?

Also, since ContentDB sounds fairly generic (a database of content?),
perhaps we ought to call this the "minetest" importer instead?

> [...]
> +;; The ContentDB API is documented at
> +;; <https://content.minetest.net>;.
> +
> +(define %contentdb-api
> +  (make-parameter "https://content.minetest.net/api/";))
> +
> +(define (string-or-false x)
> +  (and (string? x) x))
> +
> +(define (natural-or-false x)
> +  (and (exact-integer? x) (>= x 0) x))
> +
> +;; Descriptions on ContentDB use carriage returns, but Guix doesn't.
> +(define (delete-cr text)
> +  (string-delete #\cr text))
> +
> +;; Minetest package.
> +;;
> +;; API endpoint: /packages/AUTHOR/NAME/
> +(define-json-mapping <package> make-package package?
> +  json->package
> +  (author            package-author) ; string
> +  (creation-date     package-creation-date ; string
> +                     "created_at")
> +  (downloads         package-downloads) ; integer
> +  (forums            package-forums "forums" natural-or-false) ;
> natural | #f
This comment and some others like it seem to simply be repeating
already present information.  Is there a use for them?  Should we
instead provide a third argument on every field to verify/enforce the
type?
> +  (issue-tracker     package-issue-tracker "issue_tracker") ; string
> +  (license           package-license) ; string
> +  (long-description  package-long-description "long_description") ;
> string
> +  (maintainers       package-maintainers ; list of strings
> +                     "maintainers" vector->list)
> +  (media-license     package-media-license "media_license") ; string
> +  (name              package-name) ; string
> +  (provides          package-provides ; list of strings
> +                     "provides" vector->list)
> +  (release           package-release) ; integer
> +  (repository        package-repository "repo" string-or-false) ;
> string | #f
> +  (score             package-score) ; flonum
> +  (screenshots       package-screenshots "screenshots" vector->list) 
> ; list of strings
> +  (short-description package-short-description "short_description")
> ; string
> +  (state             package-state) ; string
> +  (tags              package-tags "tags" vector->list) ; list of
> strings
> +  (thumbnail         package-thumbnail) ; string
> +  (title             package-title) ; string
> +  (type              package-type) ; string
> +  (url               package-url) ; string
> +  (website           package-website "website" string-or-false)) ;
> string | #f
> +
> +(define-json-mapping <release> make-release release?
> +  json->release
> +  (commit               release-commit "commit" string-or-false) ;
> string | #f
> +  (downloads            release-downloads) ; integer
> +  (id                   release-id) ; integer
> +  (max-minetest-version release-max-minetest-version) ; string | #f
> +  (min-minetest-version release-min-minetest-version) ; string | #f
> +  (release-date         release-data) ; string
> +  (title                release-title) ; string
> +  (url                  release-url)) ; string
> +
> +(define-json-mapping <dependency> make-dependency dependency?
> +  json->dependency
> +  (optional? dependency-optional? "is_optional") ; #t | #f
Also known as "boolean".
> +  (name dependency-name) ; string
> +  (packages dependency-packages "packages" vector->list)) ; list of
> strings
> +
> +(define (contentdb-fetch author name)
> +  "Return a <package> record for package NAME by AUTHOR, or #f on
> failure."
> +  (and=> (json-fetch
> +          (string-append (%contentdb-api) "packages/" author "/"
> name "/"))
> +         json->package))
Is there a reason for author and name to be separate keys?  For me it
makes more sense to take AUTHOR/NAME as a singular search string from
users and then perform queries based on that.  If ContentDB allows
searching, we might also resolve NAME to a singular package where
possible and otherwise error out, telling the user to choose one.

> [...]
> +
> +(define (important-dependencies dependencies author name)
> +  (define dependency-list
> +    (assoc-ref dependencies (string-append author "/" name)))
> +  (filter-map
> +   (lambda (dependency)
> +     (and (not (dependency-optional? dependency))
> +          ;; "default" must be provided by the 'subgame' in use
> +          ;; and does not refer to a specific minetest mod.
> +          ;; "doors", "bucket" ... are provided by the default
> minetest
> +          ;; subgame.
> +          (not (member (dependency-name dependency)
> +                       '("default" "doors" "beds" "bucket" "doors"
> "farming"
> +                         "flowers" "stairs" "xpanes")))
> +          ;; Dependencies often have only one implementation.
> +          (let* ((/name (string-append "/" (dependency-name
> dependency)))
> +                 (likewise-named-implementations
> +                  (filter (cut string-suffix? /name <>)
> +                          (dependency-packages dependency)))
> +                 (implementation
> +                  (and (not (null? likewise-named-implementations))
> +                       (first likewise-named-implementations))))
> +            (and implementation
> +                 (apply cons (string-split implementation #\/))))))
> +   dependency-list))
What exactly does the likewise-named-implementations bit do here?

> +(define (contentdb-recursive-import author name)
> +  ;; recursive-import expects upstream package names to be strings,
> +  ;; so do some conversions.
> +  (define (split-author/name author/name)
> +    (string-split author/name #\/))
+1 for my author/name splitting, as it's already required for recursive
imports.
> +  (define (author+name->author/name author+name)
> +    (string-append (car author+name) "/" (cdr author+name)))
> +  (define* (contentdb->guix-package* author/name #:key repo version)
> +    (receive (package . maybe-dependencies)
> +        (apply contentdb->guix-package (split-author/name
> author/name))
> +      (and package
> +           (receive (dependencies)
> +               (apply values maybe-dependencies)
> +             (values package
> +                     (map author+name->author/name
> dependencies))))))
> +  (recursive-import (author+name->author/name (cons author name))
> +                    #:repo->guix-package contentdb->guix-package*
> +                    #:guix-name
> +                    (lambda (author/name)
> +                      (contentdb->package-name
> +                       (second (split-author/name author/name))))))
> +
> +;; A list of license names is available at
> +;; <https://content.minetest.net/api/licenses/>;.
> +(define (string->license str)
> +  "Convert the string STR into a license object."
> +  (match str
> +    ("GPLv3"        license:gpl3)
> +    ("GPLv2"        license:gpl2)
> +    ("ISC"          license:isc)
> +    ;; "MIT" means the Expat license on ContentDB,
> +    ;; see <
> https://github.com/minetest/contentdb/issues/326#issuecomment-890143784>
> ;.
> +    ("MIT"          license:expat)
> +    ("CC BY-SA 3.0" license:cc-by-sa3.0)
> +    ("CC BY-SA 4.0" license:cc-by-sa4.0)
> +    ("LGPLv2.1"     license:lgpl2.1)
> +    ("LGPLv3"       license:lgpl3)
> +    ("MPL 2.0"      license:mpl2.0)
> +    ("ZLib"         license:zlib)
> +    ("Unlicense"    license:unlicense)
> +    (_ #f)))
The link mentions, that ContentDB now supports all SPDX identifiers. 
Do we have a SPDX->Guix converter lying around in some other importer
that we could use as default case here (especially w.r.t. "or later")

WDYT?






reply via email to

[Prev in Thread] Current Thread [Next in Thread]