guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#28235] [PATCH 2/3] gnu: Add python-html5-parser, python2-html5-pars


From: Marius Bakke
Subject: [bug#28235] [PATCH 2/3] gnu: Add python-html5-parser, python2-html5-parser
Date: Sat, 02 Sep 2017 13:30:11 +0200
User-agent: Notmuch/0.25 (https://notmuchmail.org) Emacs/25.2.1 (x86_64-unknown-linux-gnu)

Roel Janssen <address@hidden> writes:

> * gnu/packages/python.scm (python-html5-parser): New variable.
>   (python2-html5-parser: New variable.
> ---
>  gnu/packages/python.scm | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
>
> diff --git a/gnu/packages/python.scm b/gnu/packages/python.scm
> index 9bf46fb6f..8629228db 100644
> --- a/gnu/packages/python.scm
> +++ b/gnu/packages/python.scm
> @@ -5868,6 +5868,35 @@ and written in Python.")
>  (define-public python2-html5lib-0.9
>    (package-with-python2 python-html5lib-0.9))
>  
> +(define-public python-html5-parser
> +  (package
> +    (name "python-html5-parser")
> +    (version "0.4.4")
> +    (source (origin
> +              (method url-fetch)
> +              (uri (pypi-uri "html5-parser" version))
> +              (sha256
> +               (base32
> +                "1d8sxhl41ffh7qlk7wlsy17xw6slzx5v1yna9s72wx5qrpaa3wxr"))))
> +    (build-system python-build-system)
> +    (native-inputs
> +     `(("pkg-config" ,pkg-config)))
> +    (inputs
> +     `(("libxml2" ,libxml2)))
> +    (propagated-inputs
> +     `(("python-lxml" ,python-lxml)
> +       ("python-beautifulsoup4" ,python-beautifulsoup4)))
> +    (home-page "https://html5-parser.readthedocs.io";)
> +    (synopsis "Fast C-based HTML5 parsing for Python")
> +    (description "This package provides a fast implementation of the HTML5
> +parsing spec for Python.  Parsing is done in C using a variant of the gumbo
> +parser.  The gumbo parse tree is then transformed into an lxml tree, also in
> +C, yielding parse times that can be a thirtieth of the html5lib parse 
> times.")
> +    (license license:asl2.0)))

The files 'src/as-libxml.[ch]' are GPL3.  Everything else in this series LGTM!

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]