bug#32348: 27.0.50; EWW/SHR: Please add support for hiding DOM nodes wit

From: Noam Postavsky
Subject: bug#32348: 27.0.50; EWW/SHR: Please add support for hiding DOM nodes with aria-hidden=true
Date: Wed, 15 Aug 2018 19:51:40 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

"T.V Raman" <address@hidden> writes:

> Another place to leverage aria-hidden might be in EWW's "readable"
> command. Note that in general those nodes are shown on screen but are
> useless for the most part -- and usually inoperable under EWW. As an
> example, articles from the BBC for instance use these nodes to add
> buttons for "share via messenger" etc.

Unless the "readable" command is actually giving bad results for
real-world pages (is it?), I wouldn't start complicating the heuristic
with more checks.  It doesn't even check for display:none at the moment.

I've updated my previous patch with some more documentation, so that
someone who doesn't know anything about "aria-hidden" (e.g., me, before
this thread) could have some chance of figuring out what it's good for.
I'll push to master in a few days.

>From ead5619e844fced271866349fd34990641bf75c6 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <address@hidden>
Date: Tue, 7 Aug 2018 20:40:56 -0400
Subject: [PATCH v2] Optionally skip rendering of tags with aria-hidden

* lisp/net/shr.el (shr-discard-aria-hidden): New option.
(shr-descend): Suppress aria-hidden=true tags if it's set.
* doc/misc/eww.texi (Advanced): Document shr-discard-aria-hidden.
* etc/NEWS: Announce it.

fixup! shr: Allow skipping tags with aria-hidden (Bug#32348)
 doc/misc/eww.texi | 10 ++++++++++
 etc/NEWS          |  3 +++
 lisp/net/shr.el   | 12 +++++++++++-
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/doc/misc/eww.texi b/doc/misc/eww.texi
index 43adc2eda0..bd5aef6a8a 100644
--- a/doc/misc/eww.texi
+++ b/doc/misc/eww.texi
@@ -262,6 +262,16 @@ Advanced
 variables @code{shr-color-visible-distance-min} and
 @code{shr-color-visible-luminance-min} to get a better contrast.
address@hidden shr-discard-aria-hidden
address@hidden aria-hidden
+  The HTML attribute @code{aria-hidden} is meant to tell screen
+readers to ignore a tag's contents.  You can customize the variable
address@hidden to tell @code{shr} to ignore such tags.
+This can be useful when using a screen reader on the output of
address@hidden (e.g., on EWW buffer text).  Or even when not using a
+screen reader, since web authors often put this tag on non-essential
+decorative content.
 @cindex Desktop Support
 @cindex Saving Sessions
   In addition to maintaining the history at run-time, EWW will also
diff --git a/etc/NEWS b/etc/NEWS
index 60951dfac0..d2e111fd60 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -396,6 +396,9 @@ and its value has been changed to Duck Duck Go.
 'shr-selected-link' face to give the user feedback that the command
 has been executed.
+*** New option 'shr-discard-aria-hidden'.
 ** Htmlfontify
 *** The functions 'hfy-color', 'hfy-color-vals' and
diff --git a/lisp/net/shr.el b/lisp/net/shr.el
index edea7cb297..0fbaf6f211 100644
--- a/lisp/net/shr.el
+++ b/lisp/net/shr.el
@@ -68,6 +68,14 @@ shr-use-fonts
   :group 'shr
   :type 'boolean)
+(defcustom shr-discard-aria-hidden nil
+  "If non-nil, don't render tags with `aria-hidden=\"true\"'.
+This attribute is meant to tell screen readers to ignore the
+tag's content."
+  :version "27.1"
+  :group 'shr
+  :type 'boolean)
 (defcustom shr-use-colors t
   "If non-nil, respect color specifications in the HTML."
   :version "26.1"
@@ -509,7 +517,9 @@ shr-descend
          (setq style nil)))
       ;; If we have a display:none, then just ignore this part of the DOM.
-      (unless (equal (cdr (assq 'display shr-stylesheet)) "none")
+      (unless (or (equal (cdr (assq 'display shr-stylesheet)) "none")
+                  (and shr-discard-aria-hidden
+                       (equal (dom-attr dom 'aria-hidden) "true")))
         ;; We don't use shr-indirect-call here, since shr-descend is
         ;; the central bit of shr.el, and should be as fast as
         ;; possible.  Having one more level of indirection with its

