[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Pan-users] HTML
From: |
Duncan |
Subject: |
Re: [Pan-users] HTML |
Date: |
Thu, 8 Jun 2017 02:25:10 +0000 (UTC) |
User-agent: |
Pan/0.142 (He slipped to Sam a double gin; 8a67a1642) |
Beartooth posted on Wed, 07 Jun 2017 17:57:16 +0000 as excerpted:
> I have my email set not to display html messages. Can I do that with
> Pan? How?
Short non-technical answer:
Not simply. But an accurate answer gets quite technical and complex, due
to the way NNTP works (meaning it's not just pan, all clients will have
the same general problems).
Slightly longer and more accurate answer:
It's /sort/ /of/ possible to do via scoring down to ignore level, but
it's complex and not as efficient as scoring the mostly commonly viewable
headers such as author and subject, and even then, the scoring will have
either a lot of false-positives or a lot of false-negatives, depending on
which of two scoring choices you take.
The long (semi-)technical explanation:
NNTP has what's called an overview that contains basically the
information you see in the header pane, subject, author, and date
headers, plus a few others that contain threading information
(references) and allow proper identification of the message (message-id)
-- as I said, the information necessary to populate and thread the header
pane, and to request download of individual messages.
But the MIME headers containing information about the types and parts of
the message are not in the overview, so the message will need to be
downloaded before pan can see them. Even then, most such messages will
be multi-part, containing at least two parts, the text/html part you're
trying to score against, and the text/plain part that is intended for
display on clients that don't do HTML. Tho these days many clients
assume HTML and either don't include a text/plain alternative at all, or
include one but it's blank.
And to my knowledge (this part is pan-specific) pan can only score on the
overall message headers, not the headers of individual parts. So it
won't see the sub-headers of the text/html part and won't be able to
score on them.
Thus, only the global message type is available for scoring, and the
choice is between scoring on the worst text/html only and missing (false-
negatives) all the ones that are actually multipart, or on scoring on
multipart, and hitting on (false-positive) multipart messages that don't
contain HTML, but do contain other parts such as binary attachments or
message signing (attached pgp/gpg signatures, etc).
So if you want a strict score on a single plain-text ONLY part, you can
do that. But along with html, you'll also hit messages with
attachments, including those with pgp-signature attachments as well as
binaries.
Or you can strictly score on text/html ONLY posts, but then you'll miss
the multipart posts that have HTML as one of many parts.
And in both cases, on top of the hit/miss problem, you'll have the
inefficiency of only being able to score on already downloaded messages.
Explanation summary:
So as you see, the short but not entirely accurate answer correctly is
no, but there are complex ways around that if you're willing to deal with
inefficient scores that apply only after the message is already
downloaded and available locally, AND deal with a high alternatively
false-negative or false-positive rate as well, depending on which of two
header scoring choices you make.
Bigger picture:
Of course in accordance with GNKSA pan doesn't display the HTML formatted
messages as HTML formatted, and never has, displaying the raw text HTML
code instead, tho there's a place in the applications tab of preferences
to put in an HTML previewer that I've never used, so I can't exactly say
how that works.
But html format is simply an HTML MIME-type specified part of a message
that, if it's abiding by the RFCs, should have a plain-text part with the
same text content, as well (tho some messages don't, and only have an
HTML part or have a plain-text part as well, but leave it blank). Pan
displays both parts, both as plain text, no HTML formatting.
So I'm assuming that by "not display", you mean to not display the
message at all, since pan already doesn't format as HTML, instead
displaying the raw HTML text code, along with the plain-text version if
it exists, as well.
One pan alternative:
FWIW, one alternative to pan is claws-mail (also gtk-based), which I use
for mail, and as a feeds (atom, etc) client as well, but it can do news
too. It has a somewhat different approach to HTML, displaying the
text/plain part only by default if it's available, displaying the text/
html part if there's no text/plain part, BUT...
** displaying text/html and text/xml as plain text by filtering out all
the tags and related junk, so only the plain text remains to be shown.**
This actually works surprisingly well in general, much better than I
expected going in, and well enough that I can use it on 100% XML-based
feeds with only very occasional issues.
Critically for the subject at hand, it actually CAN and DOES allow
filtering (not scoring, but if you are content with a binary show/no-
show, it's fine) on not only headers, but on body only, or on entire
message content. I *seriously* wish pan had those two options, scoring
on message body, or entire message, as well as headers. It would make
things a lot simpler. If I could code... pan would almost certainly have
had this years ago. But unfortunately I'm not a coder (tho I'm
reasonable at bash scripting, and back in my MS days, did VB).
With that you can filter on the appropriate header and content regardless
of whether it shows up in the headers, sub-part headers, or main body,
altho there's a small chance at false-positives on discussions such as
this, made smaller if you filter on the full header AND its text/html
value instead of text/html only (which is why I don't combine the two in
a single string here, to avoid such filters).
And I actually do just that, using such a filter on my mail. But
unfortunately, pan doesn't have part-header, body, or full message,
scoring/filtering.
So why am I still using pan for news, instead of claws with its full-
message filtering? Three reasons, two technical, one social:
1) claws-mail is single-threaded and single-connection at a time. That's
fine, tho occasionally a bit frustrating, for my use of mail and feeds,
and would probably work fine for most text-only or occasional trivial
binary attachment news users as well, but simply won't cut it on big
binary download jobs.
Since I only text and occasional trivial binary downloads 98+ percent of
the time, claws-mail could actually work for me in that regard, and if I
hadn't already worked with pan for years, I might actually choose claws-
mail for news as well. But I *do* very occasionally do a binary
download, where pan's *vastly* better, and I *am* already familiar with
pan and have it well configured for my general news needs, even if it's
mostly text, and the single-threading and single-connection thing /is/ a
frustration, if a minor one, so I stay with pan.
2) claws-mail's primary emphasis is on mail, and it has never
incorporated a yenc decoder.
For anyone that does any significant news binaries at all, this is an
absolute no-go, since due to its efficiency yenc is the binary encoding
form preferred by uploaders and has been for over a decade, and in the
news-binary world, it's the uploaders that make the choice, and the
downloaders that either work with it or simply don't get those files.
Of course as I said I so rarely do binaries now that this won't be a
major issue for my normal usage, and I'd be technically fine with claws.
For the rare occasions I do binary, I could keep pan around, or, since
it's rare enough, install it only for that. Alternatively, there's
separate utilities available that can ydecode from raw news message
files. I do binaries rarely enough that I could use these if I had to,
tho it'd still be a hassle.
But in news terms handing yenc is a basic principle for me. A news
client that doesn't do yenc, unless it's /purely/ text-only, isn't
something I consider a "real" news client. And so it is here. Claws-
mail is primarily a mail client, and while it has enough news capacities
I could use it in a pinch (but would probably use the text-mode lynx web
browser, which I discovered basically by accident can do news as well,
instead), without yenc, I don't consider it a "real" news client, and I'd
definitely feel crippled attempting to use it as such, simply due to no
yenc, even if was multi-threaded/multi-connection.
3) I've simply been with pan for so long, seeing it thru thick and thin,
thru dark days when it was abandoned by the coders and I was about the
only one here keeping the lights on, more or less on deathwatch by the
bedside, waiting for the day I could no longer get it to build with the
latest gcc and the maintainers and coders didn't find it worth the hassle
of patching any longer, and thru great days when features we'd been
waiting on for over a decade, like binary posting and native encrypted-
connections support, were added and I could actually use them...
After all that, saying goodbye to pan would be saying goodbye to an old
friend.
These days I mostly use pan on gmane, following this mailing list and
others, as newsgroups, and gmane itself had a close call recently, but
even if gmane were to die and I effectively wasn't using pan any longer,
except perhaps for once every two years to twice a year, when I happen to
get the binary bug... I'd certainly dust off that old pan mailing list
subscription and take it out of vacation mode, or resubscribe, and
continue to do this list/group via email.
Of course then, I'd gradually fall away from pan, as from an old friend I
don't see or talk to much any more, and perhaps there'd come a point when
I no longer knew pan well enough to properly answer questions about it on
the list, and I may at that point unsubscribe, but even then, it'd be an
old friend, that nothing could or would ever replace.
And of course if gmane died and I didn't pick up with binary newsgroups
again or something, if at some point I deleted pan... it obviously
wouldn't be to install some /other/ news client, but rather, because I no
longer needed a news client at all, however much of an old friend it
might be.
Truth be told, 20 or 30 or 40 years from now, assuming I survive that
long, I'll be 70 or 80 or 90 (being 50 now), and likely at least by 40
years out, if I'm still alive, in a nursing home. But should news still
be around, and especially should gmane still be around, and of course if
pan is still around and buildable by then as well, I'll probably still be
subscribed to this list as a newsgroup via gmane and pan, and still
answering questions from my wheelchair or bed, perhaps until that day I
logoff the computer and the net the final time and pass into eternity.
Tho as long as those I've helped remain around, in some way I will as
well. Until they pass into eternity as well, but they too will remain
with those they've helped, and thru them, in some small part, so will I.
And so on thru the generations... =:^)
(Meanwhile, OT for this list, but I feel much the same way about gentoo,
tho it's not the deep friend that I think of pan as. If gentoo and I are
both still around 20 or 30 or 40 years from now, I'm not sure what the
computers will be like then, but I can well imagine I'll still be running
and updating gentoo on them... until I logoff that final time and pass
into eternity... having built my last ebuild and helped the last person
I'll ever help... except by others passing it on, of course.)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman