pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] HTML


From: Duncan
Subject: Re: [Pan-users] HTML
Date: Thu, 8 Jun 2017 02:25:10 +0000 (UTC)
User-agent: Pan/0.142 (He slipped to Sam a double gin; 8a67a1642)

Beartooth posted on Wed, 07 Jun 2017 17:57:16 +0000 as excerpted:

> I have my email set not to display html messages. Can I do that with
> Pan? How?

Short non-technical answer:

Not simply.  But an accurate answer gets quite technical and complex, due 
to the way NNTP works (meaning it's not just pan, all clients will have 
the same general problems).


Slightly longer and more accurate answer:

It's /sort/ /of/ possible to do via scoring down to ignore level, but 
it's complex and not as efficient as scoring the mostly commonly viewable 
headers such as author and subject, and even then, the scoring will have 
either a lot of false-positives or a lot of false-negatives, depending on 
which of two scoring choices you take.


The long (semi-)technical explanation:

NNTP has what's called an overview that contains basically the 
information you see in the header pane, subject, author, and date 
headers, plus a few others that contain threading information 
(references) and allow proper identification of the message (message-id) 
-- as I said, the information necessary to populate and thread the header 
pane, and to request download of individual messages.

But the MIME headers containing information about the types and parts of 
the message are not in the overview, so the message will need to be 
downloaded before pan can see them.  Even then, most such messages will 
be multi-part, containing at least two parts, the text/html part you're 
trying to score against, and the text/plain part that is intended for 
display on clients that don't do HTML.  Tho these days many clients 
assume HTML and either don't include a text/plain alternative at all, or 
include one but it's blank.

And to my knowledge (this part is pan-specific) pan can only score on the 
overall message headers, not the headers of individual parts.  So it 
won't see the sub-headers of the text/html part and won't be able to 
score on them.  

Thus, only the global message type is available for scoring, and the 
choice is between scoring on the worst text/html only and missing (false-
negatives) all the ones that are actually multipart, or on scoring on 
multipart, and hitting on (false-positive) multipart messages that don't 
contain HTML, but do contain other parts such as binary attachments or 
message signing (attached pgp/gpg signatures, etc).

So if you want a strict score on a single plain-text ONLY part, you can 
do that.  But along with html, you'll also hit  messages with 
attachments, including those with pgp-signature attachments as well as 
binaries.

Or you can strictly score on text/html ONLY posts, but then you'll miss 
the multipart posts that have HTML as one of many parts.

And in both cases, on top of the hit/miss problem, you'll have the 
inefficiency of only being able to score on already downloaded messages.


Explanation summary:

So as you see, the short but not entirely accurate answer correctly is 
no, but there are complex ways around that if you're willing to deal with 
inefficient scores that apply only after the message is already 
downloaded and available locally, AND deal with a high alternatively 
false-negative or false-positive rate as well, depending on which of two 
header scoring choices you make.


Bigger picture:

Of course in accordance with GNKSA pan doesn't display the HTML formatted 
messages as HTML formatted, and never has, displaying the raw text HTML 
code instead, tho there's a place in the applications tab of preferences 
to put in an HTML previewer that I've never used, so I can't exactly say 
how that works.

But html format is simply an HTML MIME-type specified part of a message 
that, if it's abiding by the RFCs, should have a plain-text part with the 
same text content, as well (tho some messages don't, and only have an 
HTML part or have a plain-text part as well, but leave it blank).  Pan 
displays both parts, both as plain text, no HTML formatting.

So I'm assuming that by "not display", you mean to not display the 
message at all, since pan already doesn't format as HTML, instead 
displaying the raw HTML text code, along with the plain-text version if 
it exists, as well.


One pan alternative:

FWIW, one alternative to pan is claws-mail (also gtk-based), which I use 
for mail, and as a feeds (atom, etc) client as well, but it can do news 
too.  It has a somewhat different approach to HTML, displaying the 
text/plain part only by default if it's available, displaying the text/
html part if there's no text/plain part, BUT...

** displaying text/html and text/xml as plain text by filtering out all 
the tags and related junk, so only the plain text remains to be shown.**

This actually works surprisingly well in general, much better than I 
expected going in, and well enough that I can use it on 100% XML-based 
feeds with only very occasional issues.

Critically for the subject at hand, it actually CAN and DOES allow 
filtering (not scoring, but if you are content with a binary show/no-
show, it's fine) on not only headers, but on body only, or on entire 
message content.  I *seriously* wish pan had those two options, scoring 
on message body, or entire message, as well as headers.  It would make 
things a lot simpler.  If I could code... pan would almost certainly have 
had this years ago.  But unfortunately I'm not a coder (tho I'm 
reasonable at bash scripting, and back in my MS days, did VB).

With that you can filter on the appropriate header and content regardless 
of whether it shows up in the headers, sub-part headers, or main body, 
altho there's a small chance at false-positives on discussions such as 
this, made smaller if you filter on the full header AND its text/html 
value instead of text/html only (which is why I don't combine the two in 
a single string here, to avoid such filters).

And I actually do just that, using such a filter on my mail.  But 
unfortunately, pan doesn't have part-header, body, or full message, 
scoring/filtering.

So why am I still using pan for news, instead of claws with its full-
message filtering?  Three reasons, two technical, one social:

1) claws-mail is single-threaded and single-connection at a time.  That's 
fine, tho occasionally a bit frustrating, for my use of mail and feeds, 
and would probably work fine for most text-only or occasional trivial 
binary attachment news users as well, but simply won't cut it on big 
binary download jobs.

Since I only text and occasional trivial binary downloads 98+ percent of 
the time, claws-mail could actually work for me in that regard, and if I 
hadn't already worked with pan for years, I might actually choose claws-
mail for news as well.  But I *do* very occasionally do a binary 
download, where pan's *vastly* better, and I *am* already familiar with 
pan and have it well configured for my general news needs, even if it's 
mostly text, and the single-threading and single-connection thing /is/ a 
frustration, if a minor one, so I stay with pan.

2) claws-mail's primary emphasis is on mail, and it has never 
incorporated a yenc decoder.

For anyone that does any significant news binaries at all, this is an 
absolute no-go, since due to its efficiency yenc is the binary encoding 
form preferred by uploaders and has been for over a decade, and in the 
news-binary world, it's the uploaders that make the choice, and the 
downloaders that either work with it or simply don't get those files.

Of course as I said I so rarely do binaries now that this won't be a 
major issue for my normal usage, and I'd be technically fine with claws.  
For the rare occasions I do binary, I could keep pan around, or, since 
it's rare enough, install it only for that.  Alternatively, there's 
separate utilities available that can ydecode from raw news message 
files.  I do binaries rarely enough that I could use these if I had to, 
tho it'd still be a hassle.

But in news terms handing yenc is a basic principle for me.  A news 
client that doesn't do yenc, unless it's /purely/ text-only, isn't 
something I consider a "real" news client.  And so it is here.  Claws-
mail is primarily a mail client, and while it has enough news capacities 
I could use it in a pinch (but would probably use the text-mode lynx web 
browser, which I discovered basically by accident can do news as well, 
instead), without yenc, I don't consider it a "real" news client, and I'd 
definitely feel crippled attempting to use it as such, simply due to no 
yenc, even if was multi-threaded/multi-connection.

3) I've simply been with pan for so long, seeing it thru thick and thin, 
thru dark days when it was abandoned by the coders and I was about the 
only one here keeping the lights on, more or less on deathwatch by the 
bedside, waiting for the day I could no longer get it to build with the 
latest gcc and the maintainers and coders didn't find it worth the hassle 
of patching any longer, and thru great days when features we'd been 
waiting on for over a decade, like binary posting and native encrypted-
connections support, were added and I could actually use them...

After all that, saying goodbye to pan would be saying goodbye to an old 
friend.

These days I mostly use pan on gmane, following this mailing list and 
others, as newsgroups, and gmane itself had a close call recently, but 
even if gmane were to die and I effectively wasn't using pan any longer, 
except perhaps for once every two years to twice a year, when I happen to 
get the binary bug... I'd certainly dust off that old pan mailing list 
subscription and take it out of vacation mode, or resubscribe, and 
continue to do this list/group via email.

Of course then, I'd gradually fall away from pan, as from an old friend I 
don't see or talk to much any more, and perhaps there'd come a point when 
I no longer knew pan well enough to properly answer questions about it on 
the list, and I may at that point unsubscribe, but even then, it'd be an 
old friend, that nothing could or would ever replace.

And of course if gmane died and I didn't pick up with binary newsgroups 
again or something, if at some point I deleted pan... it obviously 
wouldn't be to install some /other/ news client, but rather, because I no 
longer needed a news client at all, however much of an old friend it 
might be.

Truth be told, 20 or 30 or 40 years from now, assuming I survive that 
long, I'll be 70 or 80 or 90 (being 50 now), and likely at least by 40 
years out, if I'm still alive, in a nursing home.  But should news still 
be around, and especially should gmane still be around, and of course if 
pan is still around and buildable by then as well, I'll probably still be 
subscribed to this list as a newsgroup via gmane and pan, and still 
answering questions from my wheelchair or bed, perhaps until that day I 
logoff the computer and the net the final time and pass into eternity. 
Tho as long as those I've helped remain around, in some way I will as 
well.  Until they pass into eternity as well, but they too will remain 
with those they've helped, and thru them, in some small part, so will I.  
And so on thru the generations...  =:^)

(Meanwhile, OT for this list, but I feel much the same way about gentoo, 
tho it's not the deep friend that I think of pan as.  If gentoo and I are 
both still around 20 or 30 or 40 years from now, I'm not sure what the 
computers will be like then, but I can well imagine I'll still be running 
and updating gentoo on them... until I logoff that final time and pass 
into eternity... having built my last ebuild and helped the last person 
I'll ever help... except by others passing it on, of course.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]