[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Pan-users] error: plonked posters posts' showing up as new
From: |
Duncan |
Subject: |
Re: [Pan-users] error: plonked posters posts' showing up as new |
Date: |
Mon, 21 Mar 2022 06:31:13 -0000 (UTC) |
User-agent: |
Pan/0.150 (Moucherotte; 7b0b3fc12) |
David Chmelik posted on Sun, 20 Mar 2022 02:29:35 -0000 (UTC) as
excerpted:
> Usenet may again be a little better than it was in mid-to-late 1990s in
> terms of spam--some newsgroups have no spam--but unfortunately once
> again others still are almost all spam.
>
> So, I have a large killfile again and will be plonking more advertisers/
> pr0n & drug & weapons dealers, trolls, proselytizer religious fanatics,
> etc..
>
> However I noticed what often happens is I get a large number updates. I
> go to those groups then see sometimes all posts are by plonked posters
> with spam subject lines... just for a split-second, then disappear.
>
> Since I'm now subscribed to over 1000 newsgroups (if you add Usenet and
> Gmane) seeing all those false updates wastes considerable time.
>
> Shouldn't the logic be fixed to omit those before they ever even get
> counted as updates, so you don't waste a lot of time still seeing dozens
> spam updates?
There's several factors to consider here, some of which are inherent in
the news protocol and thus not something pan can do anything about.
First of all, there's a quick and very bandwidth efficient counts update
mode (which I'm not actually sure pan uses at all) whereby group message
counts can be updated quickly, with little bandwidth usage and very little
additional information (no headers, etc). This simply asks the server for
the first and last message sequence numbers it currently has in whatever
group(s) and compares them to the message sequence numbers the client
already knows about, so it can update the count of unread messages
accordingly.
However, the result is always the *maximum* number of potential messages
available, not necessarily the number *actually* available. In
particular, some servers assign message numbers before they do their
filtering if any, and some messages may simply be gone from the server due
to server-policy-specific spam filtering, copywrite or COPA takedown
orders, message cancels, no-carry policies like binaries posted to
anything out of of the alt.binaries.* hierarchy (which can affect binaries
groups too if the post was cross-posted to non-binaries groups), etc.
These will appear in the initial counts but not actually be available.
Second, there's overviews mode, aka downloading "headers". But, this does
*NOT* download true headers. Rather, it downloads an abridged version
containing only the most common headers typically used for display of the
message list. This typically includes From, Subject, Size/Lines,
References (necessary for threading), and Message-IDs, and server admins
can configure it to include others if they wish, but it does *NOT*
normally include less common/useful headers such as organization, custom
headers, etc.
This affects scoring/watching/killfiling in that headers available in the
overview can be scored against with just the information in the overview,
that is, without downloading the actual message, while those not in the
overview require actually downloading the message to apply that bit of the
score.
Of course it's far better to be able to score without downloading, thereby
making it possible for killfiles to avoid downloading the message
entirely, but for nym-switching posters in particular that's not always
possible, yet there's often still something scoreable in the full headers
(or body content) and being able to auto-ignore those posts even if they
have to be downloaded to do it can still be quite useful.
So depending on what headers exactly you're scoring on, or even depending
on how the server does its numbering and filtering, you may see quite a
number of messages that pan can't preemptively do anything about, until it
gets more information, either downloading "headers" (actually overviews),
or for headers not in the overview, even downloading the entire message.
Meanwhile, particularly if your scorefile is large and not efficiently
structured, processing it will take some time too. Here's a short example
from my (very dated now because as I've posted before, I've not been
active in the binaries for years, could actually be over a decade now)
pr0n scorefile:
[alt.*]
Score:: =-9999 %Alt kill
From: Seeking teens
From: teens seeker
From: sex coed
From: NudeGirls
Subject: R/-\\PE
Subject: R/-\|PE
That's going to be **FAR** more efficient than individual score entries
for each of those. And note that they're headers that should be in the
overview as well.
If your scorefile looks more like it's going to if you've only added
entries from the pan GUI and never text-edited them into something more
efficient like the above, and if you're doing over 1000 groups as you
mentioned, you could *easily* have tens of thousands of individual single-
entry scores that can be combined into a rather more efficient say 100-200
compound-entries like the above. I've never let mine get overgrown and
really haven't done anything lately with it at all, so I can't do any
before/after comparisons, but I'm guessing it could make the difference
between seeing some of the killfiled posts momentarily while pan processes
the inefficient mess, and having them all processed before it displays
anything (especially on a fast machine with plenty of RAM and NVDIMM
storage, something my now decade-old machine is lacking, tho I did do the
SSD upgrade from spun-glass on the SATA3s).
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman