[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pan-users] Re: Filtering out the flood at sci.crypt
From: |
Frank Tabor |
Subject: |
[Pan-users] Re: Filtering out the flood at sci.crypt |
Date: |
Mon, 25 Jun 2007 20:09:37 +0000 (UTC) |
User-agent: |
Pan/0.131 (Ghosts: First Variation) |
On Mon, 25 Jun 2007 16:26:41 +0000, Duncan wrote:
> JCA <address@hidden> posted
> a10b0c8a0706250650x7e27a0scd042a801b659f6d-JsoAwUIsXosN
address@hidden,
> excerpted below, on Mon, 25 Jun 2007 06:50:26 -0700:
>
> JCA <address@hidden> posted
> a10b0c8a0706250650x7e27a0scd042a801b659f6d-JsoAwUIsXosN
address@hidden,
> excerpted below, on Mon, 25 Jun 2007 06:50:26 -0700:
>
>> For the last few weeks some idiot has taken to flooding sci.crypt
>> (and possibly other groups) with junk. The postings are spoofed to
>> appear as coming from regulars in the group, and the contents of the
>> postings are just random drivel.
>>
>> Anybody know a rule, or set of rules, to filter them out? It would
>> appear that the bogus postings all come from a specific news provider -
>> things like
>>
>> news.highwinds-media.com!hw-filter.lga!newsfe04.lga.POSTED!53ab2750
>>
>> but I don't know how to filter this out.
>
> <mode=rant>
>
> This is one reason I've pushed for a long time to have scoring/filtering
> (since before pan had scoring, when it was all binary decision
> filtering, that's how long) that could match anywhere in the post, in
> the body, or in headers not in the overviews. The problem is, the stuff
> in the overviews can generally be entirely controlled by the poster, so
> if they want to be deliberately disruptive and therefore deliberately
> and continuously modify this info, in ordered to evade scoring systems
> like pan's, unfortunately, there's not a lot that the poor users of such
> clients can do.
>
> The problem is, in ordered to score/filter on things not in the
> overviews, the post must be downloaded first. For better or for worse,
> Charles' position has always seemed to emphasize scoring in ordered to
> choose /what/ to download (and/or what to delete without downloading),
> simply trusting that the overview data used to make such decisions isn't
> going to be deliberately obfuscated, in ordered to prevent such scoring/
> filters from working.
>
> My position, OTOH, is that while it's a bonus if a useful score can be
> used to ignore (ultimately, to kill/delete) or watch (ultimately, to
> auto- download or at least mark for download) before downloading, just
> because the post must be downloaded first doesn't mean the war is
> already lost. It still takes time to view the message, and if automated
> tools (scoring/ filtering) can be used to either prioritize the viewing
> (in the case of watch or positive scores), or to allow mark-read or
> deletion without actual viewing (in the case of ignore or negative
> scores), well, the war is still won, tho admittedly not as easily.
>
> Unfortunately, while I'd have much rather had effective filtering based
> on /anything/ in the message, than scoring still restricted to overview
> data only, and while I've been a very active volunteer here on the pan
> lists/groups, it seems your problem and mine don't appear to hit enough
> people to be very high on the priority list.
>
> Back years ago, when I originally filed the request, Charles stated that
> yes, he agreed that sort of thing would be useful. However, it was for
> him pretty much in the "nice to have at some point" category, and thus
> was "blueskied" (aka "backburnered") into never-never-land.
>
> BTW, even the official slrn scorefile documentation, (slrn's scorefile
> format is what pan uses) says non-overview headers can be matched, tho
> it goes to pains to point out that it's less efficient since the posts
> must be downloaded before those scores will match.
>
> Of course, Charles has always been quite open to patches, and I've
> little doubt if someone with the skills had submitted a patch to
> implement this functionality, we'd not be talking about it now as it'd
> work as well as overview scoring does. Unfortunately, that's not a set
> of skills I have, and no one else has seemed to have the itch to
> scratch, so the functionality remains "bluesky", nice to have "someday".
>
> OTOH, the very fact that I'm still here means regardless of whether this
> particular feature I'd sure like has been instituted or not, pan
> continues to work better for me than the alternatives, so I guess I
> can't complain to strenuously.
>
> </mode=rant>
>
> Meanwhile, despite the fact that we're left fighting with the equivalent
> of our hands tied behind our backs, there's still a slight chance you
> can find something useful to match. I assume you've already found
> nothing useful to match in the subject or author headers, and date,
> group, line- count, xref, etc, are too generic to be useful.
>
> That leaves one remaining possibility, the message-ID. If you are lucky
> and this guy isn't an expert at this yet, the message-ID header, which
> *IS* part of the overview headers, will contain something identifying
> that can be scored on, hopefully without matching a bunch of other posts
> in the process.
>
> Message-ID is (or is supposed to be) unique for each post, so you'll
> have to use contains or regex expression type matching. You'll also
> have to hand-edit the score in your scorefile, altho you can get it most
> of the way there using pan's GUI. Of course, you first have to see if
> there's part of the message-ID that's uniquely his, but matches all his
> messages. Turn view headers on and check that header in several of his
> messages. You will likely want to compare those of other regulars as
> well, just to be sure you won't over-match. If you find something
> useful to match, select one of his messages and add a score on it, based
> on the References header, which pan will auto-fill-out with the
> message-ID. You'll need to edit out the part that changes, of course.
> Once you have it setup, add the score (without rescore), but keep open
> the view scores dialog. Then load the scorefile in your favorite text
> editor and find the score (should be at the end). Edit the References
> line, changing it to Message-ID. Save the file, and back in pan, NOW
> hit the close and rescore in the view article's score window. If you
> got it right, that should do it, and won't match anyone else's real
> posts.
>
> As I said tho, the good attackers won't overlook message-ID and will
> already set it so his provider won't, and you'll have no reliable way to
> score his posts. The best attackers won't just fake the message-ID,
> they'll make it look like the one the regular author they are faking
> uses, so matching it will unfortunately match the regular author's posts
> as well.
>
> BTW, that highwinds-media entry looks familiar. My ISP (Cox) outsources
> from them, so all Cox users get that stamp. If it's a Cox user,
> however, not some other non-cox user of the same server, a number of
> other headings will show up as well, including an unencrypted
> NNTP-posting- host, an X-Complaints-To header listing
> address@hidden, and an X-Trace header listing the
> same user IP as the NNTP-Posting-Host and the same server as the posted
> entry. If it doesn't have those elements, it's probably not a Cox user,
> anyway. Unfortunately, none of those headers normally appear in the
> overviews, so pan can't properly score against them. =8^(
Unfortunately, I lack the skills to do any patching or programming also.
but I am with you that I'd like to have the ability to score on more than
the overview headers. That's one of the drawbacks to Agent and
Thunderbird also.
--
Frank Tabor
Just to have it is enough.