[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pan-users] Re: Global scoring with exclusions
From: |
Duncan |
Subject: |
[Pan-users] Re: Global scoring with exclusions |
Date: |
Thu, 14 Jun 2007 18:23:36 +0000 (UTC) |
User-agent: |
Pan/0.131 (Ghosts: First Variation) |
Dave <address@hidden> posted
address@hidden, excerpted below, on Thu, 14 Jun 2007
17:11:26 +0100:
> I have a global score set up to mark trollish cross-posts, ie to three
> or more newsgroups. But I need to exclude some groups from this global
> score.
>
> I've tried the following:
>
> %Score crossposts to 3+ NGs, exclude BY and VM groups
> [~blueyonder\.announce*,~virginmedia\.announce*,*.*]
> Score:: =-9999 %CrossPosts
> Newsgroups: (.*:){3} %Crossposted3PlusGroups Xref: (.*:){3}
> %Crossposted3PlusGroups
>
> %Score crossposts to 3+ NGs, exclude BY and VM groups
> [*.*,~blueyonder\.announce*,~virginmedia\.announce*]
> Score:: =-9999 %CrossPosts
> Newsgroups: (.*:){3} %Crossposted3PlusGroups
> Xref: (.*:){3} %Crossposted3PlusGroups
>
>
> %Score crossposts to 3+ NGs, exclude BY and VM groups
> [~blueyonder.announce*,~virginmedia.announce*,*.*]
> Score:: =-9999 %CrossPosts
> Newsgroups: (.*:){3} %Crossposted3PlusGroups
> Xref: (.*:){3} %Crossposted3PlusGroups
>
> %Score crossposts to 3+ NGs, exclude BY and VM groups
> [~blueyonder\.announce*,~virginmedia\.announce*,*.*]
> Score:: =-9999 %CrossPosts
> Newsgroups: (.*:){3} %Crossposted3PlusGroups
> Xref: (.*:){3} %Crossposted3PlusGroups
You don't mention what pan version. Those section/newsgroup lines (all
but the third example) aren't going to do what you want in new-pan
(>0.90), period. That's because it's stricter slrn style than old-pan
was, and the newsgroups lines are NOT regex based, only * wildcard.
Therefore, those \. entries will match LITERAL backslash chars in the
actual group name.
Here's the slrn scoring doc: http://www.slrn.org/docs/score.txt
Now, it /does/ mention ~ in the section/newsgroup lines negating, so you
got that part right. However, the way it is worded implies you /cannot/
mix negative and positive group matches. The ~ must be the first char
following the opening bracket and negates the entire section match. You
therefore choose positive or negative matching, not both. If you choose
negative, it's all /but/, so the *.* is already the positive match. You
don't specifically list it. You only list what /not/ to match. Further,
~ as the first char negates the entire entry, so it appears only there.
Further appearances would again match literal ~ chars in the group names
Therefore, try this:
[~blueyonder.announce*,virginmedia.announce*]
Now for your scoring match, the actual Newsgroups: header.
Newsgroups: (.*:){3} %Crossposted3PlusGroups
OK, that's a regex, so the form is right, but the match isn't. Why?
Because the newsgroups header doesn't use colon separators, it uses
commas. So try this instead:
Newsgroups: (.*,){3} %Crossposted3PlusGroups
The next question is exactly how many groups did you want the xpost to
match? The comma only appears /between/ groups, not at the end, so if it
appears three times, it'll match four groups, but not three, because that
would be two commas, not three. Also, your multiplier is /exactly/
three, so as-is, it will match /exactly/ four groups, not five or more.
You probably want this (two commas plus, no upper limit, so 3+ groups):
Newsgroups: (.*,){2,} %Crossposted3PlusGroups
Which of course could also be written:
Newsgroups: ,.*, %Crossposted3PlusGroups
Since the regex isn't anchored at either the left (^) or right ($), it
matches anywhere in the line. So, it should match as long as there are
at least two commas, with any number of other characters (even zero,
you'd use a plus instead of the asterisk if you wanted to match one or
more) between them.
However, note that the doc specifies precisely two types of comments,
those on their own line begun with a percent char, and comments after
scores, also begun with a percent char. The doc does **NOT** state that
ANY line may end in a comment (or put it this way, if it does, I missed
it). Therefore, depending on how strictly the documentation is followed,
the lines above MAY be parsed as matching a LITERAL
"%Chrossposted3PlusGroups". I don't believe that's what you want either,
so that leaves us with one of the two entries below (put a comment on its
own line above or below the matching line, if necessary):
Newsgroups: (.*,){2,}
Newsgroups: ,.*,
Of course, you'll need to adjust the Xref match line as well. Without
checking, however, I can't say for sure whether it uses colons or commas
as separators. You may have the colons part right, there, but would
still need to dump the comment and adjust the frequency.
See if you get any better luck with the above changes. If not, maybe pan
isn't quite following the documented format after all. =8^\
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman