pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] how to use additional X-headers for scoring?


From: Duncan
Subject: Re: [Pan-users] how to use additional X-headers for scoring?
Date: Tue, 4 Oct 2011 12:34:17 +0000 (UTC)
User-agent: Pan/0.135 (Tomorrow I'll Wake Up and Scald Myself with Tea; GIT 8e43cc5 branch-master)

FritzS - gmx posted on Mon, 03 Oct 2011 17:24:25 +0200 as excerpted:

> I am new here on pan users.
> I run pan 0.135 under Mac OS X Lion and MacPorts (GIT 30dc37b master;
> x86_64-apple-darwin11.1.0)

You will certainly want to compare notes with SciFi, a regular use on 
this list, then.  He is, I believe, the biggest OSX expert here.  You may 
wish to check the list archive for his posts and see if there's anything 
useful you can get from them.

AFAIK, he also has (or had, perhaps the fixes are all merged, now) a 
github pan repo, with I believe a number of OSX-specific fixes for 
various things.  I don't know if he follows HM/judgefudge's github repo 
or just khaley's (which is more conservative and the direct upline to 
what ultimately becomes the public releases after going thru the official 
gnome repo via Petr Kovar).  You could also possibly cherry-pick the OSX-
specific commits from SciFi's repo and apply them to HM/judgefudge's repo, 
if you want both leading edge and OSX-specific fixes.  SciFi should be 
able to give you rather more detail both about pan on OSX and about his 
specific fixes, if any, so it's definitely worth trying to contact him 
directly, if he doesn't respond here within a few days.

> how I could use additional X-headers such as X-Authenticated-User:
> or NNTP-Posting-Host: for scoring?
> 
> In Score this don't work:
> %BOS
> %Score created by Pan on Wed Sep 28 06:26:18 2011
> [~xxxxxx]
> Score:: -600
> X-Authenticated-User: ^$$m$x1dqu63w4useridf$
> %EOS

You don't mention where you got your pan scorefile formatting info from 
or what details you may or may not know about it.  So here's the format 
documentation links I keep around, here, for referring to myself, and for 
posting when people ask about it:

The scorefile follows SLRN's format in general, tho AFAIK pan doesn't 
understand some of the advanced stuff like the include statement, and 
pan's scores are case-insensitive with keyword: , case-sensitive with 
keyword= .

http://www.slrn.org/docs/score.txt

Here's a second document, the xnews scorefile doc.  xnews uses a similar 
style, but it takes group regexes, while slrn and pan use shell-style *-
wildcards only, for groups, NOT regexes.

http://xnews.remarqs.net/scoring.txt

Originally, pan only accepted the much more limited set of header 
keywords listed in the xnews doc, basically, the ones in the overviews.

But... reasonably recent pan (from 0.134 or 0.135, since khaley took over 
as lead code monkey) *SHOULD* score on ALL headers, *IF* setup correctly, 
according to khaley.  But, there's two caveats.  First, as you obviously 
know by now, you must edit the scorefile directly to setup non-overview 
header scores, as pan's GUI doesn't handle them.

Second, and this *MIGHT* be what's getting you:

** PAN CAN ONLY SCORE ON OVERVIEW HEADERS PRE-DOWNLOAD, OTHERS SCORES 
WON'T APPLY UNTIL AFTER DOWNLOAD **

This is because overviews only contain a relatively limited set of 
headers, from, to, date, lines/size, message-id and references, etc.  Pan 
doesn't see the other headers until after it has actually downloaded the 
articles, and it obviously can't score on data it hasn't yet seen.

X-Authenticated-User isn't going to be in the overviews, for most 
servers, so pan can only score on it post-download.

It's quite possible you already realized this, and pan isn't scoring on 
it even after download.  But you don't say either way, so...


Meanwhile, based on the below, you know about the initial ~ negation, and 
it appears you know regular expressions (but I suspect some regex 
formatting problems, see below).

However, the above includes comments (the lines starting with %,  so the %
BOS, %EOS, and %Score created by... lines), which you don't specifically 
mention /as/ comments, so I'm not sure if you're taking them as required, 
based on pan putting them there, or if you realized that they /are/ 
comments.

Anyway, FWIW, I removed the %BOS and %EOS comments from my scorefile, and 
actually combined a whole bunch of sections (denoted by the [newsgroup] 
lines) and scores, *VASTLY* simplifying my scorefile.  Of course, that 
means I generally edit it by hand, except for the real temporary stuff.  
Similarly, I remove the created on date comments for the permanent stuff, 
only keeping it for the temporary stuff with expires, thus again 
simplifying things by removing what is mostly just "pan comment noise", 
once you understand the format reasonably well.

Oh, and leading spaces are ignored, so you can indent as makes sense to 
you.  Here, I'm not indenting, since indents can unnecessarily complicate
quoting and wrapping in replies.

So these are the operative lines (re-quoted):

> [~xxxxxx]
> Score:: -600
> X-Authenticated-User: ^$$m$x1dqu63w4useridf$
 
> the string $$m$x1dqu63w4useridf  is the user ID on the NNTP server and
> posted in the header
> 
> The original line in header:
> X-Authenticated-User: $$m$x1dqu63w4useridf

OK, you haven't made clear whether you're expecting pan to be able to 
score on that BEFORE download, like it does with the from header (which 
*IS* in overviews, so it can do so), or whether you know it can't do 
that, and are reporting that it isn't scoring AFTER download, either, 
which AFAIK, according to khaley, it should.

Assuming that it's not working AFTER download either, then we have a 
problem.

Have you tried simplifying the score at least temporarily?  What if you 
try something as simple as this, which should apply to EVERY post with an 
X-Authenticated-User header. (FWIW, I chose the =5000 so if you have the 
colorcoding setup correctly in preferences, it should be reasonably easy 
to spot, and a positive score so you shouldn't have problems with it 
being hidden if you have below-zero posts hidden by default, of course, 
scores aren't permanent, so we can add this for testing only, then delete 
it and force a rescore, and pan should return to normal scoring as if 
we'd never tried this one):

[~xxxxxx]
Score: =5000
X-Authenticated-User: ^.*$

If that works, but ONLY AFTER downloading the message so it can see the 
header in question, as I suspect it SHOULD, then we know it's scoring on 
the header as it should, but tripping up on the regex you tried to use.

Here's your attempted regex line again:

> X-Authenticated-User: ^$$m$x1dqu63w4useridf$

You are trying to match:

> $$m$x1dqu63w4useridf 

As you know if you know regexes, $ is the regex symbol for right-anchor, 
end of line.  (^ is the symbol for left-anchor, beginning of line.)

The problem, I suspect, is that you're not escaping (with a backslash, 
see the slrn doc examples) those end-of-lines.  Since the first symbol 
pan sees in the regex is $, it's parsing that as end-of-line, thus, the 
header would have to be there but empty in ordered to match your score.

If the ^.*$ test worked (^ is left anchor, . is any character, * means 
any number of the previous, so together .* means any number of any 
character, $ is end of line, so we have beginning of line/header, any 
number of any character, including no characters at all, end of line, and 
it should thus always match if that header appears at all), try this 
instead (be sure and delete the ^.*$ test tho, or it'll stop on the first 
one it sees due to the =5000 score:

[~xxxxxx]
Score: =1111
X-Authenticated-User: \$\$m\$x1dqu63w4useridf

That is without anchors so it should indeed match the desired header 
content, but it might match others that include it, too.  But we first 
want to get it working, then try to narrow the scope and make sure it 
keeps working.  So assuming that works, now we'll try it with the anchors 
(again, delete the test above first):

[~xxxxxx]
Score: =2222
X-Authenticated-User: ^\$\$m\$x1dqu63w4useridf$


If that too works, now, again, only after download since we're dealing 
with a non-overview header, then the final bit is to change that score to 
your desired score:

[~xxxxxx]
Score: -600
X-Authenticated-User: ^\$\$m\$x1dqu63w4useridf$

One final note.  That's a /relative/ -600.  So other score matches can 
increase or decrease it.  That may or may not be what you want.  If you 
want it set to -600, absolute (and to quite processing further relative 
scores), use =-600 instead of just -600.


> This works well
> %BOS
> %Score created by Pan on Mon Oct  3 10:21:15 2011
> [~xxxxxx]
> Score:: =-606
> From: address@hidden \(Name Equall\)$
> %EOS

Four things to note (mostly repeat/reinforcement from above) here.  
First, the scoring is on the from header, which is in overviews, so this 
score should work pre-download.  Second, it's using = scoring, so it sets 
an absolute =-606 to anything matching (provided it didn't match any 
previous absolute scores and thus never got this far), and quits looking 
for further matches.  Third, again, the %... lines are comments.  Fourth, 
note the backslash escaping of the . and ().

=:^)

> [~xxxxxx] (is not xxxxxx)  works  as Joker for all groups
> [*] don't work

What about [*.*] ?  (What sort of group name doesn't contain a . at all?  
None I've ever come across.)  But you found something that's working 
already, so why bother changing?  So it's just for the sake of argument.

It's worth repeating here, as it's a common source of confusion. section/
group lines take * wildcards, **NOT** the regex format used in header 
matches.

> .....
> How can I change, expand the selection at / Article / add new rating ..
> / From the article ....
> 
> In German /Artikel/Neue Wertung hinzufügen ../Und die Artikel ....
> .....
> The next X-Header I want to use for scoring is the NNTP-Posting-Host:
> and with a Joker for a part of the hostname looks like this
> NNTP-Posting-Host: *.hostname.domain.com * is the ip from the posting
> host and a Joker is needed .....

The discussion above should have answered that, but if not, try

NNTP-Posting-Host: \.hostname\.domain\.com$

Again, ^ left-anchors but we do NOT want that here.  The first part 
doesn't matter, and unanchored is one way to do that. A single dot 
matches any single character, so to match ONLY a dot, use the backslash 
escape (otherwise both xhostname and x.hostname would match, we want a 
dot and only a dot).  We *DO* want it anchored at the end, thus the $ 
(otherwise, it'd match .com , .command , .compass ...).

Alternatively, with a left-anchor and explicit "match anything" bit 
immediately after it:

NNTP-Posting-Host: ^.*\.hostname\.domain\.com$


(Now do you see why I couldn't really handle all that when I was too 
tired to think straight, with jacked up blood sugar as well?  But 
hopefully the answer was worth the wait!  =:^)

> Plus and minus scoring appear interchanged in the pan German version
> too.

That's covered in other replies, but briefly, HM just caught that maybe a 
week or two ago and already has it changed in his git repo.  However, I 
don't believe that patch has made it to khaley's mainline repo (lostcoder 
on github) yet, or from there to the offcial gnome repo via pkovar.

> Could I attach some screenshots in this mailing list too?

Attachments should work, yes.  But please be considerate of others' 
bandwidth and don't overdo it.  256-color pngs of just a dialog window 
should be fine.  Full-screen 24-bit or 32-bit color pngs really aren't 
appropriate, tho you probably won't get much complaint for a one-off, but 
might if you do several.  For something big like that, I'd recommend 
uploading it to your webspace, or imagebin.org (good for a couple weeks 
only, tho), or the like, and posting the link and description.

> PS: Please excuse my bad english, my native language is German, I live
> in Austria :-)

As the saying goes, you speak far better English than I do German! =:^)  

Seriously, you write English better than many native speakers.


One final thing.  I and others have mentioned repos a few times, but it 
occurs to me that you may not know where they are.  You'll need to have 
git installed to fetch the git sources, but you can just browse via the 
web interface.  Of course, these are sources, so to do anything more than 
look, you'll need to know how to build them locally.

official gnome repo (releases are made off of this):

http://git.gnome.org/pan2
git://git.gnome.org/pan2

pkovar's github repo (from which the gnome repo is pulled)

https://github.com/pmkovar/pan2
git://github.com/pmkovar/pan2

khaley/lostcoder's github repo (mainline from which pkovar pulls)

https://github.com/lostcoder/pan2
git://github.com/lostcoder/pan2
judgefudge
hmueller/judgefudge's github repo (leading edge experimental)

https://github.com/judgefudge/pan2
git://github.com/judgefudge/pan2

There's also jlynch/aexoden's github repo.  I noticed this browsing github 
one day, but he doesn't appear to be active here, and I really don't have 
much info on it but what's easily apparent from a 5 minute browse of the 
repo via https.  I DO know that a couple of his patches have made it into 
mainline, tho.

https://github.com/aexoden/pan2
git://github.com/aexoden/pan2

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]