bug-cfengine
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

editfiles bug and odd behaviour


From: David Ressman
Subject: editfiles bug and odd behaviour
Date: Fri, 16 Mar 2001 15:59:33 -0600

I apologize for the length of this email, but these problems take a lot of 
work to explain accurately.  Unless specifically stated otherwise, all 
source is from 1.6.3.


We use the RunScriptIfNoLineMatching command in editfiles to look for a
printer definition in the printcap file and run a script to add the printer
if it's not found.  Our cfengine input file looks like this:

 { /etc/printers.conf
   SetScript "/bin/lpset -a bsdaddr=host,printer -a description=blah 
    printer ;"
   RunScriptIfNoLineMatching "^printer:\\$"

   <repeated for every printer>
 }

It didn't ever work until we added the ";" to the end of the SetScript,
although at the time, we didn't bother to investigate further.  (This is
important, for reasons explained later.)

This worked fine under 1.5.4, but when we switched to 1.6.0 it broke 
completely.  It would execute the input file without error, but the command
to add the printer wouldn't actually be run.  If we looked at the debugging
output, we could see that there was a problem.

As cfengine parsed the file, everything looked fine (\'s added by me):

  AddEditAction(/etc/printers.conf,SetScript,/bin/lpset -a bsdaddr=host,\
   printer -a description=blah printer)
  EditActionsToCode(SetScript)
  AddEditAction(/etc/printers.conf,RunScriptIfNoLineMatching,^printer:\\$)

  SetScript [/bin/lpset -a bsdaddr=host,printer -a description=blah printer]
  RunScriptIfNoLineMatching [^printer:\\$]

However, when it was actually trying to do it, we got this output:

  Edit action: SetScript
  Edit action: RunScriptIfNoLineMatching
  Running command: ^printer:\\$ /etc/printers.conf solaris  2>&1
  cfpopen(^printer:\\$ /etc/printers.conf solaris  2>&1)
  cfpclose(pp)
  cfpopen - Waiting for process 10394

And this didn't seem right at all.  From the debugging output, it looked
like it was actually trying to execute the search regexp string instead of
the defined script.  This looked like a simple case of just passing the
wrong argument to some function, so we decided to take a look.

In edittools.c, there are 3 places of interest to look (all in DoEditFile):

 Line 332: 
   char *currenteditscript, searchstr[bufsize], expdata[bufsize];

 Line 413:
   ExpandVarstring(ep->data,expdata,NULL);

 Line 778:
    case SetScript:
        currenteditscript = expdata;
        break;

    <...>

    case RunScriptIfNoLineMatching:
        if (! LocateNextItemMatching(filestart,expdata))
           {
           if (! RunEditScript(currenteditscript,filename,&filestart,ptr))
           <...> 
           
The last 2 (from just above 413) are enclosed in a while loop, and it 
appears that for every file to be edited by editfiles, it goes through this
loop one time for every action to be performed on that file.

Since we're running SetScript and then RunScriptIfNoLineMatching, it goes
through this list at least twice for each printer we do this with.  For each
iteration of this while loop, the command at line 413 will take the 
arguments for the particular action, expand any variables, and put them in 
the expdata array.  When it runs through the SetScript case, it assigns that 
currenteditscript pointer to point to the expdata array so that the actual 
script will be stored with the data provided by SetScript when it runs 
through the RunScriptIfNoLineMatching case.

Here's the problem.  Since the expdata array gets populated with new data 
at every iteration of the while loop, and since currenteditscript is only a
pointer to the expdata array, the data that is supposed to stay in 
currenteditscript for the next iteration of the loop gets rewritten.  That's 
why we see it trying to execute "^printer:\\$".  Since "^printer:\\$" is in 
expdata for the RunScriptIfNoLineMatching case, currenteditscript just
points to that data.  This was easily fixed by changing currenteditscript to
a regular array and changing SetScript so that it strncpys the data into it.
This fixes that particular problem very well, and it appears to me to be the
proper way to do it (although I didn't write it, so Mark would know better).

See enclosed patch1 for the fix.

However, once this bug was fixed, something else popped up. (This 
description is heavily plagiarized from an email my boss sent me.)

There's one difference between 1.5.4 & 1.6.3 w/ regard to the cfpopen() call 
in RunEditScript() in edittools.c.  Specifically, in 1.5.4, RunEditScript 
*doesn't use* cfpopen(); it rather, it calls the Unix C library function 
popen().  Most likely, cfpopen() is designed to be a private work-alike for 
the popen() function.  However, there is at least one major difference, 
which is the the source of our problem.

popen() runs its argument by passing it to "/bin/sh -c", whereas cfpopen 
execv()'s its argument.  execv() simply executes its argument, w/o passing 
it to a shell.  The consequence of this for our editscript is that the ";" 
we had to put at the end of our script when we were still using 1.5.4 and 
the "solaris" and "2>&1" cfengine adds are passed directly to what
gets execv()'d (/bin/lpset); since execv() doesn't call a shell, there's no 
shell to interpret these extra args, and so get passed directly to lpset.

We have a problem with the extra arguments cfengine is adding.  If you
run lpset manually w/ these extra args, it will complain w/ a usage
message.  I'm guessing the reason our edit script worked w/ 1.5.x is
because the script was being passed to a shell via the popen(); as a
result, the ";" at the end of our script separated the real lpset
command & its args from the args cfengine tacked on.  Hence, the lpset
ran correctly, and then shell tried to run the extra args as another
command (at no consequence however).  I'm guessing this is the reason
why we needed to add ";" to make that work.

A workaround, w/o any change to the cfengine code, would be define our
commands using a shell.  That is, rather than:

  SetScript "/bin/lpset -a bsdaddr=host,printer -a
             description=blah printer"

we prefix all of that "/bin/sh -c":

  SetScript "/bin/sh -c '/bin/lpset -a bsdaddr=host,printer -a
             description=blah printer'"

/bin/sh will just ignore the extra arguments that cfengine supplies.

Alternatively, we could write a shell wrapper to "do the right thing", but
I like the idea of cfengine running the actual lpset command more than I
like the idea of having to write a wrapper just to deal with these extra
arguments.

Unfortunately, this reeks of hack to us, but to fix this in code, we'd have 
to not add the extra args, revert to using popen() rather than cfpopen(), or 
change cfpopen() to use "/bin/sh -c"  Any of these would depend on what 
Mark's intent was when he replaced popen and added those extra arguments.

Since it seems unlikely that many people out there are actually using these
RunScript[...] commands (since no one has complained about the pointer bug),
I'd like to see those extra arguments go away (at the very least, the 2>&1
should scram since cfpopen doesn't use a shell).  If you intend for them to
stay, it would be nice to have a configuration option to leave them out.

David

Attachment: cfengine-1.6.3-patch1
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]