John M Collins wrote:
> On Thu, 2009-09-24 at 19:04 +0200, Trygve Laugstøl wrote:
>> John M Collins wrote:
>> > On Wed, 2009-09-23 at 23:18 +0200, Trygve Laugstøl wrote:
>> >> Hi
>> >>
>> >> I'm trying to package GNUBatch for OpenCSW [1]. I've created a package
>> >> and installed GNUBatch quite successfully. I can start it, run jobs with
>> >> gbch-r and they're executed. I even get an email about it every time.
>> >>
>> >> Now, the question is how do I get it to run the job on other nodes? I've
>> >> been through the manuals but haven't been able to find much info on the
>> >> subject. I can't find much info on how to create different queues, only
>> >> how to write expressions to select them.
>> >>
>> >> [1]: http://opencsw.org
>> >>
>> >> --
>> >> Trygve
>> >>
>> > You firs need to have each other node set up so it sees "exported" jobs
>> > and variables from its peers - you should be able to change job
>> > parameters remotely etc.
>> >
>> > You may have to run gbch-hostedit to set up other nodes' IP addresses
>> > and stop/restart the scheduler.
>>
>> I think I've got the host file correctly configured. When I'm in
>> gbch-hostedit I can see the (correct) IP of the host.
>>
>> However, I'm not sure how to verify the file and the connection to the
>> other hosts.
>>
>> This is my setup:
>>
>> $ cat /opt/csw/etc/gnubatch.hosts
>> # Host file created on 24/09/09 at 18:59:19
>>
>> skybert-6 s6 probe,manual,trusted
>>
>>
>> When I'm trying to access a variable on a remote node (assuming this is
>> the correct syntax) I'm getting:
>>
>> $ gbch-var skybert-6:CLOAD
>> gbch-var: Unknown variable skybert-6:CLOAD
>>
>> I did try to run "gbch-conn skybert-6" which seemed to work just fine
>> after switching the connection type to manual.
> If there had been anything wrong with the hosts file it would have given
> some error message at that point.
>
>> Are there any log files I can look at?
> I think it has probably worked OK
>
> You have got each machine with a hosts file entry pointing to the other
> one haven't you?
Yep:
telestes:$ cat /opt/csw/etc/gnubatch.hosts
# Host file created on 24/09/09 at 18:59:19
skybert-6 s6 probe,manual,trusted
skybert-6:]$ cat /opt/csw/etc/gnubatch.hosts
# Host file created on 24/09/09 at 18:58:09
telestes - probe,manual,trusted
Now they're both in manual mode. I haven't seen any messages in the
"btsched-reps" file
telestes:$ netstat -a|grep gnubatch
*.gnubatch Idle
*.gnubatch-netsrv Idle
*.gnubatch *.* 0 0 49152 0 LISTEN
*.gnubatch-feeder *.* 0 0 49152 0 LISTEN
*.gnubatch-netsrv *.* 0 0 49152 0 LISTEN
*.gnubatch-api *.* 0 0 49152 0 LISTEN
skybert-6:$ netstat -a|grep gnubatch
*.gnubatch Idle
*.gnubatch-netsrv Idle
*.gnubatch *.* 0 0 49152 0 LISTEN
*.gnubatch-feeder *.* 0 0 49152 0 LISTEN
*.gnubatch-netsrv *.* 0 0 49152 0 LISTEN
*.gnubatch-api *.* 0 0 49152 0 LISTEN