[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [help-gnubatch] How do I run jobs on a remote node
From: |
Trygve Laugstøl |
Subject: |
Re: [help-gnubatch] How do I run jobs on a remote node |
Date: |
Fri, 25 Sep 2009 14:48:03 +0200 |
User-agent: |
Thunderbird 2.0.0.23 (X11/20090817) |
John M Collins wrote:
On Thu, 2009-09-24 at 19:04 +0200, Trygve Laugstøl wrote:
John M Collins wrote:
> On Wed, 2009-09-23 at 23:18 +0200, Trygve Laugstøl wrote:
>> Hi
>>
>> I'm trying to package GNUBatch for OpenCSW [1]. I've created a package
>> and installed GNUBatch quite successfully. I can start it, run jobs with
>> gbch-r and they're executed. I even get an email about it every time.
>>
>> Now, the question is how do I get it to run the job on other nodes? I've
>> been through the manuals but haven't been able to find much info on the
>> subject. I can't find much info on how to create different queues, only
>> how to write expressions to select them.
>>
>> [1]: http://opencsw.org
>>
>> --
>> Trygve
>>
> You firs need to have each other node set up so it sees "exported" jobs
> and variables from its peers - you should be able to change job
> parameters remotely etc.
>
> You may have to run gbch-hostedit to set up other nodes' IP addresses
> and stop/restart the scheduler.
I think I've got the host file correctly configured. When I'm in
gbch-hostedit I can see the (correct) IP of the host.
However, I'm not sure how to verify the file and the connection to the
other hosts.
This is my setup:
$ cat /opt/csw/etc/gnubatch.hosts
# Host file created on 24/09/09 at 18:59:19
skybert-6 s6 probe,manual,trusted
When I'm trying to access a variable on a remote node (assuming this is
the correct syntax) I'm getting:
$ gbch-var skybert-6:CLOAD
gbch-var: Unknown variable skybert-6:CLOAD
I did try to run "gbch-conn skybert-6" which seemed to work just fine
after switching the connection type to manual.
If there had been anything wrong with the hosts file it would have given
some error message at that point.
Are there any log files I can look at?
I think it has probably worked OK
You have got each machine with a hosts file entry pointing to the other
one haven't you?
Yep:
telestes:$ cat /opt/csw/etc/gnubatch.hosts
# Host file created on 24/09/09 at 18:59:19
skybert-6 s6 probe,manual,trusted
skybert-6:]$ cat /opt/csw/etc/gnubatch.hosts
# Host file created on 24/09/09 at 18:58:09
telestes - probe,manual,trusted
Now they're both in manual mode. I haven't seen any messages in the
"btsched-reps" file
telestes:$ netstat -a|grep gnubatch
*.gnubatch Idle
*.gnubatch-netsrv Idle
*.gnubatch *.* 0 0 49152 0 LISTEN
*.gnubatch-feeder *.* 0 0 49152 0 LISTEN
*.gnubatch-netsrv *.* 0 0 49152 0 LISTEN
*.gnubatch-api *.* 0 0 49152 0 LISTEN
skybert-6:$ netstat -a|grep gnubatch
*.gnubatch Idle
*.gnubatch-netsrv Idle
*.gnubatch *.* 0 0 49152 0 LISTEN
*.gnubatch-feeder *.* 0 0 49152 0 LISTEN
*.gnubatch-netsrv *.* 0 0 49152 0 LISTEN
*.gnubatch-api *.* 0 0 49152 0 LISTEN
If you look in the "btsched-reps" file there will be messages if it
doesn't understand a connection attempt.
After you've run "gbch-conn" check for a connection on the gnubatch port
using "netstat -a|grep gnubatch".
You won't "see" the variables on the other machine until you've marked
them for export on the other machine with "gbch-var -E varname". The
same is true of jobs. (I had to make it like that as the network traffic
is too great especially when you have several hosts).
I tried this on telestes as the user "gnubatch":
$ gbch-var -C TRYGVE -s 123
gbch-var: Unknown variable TRYGVE
Is this the right syntax?
$ gbch-vlist
CLOAD 0 Export # Current value of load level
LOADLEVEL 20000 # Maximum value of load level
LOGJOBS # File to save job record in
LOGVARS # File to save variable record in
MACHINE telestes-nge0.vs.inamo.no # Name of current host
STARTLIM 15 # Number of jobs to start at once
STARTWAIT 30 # Wait time in seconds for
job start
On skybert-6:
$ gbch-vlist -R
CLOAD #
LOADLEVEL #
LOGJOBS #
LOGVARS #
MACHINE skybert-6.vs.inamo.no # Name of current host
STARTLIM #
STARTWAIT #
--
Trygve
- [help-gnubatch] How do I run jobs on a remote node, Trygve Laugstøl, 2009/09/23
- Re: [help-gnubatch] How do I run jobs on a remote node, John M Collins, 2009/09/23
- Re: [help-gnubatch] How do I run jobs on a remote node, Trygve Laugstøl, 2009/09/24
- Re: [help-gnubatch] How do I run jobs on a remote node, John M Collins, 2009/09/24
- Re: [help-gnubatch] How do I run jobs on a remote node,
Trygve Laugstøl <=
- Re: [help-gnubatch] How do I run jobs on a remote node, John M Collins, 2009/09/25
- Re: [help-gnubatch] How do I run jobs on a remote node, Trygve Laugstøl, 2009/09/25
- Re: [help-gnubatch] How do I run jobs on a remote node, John M Collins, 2009/09/25
- Re: [help-gnubatch] How do I run jobs on a remote node, Trygve Laugstøl, 2009/09/25
- Re: [help-gnubatch] How do I run jobs on a remote node, John M Collins, 2009/09/25
- Re: [help-gnubatch] How do I run jobs on a remote node, Trygve Laugstøl, 2009/09/25
Re: [help-gnubatch] How do I run jobs on a remote node, Trygve Laugstøl, 2009/09/26
Re: [help-gnubatch] How do I run jobs on a remote node, John M Collins, 2009/09/24