some tips on learning linux/bsd/solaris: Setup torque pbs system

For Server

1. install:

sudo apt-get install torque-server torque-scheduler torque-client

2. create new profile by:

pbs_server -t create

3. change the server name as macondo03 in var/spool/torque/server_name
4. check the status by

# qmgr -c 'p s'

5. add computational nodes at /var/spool/torque/server_priv/nodes:

macondo01 np=64
macondo02 np=64
macondo03 np=64
macondo04 np=64

6. add more criteria in the server

qmgr -c "set server acl_hosts = macondo03"
qmgr -c "set server scheduling=true"
qmgr -c "create queue batch queue_type=execution"
qmgr -c "set queue batch started=true"
qmgr -c "set queue batch enabled=true"
qmgr -c "set queue batch resources_default.nodes=1"
qmgr -c "set queue batch resources_default.walltime=3600"
qmgr -c "set server default_queue=batch"
qmgr -c "set server keep_completed = 86400"    # amount of time to keep complete
qmgr -c "set server query_other_jobs = true"   # let other people see your job

6. check node status using

$ pbsnodes -a

7. run pbs server and pbs schedule by

pbs_sched
pbs_server

8. put the pbs_sched and pbs_server running on boot

mv /filename /etc/init.d/
chmod +x /etc/init.d/filename 
update-rc.d filename defaults

For client:

(1) install torque client

sudo apt-get install torque-client torque-mom

(2) specify servername in /var/spool/torque/server_name

macondo04

(3) cofigure /var/spool/torque/mom_priv/config, make sure it looks like that

$pbsserver      macondo04          # note: this is the hostname of the headnode
$logevent       255                 # bitmap of which events to log
$usecp  *:/home /home
$usecp  *:/home/users /home/users
$usecp  *:/home/users1 /home/users1
$usecp  *:/home/users2 /home/users2
$usecp  *:/home/users3 /home/users3

Note that the last line specifies how the client node (node that do computational work) is communicating with the head node (the node that submits the jobs)

Since the current system use NFS to communicate, it can direct use cp to transfer file with out barriers.

(3) make sure pbs_mom is running at the client machine

psb_mom
ps aux |grep "pbs"

(4) add pbs_mom into /etc/rc.local so that this deamom is running on boot

pbs_mom

(5) you can check if the host machine can find out the nodes or not by running the following command in any nodes:

pbsnodes -a
macondo01
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106821,varattr=,jobs=,state=free,netload=2275358527127,gres=,loadave=41.04,ncpus=64,physmem=131988228kb,availmem=259618040kb,totmem=266160896kb,idletime=1703,nusers=6,nsessions=20,sessions=2817 59937 18341 19455 19858 21924 59201 31663 32133 35793 54824 7341 42678 52013 53858 53863 58843 59605 59606 59741,uname=Linux macondo01 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux

macondo02
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106834,varattr=,jobs=,state=free,netload=672949712804,gres=,loadave=0.00,ncpus=64,physmem=131988228kb,availmem=263269024kb,totmem=266160896kb,idletime=7647,nusers=6,nsessions=9,sessions=2691 16924 11828 16164 16336 19656 19765 49307 50259,uname=Linux macondo02 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux

macondo03
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106809,varattr=,jobs=,state=free,netload=1285952855161,gres=,loadave=0.03,ncpus=64,physmem=131987480kb,availmem=261421664kb,totmem=266160148kb,idletime=1285,nusers=5,nsessions=5,sessions=2041 5539 15054 35499 36593,uname=Linux macondo03 3.8.0-39-generic #57~precise1-Ubuntu SMP Tue Apr 1 20:04:50 UTC 2014 x86_64,opsys=linux

macondo04
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106833,varattr=,jobs=,state=free,netload=366267047029,gres=,loadave=45.76,ncpus=64,physmem=131988228kb,availmem=253846276kb,totmem=266160896kb,idletime=2895,nusers=9,nsessions=23,sessions=2934 5542 12767 60790 21701 30995 31420 32046 36411 36593 36670 36675 52291 45704 47664 59365 57029 59697 62688 61747 62270 62342 63025,uname=Linux macondo04 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux

working scripts:

#PBS -A uq-CivilEng
#PBS -N data
#PBS -l walltime=530:00:00
#PBS -l nodes=2:ppn=64
#PBS -j oe
#PBS -m abe -M chenming.zhang@uq.edu.au

cd $PBS_O_WORKDIR
date 1>output 2>error

pbscommand:
qterm -t quick
pbs_server the above two commands restarts the system without disturbing the simulation

Problem so far:
refrain the user right to directly executing the files

* qstat

* showq

* qmgr

* diagnose
maui's current tables for important metrics
-f: fairshare
-p: priority

* pbsnodes

* qsummary

* idle_nodes
- displays the nodes that have $1 (or more) cpus free (default 1 or more)

* checkjob

* checknode

* showstate

* showstats

* qalter
- alter the parameters of a job. Usually applies to a job that is in the Q state and you want to change values (like cpu-time, target nodes) before it runs

* qhold
- put a hold on a job before it starts running (applies to job in Q state)

* qrun -H
- force to run a job on a particular node

To use PBS batch queue manger on feynman cluster:
- User must be added to ACL, thus authorized to submit jobs
- node016 and node017 are almost always free. They are time-shared nodes;
mainly for small (less than an hour) jobs, can be shared among multiple jobs.

- Node001-node015 are in exclusive mode, one job per CPU.

- All nodes have 2 CPUs each. There are two ways to submit a job:

- by default, your job goes to either node016 or node017. That is, all
you have to say is: "qsub "
- to run jobs in exclusive mode on node001-node015, do:
"qsub -l nodes=n ", where 'n' is the number of exclusive cpus
you want for your job.

- A better solution will be to use MAUI's fairshare which tracks usage
per time-intervals. Computes fairshare by weighted sum of usage with
weight of usage in older time-intervals progressively decreasing.

(Use qsub2,qstat2,qdel2 for PBS server#2 @feynman:13001 serving Long Q)
Also see: http://physics/it/cluster/admin/pbs_cmds.html
qsub #submit a job, see man qsub
qdel -p jobid #will force purge the job if it is not killed by qdel
qdel_all #kill all jobs & stop PBS server.
#Run this before shutdown of cluster nodes.
#When all nodes are up again, do@feynman service pbs restart
qsummary #lists summary of jobs/user
qstat #list information about queues and jobs
showq #calculated guess which job will run next
xpbs #GUI to PBS commands
qstat -q #list all queues on system
qstat -Q #list queue limits for all queues
qstat -a #list all jobs on system
qstat -au userid #list all jobs owned by user userid
qstat -s #list all jobs with status comments
qstat -r #list all running jobs
qstat -f jobid #list full information known about jobid
qstat -Qf queueid #list all information known about queueid
qstat -B #list summary information about the PBS server
qstat -iu userid #get info for jobs of userid
qstat -n -1 jobid #will list nodes on which jobid is running in one line
pbsnodes -l #will list nodes which are either down or offline
checknode node034 #will list status of node034 including jobids on that node
checkjob jobid #will list job details
pbsnodes -s feynman:13001 -o node100 #marks node100 offline to PBS2 (does not affect running jobs)
pbsnodes -s feynman:13001 -c node100 #clears offline status of node100 to PBS2
diagnose -f #lists current recent usage of a user/group vs quota?
showstart jobid #lists start of running or estimate for waiting jobs, get jobids from qstat2
setspri #to raise/lower priority see, http://www.clusterresources.com/products/maui/docs/commands/setspri.shtml

Restarting PBS server: Use "qterm -t quick" to shutdown the server without disturbing any job ( running or queued ). The server will cleanly shut-down and can be restarted when desired. Upon restart of the server, jobs that continue to run are shown as running; jobs that terminated during the server's absence will be placed into the exiting state.

- A stuck/hung uniprocessor job can be purged with qdel -p but for multi-cpu MPI jobs,
you better reboot the hung nodes first and then do a qdel from pbs server so that the
server can talk to the node for cleaning up the queue.

- It is important to specify an accurate walltime for your job in your PBS submission script. Selecting the default of 4 hours for jobs that are known to run for less time may result in the job being delayed by the scheduler due to an overestimation of the time the job needs to run.

pbstop useful command to show the running situation

it is learned that maui is better than torque scheduler. texlive-latex-extra
the below processes provide an guideline to install maui into ubuntu 12.04
1. download libtorque libnuma-dev

apt-get install libadns1-dev libnuma-dev texlive-latex-extra texlive-core
nvidia-cuda-dev
ulimit -a

pbstop

Reference:
https://wiki.archlinux.org/index.php/TORQUE
https://help.ubuntu.com/community/TorquePbsHowto

some tips on learning linux/bsd/solaris

Sunday, 29 June 2014

Setup torque pbs system

For Server

For client:

No comments:

Post a Comment

About Me