Monday 30 June 2014

setup openldap host

The reason of setting up openldap host is to centralize the cluster. that is, each user can enter any node of the cluster using one account. This tutorial explains the setup of the server. To make openLDAP work, one also needs to configure at the client machines (the machine that use openLDAP to authenticate users)

1.  install slapd
sudo apt-get update
sudo apt-get install slapd ldap-utils phpldapadmin
2. envoke slapd configuration guide by executing
sudo dpkg-reconfigure slapd

  • Omit OpenLDAP server configuration? No
  • DNS domain name? macondo04.eait.uq.edu.au
  • Organization name?  macondo04.eait.uq.edu.au
  • Administrator password? input password
  • Database backend to use? HDB
  • Remove the database when slapd is purged? No
  • Move old database? Yes
  • Allow LDAPv2 protocol? No
3. configure /etc/phpldapadmin/config.php:
$servers->setValue('server','host','domain_nam_or_IP_address');
$servers->setValue('server','host','macondo04.eait.uq.edu.au');
$servers->setValue('server','host','127.0.0.1');
$servers->setValue('server','base',array('dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au'));
$servers->setValue('login','auth_type','session');
$servers->setValue('login','bind_id','cn=admin,dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au');
4. Now you should be able to log in by macondo04.eait.uq.edu.au/phpldapadmin.
note that the ldap server is only availabe via student region. so one needs to forward into the staff region by:


  ssh -L9980:127.0.0.1:80  -L10000:127.0.0.1:10000 -XY mpiuser@macondo04

then one should be abe to logon ldapserver by:


 http://127.0.0.1:9980/phpldapadmin/

If one wish his home folders not seen by others, he/she can do: 
chmod 700:503 /home/user/chenming

Reference:
https://help.ubuntu.com/community/LDAPClientAuthentication
https://help.ubuntu.com/community/LDAPClientAuthentication
http://hswong3i.net/blog/hswong3i/ldap-single-sign-webmin-ubuntu-12-04-howto
http://www.linux.com/learn/tutorials/377952%3Amanage-ldap-data-with-phpldapadmin
https://www.digitalocean.com/community/tutorials/how-to-authenticate-client-computers-using-ldap-on-an-ubuntu-12-04-vps
https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-a-basic-ldap-server-on-an-ubuntu-12-04-vps

Sunday 29 June 2014

setup hostbased ssh

Hostbased ssh refers to the protocol that allows authentication of the remote server (where sshd service is provided) is done by checking host machine (where ssh command is executed), not rather the username and password. This is extremely important for several scenarios:

(1) allows a group of users get access in an cluster of machines, once they have logged in from any single machine in the cluster.
(2) setup PBS system where users can access all the cpu resources seamlessly from any node that they login.

Procedures: 
1. Put all nodes and alias into /etc/hosts on both remote server and host machine:
127.0.0.1       hostname.example.com           localhost
130.102.72.43   macondo04.eait.uq.edu.au        macondo04
130.102.72.42   macondo03.eait.uq.edu.au        macondo03
130.102.72.41   macondo02.eait.uq.edu.au        macondo02
130.102.72.40   macondo01.eait.uq.edu.au        macondo01
2. Put all trusted nodes into /etc/hosts.equiv of the remote server:
130.102.72.40    #ip address is the most important one
130.102.72.41
130.102.72.42
130.102.72.43
# the alias seems not very useful for hostbased ssh, however, it is 
quite important to allow accounts in openldap to execute jobs.
10.33.20.120
macondo01 macondo03 macondo04 macondo01.eait.uq.edu.au macondo02.eait.uq.edu.au macondo03.eait.uq.edu.au macondo04.eait.uq.edu.au
This is one of the most important part for hostbasedauthentication. one should carefully check the validity of /etc/hosts.equiv.

3. configure /etc/ssh/sshd_config of the remote server as follows:


# Package generated configuration file
# See the sshd_config(5) manpage for details

# What ports, IPs and protocols we listen for
Port 22
# Use these options to restrict which interfaces/protocols sshd will bind to
#ListenAddress ::
#ListenAddress 0.0.0.0
Protocol 2
# HostKeys for protocol version 2
HostKey /etc/ssh/ssh_host_rsa_key
HostKey /etc/ssh/ssh_host_dsa_key
HostKey /etc/ssh/ssh_host_ecdsa_key
#Privilege Separation is turned on for security
UsePrivilegeSeparation yes

# Lifetime and size of ephemeral version 1 server key
KeyRegenerationInterval 3600
ServerKeyBits 768

# Logging
SyslogFacility AUTH
LogLevel INFO

# Authentication:
LoginGraceTime 120
PermitRootLogin yes
StrictModes yes

RSAAuthentication yes
PubkeyAuthentication yes
#AuthorizedKeysFile     %h/.ssh/authorized_keys

# Don't read the user's ~/.rhosts and ~/.shosts files
IgnoreRhosts yes
# For this to work you will also need host keys in /etc/ssh_known_hosts
RhostsRSAAuthentication yes   #done by chenming
# similar for protocol version 2
HostbasedAuthentication yes
# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
#IgnoreUserKnownHosts yes

# To enable empty passwords, change to yes (NOT RECOMMENDED)
PermitEmptyPasswords no

# Change to yes to enable challenge-response passwords (beware issues with
# some PAM modules and threads)
ChallengeResponseAuthentication no

# Change to no to disable tunnelled clear text passwords
#PasswordAuthentication yes

# Kerberos options
#KerberosAuthentication no
#KerberosGetAFSToken no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes

# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes

X11Forwarding yes
X11DisplayOffset 10
PrintMotd no
PrintLastLog yes
TCPKeepAlive yes
#UseLogin no

#MaxStartups 10:30:60
#Banner /etc/issue.net

# Allow client to pass locale environment variables

AcceptEnv LANG LC_*

Subsystem sftp /usr/lib/openssh/sftp-server

# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication.  Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes

 The most important change for this section is to add:
RhostsRSAAuthentication yes 
HostbasedAuthentication yes
4.  In the remote server (macondo04 in this example), store the rsa public key of the host machine by executing: (this is one of the most important part for the configureation, as all the machines in the /etc/ssh/ssh_known_hosts will obtain the privilliage to visit the host without inputting the password)
ssh-keyscan -t rsa macondo01 >> /etc/ssh/ssh_known_hosts
ssh-keyscan -t rsa macondo02 >> /etc/ssh/ssh_known_hosts
ssh-keyscan -t rsa macondo03 >> /etc/ssh/ssh_known_hosts
after that, one would see the /etc/ssh/ssh_known_hosts on the remote server (macondo04 in this example),  looking like the follows:
macondo01 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDS9bpDVmAgB4SEljkS2zxxY
macondo02 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC49CqXF
macondo03 ssh-rsa AAAB3NzaC1yc2EAAAADAQABAAABAQC9Nre7E2EUmWx/xso4MYCTPXdCyiad4q
To add ip iddress and full dns name in ssh_known_hosts on the remote server (macondo04 in this example), one should change the file as in the folling format:
macondo01,macondo01.eait.uq.edu.au,130.102.72.40 ssh-rsa AAAAB3NzaC1yc2EAAAADAQ
macondo02,macondo02.eait.uq.edu.au,130.102.72.41 ssh-rsa AAAAB3NzaC1yc2EAAAADA
macondo03,macondo03.eait.uq.edu.au,130.102.72.42 ssh-rsa AAAB3NzaC1yc2EAAAADAQA
It is tested that if the ip address is not included in ssh_known_hosts, host-based authentication may fail.

5. Don't forget restart ssh service after changing the configuration, and add ssh service enabled on boot:
sudo service ssh restart
sudo update-rc.d ssh defaults

6. In the host machine, change /etc/ssh/ssh_config as follows:
Host *
 HostbasedAuthentication yes
 PreferredAuthentications     hostbased,publickey,keyboard-interactive,password
 EnableSSHKeysign        yes
 SendEnv LANG LC_*
 HashKnownHosts yes
 Now one should be able to ssh from host machine to remote server without inputing any command. If it is not working still, one can use
ssh -vvvv abc@macondo01
to debug the host machine, or use:
/usr/sbin/sshd -ddd
to debug the remote server.

Appendix:
SSH from A to C via B

# hostA:~/.ssh/config (see man 5 ssh_config for details)
Host hostC
ProxyCommand ssh hostB nc %h %p  # or netcat or whatever you have on hostB

hostA:~$ ssh hostC  # this will automatically tunnel through ssh hostB

http://superuser.com/questions/107679/forward-ssh-traffic-through-a-middle-machine

Reference:
https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Host-based_Authentication
http://users.telenet.be/mydotcom/howto/linux/sshpasswordless.htm

Setup torque pbs system

For Server

1. install:
sudo apt-get install torque-server torque-scheduler torque-client
2. create new profile by:
pbs_server -t create
3. change the server name as macondo03 in var/spool/torque/server_name
4. check the status by
# qmgr -c 'p s'
5. add computational nodes at /var/spool/torque/server_priv/nodes:
macondo01 np=64
macondo02 np=64
macondo03 np=64
macondo04 np=64

6. add more criteria in the server

qmgr -c "set server acl_hosts = macondo03"
qmgr -c "set server scheduling=true"
qmgr -c "create queue batch queue_type=execution"
qmgr -c "set queue batch started=true"
qmgr -c "set queue batch enabled=true"
qmgr -c "set queue batch resources_default.nodes=1"
qmgr -c "set queue batch resources_default.walltime=3600"
qmgr -c "set server default_queue=batch"
qmgr -c "set server keep_completed = 86400"    # amount of time to keep complete
qmgr -c "set server query_other_jobs = true"   # let other people see your job


6. check node status using
$ pbsnodes -a
7. run pbs server and pbs schedule by
pbs_sched
pbs_server
8. put the pbs_sched and pbs_server running on boot

mv /filename /etc/init.d/
chmod +x /etc/init.d/filename 
update-rc.d filename defaults





For client:


(1)  install torque client

sudo apt-get install torque-client torque-mom
(2) specify servername in /var/spool/torque/server_name

macondo04
(3) cofigure /var/spool/torque/mom_priv/config, make sure it looks like that
$pbsserver      macondo04          # note: this is the hostname of the headnode
$logevent       255                 # bitmap of which events to log
$usecp  *:/home /home
$usecp  *:/home/users /home/users
$usecp  *:/home/users1 /home/users1
$usecp  *:/home/users2 /home/users2
$usecp  *:/home/users3 /home/users3
Note that the last line specifies how the client node (node that do computational work) is communicating with the head node (the node that submits the jobs)
Since the current system use NFS to communicate, it can direct use cp to transfer file with out barriers.

(3) make sure pbs_mom is running at the client machine
psb_mom
ps aux |grep "pbs"

(4) add pbs_mom into /etc/rc.local so that this deamom is running on boot
pbs_mom

(5) you can check if the host machine can find out the nodes or not by running the following command in any nodes:
pbsnodes -a
macondo01
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106821,varattr=,jobs=,state=free,netload=2275358527127,gres=,loadave=41.04,ncpus=64,physmem=131988228kb,availmem=259618040kb,totmem=266160896kb,idletime=1703,nusers=6,nsessions=20,sessions=2817 59937 18341 19455 19858 21924 59201 31663 32133 35793 54824 7341 42678 52013 53858 53863 58843 59605 59606 59741,uname=Linux macondo01 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux

macondo02
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106834,varattr=,jobs=,state=free,netload=672949712804,gres=,loadave=0.00,ncpus=64,physmem=131988228kb,availmem=263269024kb,totmem=266160896kb,idletime=7647,nusers=6,nsessions=9,sessions=2691 16924 11828 16164 16336 19656 19765 49307 50259,uname=Linux macondo02 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux

macondo03
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106809,varattr=,jobs=,state=free,netload=1285952855161,gres=,loadave=0.03,ncpus=64,physmem=131987480kb,availmem=261421664kb,totmem=266160148kb,idletime=1285,nusers=5,nsessions=5,sessions=2041 5539 15054 35499 36593,uname=Linux macondo03 3.8.0-39-generic #57~precise1-Ubuntu SMP Tue Apr 1 20:04:50 UTC 2014 x86_64,opsys=linux

macondo04
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106833,varattr=,jobs=,state=free,netload=366267047029,gres=,loadave=45.76,ncpus=64,physmem=131988228kb,availmem=253846276kb,totmem=266160896kb,idletime=2895,nusers=9,nsessions=23,sessions=2934 5542 12767 60790 21701 30995 31420 32046 36411 36593 36670 36675 52291 45704 47664 59365 57029 59697 62688 61747 62270 62342 63025,uname=Linux macondo04 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux

working scripts:
#PBS -A uq-CivilEng
#PBS -N data
#PBS -l walltime=530:00:00
#PBS -l nodes=2:ppn=64
#PBS -j oe
#PBS -m abe -M chenming.zhang@uq.edu.au

cd $PBS_O_WORKDIR
date 1>output 2>error

pbscommand:
qterm -t quick
pbs_server      the above two commands restarts the system without disturbing the simulation





Problem so far:
refrain the user right to directly executing the files


* qstat
* showq
* qmgr
* diagnose
maui's current tables for important metrics
-f: fairshare
-p: priority
* pbsnodes
* qsummary
* idle_nodes
- displays the nodes that have $1 (or more) cpus free (default 1 or more)
* checkjob
* checknode

* showstate
* showstats
* qalter
- alter the parameters of a job. Usually applies to a job that is in the Q state and you want to change values (like cpu-time, target nodes) before it runs
* qhold
- put a hold on a job before it starts running (applies to job in Q state)
* qrun -H
- force to run a job on a particular node
To use PBS batch queue manger on feynman cluster:
- User must be added to ACL, thus authorized to submit jobs
- node016 and node017 are almost always free. They are time-shared nodes;
mainly for small (less than an hour) jobs, can be shared among multiple jobs.
- Node001-node015 are in exclusive mode, one job per CPU.
- All nodes have 2 CPUs each. There are two ways to submit a job:
- by default, your job goes to either node016 or node017. That is, all
you have to say is: "qsub "
- to run jobs in exclusive mode on node001-node015, do:
"qsub -l nodes=n ", where 'n' is the number of exclusive cpus
you want for your job.
- A better solution will be to use MAUI's fairshare which tracks usage
per time-intervals. Computes fairshare by weighted sum of usage with
weight of usage in older time-intervals progressively decreasing.
(Use qsub2,qstat2,qdel2 for PBS server#2 @feynman:13001 serving Long Q)
Also see: http://physics/it/cluster/admin/pbs_cmds.html
qsub #submit a job, see man qsub
qdel -p jobid #will force purge the job if it is not killed by qdel
qdel_all #kill all jobs & stop PBS server.
#Run this before shutdown of cluster nodes.
#When all nodes are up again, do@feynman service pbs restart
qsummary #lists summary of jobs/user
qstat #list information about queues and jobs
showq #calculated guess which job will run next
xpbs #GUI to PBS commands
qstat -q #list all queues on system
qstat -Q #list queue limits for all queues
qstat -a #list all jobs on system
qstat -au userid #list all jobs owned by user userid
qstat -s #list all jobs with status comments
qstat -r #list all running jobs
qstat -f jobid #list full information known about jobid
qstat -Qf queueid #list all information known about queueid
qstat -B #list summary information about the PBS server
qstat -iu userid #get info for jobs of userid
qstat -n -1 jobid #will list nodes on which jobid is running in one line
pbsnodes -l #will list nodes which are either down or offline
checknode node034 #will list status of node034 including jobids on that node
checkjob jobid #will list job details
pbsnodes -s feynman:13001 -o node100 #marks node100 offline to PBS2 (does not affect running jobs)
pbsnodes -s feynman:13001 -c node100 #clears offline status of node100 to PBS2
diagnose -f #lists current recent usage of a user/group vs quota?
showstart jobid #lists start of running or estimate for waiting jobs, get jobids from qstat2
setspri #to raise/lower priority see, http://www.clusterresources.com/products/maui/docs/commands/setspri.shtml
Restarting PBS server: Use "qterm -t quick" to shutdown the server without disturbing any job ( running or queued ). The server will cleanly shut-down and can be restarted when desired. Upon restart of the server, jobs that continue to run are shown as running; jobs that terminated during the server's absence will be placed into the exiting state.
- A stuck/hung uniprocessor job can be purged with qdel -p but for multi-cpu MPI jobs,
you better reboot the hung nodes first and then do a qdel from pbs server so that the
server can talk to the node for cleaning up the queue.
- It is important to specify an accurate walltime for your job in your PBS submission script. Selecting the default of 4 hours for jobs that are known to run for less time may result in the job being delayed by the scheduler due to an overestimation of the time the job needs to run.
pbstop useful command to show the running situation

it is learned that maui is better than torque scheduler. texlive-latex-extra
the below processes provide an guideline to install maui into ubuntu 12.04
1. download libtorque libnuma-dev

apt-get install libadns1-dev  libnuma-dev texlive-latex-extra texlive-core
nvidia-cuda-dev
 ulimit -a

pbstop

Reference:
https://wiki.archlinux.org/index.php/TORQUE
https://help.ubuntu.com/community/TorquePbsHowto

setup openldap client

(1)  Installing the following softwares:
apt-get install ldap-utils libpam-ldap libnss-ldap nslcd
During the installation,  you may be asked to input several prompt:
   a). configuring ldap-auth-config with initial:
         ldapi:///
       we need to change this to:
         ldap://macondo04.eait.uq.edu.au
       note that it is ladp rather ladpi. also note that there are only two "/" rather than three
    b). Distinguished name of the search base. change this to 
         dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au
    c). LADP version to use: Select 3
    d). make local root database admin: select Yes
    e). Does the LDAP database require login?  No
     f). LDAP account for root:
         cn=admin,dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au
    g). LDAP root account password: Your-LDAP-root-password
    h). LDAP server URI: 
          ldap://macondo04.eait.uq.edu.au/
    i).  LDAP server search base: 
          dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au

This wizzard is actually a procedure to configure /etc/ldap.conf. make sure it looks like this:

base dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au
uri ldapi://macondo04.eait.uq.edu.au
ldap_version 3
rootbinddn cn=admin,dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au
pam_password md5
If one wants to go through this process again, one can do 
dpkg-reconfigure ldap-auth-config
(2)  modify /etc/nsswitch.conf  file:
#Original file looks like this
passwd: compat 
group : compat  
shadow: compat 

#After appending "ldap" lines look like these
passwd: compat ldap
group : compat ldap  
shadow: compat ldap 
(3)  execute the following command to make sure that if the user doesn't have a home folder, the system will make one:
echo "session required pam_mkhomedir.so skel=/etc/skel umask=0022">> /etc/pam.d/login
echo "session required pam_mkhomedir.so skel=/etc/skel umask=0022" >> /etc/pam.d/lightdm  
echo "session required    pam_mkhomedir.so skel=/etc/skel umask=0022">> /etc/pam.d/common-session
(4) One also needs to make sure the NFS server is properly mounted to the system.
echo "macondo04:/home/users /home/users nfs">> /etc/fstab
sudo mount macondo04:/home/users /home/users
(5) remove use_authtok parameter in /etc/pam.d/common-password on each host and client nodes so that all the users can change their passwd by using passwd command

(6) make sure you have restarted your nscd:
/etc/init.d/nscd restart

Reference:
http://askubuntu.com/questions/127389/how-to-configure-ubuntu-as-an-ldap-client
https://www.digitalocean.com/community/tutorials/how-to-authenticate-client-computers-using-ldap-on-an-ubuntu-12-04-vps
http://askubuntu.com/questions/340340/how-to-allow-ldap-user-to-change-password


Wednesday 25 June 2014

decode hpc cluster


$clienthost pbsserver
$clienthost paroo3
cpuset_create_flags 0
# Enforce memory limits
$enforce mem
# Use local disk for temporary files, job files and checkpoints
$jobdir_root /scratch
$tmpdir /scratch
##$checkpoint_path /scratch/checkpoint
# Restrict use of batch nodes to PBS, interactive logins not allowed
##$restrict_user on
##$restrict_user_maxsysid 500
# Use cp rather than scp to transfer/stage files i.e. use Panasas
$usecp barrine*.hpcu.uq.edu.au:/home/ /home/
$usecp *.barrine.hpcu.uq.edu.au:/home/ /home/
$usecp barrine*.hpcu.uq.edu.au:/home2/ /home2/
$usecp *.barrine.hpcu.uq.edu.au:/home2/ /home2/
$usecp barrine*.hpcu.uq.edu.au:/home3/ /home3/
$usecp *.barrine.hpcu.uq.edu.au:/home3/ /home3/
$usecp barrine*.hpcu.uq.edu.au:/home4/ /home4/
$usecp *.barrine.hpcu.uq.edu.au:/home4/ /home4/
$usecp barrine*.hpcu.uq.edu.au:/panfs/imb /panfs/imb
$usecp *.barrine.hpcu.uq.edu.au:/panfs/imb /panfs/imb
$usecp barrine*.hpcu.uq.edu.au:/work1 /work1/
$usecp *.barrine.hpcu.uq.edu.au:/work1/ /work1/
$usecp barrine*.hpcu.uq.edu.au:/work2/ /work2/
$usecp *.barrine.hpcu.uq.edu.au:/work2/ /work2/
$usecp barrine*.hpcu.uq.edu.au:/HPC/home /HPC/home/
$usecp *.barrine.hpcu.uq.edu.au:/HPC/home/ /HPC/home/
$usecp barrine.hpcu.uq.edu.au:/PROJ/ /PROJ/
$usecp *.barrine.hpcu.uq.edu.au:/PROJ/ /PROJ/
$usecp barrine.hpcu.uq.edu.au:/ebi/home /ebi/home
$usecp *.barrine.hpcu.uq.edu.au:/ebi/home /ebi/home
$usecp barrine.hpcu.uq.edu.au:/ebi/bscratch /ebi/bscratch
$usecp *.barrine.hpcu.uq.edu.au:/ebi/bscratch /ebi/bscratch

# Dynamic host-level resource definitions - added 13/02/10
# /tmp on tmpfs filesystem in memory (reports bytes free)
##localtmp !/opt/sw/sys/pbs/pbsres_diskspaceavail.bash /tmp
# /scratch on local disk (report bytes free)
##scratch !/opt/sw/sys/pbs/pbsres_diskspaceavail.bash /scratch
#scratch !/usr/local/bin/diskspace /scratch

ip address
10.120.12.50

# GigE Login Node Entries
10.120.12.50    barrine1.barrine.hpcu.uq.edu.au barrine1-ge     barrine1


uqczhan2@barrine1:~> df -Hh
df: `/home3/uqmmallo/.gvfs': Permission denied
df: `/home/uqdgree5/.gvfs': Permission denied
df: `/home/uqdgree5/MappedDrives': Permission denied
Filesystem                         Size  Used Avail Use% Mounted on
/dev/sda2                           25G   16G  9.1G  64% /
udev                                24G  232K   24G   1% /dev
tmpfs                               24G   12K   24G   1% /dev/shm
/dev/sda1                          504M   60M  420M  13% /boot
/dev/sda5                          1.6T   21G  1.5T   2% /scratch
panfs://10.150.250.193/acceptance   92T   71T   21T  78% /panfs/acceptance
panfs://10.150.250.193/sw          2.8T  2.1T  696G  76% /sw
panfs://10.150.250.193/home         28T   12T   17T  42% /home
panfs://10.150.250.193/home2        28T  6.5T   21T  24% /home2
panfs://10.150.250.193/home3        28T  6.5T   21T  24% /home3
panfs://10.150.250.193/home4        28T  6.5T   21T  24% /home4
panfs://10.150.250.193/work1        27T   23T  3.6T  87% /work1
panfs://10.150.250.193/work2        92T   71T   21T  78% /work2
ebiserver:/bscratch                200T  174T   26T  87% /ebi/bscratch
cirrus:/HPC/backup1                2.0T  1.3T  645G  68% /HPC/backup1
cirrus:/HPC/home                   2.0T  1.3T  720G  65% /HPC/home
paroo3:/var/spool/PBS/sched_logs    99G   55G   39G  59% /var/spool/PBS/sched_logs
cirrus:/PROJ/jarrah                1.2T 1002G  170G  86% /PROJ/jarrah

######panfs is panasas purchased from outside

#################################################################
this is machine for computing


Filesystem                         Size  Used Avail Use% Mounted on
/dev/sda2                           27G  8.4G   18G  32% /
udev                               509G  205k  509G   1% /dev
tmpfs                              509G     0  509G   0% /dev/shm
/dev/sda1                          529M   63M  440M  13% /boot
/dev/sda5                          555G   23G  532G   5% /scratch
panfs://10.150.250.193/acceptance  101T   78T   23T  78% /panfs/acceptance
panfs://10.150.250.193/sw          3.0T  2.3T  747G  76% /sw
panfs://10.150.250.193/home         30T   13T   18T  42% /home
panfs://10.150.250.193/home2        30T  7.2T   23T  24% /home2
panfs://10.150.250.193/home3        30T  7.2T   23T  24% /home3
panfs://10.150.250.193/home4        30T  7.2T   23T  24% /home4
panfs://10.150.250.193/work1        29T   26T  3.9T  87% /work1
panfs://10.150.250.193/work2       101T   78T   23T  78% /work2
ebiserver:/bscratch                220T  191T   29T  87% /ebi/bscratch


uqczhan2@b11a07:/var/spool/PBS/mom_priv> cat config
$clienthost paroo3
cpuset_create_flags 0
$restrict_user_maxsysid 499
$tmpdir /scratch
$jobdir_root /scratch
$usecp barrine*.hpcu.uq.edu.au:/home /home
$usecp *.barrine.hpcu.uq.edu.au:/home /home



1.Problem:
qsub job
qsub: Bad UID for job execution MSG=User szhang does not exist in server password file
   solve:
 echo "+::::::" >> /etc/passwd
reference, the setup of  the barrine
solution:
  Make sure /ect/hosts.equiv has the machine name on it. See torque manual on this.

2. Problem
add default shell as bash in phpLDAPadmin


3.

ldapsearch -x -b 'dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au' 'objectclass=*'

4 can not change password


5. how to add


6. the phpLDAPadmin can not be accessed by all the users. (which propbably is a good idea)



7. maui libtorque is not working at the moment.

one library called libtorque or pbs-config is missing  15-01-04

8. who. check who is using the server.