Monday 1 December 2014

compilling modpath6 modflow-nwt modflow2005 in linux(makefile change)

The list of files are as follows:
MOD_GLOBAL.for  MOD_PARTICLEDATA.for    MP6Flowdata.for  MP6ParticleMgr.for     Writpts0.for    MOD_MPBAS.for   MOD_PrecisionCheck.for  MP6.for          MP6TrackParticles.for  MOD_MPDATA.for  MP6Budgetrd.for         MP6MPBAS1.for    MP6Util.for
Note that MP6PrecisionCheck.for has been changed into MOD_precisionCheck.for as it contains a module. Also fortran is independent from file names.
 By first start, I have used:

gfortran *.for *.inc -o a.out
there are errors reported.

Then I dicided to make the step-by-step compiling
gfortran -c MOD*.for
gfortran -c PM6*.for
gfortran *.o -o def
The binary file has been created, but it runs with errors.
So I dicided to use ifort for compilling:
ifort -c MOD*.for
ifort -c PM6*.for
ifort *.o -o def
Everything works perfectly.

Learned:
  1. if dirct compilling is not working, it is necessary to create the object files first and then do the compilling work. 
  2. the *.inc file may not be needed in the command that conduct the compilling work.
  3. some programs are depended on the compiller (either gfortran or ifort)

The same trick may also need to be used for SUTRA-MS

Modflow2005 is almost working out of the box, except that one needs to change the inc file.

modflow-nwt it is found that gfortran lack of ieee_arithmatic library. ifort has no problem in finding that library. question has been raised in stackoverflow. the other trick i made



Sunday 5 October 2014

use matlab -nodesktop -nosplash and oop

>>commandhistory

 pops out the commandhistory window. but this window does not remember all the commands executed in terminal emulator

 >>setenv('LD_LIBRARY_PATH',[getenv('PATH') getenv('LD_LIBRARY_PATH')])

 Add the system library in front of the matlab library. this is exetrmely useful when running system command in matlab using "!"


Debugging in command line

dbstop in Datareading

where Datareading is the file name where script is suppose to stop

Datareading

then scripts starts to run, and EDITOR pops out.

dbclear

clear out all the breaking point
2.  Use editor to edit one file
>>edit ~/Dropbox/Matlab/SutraLab/SutraLab/mfiles/slsetpath

methodsview  -- show all the methods

3.  Make sure tiff files are exactly the same as eps file:
See my question: http://stackoverflow.com/questions/3600945/printing-a-matlab-plot-in-exact-dimensions-on-paper
alternative solution: using bash command:
convert -colorspace RGB -density 300 coutour.eps -resize 1024x1024 image.jpg
another better command gives high compression files
convert -colorspace RGB -density 1000 coutour.eps -resize 1000x1000 -compress ZIP image.tif
the export_fig is a useful tool to work for figure output.
but I found it is not very friendly with linux.
so the getaround is again, using linux to do caluclation, and then use windows to output results.
4.  Font size does not change in matlab in Linux 15-07-14
there are serveral things to look into for solving this problem
(1) check if you have something missing with the font.
http://stackoverflow.com/questions/16218979/changing-figure-fonts-in-matlab-has-no-effect
sudo apt-get install xfonts-75dpi xfonts-100dpi
This works for my ubuntu machine.

in gentoo linux,
emerge -av media-fonts/font-adobe-100dpi media-fonts/font-adobe-75dpi liberation-fonts
I also have rebooted the computer.
here are some useful commands:

 eselect fontconfig list
 find /usr/share/ -iname 'helvet'*
fc-match Helvetica
(2) people also argue that matlab is very slow when using figures through ssh. to resolve that, one has to put two lines in /etc/X11/xorg.conf (in 'device' section), which is caused by nvidia driver.
http://ifixdit.blogspot.com.au/2011/08/how-to-fix-matlab-small-figures-and.html

Option "UseEdidDpi"   "false"
Option "Dpi"          "92 x 92"
which i also have done in gentoo. but in ubuntu (installed in a machine using ATI cards), I do not have to do that.







Monday 30 June 2014

setup openldap host

The reason of setting up openldap host is to centralize the cluster. that is, each user can enter any node of the cluster using one account. This tutorial explains the setup of the server. To make openLDAP work, one also needs to configure at the client machines (the machine that use openLDAP to authenticate users)

1.  install slapd
sudo apt-get update
sudo apt-get install slapd ldap-utils phpldapadmin
2. envoke slapd configuration guide by executing
sudo dpkg-reconfigure slapd

  • Omit OpenLDAP server configuration? No
  • DNS domain name? macondo04.eait.uq.edu.au
  • Organization name?  macondo04.eait.uq.edu.au
  • Administrator password? input password
  • Database backend to use? HDB
  • Remove the database when slapd is purged? No
  • Move old database? Yes
  • Allow LDAPv2 protocol? No
3. configure /etc/phpldapadmin/config.php:
$servers->setValue('server','host','domain_nam_or_IP_address');
$servers->setValue('server','host','macondo04.eait.uq.edu.au');
$servers->setValue('server','host','127.0.0.1');
$servers->setValue('server','base',array('dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au'));
$servers->setValue('login','auth_type','session');
$servers->setValue('login','bind_id','cn=admin,dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au');
4. Now you should be able to log in by macondo04.eait.uq.edu.au/phpldapadmin.
note that the ldap server is only availabe via student region. so one needs to forward into the staff region by:


  ssh -L9980:127.0.0.1:80  -L10000:127.0.0.1:10000 -XY mpiuser@macondo04

then one should be abe to logon ldapserver by:


 http://127.0.0.1:9980/phpldapadmin/

If one wish his home folders not seen by others, he/she can do: 
chmod 700:503 /home/user/chenming

Reference:
https://help.ubuntu.com/community/LDAPClientAuthentication
https://help.ubuntu.com/community/LDAPClientAuthentication
http://hswong3i.net/blog/hswong3i/ldap-single-sign-webmin-ubuntu-12-04-howto
http://www.linux.com/learn/tutorials/377952%3Amanage-ldap-data-with-phpldapadmin
https://www.digitalocean.com/community/tutorials/how-to-authenticate-client-computers-using-ldap-on-an-ubuntu-12-04-vps
https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-a-basic-ldap-server-on-an-ubuntu-12-04-vps

Sunday 29 June 2014

setup hostbased ssh

Hostbased ssh refers to the protocol that allows authentication of the remote server (where sshd service is provided) is done by checking host machine (where ssh command is executed), not rather the username and password. This is extremely important for several scenarios:

(1) allows a group of users get access in an cluster of machines, once they have logged in from any single machine in the cluster.
(2) setup PBS system where users can access all the cpu resources seamlessly from any node that they login.

Procedures: 
1. Put all nodes and alias into /etc/hosts on both remote server and host machine:
127.0.0.1       hostname.example.com           localhost
130.102.72.43   macondo04.eait.uq.edu.au        macondo04
130.102.72.42   macondo03.eait.uq.edu.au        macondo03
130.102.72.41   macondo02.eait.uq.edu.au        macondo02
130.102.72.40   macondo01.eait.uq.edu.au        macondo01
2. Put all trusted nodes into /etc/hosts.equiv of the remote server:
130.102.72.40    #ip address is the most important one
130.102.72.41
130.102.72.42
130.102.72.43
# the alias seems not very useful for hostbased ssh, however, it is 
quite important to allow accounts in openldap to execute jobs.
10.33.20.120
macondo01 macondo03 macondo04 macondo01.eait.uq.edu.au macondo02.eait.uq.edu.au macondo03.eait.uq.edu.au macondo04.eait.uq.edu.au
This is one of the most important part for hostbasedauthentication. one should carefully check the validity of /etc/hosts.equiv.

3. configure /etc/ssh/sshd_config of the remote server as follows:


# Package generated configuration file
# See the sshd_config(5) manpage for details

# What ports, IPs and protocols we listen for
Port 22
# Use these options to restrict which interfaces/protocols sshd will bind to
#ListenAddress ::
#ListenAddress 0.0.0.0
Protocol 2
# HostKeys for protocol version 2
HostKey /etc/ssh/ssh_host_rsa_key
HostKey /etc/ssh/ssh_host_dsa_key
HostKey /etc/ssh/ssh_host_ecdsa_key
#Privilege Separation is turned on for security
UsePrivilegeSeparation yes

# Lifetime and size of ephemeral version 1 server key
KeyRegenerationInterval 3600
ServerKeyBits 768

# Logging
SyslogFacility AUTH
LogLevel INFO

# Authentication:
LoginGraceTime 120
PermitRootLogin yes
StrictModes yes

RSAAuthentication yes
PubkeyAuthentication yes
#AuthorizedKeysFile     %h/.ssh/authorized_keys

# Don't read the user's ~/.rhosts and ~/.shosts files
IgnoreRhosts yes
# For this to work you will also need host keys in /etc/ssh_known_hosts
RhostsRSAAuthentication yes   #done by chenming
# similar for protocol version 2
HostbasedAuthentication yes
# Uncomment if you don't trust ~/.ssh/known_hosts for RhostsRSAAuthentication
#IgnoreUserKnownHosts yes

# To enable empty passwords, change to yes (NOT RECOMMENDED)
PermitEmptyPasswords no

# Change to yes to enable challenge-response passwords (beware issues with
# some PAM modules and threads)
ChallengeResponseAuthentication no

# Change to no to disable tunnelled clear text passwords
#PasswordAuthentication yes

# Kerberos options
#KerberosAuthentication no
#KerberosGetAFSToken no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes

# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes

X11Forwarding yes
X11DisplayOffset 10
PrintMotd no
PrintLastLog yes
TCPKeepAlive yes
#UseLogin no

#MaxStartups 10:30:60
#Banner /etc/issue.net

# Allow client to pass locale environment variables

AcceptEnv LANG LC_*

Subsystem sftp /usr/lib/openssh/sftp-server

# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication.  Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes

 The most important change for this section is to add:
RhostsRSAAuthentication yes 
HostbasedAuthentication yes
4.  In the remote server (macondo04 in this example), store the rsa public key of the host machine by executing: (this is one of the most important part for the configureation, as all the machines in the /etc/ssh/ssh_known_hosts will obtain the privilliage to visit the host without inputting the password)
ssh-keyscan -t rsa macondo01 >> /etc/ssh/ssh_known_hosts
ssh-keyscan -t rsa macondo02 >> /etc/ssh/ssh_known_hosts
ssh-keyscan -t rsa macondo03 >> /etc/ssh/ssh_known_hosts
after that, one would see the /etc/ssh/ssh_known_hosts on the remote server (macondo04 in this example),  looking like the follows:
macondo01 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDS9bpDVmAgB4SEljkS2zxxY
macondo02 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC49CqXF
macondo03 ssh-rsa AAAB3NzaC1yc2EAAAADAQABAAABAQC9Nre7E2EUmWx/xso4MYCTPXdCyiad4q
To add ip iddress and full dns name in ssh_known_hosts on the remote server (macondo04 in this example), one should change the file as in the folling format:
macondo01,macondo01.eait.uq.edu.au,130.102.72.40 ssh-rsa AAAAB3NzaC1yc2EAAAADAQ
macondo02,macondo02.eait.uq.edu.au,130.102.72.41 ssh-rsa AAAAB3NzaC1yc2EAAAADA
macondo03,macondo03.eait.uq.edu.au,130.102.72.42 ssh-rsa AAAB3NzaC1yc2EAAAADAQA
It is tested that if the ip address is not included in ssh_known_hosts, host-based authentication may fail.

5. Don't forget restart ssh service after changing the configuration, and add ssh service enabled on boot:
sudo service ssh restart
sudo update-rc.d ssh defaults

6. In the host machine, change /etc/ssh/ssh_config as follows:
Host *
 HostbasedAuthentication yes
 PreferredAuthentications     hostbased,publickey,keyboard-interactive,password
 EnableSSHKeysign        yes
 SendEnv LANG LC_*
 HashKnownHosts yes
 Now one should be able to ssh from host machine to remote server without inputing any command. If it is not working still, one can use
ssh -vvvv abc@macondo01
to debug the host machine, or use:
/usr/sbin/sshd -ddd
to debug the remote server.

Appendix:
SSH from A to C via B

# hostA:~/.ssh/config (see man 5 ssh_config for details)
Host hostC
ProxyCommand ssh hostB nc %h %p  # or netcat or whatever you have on hostB

hostA:~$ ssh hostC  # this will automatically tunnel through ssh hostB

http://superuser.com/questions/107679/forward-ssh-traffic-through-a-middle-machine

Reference:
https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Host-based_Authentication
http://users.telenet.be/mydotcom/howto/linux/sshpasswordless.htm

Setup torque pbs system

For Server

1. install:
sudo apt-get install torque-server torque-scheduler torque-client
2. create new profile by:
pbs_server -t create
3. change the server name as macondo03 in var/spool/torque/server_name
4. check the status by
# qmgr -c 'p s'
5. add computational nodes at /var/spool/torque/server_priv/nodes:
macondo01 np=64
macondo02 np=64
macondo03 np=64
macondo04 np=64

6. add more criteria in the server

qmgr -c "set server acl_hosts = macondo03"
qmgr -c "set server scheduling=true"
qmgr -c "create queue batch queue_type=execution"
qmgr -c "set queue batch started=true"
qmgr -c "set queue batch enabled=true"
qmgr -c "set queue batch resources_default.nodes=1"
qmgr -c "set queue batch resources_default.walltime=3600"
qmgr -c "set server default_queue=batch"
qmgr -c "set server keep_completed = 86400"    # amount of time to keep complete
qmgr -c "set server query_other_jobs = true"   # let other people see your job


6. check node status using
$ pbsnodes -a
7. run pbs server and pbs schedule by
pbs_sched
pbs_server
8. put the pbs_sched and pbs_server running on boot

mv /filename /etc/init.d/
chmod +x /etc/init.d/filename 
update-rc.d filename defaults





For client:


(1)  install torque client

sudo apt-get install torque-client torque-mom
(2) specify servername in /var/spool/torque/server_name

macondo04
(3) cofigure /var/spool/torque/mom_priv/config, make sure it looks like that
$pbsserver      macondo04          # note: this is the hostname of the headnode
$logevent       255                 # bitmap of which events to log
$usecp  *:/home /home
$usecp  *:/home/users /home/users
$usecp  *:/home/users1 /home/users1
$usecp  *:/home/users2 /home/users2
$usecp  *:/home/users3 /home/users3
Note that the last line specifies how the client node (node that do computational work) is communicating with the head node (the node that submits the jobs)
Since the current system use NFS to communicate, it can direct use cp to transfer file with out barriers.

(3) make sure pbs_mom is running at the client machine
psb_mom
ps aux |grep "pbs"

(4) add pbs_mom into /etc/rc.local so that this deamom is running on boot
pbs_mom

(5) you can check if the host machine can find out the nodes or not by running the following command in any nodes:
pbsnodes -a
macondo01
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106821,varattr=,jobs=,state=free,netload=2275358527127,gres=,loadave=41.04,ncpus=64,physmem=131988228kb,availmem=259618040kb,totmem=266160896kb,idletime=1703,nusers=6,nsessions=20,sessions=2817 59937 18341 19455 19858 21924 59201 31663 32133 35793 54824 7341 42678 52013 53858 53863 58843 59605 59606 59741,uname=Linux macondo01 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux

macondo02
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106834,varattr=,jobs=,state=free,netload=672949712804,gres=,loadave=0.00,ncpus=64,physmem=131988228kb,availmem=263269024kb,totmem=266160896kb,idletime=7647,nusers=6,nsessions=9,sessions=2691 16924 11828 16164 16336 19656 19765 49307 50259,uname=Linux macondo02 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux

macondo03
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106809,varattr=,jobs=,state=free,netload=1285952855161,gres=,loadave=0.03,ncpus=64,physmem=131987480kb,availmem=261421664kb,totmem=266160148kb,idletime=1285,nusers=5,nsessions=5,sessions=2041 5539 15054 35499 36593,uname=Linux macondo03 3.8.0-39-generic #57~precise1-Ubuntu SMP Tue Apr 1 20:04:50 UTC 2014 x86_64,opsys=linux

macondo04
     state = free
     np = 64
     ntype = cluster
     status = rectime=1404106833,varattr=,jobs=,state=free,netload=366267047029,gres=,loadave=45.76,ncpus=64,physmem=131988228kb,availmem=253846276kb,totmem=266160896kb,idletime=2895,nusers=9,nsessions=23,sessions=2934 5542 12767 60790 21701 30995 31420 32046 36411 36593 36670 36675 52291 45704 47664 59365 57029 59697 62688 61747 62270 62342 63025,uname=Linux macondo04 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64,opsys=linux

working scripts:
#PBS -A uq-CivilEng
#PBS -N data
#PBS -l walltime=530:00:00
#PBS -l nodes=2:ppn=64
#PBS -j oe
#PBS -m abe -M chenming.zhang@uq.edu.au

cd $PBS_O_WORKDIR
date 1>output 2>error

pbscommand:
qterm -t quick
pbs_server      the above two commands restarts the system without disturbing the simulation





Problem so far:
refrain the user right to directly executing the files


* qstat
* showq
* qmgr
* diagnose
maui's current tables for important metrics
-f: fairshare
-p: priority
* pbsnodes
* qsummary
* idle_nodes
- displays the nodes that have $1 (or more) cpus free (default 1 or more)
* checkjob
* checknode

* showstate
* showstats
* qalter
- alter the parameters of a job. Usually applies to a job that is in the Q state and you want to change values (like cpu-time, target nodes) before it runs
* qhold
- put a hold on a job before it starts running (applies to job in Q state)
* qrun -H
- force to run a job on a particular node
To use PBS batch queue manger on feynman cluster:
- User must be added to ACL, thus authorized to submit jobs
- node016 and node017 are almost always free. They are time-shared nodes;
mainly for small (less than an hour) jobs, can be shared among multiple jobs.
- Node001-node015 are in exclusive mode, one job per CPU.
- All nodes have 2 CPUs each. There are two ways to submit a job:
- by default, your job goes to either node016 or node017. That is, all
you have to say is: "qsub "
- to run jobs in exclusive mode on node001-node015, do:
"qsub -l nodes=n ", where 'n' is the number of exclusive cpus
you want for your job.
- A better solution will be to use MAUI's fairshare which tracks usage
per time-intervals. Computes fairshare by weighted sum of usage with
weight of usage in older time-intervals progressively decreasing.
(Use qsub2,qstat2,qdel2 for PBS server#2 @feynman:13001 serving Long Q)
Also see: http://physics/it/cluster/admin/pbs_cmds.html
qsub #submit a job, see man qsub
qdel -p jobid #will force purge the job if it is not killed by qdel
qdel_all #kill all jobs & stop PBS server.
#Run this before shutdown of cluster nodes.
#When all nodes are up again, do@feynman service pbs restart
qsummary #lists summary of jobs/user
qstat #list information about queues and jobs
showq #calculated guess which job will run next
xpbs #GUI to PBS commands
qstat -q #list all queues on system
qstat -Q #list queue limits for all queues
qstat -a #list all jobs on system
qstat -au userid #list all jobs owned by user userid
qstat -s #list all jobs with status comments
qstat -r #list all running jobs
qstat -f jobid #list full information known about jobid
qstat -Qf queueid #list all information known about queueid
qstat -B #list summary information about the PBS server
qstat -iu userid #get info for jobs of userid
qstat -n -1 jobid #will list nodes on which jobid is running in one line
pbsnodes -l #will list nodes which are either down or offline
checknode node034 #will list status of node034 including jobids on that node
checkjob jobid #will list job details
pbsnodes -s feynman:13001 -o node100 #marks node100 offline to PBS2 (does not affect running jobs)
pbsnodes -s feynman:13001 -c node100 #clears offline status of node100 to PBS2
diagnose -f #lists current recent usage of a user/group vs quota?
showstart jobid #lists start of running or estimate for waiting jobs, get jobids from qstat2
setspri #to raise/lower priority see, http://www.clusterresources.com/products/maui/docs/commands/setspri.shtml
Restarting PBS server: Use "qterm -t quick" to shutdown the server without disturbing any job ( running or queued ). The server will cleanly shut-down and can be restarted when desired. Upon restart of the server, jobs that continue to run are shown as running; jobs that terminated during the server's absence will be placed into the exiting state.
- A stuck/hung uniprocessor job can be purged with qdel -p but for multi-cpu MPI jobs,
you better reboot the hung nodes first and then do a qdel from pbs server so that the
server can talk to the node for cleaning up the queue.
- It is important to specify an accurate walltime for your job in your PBS submission script. Selecting the default of 4 hours for jobs that are known to run for less time may result in the job being delayed by the scheduler due to an overestimation of the time the job needs to run.
pbstop useful command to show the running situation

it is learned that maui is better than torque scheduler. texlive-latex-extra
the below processes provide an guideline to install maui into ubuntu 12.04
1. download libtorque libnuma-dev

apt-get install libadns1-dev  libnuma-dev texlive-latex-extra texlive-core
nvidia-cuda-dev
 ulimit -a

pbstop

Reference:
https://wiki.archlinux.org/index.php/TORQUE
https://help.ubuntu.com/community/TorquePbsHowto

setup openldap client

(1)  Installing the following softwares:
apt-get install ldap-utils libpam-ldap libnss-ldap nslcd
During the installation,  you may be asked to input several prompt:
   a). configuring ldap-auth-config with initial:
         ldapi:///
       we need to change this to:
         ldap://macondo04.eait.uq.edu.au
       note that it is ladp rather ladpi. also note that there are only two "/" rather than three
    b). Distinguished name of the search base. change this to 
         dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au
    c). LADP version to use: Select 3
    d). make local root database admin: select Yes
    e). Does the LDAP database require login?  No
     f). LDAP account for root:
         cn=admin,dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au
    g). LDAP root account password: Your-LDAP-root-password
    h). LDAP server URI: 
          ldap://macondo04.eait.uq.edu.au/
    i).  LDAP server search base: 
          dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au

This wizzard is actually a procedure to configure /etc/ldap.conf. make sure it looks like this:

base dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au
uri ldapi://macondo04.eait.uq.edu.au
ldap_version 3
rootbinddn cn=admin,dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au
pam_password md5
If one wants to go through this process again, one can do 
dpkg-reconfigure ldap-auth-config
(2)  modify /etc/nsswitch.conf  file:
#Original file looks like this
passwd: compat 
group : compat  
shadow: compat 

#After appending "ldap" lines look like these
passwd: compat ldap
group : compat ldap  
shadow: compat ldap 
(3)  execute the following command to make sure that if the user doesn't have a home folder, the system will make one:
echo "session required pam_mkhomedir.so skel=/etc/skel umask=0022">> /etc/pam.d/login
echo "session required pam_mkhomedir.so skel=/etc/skel umask=0022" >> /etc/pam.d/lightdm  
echo "session required    pam_mkhomedir.so skel=/etc/skel umask=0022">> /etc/pam.d/common-session
(4) One also needs to make sure the NFS server is properly mounted to the system.
echo "macondo04:/home/users /home/users nfs">> /etc/fstab
sudo mount macondo04:/home/users /home/users
(5) remove use_authtok parameter in /etc/pam.d/common-password on each host and client nodes so that all the users can change their passwd by using passwd command

(6) make sure you have restarted your nscd:
/etc/init.d/nscd restart

Reference:
http://askubuntu.com/questions/127389/how-to-configure-ubuntu-as-an-ldap-client
https://www.digitalocean.com/community/tutorials/how-to-authenticate-client-computers-using-ldap-on-an-ubuntu-12-04-vps
http://askubuntu.com/questions/340340/how-to-allow-ldap-user-to-change-password


Wednesday 25 June 2014

decode hpc cluster


$clienthost pbsserver
$clienthost paroo3
cpuset_create_flags 0
# Enforce memory limits
$enforce mem
# Use local disk for temporary files, job files and checkpoints
$jobdir_root /scratch
$tmpdir /scratch
##$checkpoint_path /scratch/checkpoint
# Restrict use of batch nodes to PBS, interactive logins not allowed
##$restrict_user on
##$restrict_user_maxsysid 500
# Use cp rather than scp to transfer/stage files i.e. use Panasas
$usecp barrine*.hpcu.uq.edu.au:/home/ /home/
$usecp *.barrine.hpcu.uq.edu.au:/home/ /home/
$usecp barrine*.hpcu.uq.edu.au:/home2/ /home2/
$usecp *.barrine.hpcu.uq.edu.au:/home2/ /home2/
$usecp barrine*.hpcu.uq.edu.au:/home3/ /home3/
$usecp *.barrine.hpcu.uq.edu.au:/home3/ /home3/
$usecp barrine*.hpcu.uq.edu.au:/home4/ /home4/
$usecp *.barrine.hpcu.uq.edu.au:/home4/ /home4/
$usecp barrine*.hpcu.uq.edu.au:/panfs/imb /panfs/imb
$usecp *.barrine.hpcu.uq.edu.au:/panfs/imb /panfs/imb
$usecp barrine*.hpcu.uq.edu.au:/work1 /work1/
$usecp *.barrine.hpcu.uq.edu.au:/work1/ /work1/
$usecp barrine*.hpcu.uq.edu.au:/work2/ /work2/
$usecp *.barrine.hpcu.uq.edu.au:/work2/ /work2/
$usecp barrine*.hpcu.uq.edu.au:/HPC/home /HPC/home/
$usecp *.barrine.hpcu.uq.edu.au:/HPC/home/ /HPC/home/
$usecp barrine.hpcu.uq.edu.au:/PROJ/ /PROJ/
$usecp *.barrine.hpcu.uq.edu.au:/PROJ/ /PROJ/
$usecp barrine.hpcu.uq.edu.au:/ebi/home /ebi/home
$usecp *.barrine.hpcu.uq.edu.au:/ebi/home /ebi/home
$usecp barrine.hpcu.uq.edu.au:/ebi/bscratch /ebi/bscratch
$usecp *.barrine.hpcu.uq.edu.au:/ebi/bscratch /ebi/bscratch

# Dynamic host-level resource definitions - added 13/02/10
# /tmp on tmpfs filesystem in memory (reports bytes free)
##localtmp !/opt/sw/sys/pbs/pbsres_diskspaceavail.bash /tmp
# /scratch on local disk (report bytes free)
##scratch !/opt/sw/sys/pbs/pbsres_diskspaceavail.bash /scratch
#scratch !/usr/local/bin/diskspace /scratch

ip address
10.120.12.50

# GigE Login Node Entries
10.120.12.50    barrine1.barrine.hpcu.uq.edu.au barrine1-ge     barrine1


uqczhan2@barrine1:~> df -Hh
df: `/home3/uqmmallo/.gvfs': Permission denied
df: `/home/uqdgree5/.gvfs': Permission denied
df: `/home/uqdgree5/MappedDrives': Permission denied
Filesystem                         Size  Used Avail Use% Mounted on
/dev/sda2                           25G   16G  9.1G  64% /
udev                                24G  232K   24G   1% /dev
tmpfs                               24G   12K   24G   1% /dev/shm
/dev/sda1                          504M   60M  420M  13% /boot
/dev/sda5                          1.6T   21G  1.5T   2% /scratch
panfs://10.150.250.193/acceptance   92T   71T   21T  78% /panfs/acceptance
panfs://10.150.250.193/sw          2.8T  2.1T  696G  76% /sw
panfs://10.150.250.193/home         28T   12T   17T  42% /home
panfs://10.150.250.193/home2        28T  6.5T   21T  24% /home2
panfs://10.150.250.193/home3        28T  6.5T   21T  24% /home3
panfs://10.150.250.193/home4        28T  6.5T   21T  24% /home4
panfs://10.150.250.193/work1        27T   23T  3.6T  87% /work1
panfs://10.150.250.193/work2        92T   71T   21T  78% /work2
ebiserver:/bscratch                200T  174T   26T  87% /ebi/bscratch
cirrus:/HPC/backup1                2.0T  1.3T  645G  68% /HPC/backup1
cirrus:/HPC/home                   2.0T  1.3T  720G  65% /HPC/home
paroo3:/var/spool/PBS/sched_logs    99G   55G   39G  59% /var/spool/PBS/sched_logs
cirrus:/PROJ/jarrah                1.2T 1002G  170G  86% /PROJ/jarrah

######panfs is panasas purchased from outside

#################################################################
this is machine for computing


Filesystem                         Size  Used Avail Use% Mounted on
/dev/sda2                           27G  8.4G   18G  32% /
udev                               509G  205k  509G   1% /dev
tmpfs                              509G     0  509G   0% /dev/shm
/dev/sda1                          529M   63M  440M  13% /boot
/dev/sda5                          555G   23G  532G   5% /scratch
panfs://10.150.250.193/acceptance  101T   78T   23T  78% /panfs/acceptance
panfs://10.150.250.193/sw          3.0T  2.3T  747G  76% /sw
panfs://10.150.250.193/home         30T   13T   18T  42% /home
panfs://10.150.250.193/home2        30T  7.2T   23T  24% /home2
panfs://10.150.250.193/home3        30T  7.2T   23T  24% /home3
panfs://10.150.250.193/home4        30T  7.2T   23T  24% /home4
panfs://10.150.250.193/work1        29T   26T  3.9T  87% /work1
panfs://10.150.250.193/work2       101T   78T   23T  78% /work2
ebiserver:/bscratch                220T  191T   29T  87% /ebi/bscratch


uqczhan2@b11a07:/var/spool/PBS/mom_priv> cat config
$clienthost paroo3
cpuset_create_flags 0
$restrict_user_maxsysid 499
$tmpdir /scratch
$jobdir_root /scratch
$usecp barrine*.hpcu.uq.edu.au:/home /home
$usecp *.barrine.hpcu.uq.edu.au:/home /home



1.Problem:
qsub job
qsub: Bad UID for job execution MSG=User szhang does not exist in server password file
   solve:
 echo "+::::::" >> /etc/passwd
reference, the setup of  the barrine
solution:
  Make sure /ect/hosts.equiv has the machine name on it. See torque manual on this.

2. Problem
add default shell as bash in phpLDAPadmin


3.

ldapsearch -x -b 'dc=macondo04,dc=eait,dc=uq,dc=edu,dc=au' 'objectclass=*'

4 can not change password


5. how to add


6. the phpLDAPadmin can not be accessed by all the users. (which propbably is a good idea)



7. maui libtorque is not working at the moment.

one library called libtorque or pbs-config is missing  15-01-04

8. who. check who is using the server.

Saturday 26 April 2014

abcde - good CD ripper and itunes

By default, all the tracks will be converted to ogg file by executing:
abcde

One can change it into mp3 file by:
abcde -o mp3:"-b 256"

If only several file require to be converted, one can do:
abcde -o mp3:"-b 256" 1,2,3-7



Delete music with missing file in itunes:
There's a trick I just tried with itunes 11, I don't remember the details for a previous (better) version of itunes.

To summerize the trick I detail bellow it's to use a video field to set for all your music tracks, then have itunes set it and check all tracks, itunes will set it only for tracks with a valid file and will update the track state to show whe it hasn't a file. Then you can use that field value to sort tracks of you library music and delete all tracks with a missing file.

So for itunes 11.0.1:
- Select your library in itunes to see all tracks.
- Then choose the tracks presentation.
- Then in Prensentation options add the serie (video section)
- Sort the list by Season to check it's not a used field, it shouldn't as its music.

If the season field isn't used then you can use the trick :
- Select all in the list
- Get information
- Go in Video Tab
- In season number enter 1
- Then check ok.
- Then let itunes set the field for all tracks and it will check them all. It takes times if you have many tracks.
- Once it's finished just sort the list by season, and all tracks with no season are those with missing file. Well perhaps also those with a protected file that can't be changed. Ididn't tested that last point but you could give a visual check to the elements to be sure that all are showing the icon for missing file.

So with that sort you can select all the tracks with missing file to delete them.

Then you can restore the season field to none to reset all valid tracks:
- Select all tracks in your library
- Righ click to get information
- Select video tab and in season number field erase the 1 you set before.
- Ok to apply.

Well it's neither fast nor without risks of human errors, a paid software well designed and well programed is a wiser choice, but that manual process is free of money (well if time isn't money).

The small adventage I can see is in that way you are sure that itunes will check all files of your library.

EDIT: Take care when you do the operation that all the removable disk you are using to store files in you itunes are well connected. That's this sort of little traps a human could forget check and it could generate a little disaster. :-)

CMUS - the best music player

Some useful command used in CMUS


Load playlist named diablo.pl
Load -p diablo.pl
comments: To find where the current folder is, go to file browser(choose 5)

Save play list to diablo.pl
Save-p diablo.pl

Add file to playlist, simply press y


cmus-remote -Q  get a series of information on the playing song





Tuesday 18 February 2014

Installing and maintaining Gentoo box


1. Touch excursion from nonmultilib to multilib (2013)
After installed gentoo into T5500 machine, I found there are several problem that the system can not handle:

 a. three files can not be updated: (a) sandbox (b) glibc (c) gcc. all of the three are said to be updated due to enabled multilib. once glibc failed, it says stub, x86 compile failed
 b. skype can not be opened up, even if it is properly well installed.
 c. the opensource virtualbox can not be compilled. only the binary version can be installed
 The reason of having this problem, is because the base system I installed is non-multilib. It is found from many websites that changing from non-multilib to multilib is very difficult. The easist way to solve this problem is to reinstall gentoo using another stage3 tarball Although some peole say one can establish another glibc from scratch, this may lead to the problem very complex finally.

2.  can not properly disable nouveau driver
   After the system installed in chroot stage, systemd can not be properly booted, mainly due to two symptons:
a. The running text shows that nouveau has been loaded
b. The systemds hangs at the stage where the following lines are repeatedly showing up:
Jun 4 03:16:56 histon kernel: [ 1067.937418] NVRM: The NVIDIA probe routine was not called for 2 device(s). 
Jun 4 03:16:56 histon kernel: [ 1067.937424] NVRM: This can occur when a driver such as nouveau, rivafb, 
Jun 4 03:16:56 histon kernel: [ 1067.937426] NVRM: nvidiafb, or rivatv was loaded and obtained ownership of 
Jun 4 03:16:56 histon kernel: [ 1067.937428] NVRM: the NVIDIA device(s). 
Jun 4 03:16:56 histon kernel: [ 1067.937432] NVRM: Try unloading the conflicting kernel module (and/or 
Jun 4 03:16:56 histon kernel: [ 1067.937434] NVRM: reconfigure your kernel without the conflicting 
Jun 4 03:16:56 histon kernel: [ 1067.937435] NVRM: driver(s)), then try loading the NVIDIA kernel module 
Jun 4 03:16:56 histon kernel: [ 1067.937437] NVRM: again. 
Jun 4 03:16:56 histon kernel: [ 1067.937439] NVRM: No NVIDIA graphics adapter probed! 
    This is probably due to the reason that nouveau has been loaded before nvidia driver starts to load. The following steps can solve this problem:

a. make sure nouveau is disabled by running
# echo "blacklist nouveal" >> /etc/modprobe.d/blacklist.conf
b. make sure that xorg-drivers has nvidia enabled:

 [ebuild   R    ] x11-base/xorg-drivers-1.14  INPUT_DEVICES="evdev keyboard mouse -acecad -aiptek -elographics -fpit -hyperpen -joystick -mutouch -penmount -synaptics -tslib -vmmouse -void -wacom" VIDEO_CARDS="nvidia -apm -ast -chips -cirrus -dummy -epson -fbdev -fglrx (-geode) -glint -i128 (-i740) -intel -mach64 -mga -modesetting -neomagic -nouveau -nv (-omap) (-omapfb) -qxl -r128 -radeon -radeonsi -rendition -s3virge -savage -siliconmotion -sisusb (-sunbw2) (-suncg14) (-suncg3) (-suncg6) (-sunffb) (-sunleo) (-suntcx) -tdfx -tga -trident -tseng -v4l -vesa -via -virtualbox -vmware (-voodoo)" 0 kB
c. follow the instuctions listed in these websites:
http://wiki.gentoo.org/wiki/X_server
https://wiki.gentoo.org/wiki/NVidia/nvidia-drivers
https://wiki.gentoo.org/wiki/Xorg/Configuration

d. copy xorg.conf into /etc/X11/xorg.conf.d

Then it should be working

3.  common packages required to be installed

# emerge -av --keep-going xorg-server xorg-drivers media-libs/mesa systemd i3lock i3   emacs  rxvt-unicode skype dropbox gnome nvidia-drivers scrot xorg-server xorg-drivers vlc mplayer gimp inkscape fuse htop adobe-flash alsa-utils  gnumeric  virtualbox-guest-additions i3status openssh rdesktop sshfs-fuse R mercurial ranger cmus ntfs3g feh mupdf exiftool ifuse gentoolkit dev-python/pip id3 id3v2 cdparanoia abcde virtualbox-extpack-oracle texlive dvipng dvisvgm texlive-latexextra  matplotlib eix openmpi hdf5 paraview gdb p7zip gv openntpd exiftool screen
4. files required to backup
/etc/portage/{make.conf, package.use,package.keywords}
/usr/src/linux/.config
/etc/fstab

5. webkit-gtk is time consuming to emerge
In my laptop, installing webkit-gtk can take ages, to avoid that, I decided not emerge this atom. First, I have checked the atoms that are dependent on webkit-gtk. i.e.,

query depends webkit-gtk

emerge -C gnome-extra/gnome-documents-3.8.5 gnome-extra/sushi-3.10.0 gnome-extra/yelp-3.8.1 gnome-extra/zenity-3.8.0 media-gfx/gimp-2.8.10-r1 media-gfx/shotwell-0.15.1 media-sound/rhythmbox-3.0.2 net-im/empathy-3.8.6 net-libs/gnome-online-accounts-3.10.4 net-libs/libproxy-0.4.11-r1 webkit-gtk

6. list of useful binary package
It is very time consuming to install all the package in source form. To make sure the labtops can be able to enjoy gentoo, I have listed all the useful binary package so that the compilling time can be reduced significantly.
google-chrome firefox-bin libreoffice-bin thunderbird-bin virtualbox-bin icedtea-bin openfoam-bin
7. locations where files should be cleaned
This is the location where source file are downloaded

/usr/portage/distfiles
or clean distfiles using the following command directly.
# eclean-dist
See a list of installed package by their size:
# qsize -a -k | sort -n -k 6
8. get ip address for enp6s0
# dhcpcd enp6s0
or
# dhclient enp6s0
9. change emul-linux to abi (2015-06-22)
Users should be find that emul-linux is gone, what replaces it is abi.
https://wiki.gentoo.org/wiki/Multilib_System_without_emul-linux_Packages
https://forums.gentoo.org/viewtopic-p-7728476.html?sid=54b9db45ef48cc2eb46dd4f79c9c08e8

two more useful links:
1. https://forums.gentoo.org/viewtopic-t-984500-highlight-ffmpeg.html
abi_x86_32 disabled.

2. https://forums.gentoo.org/viewtopic-t-1020872-highlight-.html
never try to globalize abi_x86_32

useful commands:
find all packages that are dependent on EMUL
root #for EMUL in $(eix -I --only-names emul-linux); do equery depends $EMUL; done
10. keep kernels in the eselect kernel lists (2015-06-22)
One may notice that once emerge -depclean, all the non-listed kernels are gone (one can not make it anymore like 3.10.25 in the laptop).

related:
https://forums.gentoo.org/viewtopic-p-7696064.html#7696064
#emerge --noreplace =gentoo-sources-3.17.8-r1
after running this command, the select atom is add in world set (/var/lib/portage/world) so that emerge --depclean is not going to wipe it off.
11. Problem when using emerge @presserved-rebuild (2015-06-22)
Assuming there is one package in preserved library deleted, the box will produced error when running emerge @presserved-rebuild. The way to delete it is to mannually delete the associated files:
portageq list_preserved_libs /
related:
https://wiki.gentoo.org/wiki/Preserve-libs
https://forums.gentoo.org/viewtopic-t-959998-start-0.html
12. Remove all gnome, kde associated packages. (2015-06-22)
The reason to do so is to make the system compact so that any computer is able to install gentoo.
(1) add -gnome -kde in /etc/make.conf  (it seems there is no flags for lxde and xfce)
(2) Change profiles
eselect profile list
(3)un
emerge -C $(grep gnome /var/lib/portage/world)
also find all installed package associated with gnome and perhapse uninstall all of them.
equery list "*" | grep gnome
Related Links:
https://forums.gentoo.org/viewtopic-t-850239-start-0.html
https://wiki.gentoo.org/wiki/KDE/Removal
13. Check use flags of a package (2015-06-22)
find the use flags
equery uses gnumeric
find the dependency conditions
equery depends gnumeric
all packages associated with gnome

equery list "*" | grep gnome
other infomation

equery 
get all installed package

equery list "*" >> installed
so far i have got 926 on 2015-06-22. try to reduce it.
                         895 for toshiba
equery has bindist
list all the packages that has bindist flag (whether it is enabled or not) 
14. use distcc (2015-06-26)
see the list of compile machines.
distcc-config --get-hosts
list all the packages that has bindis
https://forums.gentoo.org/viewtopic-p-7207506.html
15. a correction to my method of using bash (2015-06-27)
https://forums.gentoo.org/viewtopic-t-1020878-highlight-.html
16. How to properly update a kernel(2015-06-27)
at the moment 4.0.5 is not working on T5500. the reason is that nvidia-drivers (bloody nvidia drivers has to be emerged every single time https://forums.gentoo.org/viewtopic-t-996602-highlight-nvidia.html) are not able to be installed. even i have followed:
https://forums.gentoo.org/viewtopic-p-7766108.html?sid=f4c9970e533b6bf99b7450aa1e9e89b4
it is still not working, anyway.
17. Solving no mouse problem on toshiba laptop [unsloved](2015-06-27)
question posted:
https://forums.gentoo.org/viewtopic-t-1020934-highlight-.html
18. Control tty (2015-06-27)
following:
http://superuser.com/questions/67659/linux-share-keyboard-over-network
do on user machine (client)
cat /dev/input/by-path/pci-0000:00:1a.0-usb-0:1.1:1.0-event-kbd | nc 10.33.21.70 4444
and
nc -l -p 4444 > /dev/input/by-path/platform-i8042-serio-0-event-kbd
on host, then you are able to control tty perfectly.

lsusb -t
qca failed
/var/log/portage/app-crypt:qca-2.1.0.3:20150628-000221.log

list all the running modules
cat /proc/modules

ae429-3176 proc # cd /lib/modules/4.0.5-gentoo/kernel/drivers/hid/
ae429-3176 hid # ls
hid-logitech-dj.ko  hid-logitech-hidpp.ko  uhid.ko

modinfo hid-logitech-hidpp

uqczhan2@macondo03 ~ $ lsmod |grep hid
mac_hid                13253  0
hid_generic            12548  0
usbhid                 53111  0
hid                   106605  2 hid_generic,usbhid

https://forums.gentoo.org/viewtopic-t-883988-highlight-keyboard.html

 qlist -IC x11-drivers

working 3.10.25
ae429-0176 chenming # dmesg |grep Logitech
[    2.760539] usb 4-1: Manufacturer: Logitech
[    2.769325] input: Logitech USB Receiver as /devices/pci0000:00/0000:00:12.0/usb4/4-1/4-1:1.0/input/input5
[    2.770902] hid-generic 0003:046D:C52F.0001: input,hidraw0: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-0000:00:12.0-1/input0
[    2.779733] input: Logitech USB Receiver as /devices/pci0000:00/0000:00:12.0/usb4/4-1/4-1:1.1/input/input6
[    2.783405] hid-generic 0003:046D:C52F.0002: input,hiddev0,hidraw1: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:12.0-1/input1
[    2.829659] input: PS/2 Logitech Wheel Mouse as /devices/platform/i8042/serio1/input/input7

not working case 3.17.??
dmesg |grep Logitech
[    2.515042] input: PS/2 Logitech Wheel Mouse as /devices/platform/i8042/serio1/input/input6

currently i can use
shift+insert to paste things on commandline