[Wien] Parallel calculations in cluster - choices

Laurence Marks L-marks at northwestern.edu
Mon Aug 8 05:12:47 CEST 2011


On Sun, Aug 7, 2011 at 9:38 PM, Dr Qiwen  YAO <Yao.Qiwen at nims.go.jp> wrote:
> Dear Professor Marks,
> Thank you very much for your helpful and kind reply.
> I still need more helps if possible:
> 1. Can you please send me a copy of your set of commands you mentioned in your last responded email?
> 2. The system/Node CPU is Xeon X5560 2.80GHz 4cores x 2sockets - so I would go for the benchmarks for bi-Xeon 5320 (overcl 2.67GHz) in the benchmark page - would this means, the most efficient way to run Wien2k in this cluster is with 1 job with 4 threads or 1 job with 8 threads? (this means, I would set in my .bashrc file (export OMP_NUM_THREADS=4 or 8?) - sorry I am not fully understand the benchmark thing. So in this case do I still use 8 kpoint parallel with 8 mpi?

No, you are confusing things. mpi uses lapw[0-2]_mpi and this is
different from Intel threading. Depending upon your specific hardware
(including interconnects) these may be faster or slower than running
with just threads, and this also depends upon the size of the problem
(grep -e RKM *.scf1* and look at the matrix size ) For the systems I
work with mpi is more effective than the mkl threading, but you need
to test this yourself. For instance, test 1 job with 4 thread; 2 jobs
with 4 threads and 1 mpi task with 8 cores (and by jobs I mean
seperate lapw1 tasks).

> 3. You mentioned in your email that there is a way to check the number of k-points for a 1x1x1 cell - would you mind to explain it a bit more or, provide me a formula is any?

You check how many k-points are needed by increasing the number and
plotting the total energy until the convergence is "adequate". What is
adequate is something you need to think about, there is no magic
answer.

>
> Sorry for spending you so much of time for simple questions like these.
>
> Thank you again.
>
> Kind regards,
> Qiwen
>
>
> ------Original Message------
> From:"Laurence Marks"<L-marks at northwestern.edu>
> To:"A Mailing list for WIEN2k users"<wien at zeus.theochem.tuwien.ac.at>
> Cc:
> Subject:Re: [Wien] Parallel calculations in cluster - choices
> Date:08/07/2011 11:07:51 AM(-0500)
>>On Sun, Aug 7, 2011 at 5:00 AM, Dr Qiwen  YAO <Yao.Qiwen at nims.go.jp> wrote:
>>> Dear Wien2k users,
>>> I was running a supercell 3x3x1 calculation for a 4-atom double peroskite compound with spin polarized calculation.
>>> The job was killed by the cluster because of wtime limited - see below the relevant error message:
>>
>>> =>> PBS: job killed: walltime 86419 exceeded limit 86400
>>> ----------
>>> Question 1.
>>> In a case like this - what is the best way for me to continue on the previous calculation (or if it is possible just to re-run the same job again pretending nothing happened - since the calculation itself does not crush) - if I am to restart the job again would Wien2k be able to pick up the best point and continues on? - I could not find anything similar to this in the email achieve.
>>
>>Yes, just add "-NI" to the runXYZ command (XYZ as appropriate).
>>
>>>
>>> Question 2. The choices of the mpi and k-parallel  - for my future calculation choices.
>>> The cluster I am running wien2k with is a 512 node cluster with PBS. Each node is with 8 cores and it is 2.85GB/core in memory/core ratio.
>>
>>You need to benchmark the speed of your system first. Use the
>>benchmark from the Wien2k web page, and try different configurations.
>>Work out as well the speed of lapw1 (the slowist program) for
>>different sizes for you cluster (and different numbers of nodes).
>>
>>Note, as well, that in general you want to use a comparable density of
>>k-points in reciprocal space. Hence for a 3x3x1 cell you need 1/9 the
>>number of k-points that you need for a 1x1x1 cell. In addition, for an
>>insulator in general you need less k-points than for a metal. Check
>>the number that you need for a 1x1x1 cell then use the same density
>>for the 3x3x1 (not the same number).
>>
>>Last, you should see what is being produced by the script you have,
>>i.e. look at the .machines file (cat .machines). I have a small set of
>>commands that I have used to control what one gets in a more flexible
>>fashion. I will send it to Peter and ask that he puts it on the
>>unsupported software page. (I can send it seperately on request, but
>>not via the list.)
>>
>>> I was running 8 k-point parallel and 8 mpi parallel (suggested by the system engineer as I don't have much idea what is the best choice) with these options:
>>>
>>> #!/usr/bin/tcsh
>>> #QSUB2 queue qaM
>>> #QSUB2 core 8
>>> #QSUB2 mpi 8
>>> #QSUB2 smp 1
>>>
>>> These lines provided by the system support and it seems we do not use dynamic script to find the available nodes/cores in run time but we specify the number of cores/nodes we want to use in our script (we are using a point system to run our scripts) then we submit our job in the queue. I could specify up to maybe 24 nodes x 8 cores/node (#QSUB2 core 192, and  #QSUB2 mpi 192) in the script - would that help speed up the calculation if I did in the current system?
>>>
>>> The system engineer thought it might not be better for win2k as wien2k is most efficient in k-point parallel. So he suggested me to use 8 k-pont parallel and 8 mpi parallel - even though he is not sure what the best choice is - and I have less idea about the best choice in the current system.
>>>
>>> Below is the complete script for submitting my previous job:
>>> -------------------------
>>> #!/usr/bin/tcsh
>>> #QSUB2 queue qaM
>>> #QSUB2 core 8
>>> #QSUB2 mpi 8
>>> #QSUB2 smp 1
>>>
>>> cd ${PBS_O_WORKDIR}
>>>
>>> source /etc/profile.d/modules.csh
>>> module load intel11.1/sgimpt
>>>
>>> cat $PBS_NODEFILE > .machines_current
>>> set aa=`wc .machines_current`
>>> echo '#' > .machines
>>>
>>> # example for an MPI parallel lapw0
>>> echo -n 'lapw0:' >> .machines
>>> set i=1
>>> while ($i < $aa[1])
>>> echo -n `cat $PBS_NODEFILE |head -$i | tail -1` ' ' >> .machines
>>> @ i ++
>>> end
>>> echo  `cat $PBS_NODEFILE |head -$i|tail -1` ' ' >> .machines
>>>
>>> #example for k-point parallel lapw1/2
>>> set i=1
>>> while ($i <= $aa[1])
>>> echo -n '1:' >> .machines
>>> head -$i .machines_current |tail -1 >> .machines
>>> @ i ++
>>> end
>>> echo 'granularity:1' >> .machines
>>> echo 'extrafine:1' >> .machines
>>>
>>> runsp_lapw -ec 0.0001 -p
>>> ------------------------------
>>>
>>> Any comment or suggestion would be highly appreciated.
>>>
>>> Thank you very much.
>>> Qiwen
>>>
>>> ------Original Message------
>>> From:"Aaron"<nkleof at gmail.com>
>>> To:<Wien at zeus.theochem.tuwien.ac.at>
>>> Cc:
>>> Subject:[Wien] problem on electronic structure in slab
>>> Date:07/29/2011 04:25:23 PM(+0800)
>>>>_______________________________________________
>>>>Wien mailing list
>>>>Wien at zeus.theochem.tuwien.ac.at
>>>>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>>>
>>>> << 1.2.html >>
>>>
>>> **********************************************************
>>>
>>> Dr QiWen YAO
>>>
>>> JSPS Fellow
>>> Multifunctional Materials Group
>>> Optical and Electronic Materials Unit
>>> Environment and Energy Materials Research Division
>>>
>>> National Institute for Materials Science
>>>
>>> 1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
>>> Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501
>>>
>>> **********************************************************
>>>
>>> _______________________________________________
>>> Wien mailing list
>>> Wien at zeus.theochem.tuwien.ac.at
>>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>>
>>
>>
>>
>>--
>>Laurence Marks
>>Department of Materials Science and Engineering
>>MSE Rm 2036 Cook Hall
>>2220 N Campus Drive
>>Northwestern University
>>Evanston, IL 60208, USA
>>Tel: (847) 491-3996 Fax: (847) 491-7820
>>email: L-marks at northwestern dot edu
>>Web: www.numis.northwestern.edu
>>Chair, Commission on Electron Crystallography of IUCR
>>www.numis.northwestern.edu/
>>Research is to see what everybody else has seen, and to think what
>>nobody else has thought
>>Albert Szent-Gyorgi
>>_______________________________________________
>>Wien mailing list
>>Wien at zeus.theochem.tuwien.ac.at
>>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
> **********************************************************
>
> Dr QiWen YAO
>
> JSPS Fellow
> Multifunctional Materials Group
> Optical and Electronic Materials Unit
> Environment and Energy Materials Research Division
>
> National Institute for Materials Science
>
> 1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
> Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501
>
> **********************************************************
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>



-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Research is to see what everybody else has seen, and to think what
nobody else has thought
Albert Szent-Gyorgi


More information about the Wien mailing list