[Wien] Parallel calculations in cluster - choices

Dr Qiwen YAO Yao.Qiwen at nims.go.jp
Mon Aug 8 08:46:49 CEST 2011


Dear Professor Marks,
Your answers are quite clear and I understand what I need to do now. Thank you so much for your time and your help.

kind regards,
Qiwen


------Original Message------
From:"Laurence Marks"<L-marks at northwestern.edu>
To:"A Mailing list for WIEN2k users"<wien at zeus.theochem.tuwien.ac.at>
Cc:
Subject:Re: [Wien] Parallel calculations in cluster - choices
Date:08/07/2011 10:12:47 PM(-0500)
>On Sun, Aug 7, 2011 at 9:38 PM, Dr Qiwen  YAO <Yao.Qiwen at nims.go.jp> wrote:
>> Dear Professor Marks,
>> Thank you very much for your helpful and kind reply.
>> I still need more helps if possible:
>> 1. Can you please send me a copy of your set of commands you mentioned in your last responded email?
>> 2. The system/Node CPU is Xeon X5560 2.80GHz 4cores x 2sockets - so I would go for the benchmarks for bi-Xeon 5320 (overcl 2.67GHz) in the benchmark page - would this means, the most efficient way to run Wien2k in this cluster is with 1 job with 4 threads or 1 job with 8 threads? (this means, I would set in my .bashrc file (export OMP_NUM_THREADS=4 or 8?) - sorry I am not fully understand the benchmark thing. So in this case do I still use 8 kpoint parallel with 8 mpi?
>
>No, you are confusing things. mpi uses lapw[0-2]_mpi and this is
>different from Intel threading. Depending upon your specific hardware
>(including interconnects) these may be faster or slower than running
>with just threads, and this also depends upon the size of the problem
>(grep -e RKM *.scf1* and look at the matrix size ) For the systems I
>work with mpi is more effective than the mkl threading, but you need
>to test this yourself. For instance, test 1 job with 4 thread; 2 jobs
>with 4 threads and 1 mpi task with 8 cores (and by jobs I mean
>seperate lapw1 tasks).
>
>> 3. You mentioned in your email that there is a way to check the number of k-points for a 1x1x1 cell - would you mind to explain it a bit more or, provide me a formula is any?
>
>You check how many k-points are needed by increasing the number and
>plotting the total energy until the convergence is "adequate". What is
>adequate is something you need to think about, there is no magic
>answer.
>
>>
>> Sorry for spending you so much of time for simple questions like these.
>>
>> Thank you again.
>>
>> Kind regards,
>> Qiwen
>>
>>
>> ------Original Message------
>> From:"Laurence Marks"<L-marks at northwestern.edu>
>> To:"A Mailing list for WIEN2k users"<wien at zeus.theochem.tuwien.ac.at>
>> Cc:
>> Subject:Re: [Wien] Parallel calculations in cluster - choices
>> Date:08/07/2011 11:07:51 AM(-0500)
>>>On Sun, Aug 7, 2011 at 5:00 AM, Dr Qiwen  YAO <Yao.Qiwen at nims.go.jp> wrote:
>>>> Dear Wien2k users,
>>>> I was running a supercell 3x3x1 calculation for a 4-atom double peroskite compound with spin polarized calculation.
>>>> The job was killed by the cluster because of wtime limited - see below the relevant error message:
>>>
>>>> =>> PBS: job killed: walltime 86419 exceeded limit 86400
>>>> ----------
>>>> Question 1.
>>>> In a case like this - what is the best way for me to continue on the previous calculation (or if it is possible just to re-run the same job again pretending nothing happened - since the calculation itself does not crush) - if I am to restart the job again would Wien2k be able to pick up the best point and continues on? - I could not find anything similar to this in the email achieve.
>>>
>>>Yes, just add "-NI" to the runXYZ command (XYZ as appropriate).
>>>
>>>>
>>>> Question 2. The choices of the mpi and k-parallel  - for my future calculation choices.
>>>> The cluster I am running wien2k with is a 512 node cluster with PBS. Each node is with 8 cores and it is 2.85GB/core in memory/core ratio.
>>>
>>>You need to benchmark the speed of your system first. Use the
>>>benchmark from the Wien2k web page, and try different configurations.
>>>Work out as well the speed of lapw1 (the slowist program) for
>>>different sizes for you cluster (and different numbers of nodes).
>>>
>>>Note, as well, that in general you want to use a comparable density of
>>>k-points in reciprocal space. Hence for a 3x3x1 cell you need 1/9 the
>>>number of k-points that you need for a 1x1x1 cell. In addition, for an
>>>insulator in general you need less k-points than for a metal. Check
>>>the number that you need for a 1x1x1 cell then use the same density
>>>for the 3x3x1 (not the same number).
>>>
>>>Last, you should see what is being produced by the script you have,
>>>i.e. look at the .machines file (cat .machines). I have a small set of
>>>commands that I have used to control what one gets in a more flexible
>>>fashion. I will send it to Peter and ask that he puts it on the
>>>unsupported software page. (I can send it seperately on request, but
>>>not via the list.)
>>>
>>>> I was running 8 k-point parallel and 8 mpi parallel (suggested by the system engineer as I don't have much idea what is the best choice) with these options:
>>>>
>>>> #!/usr/bin/tcsh
>>>> #QSUB2 queue qaM
>>>> #QSUB2 core 8
>>>> #QSUB2 mpi 8
>>>> #QSUB2 smp 1
>>>>
>>>> These lines provided by the system support and it seems we do not use dynamic script to find the available nodes/cores in run time but we specify the number of cores/nodes we want to use in our script (we are using a point system to run our scripts) then we submit our job in the queue. I could specify up to maybe 24 nodes x 8 cores/node (#QSUB2 core 192, and  #QSUB2 mpi 192) in the script - would that help speed up the calculation if I did in the current system?
>>>>
>>>> The system engineer thought it might not be better for win2k as wien2k is most efficient in k-point parallel. So he suggested me to use 8 k-pont parallel and 8 mpi parallel - even though he is not sure what the best choice is - and I have less idea about the best choice in the current system.
>>>>
>>>> Below is the complete script for submitting my previous job:
>>>> -------------------------
>>>> #!/usr/bin/tcsh
>>>> #QSUB2 queue qaM
>>>> #QSUB2 core 8
>>>> #QSUB2 mpi 8
>>>> #QSUB2 smp 1
>>>>
>>>> cd ${PBS_O_WORKDIR}
>>>>
>>>> source /etc/profile.d/modules.csh
>>>> module load intel11.1/sgimpt
>>>>
>>>> cat $PBS_NODEFILE > .machines_current
>>>> set aa=`wc .machines_current`
>>>> echo '#' > .machines
>>>>
>>>> # example for an MPI parallel lapw0
>>>> echo -n 'lapw0:' >> .machines
>>>> set i=1
>>>> while ($i < $aa[1])
>>>> echo -n `cat $PBS_NODEFILE |head -$i | tail -1` ' ' >> .machines
>>>> @ i ++
>>>> end
>>>> echo  `cat $PBS_NODEFILE |head -$i|tail -1` ' ' >> .machines
>>>>
>>>> #example for k-point parallel lapw1/2
>>>> set i=1
>>>> while ($i <= $aa[1])
>>>> echo -n '1:' >> .machines
>>>> head -$i .machines_current |tail -1 >> .machines
>>>> @ i ++
>>>> end
>>>> echo 'granularity:1' >> .machines
>>>> echo 'extrafine:1' >> .machines
>>>>
>>>> runsp_lapw -ec 0.0001 -p
>>>> ------------------------------
>>>>
>>>> Any comment or suggestion would be highly appreciated.
>>>>
>>>> Thank you very much.
>>>> Qiwen
>>>>
>>>> ------Original Message------
>>>> From:"Aaron"<nkleof at gmail.com>
>>>> To:<Wien at zeus.theochem.tuwien.ac.at>
>>>> Cc:
>>>> Subject:[Wien] problem on electronic structure in slab
>>>> Date:07/29/2011 04:25:23 PM(+0800)
>>>>>_______________________________________________
>>>>>Wien mailing list
>>>>>Wien at zeus.theochem.tuwien.ac.at
>>>>>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>>>>
>>>>> << 1.2.html >>
>>>>
>>>> **********************************************************
>>>>
>>>> Dr QiWen YAO
>>>>
>>>> JSPS Fellow
>>>> Multifunctional Materials Group
>>>> Optical and Electronic Materials Unit
>>>> Environment and Energy Materials Research Division
>>>>
>>>> National Institute for Materials Science
>>>>
>>>> 1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
>>>> Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501
>>>>
>>>> **********************************************************
>>>>
>>>> _______________________________________________
>>>> Wien mailing list
>>>> Wien at zeus.theochem.tuwien.ac.at
>>>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>>>
>>>
>>>
>>>
>>>--
>>>Laurence Marks
>>>Department of Materials Science and Engineering
>>>MSE Rm 2036 Cook Hall
>>>2220 N Campus Drive
>>>Northwestern University
>>>Evanston, IL 60208, USA
>>>Tel: (847) 491-3996 Fax: (847) 491-7820
>>>email: L-marks at northwestern dot edu
>>>Web: www.numis.northwestern.edu
>>>Chair, Commission on Electron Crystallography of IUCR
>>>www.numis.northwestern.edu/
>>>Research is to see what everybody else has seen, and to think what
>>>nobody else has thought
>>>Albert Szent-Gyorgi
>>>_______________________________________________
>>>Wien mailing list
>>>Wien at zeus.theochem.tuwien.ac.at
>>>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>
>> **********************************************************
>>
>> Dr QiWen YAO
>>
>> JSPS Fellow
>> Multifunctional Materials Group
>> Optical and Electronic Materials Unit
>> Environment and Energy Materials Research Division
>>
>> National Institute for Materials Science
>>
>> 1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
>> Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501
>>
>> **********************************************************
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>
>
>
>
>-- 
>Laurence Marks
>Department of Materials Science and Engineering
>MSE Rm 2036 Cook Hall
>2220 N Campus Drive
>Northwestern University
>Evanston, IL 60208, USA
>Tel: (847) 491-3996 Fax: (847) 491-7820
>email: L-marks at northwestern dot edu
>Web: www.numis.northwestern.edu
>Chair, Commission on Electron Crystallography of IUCR
>www.numis.northwestern.edu/
>Research is to see what everybody else has seen, and to think what
>nobody else has thought
>Albert Szent-Gyorgi
>_______________________________________________
>Wien mailing list
>Wien at zeus.theochem.tuwien.ac.at
>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

**********************************************************

Dr QiWen YAO

JSPS Fellow
Multifunctional Materials Group
Optical and Electronic Materials Unit
Environment and Energy Materials Research Division

National Institute for Materials Science

1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501

**********************************************************



More information about the Wien mailing list