[Wien] Parallel job error

jadhikari@clarku.edu jadhikari at clarku.edu
Tue Mar 6 18:18:16 CET 2007


Prof. L Marks,

Thank you very much for the suggestions. I tried all the combinations and
finally found that the problem is with sleep and delay parameters which I
do not know how to set for my current calculations.

Subin

> It is 99.99% certain that this has nothing to do with Wien2k or
> compilation options. It sounds like the jobs are not being
> appropriately dispatched to the different processors. If they are on
> different machines, that means that something is failing in terms of
> your nfs/ssh communications and it might be as simple as a incorrect
> nfs setup or the nfs bug (search the mailing list). In this case do a
> simple ssh to the other node and see if it is running via top, also
> look in :log, case.dayfile and/or turn on debuggin in lapw1para so you
> can find out what is going on, look also in the case.output1_?? files
>
> If it is running many points on a single, multiprocessor machine it
> might be something similar -- is your ssh/rsh or whatever working
> right? Use top and ps.
>
> On 3/2/07, jadhikari at clarku.edu <jadhikari at clarku.edu> wrote:
>> Dear Wien users,
>>
>> The calculation (with the input file below) runs to convergence with 1 k
>> point and 1 processor. It takes about 18 hours for 18 cycles.
>>
>> Then with higher number of k points with multiple processor it always
>> fails.  It gets stuck after LAPW1 END in the first cycle. Following is
>> the
>> part of dayfile-
>>
>> [1]  - Done                      ( cd $PWD; $t $exe ${def}_$loop.def; rm
>> -f .lock_$lockfile[$p] ) >>  ...
>>
>> Then it seems to be static and not moving forward.
>>
>> This calculation involving NaNbO3 has never converged in a parallel mode
>> but   I could manage with other systems like TiO2 and NbO2. Regarding
>> static and floating as previously mentioned in board for compiler
>> options/flags, it seems this is not an issue in our case.
>>
>> Is there any other parameter that has to be set after switching to
>> different  space groups or system with different number of atoms? I
>> guess
>> this is not the cause. Why is this system different from NbO2 which runs
>> fine in a parallel mode? There is something with parallel option in the
>> present case that we are missing.
>>
>> Any help regarding fixing of this error will be highly appreciated.
>>
>> Have a happy lunar eclipse.
>> Subin
>>
>>
>> case.in1
>> __________________________________________________________
>> WFFIL        (WFPRI, SUPWF)
>>   7.00       10    4 (R-MT*K-MAX; MAX L IN WF, V-NMT
>>  .05320   5   0      global e-param with N other choices, napw
>>  0    0.140     0.000 CONT 1
>>  0   -3.248     0.002 CONT 1
>>  1    0.238     0.000 CONT 1
>>  1   -1.189     0.000 CONT 1
>>  2    0.215     0.000 CONT 1
>>  .05320   5   0      global e-param with N other choices, napw
>>  0    0.111     0.000 CONT 1
>>  0   -3.265     0.002 CONT 1
>>  1    0.204     0.000 CONT 1
>>  1   -1.206     0.000 CONT 1
>>  2    0.195     0.000 CONT 1
>>  .05320   6   0      global e-param with N other choices, napw
>>  0    0.054     0.000 CONT 1
>>  0   -3.611     0.002 CONT 1
>>  1    0.225     0.000 CONT 1
>>  1   -1.858     0.000 CONT 1
>>  2    0.096     0.000 CONT 1
>>  2   -0.867     0.000 CONT 1
>>  .05320   3   0      global e-param with N other choices, napw
>>  0    0.184     0.000 CONT 1
>>  0   -0.851     0.000 CONT 1
>>  1    0.201     0.000 CONT 1
>>  .05320   3   0      global e-param with N other choices, napw
>>  0    0.187     0.000 CONT 1
>>  0   -0.870     0.000 CONT 1
>>  1    0.184     0.000 CONT 1
>>  .05320   3   0      global e-param with N other choices, napw
>>  0    0.201     0.000 CONT 1
>>  0   -0.884     0.000 CONT 1
>>  1    0.167     0.000 CONT 1
>>  .05320   3   0      global e-param with N other choices, napw
>>  0    0.204     0.000 CONT 1
>>  0   -0.815     0.000 CONT 1
>>  1    0.234     0.000 CONT 1
>> K-VECTORS FROM UNIT:4   -10.0       2.0      emin/emax window
>> _______________________________________________________________
>> Case.struct
>>
>> Sodium Niobate
>> P   LATTICE,NONEQUIV.ATOMS:  757_Pbcm
>> MODE OF CALC=RELA unit=bohr
>>  10.404800 10.518200 29.328600 90.000000 90.000000 90.000000
>> ATOM  -1: X=0.24300000 Y=0.75000000 Z=0.00000000
>>           MULT= 4          ISPLIT= 8
>>       -1: X=0.75700000 Y=0.25000000 Z=0.00000000
>>       -1: X=0.24300000 Y=0.75000000 Z=0.50000000
>>       -1: X=0.75700000 Y=0.25000000 Z=0.50000000
>> Na1        NPT=  781  R0=0.00010000 RMT=    2.5000   Z: 11.0
>> LOCAL ROT MATRIX:    0.0000000 0.0000000 1.0000000
>>                      1.0000000 0.0000000 0.0000000
>>                      0.0000000 1.0000000 0.0000000
>> ATOM  -2: X=0.23900000 Y=0.78200000 Z=0.25000000
>>           MULT= 4          ISPLIT= 8
>>       -2: X=0.76100000 Y=0.21800000 Z=0.75000000
>>       -2: X=0.76100000 Y=0.28200000 Z=0.25000000
>>       -2: X=0.23900000 Y=0.71800000 Z=0.75000000
>> Na2        NPT=  781  R0=0.00010000 RMT=    2.5000   Z: 11.0
>> LOCAL ROT MATRIX:    1.0000000 0.0000000 0.0000000
>>                      0.0000000 1.0000000 0.0000000
>>                      0.0000000 0.0000000 1.0000000
>> ATOM  -3: X=0.25660000 Y=0.27220000 Z=0.12620000
>>           MULT= 8          ISPLIT= 8
>>       -3: X=0.74340000 Y=0.72780000 Z=0.87380000
>>       -3: X=0.25660000 Y=0.27220000 Z=0.37380000
>>       -3: X=0.74340000 Y=0.72780000 Z=0.62620000
>>       -3: X=0.74340000 Y=0.77220000 Z=0.12620000
>>       -3: X=0.25660000 Y=0.22780000 Z=0.87380000
>>       -3: X=0.74340000 Y=0.77220000 Z=0.37380000
>>       -3: X=0.25660000 Y=0.22780000 Z=0.62620000
>> Nb         NPT=  781  R0=0.00010000 RMT=    1.8000   Z: 41.0
>> LOCAL ROT MATRIX:    1.0000000 0.0000000 0.0000000
>>                      0.0000000 1.0000000 0.0000000
>>                      0.0000000 0.0000000 1.0000000
>> ATOM  -4: X=0.30400000 Y=0.25000000 Z=0.00000000
>>           MULT= 4          ISPLIT= 8
>>       -4: X=0.69600000 Y=0.75000000 Z=0.00000000
>>       -4: X=0.30400000 Y=0.25000000 Z=0.50000000
>>       -4: X=0.69600000 Y=0.75000000 Z=0.50000000
>> O 1        NPT=  781  R0=0.00010000 RMT=    1.4000   Z:  8.0
>> LOCAL ROT MATRIX:    0.0000000 0.0000000 1.0000000
>>                      1.0000000 0.0000000 0.0000000
>>                      0.0000000 1.0000000 0.0000000
>> ATOM  -5: X=0.19100000 Y=0.23300000 Z=0.25000000
>>           MULT= 4          ISPLIT= 8
>>       -5: X=0.80900000 Y=0.76700000 Z=0.75000000
>>       -5: X=0.80900000 Y=0.73300000 Z=0.25000000
>>       -5: X=0.19100000 Y=0.26700000 Z=0.75000000
>> O 2        NPT=  781  R0=0.00010000 RMT=    1.4000   Z:  8.0
>> LOCAL ROT MATRIX:    1.0000000 0.0000000 0.0000000
>>                      0.0000000 1.0000000 0.0000000
>>                      0.0000000 0.0000000 1.0000000
>> ATOM  -6: X=0.53600000 Y=0.03200000 Z=0.14000000
>>           MULT= 8          ISPLIT= 8
>>       -6: X=0.46400000 Y=0.96800000 Z=0.86000000
>>       -6: X=0.53600000 Y=0.03200000 Z=0.36000000
>>       -6: X=0.46400000 Y=0.96800000 Z=0.64000000
>>       -6: X=0.46400000 Y=0.53200000 Z=0.14000000
>>       -6: X=0.53600000 Y=0.46800000 Z=0.86000000
>>       -6: X=0.46400000 Y=0.53200000 Z=0.36000000
>>       -6: X=0.53600000 Y=0.46800000 Z=0.64000000
>> O 3        NPT=  781  R0=0.00010000 RMT=    1.4000   Z:  8.0
>> LOCAL ROT MATRIX:    1.0000000 0.0000000 0.0000000
>>                      0.0000000 1.0000000 0.0000000
>>                      0.0000000 0.0000000 1.0000000
>> ATOM  -7: X=0.96600000 Y=0.46700000 Z=0.11000000
>>           MULT= 8          ISPLIT= 8
>>       -7: X=0.03400000 Y=0.53300000 Z=0.89000000
>>       -7: X=0.96600000 Y=0.46700000 Z=0.39000000
>>       -7: X=0.03400000 Y=0.53300000 Z=0.61000000
>>       -7: X=0.03400000 Y=0.96700000 Z=0.11000000
>>       -7: X=0.96600000 Y=0.03300000 Z=0.89000000
>>       -7: X=0.03400000 Y=0.96700000 Z=0.39000000
>>       -7: X=0.96600000 Y=0.03300000 Z=0.61000000
>> O 4        NPT=  781  R0=0.00010000 RMT=    1.4000   Z:  8.0
>> LOCAL ROT MATRIX:    1.0000000 0.0000000 0.0000000
>>                      0.0000000 1.0000000 0.0000000
>>                      0.0000000 0.0000000 1.0000000
>>    8      NUMBER OF SYMMETRY OPERATIONS
>> -1 0 0 0.00000000
>>  0-1 0 0.00000000
>>  0 0-1 0.00000000
>>        1
>>  1 0 0 0.00000000
>>  0 1 0 0.00000000
>>  0 0 1 0.00000000
>>        2
>> -1 0 0 0.00000000
>>  0-1 0 0.00000000
>>  0 0 1 0.50000000
>>        3
>> -1 0 0 0.00000000
>>  0 1 0 0.50000000
>>  0 0-1 0.50000000
>>        4
>> -1 0 0 0.00000000
>>  0 1 0 0.50000000
>>  0 0 1 0.00000000
>>        5
>>  1 0 0 0.00000000
>>  0-1 0 0.50000000
>>  0 0-1 0.00000000
>>        6
>>  1 0 0 0.00000000
>>  0-1 0 0.50000000
>>  0 0 1 0.50000000
>>        7
>>  1 0 0 0.00000000
>>  0 1 0 0.00000000
>>  0 0-1 0.50000000
>>        8
>> ___________________________________________________________
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>
>
>
> --
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60208, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L-marks at northwestern dot edu
> Web: www.numis.northwestern.edu
> EMM2007 http://ns.crys.ras.ru/EMMM07/
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
>



More information about the Wien mailing list