[Wien] Problem Running Parallel Job

Laurence Marks L-marks at northwestern.edu
Sat Nov 12 21:40:42 CET 2011


There is no obvious indication that it failed.

And, do not just send files. The purpose of this list is to provide
advice, but you are expected to do most of the work to debug your own
calculation as we have no access to your files.

2011/11/12 Chinedu Ekuma <panaceamee at yahoo.com>:
> Dear Dr. Laurence,
> The MPI is well compiled as other softwares uses it. You can let me know the
> specific information you need to help and I will send them to you. Below is
> the output of from the case.daylife file.
> using WIEN2k_11.1 (Release 14/6/2011) in
> /home/packages/wien2k/11.1/intel-11.1-mvapich-1.1/src
>
>
>     start     (Thu Nov 10 15:57:56 CST 2011) with lapw0 (90/99 to go)
>
>     cycle 1     (Thu Nov 10 15:57:56 CST 2011)     (90/99 to go)
>
>>   lapw0 -p    (15:57:56) starting parallel lapw0 at Thu Nov 10 15:57:56
>> CST 2011
> -------- .machine0 : 16 processors
> 0.080u 0.140s 0:03.81 5.7%    0+0k 0+0io 0pf+0w
>>   lapw1  -p      (15:58:00) starting parallel lapw1 at Thu Nov 10 15:58:00
>> CST 2011
> ->  starting parallel LAPW1 jobs at Thu Nov 10 15:58:00 CST 2011
> running LAPW1 in parallel mode (using .machines)
> 16 number_of_parallel_jobs
>      oliver1(380) 26.000u 3.270s 1:29.71 32.63%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.980u 4.220s 1:38.81 30.56%      0+0k 0+0io 0pf+0w
>      oliver1(380) 26.380u 3.650s 1:48.06 27.79%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.940u 3.190s 1:28.54 32.90%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.610u 3.340s 1:59.38 24.25%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.890u 3.260s 1:51.26 26.20%      0+0k 0+0io 0pf+0w
>      oliver1(380) 26.030u 3.460s 1:44.84 28.13%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.890u 3.100s 1:45.37 27.51%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.370u 3.340s 1:50.97 25.87%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.850u 4.720s 1:53.82 26.86%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.710u 3.120s 1:35.85 30.08%      0+0k 0+0io 0pf+0w
>      oliver1(380) 26.060u 3.390s 1:44.48 28.19%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.480u 3.310s 1:46.31 27.08%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.430u 3.360s 1:46.49 27.03%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.580u 3.160s 1:48.18 26.56%      0+0k 0+0io 0pf+0w
>      oliver1(380) 25.250u 3.270s 1:46.57 26.76%      0+0k 0+0io 0pf+0w
>      oliver1(1) 0.270u 0.010s 0.47 58.95%      0+0k 0+0io 0pf+0w
>      oliver1(1) 0.240u 0.040s 0.47 59.20%      0+0k 0+0io 0pf+0w
>      oliver1(1) 0.240u 0.020s 0.47 54.74%      0+0k 0+0io 0pf+0w
>      oliver1(1) 0.250u 0.000s 0.76 32.85%      0+0k 0+0io 0pf+0w
>    Summary of lapw1para:
>    oliver1     k=6084     user=413.45     wallclock=2014.58
> 0.270u 1.870s 2:06.40 1.6%    0+0k 0+0io 0pf+0w
>>   lapw2 -p      (16:00:07) running LAPW2 in parallel mode
>       oliver1 12.990u 0.970s 16.61 84.05% 0+0k 0+0io 0pf+0w
>       oliver1 13.890u 1.780s 24.15 64.86% 0+0k 0+0io 0pf+0w
>       oliver1 13.140u 1.360s 28.73 50.46% 0+0k 0+0io 0pf+0w
>       oliver1 14.740u 4.600s 54.22 35.67% 0+0k 0+0io 0pf+0w
>       oliver1 14.060u 1.030s 52.49 28.74% 0+0k 0+0io 0pf+0w
>       oliver1 13.330u 1.000s 57.72 24.83% 0+0k 0+0io 0pf+0w
>       oliver1 14.340u 0.870s 1:05.97 23.05% 0+0k 0+0io 0pf+0w
>       oliver1 13.420u 1.040s 1:06.51 21.74% 0+0k 0+0io 0pf+0w
>       oliver1 13.850u 1.050s 1:13.63 20.23% 0+0k 0+0io 0pf+0w
>       oliver1 13.320u 1.110s 1:07.55 21.36% 0+0k 0+0io 0pf+0w
>       oliver1 13.100u 1.100s 1:11.05 19.98% 0+0k 0+0io 0pf+0w
>       oliver1 13.980u 1.000s 1:09.53 21.54% 0+0k 0+0io 0pf+0w
>       oliver1 12.980u 1.170s 1:06.62 21.24% 0+0k 0+0io 0pf+0w
>       oliver1 13.150u 1.300s 1:07.4 21.44% 0+0k 0+0io 0pf+0w
>       oliver1 13.940u 0.980s 1:07.78 22.01% 0+0k 0+0io 0pf+0w
>       oliver1 13.280u 1.080s 1:03.40 22.65% 0+0k 0+0io 0pf+0w
>       oliver1 0.080u 0.100s 3.32 5.41% 0+0k 0+0io 0pf+0w
>       oliver1 0.110u 0.040s 3.17 4.72% 0+0k 0+0io 0pf+0w
>       oliver1 0.090u 0.040s 2.43 5.34% 0+0k 0+0io 0pf+0w
>       oliver1 0.110u 0.030s 2.52 5.54% 0+0k 0+0io 0pf+0w
>    Summary of lapw2para:
>    oliver1     user=217.9     wallclock=15710.7
> 3.670u 5.790s 1:34.56 10.0%    0+0k 0+0io 5pf+0w
>>   lcore    (16:01:41) 0.030u 0.000s 0:00.16 18.7%    0+0k 0+0io 0pf+0w
>>   mixer    (16:01:42) 0.030u 0.040s 0:00.28 25.0%    0+0k 0+0io 0pf+0w
> :ENERGY convergence:  0 0 .0169862050000000
> :CHARGE convergence:  0 0.0001 .2566193
> ec cc and fc_conv 1 0 1
>
>
>
> Regards?
> Chinedu Ekuma  Ekuma
> Department of Physics and
> Astronomy
> Louisiana State
> University
> 202 Nicholson Hall, Tower
> Dr
> Baton Rouge, Louisiana, 70803-4001
> Phone (Mobile): +12254390766
>
> ...The Ways of God are Mysterious
>                     As Always
> I wish you God's PANACEA
>
>
>
>
>
> ________________________________
> From: Laurence Marks <L-marks at northwestern.edu>
> To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
> Sent: Saturday, November 12, 2011 2:14 PM
> Subject: Re: [Wien] Problem Running Parallel Job
>
> Wien2k works in parallel, which means that
> a) The script is wrong
> b) You do not have mpi compiled
> c) Your OS/pbs is different
> d) Something else
>
> Without more information nobody can help you -- saying "that it does
> not work" is not enough.
>
> 2011/11/12 Chinedu Ekuma <panaceamee at yahoo.com>:
>> Dear Wien2k Users,
>> We recently installed wien2k version 11 on my cluster to run in parallel.
>> Below is the pbs script we use for it but our computer administrator has
>> said that it does not run in parallel. Could you kindly help us with
>> rectifying the problem. Thanks in anticipation for your help.
>>
>>
>> #PBS -l walltime=14:20:0
>> #PBS -j oe
>> #PBS -N
>> #This is necessary on my pbs cluster:
>> #setenv SCRATCH /scratch/
>>
>> # change into your working directory
>> cd "$WORKDIR"
>>
>> #start creating .machines
>> cat $PBS_NODEFILE |cut -c1-7 >.machines_current
>> set aa=`wc .machines_current`
>> echo '#' > .machines
>>
>> # example for an MPI parallel lapw0
>> echo -n 'lapw0:' >> .machines
>> set i=1
>> while ($i < $aa[1] )
>> echo -n `cat $PBS_NODEFILE |head -$i | tail -1` ' ' >>.machines
>> @ i ++
>> end
>> echo  `cat $PBS_NODEFILE |head -$i|tail -1` ' ' >>.machines
>>
>> #example for k-point parallel lapw1/2
>> set i=1
>> while ($i <= $aa[1] )
>> echo -n '1:' >>.machines
>> head -$i .machines_current |tail -1 >> .machines
>> @ i ++
>> end
>> echo 'granularity:1' >>.machines
>> echo 'extrafine:1' >>.machines
>>
>> #define here your WIEN2k command
>>
>> runsp_lapw -p -i 40 -cc 0.0001 -I
>>
>>
>> Regards?
>> Chinedu Ekuma  Ekuma
>>
>>
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>
>>
>
>
>
> --
> Professor Laurence Marks
> Department of Materials Science and Engineering
> Northwestern University
> www.numis.northwestern.edu 1-847-491-3996
> "Research is to see what everybody else has seen, and to think what
> nobody else has thought"
> Albert Szent-Gyorgi
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
>



-- 
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what
nobody else has thought"
Albert Szent-Gyorgi


More information about the Wien mailing list