[Wien] LAPW2 crashed!

Laurence Marks L-marks at northwestern.edu
Thu Aug 16 15:04:39 CEST 2012


Most errors are due to user mistakes in the input. You have not provided
enough information for anyone to do more than make a guess.

My suspicion is that someone gave you the script and said "use this". If
you are an experienced user scripts are good. However, most experienced
users know where to look to diagnose errors.

You probably should do the initialization by hand so you can understand the
steps and work out what has gone wrong. Have you started by working through
the examples in the user guide first?

---------------------------
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what nobody
else has thought"
Albert Szent-Gyorgi
 On Aug 16, 2012 7:47 AM, "Yunguo Li" <yunguo at kth.se> wrote:

> Dear support,
> - I am running wien version WIEN2k_11.1 (Release 14/6/2011).
> - The purpose of my calculations is to calculate XAS, firstly I am running
> SCF, I considered ferromagnetic calculation.
> - I am running this case using this sbatch script:
> #!/bin/bash
>
> #SBATCH -A matter4
> #SBATCH -J tst
> #SBATCH -N 4
> #SBATCH -t 00:14:00
>
>
> export SCRATCH=/scratch/local
> export WIENROOT=/home/x_yunli/wien2k
>
> # set .machines for parallel job
> # lapw0 running on one node
> echo -n "lapw0: " > .machines
> echo -n $(hostlist -e $SLURM_JOB_NODELIST | tail -1) >> .machines
> echo "$i:8" >> .machines
> # run one mpi job on each node (splitting k-mesh over nodes)
> for i in $(hostlist -e $SLURM_JOB_NODELIST)
> do
>  echo "1:$i:8 " >> .machines
> done
> echo granularity:1 >> .machines
> echo extrafine:1   >> .machines
>
> #start WIEN2k
> x_lapw -f GaNCu -up -c -p -fermi
> #initio
> init_lapw -sp -red 3 -ecut 8 -numk 144
> #main
> runsp_lapw -ec 0.0001Ry -i 40 -p -I
>
> - The program stops at this point,  This is the content of the day file:
> Calculating GaNCu in /home/x_yunli/WIEN2k/GaNCu
> on m371 with PID 18960
> using WIEN2k_11.1 (Release 14/6/2011) in /home/x_yunli/wien2k
>
>
>     start       (Thu Aug 16 13:56:39 CEST 2012) with lapw0 (40/99 to go)
>
>     cycle 1     (Thu Aug 16 13:56:39 CEST 2012)         (40/99 to go)
>
> >   lapw0 -p    (13:56:39) starting parallel lapw0 at Thu Aug 16 13:56:39
> CEST 2012
> -------- .machine0 : 8 processors
> mpprun INFO: Starting openmpi run on 4 nodes (32 ranks)...
> 0.364u 0.590s 0:24.04 3.9%      0+0k 0+0io 27pf+0w
> >   lapw1  -c -up -p    (13:57:03) starting parallel lapw1 at Thu Aug 16
> 13:57:03 CEST 2012
> ->  starting parallel LAPW1 jobs at Thu Aug 16 13:57:03 CEST 2012
> running LAPW1 in parallel mode (using .machines)
> 4 number_of_parallel_jobs
>      m371 m371 m371 m371 m371 m371 m371 m371(18) 0.010u 0.006s 0:00.02
> 50.0%    0+0k 0+0io 1pf+0w
>      m372 m372 m372 m372 m372 m372 m372 m372(18) 0.010u 0.005s 0:00.01
> 100.0%   0+0k 0+0io 0pf+0w
>      m373 m373 m373 m373 m373 m373 m373 m373(18) 0.012u 0.004s 0:00.01
> 100.0%   0+0k 0+0io 0pf+0w
>      m374 m374 m374 m374 m374 m374 m374 m374(18) 0.011u 0.006s 0:00.01
> 100.0%   0+0k 0+0io 0pf+0w
>    Summary of lapw1para:
>    m371  k=0     user=0  wallclock=0
>    m372  k=0     user=0  wallclock=0
>    m373  k=0     user=0  wallclock=0
>    m374  k=0     user=0  wallclock=0
> 0.161u 0.239s 0:06.53 5.9%      0+0k 0+0io 11pf+0w
> >   lapw1  -c -dn -p    (13:57:09) starting parallel lapw1 at Thu Aug 16
> 13:57:09 CEST 2012
> ->  starting parallel LAPW1 jobs at Thu Aug 16 13:57:09 CEST 2012
> running LAPW1 in parallel mode (using .machines.help)
> 4 number_of_parallel_jobs
>      m371 m371 m371 m371 m371 m371 m371 m371(18) 0.011u 0.005s 0:00.01
> 100.0%   0+0k 0+0io 0pf+0w
>      m372 m372 m372 m372 m372 m372 m372 m372(18) 0.009u 0.008s 0:00.02
> 0.0%     0+0k 0+0io 0pf+0w
>      m373 m373 m373 m373 m373 m373 m373 m373(18) 0.009u 0.006s 0:00.01
> 0.0%     0+0k 0+0io 0pf+0w
>      m374 m374 m374 m374 m374 m374 m374 m374(18) 0.008u 0.007s 0:00.01
> 0.0%     0+0k 0+0io 0pf+0w
>    Summary of lapw1para:
>    m371  k=0     user=0  wallclock=0
>    m372  k=0     user=0  wallclock=0
>    m373  k=0     user=0  wallclock=0
>    m374  k=0     user=0  wallclock=0
> 0.138u 0.253s 0:06.39 5.9%      0+0k 0+0io 0pf+0w
> >   lapw2 -c -up  -p    (13:57:16) running LAPW2 in parallel mode
> **  LAPW2 crashed!
> 0.027u 0.039s 0:00.15 33.3%     0+0k 0+0io 0pf+0w
> error: command   /home/x_yunli/wien2k/lapw2cpara -up -c uplapw2.def
> failed
>
> >   stop error
>
> Could you please find the problem ?
>
> Best regards,
> Li
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20120816/56658a9e/attachment.htm>


More information about the Wien mailing list