[Wien] [SPAM?] Re: k point parallel calculations

Laurence Marks L-marks at northwestern.edu
Tue Feb 24 19:39:44 CET 2015


I am not certain, but it looks like the mixer error for 12/13 is due to a
format error in your case.in0. This may be incorrect, please look at what
is at line 168 of your mixer.F.

In most cases where I have seen errors such as this it is because something
has gone wrong earlier. Check with "cat *.error" as all theses files should
be empty. Check that your case.clmval and case.clmcor are not empty and do
not contain NAN. Look at the end of the case.output* files to check that
the programs really worked.

___________________________
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu
MURI4D.numis.northwestern.edu
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what nobody
else has thought"
Albert Szent-Gyorgi
On Feb 24, 2015 12:19 PM, "Priyanka Seth" <priyanka.seth at polytechnique.edu>
wrote:

> Hello all,
>
> I have been trying to run some k-point parallel calculations for some
> large structures and have been having problems for versions 12, 13 and
> 14 on an ifort compilation. In all cases, I am running on the same
> number of cores as k vectors. Note that calculations begun from the same
> input and run on a single core calculation run without any problems.
>
> v12/v13
> =====
>
> This is the output for versions 12 and 13 (I've removed the
> node-dependent lines):
>
> LAPW0 END
> LAPW1 END
> LAPW2 - FERMI; weighs written
> LAPW2 END
> SUMPARA END
> CORE  END
> forrtl: severe (59): list-directed I/O syntax error, unit -5, file
> Internal List-Directed Read
> Image              PC                Routine            Line Source
> mixer              000000000051693D  Unknown               Unknown Unknown
> mixer              0000000000515445  Unknown               Unknown Unknown
> mixer              00000000004BC9E0  Unknown               Unknown Unknown
> mixer              000000000046F4BA  Unknown               Unknown Unknown
> mixer              000000000046ECB0  Unknown               Unknown Unknown
> mixer              0000000000492B76  Unknown               Unknown Unknown
> mixer              000000000049043B  Unknown               Unknown Unknown
> mixer              0000000000407E7E  MAIN__                    168 mixer.F
> mixer              000000000040414C  Unknown               Unknown Unknown
> libc.so.6          00000037C241D994  Unknown               Unknown Unknown
> mixer              0000000000403FC9  Unknown               Unknown Unknown
>
>  >   stop error
>
> Looking at the error files, I have "Error in MIXER" in both versions.
>
> The dayfile ends as follows:
> 1.884u 0.844s 0:09.73 27.9%    0+0k 0+0io 8pf+0w
>  >   lcore    (09:33:51) 0.046u 0.007s 0:00.14 28.5%    0+0k 0+0io 7pf+0w
>  >   mixer    (09:33:51) 0.000u 0.005s 0:00.04 0.0%    0+0k 0+0io 8pf+0w
> error: command   /home/pseth/SOURCES/WIEN2K_v13/mixer mixer.def failed
>
>  >   stop error
>
>
> v14
> ===
>
> I get to the second cycle, but then the calculation crashes with "Error
> in LAPW1" in lapw1_*.error:
>
>   LAPW2 END
>   SUMPARA END
>   CORE  END
>   MIXER END
> ec cc and fc_conv 0 0 1
> in cycle 2    ETEST: 0   CTEST: 0
>   LAPW0 END
>
> There is nothing obviously wrong looking at the case.scf1_* files or at
> the dayfile which ends like this:
>
>  >   lapw1  -p           (09:37:40) starting parallel lapw1 at Tue Feb
> 10 09:37:40 CET 2015
> ->  starting parallel LAPW1 jobs at Tue Feb 10 09:37:40 CET 2015
> running LAPW1 in parallel mode (using .machines)
> 24 number_of_parallel_jobs
> [1] 30405
> [2] 30437
> [3] 30471
> [4] 30507
> [5] 30559
> [6] 30606
> [7] 30653
> [8] 30717
> [9] 30809
> [10] 30916
> [11] 31000
> [12] 31070
> [13] 31192
> [14] 31329
> [15] 31428
> [16] 31504
> [17] 31664
> [18] 31788
> [19] 31871
> [20] 31900
> [21] 31928
> [22] 31956
> [23] 31982
> [24] 32010
> [5]    Done                          ( ( $remote $machine[$p]  ...
>
>
> I understand that this is not much information to go on, but I don't
> really know where else to look! Has anyone had similar issues? What else
> would help in diagnosing the problem?
>
> Many thanks,
> Priyanka
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20150224/9489fb8a/attachment.html>


More information about the Wien mailing list