[Wien] commlib error

Gavin Abo gsabo at crimson.ua.edu
Fri Jul 10 16:24:09 CEST 2015


An additional comment, in the post at:

https://arc.liv.ac.uk/pipermail/gridengine-users/2010-October/032729.html

You can see that they have the error of the form:

error: commlib error: got select error (Connection reset by peer)
error: executing task of job x failed: failed sending task to 
execd at hostname: can't find connection

It looks like they might have tracked down the problem to the master 
daemon (qmaster), as seen in the post at:

https://arc.liv.ac.uk/pipermail/gridengine-users/2010-October/032758.html

So, maybe, the error could be caused by a daemon problem (with the 
tachyon1478 node).

On 7/10/2015 5:01 AM, Laurence Marks wrote:
>
> From a brief Google search this is an mpi error.
>
> How did you compile, it is easy to use wrong blacs combinations.
>
> Have you run simpler cases such as TiC first?
>
> ---
> Professor Laurence Marks
> Department of Materials Science and Engineering
> Northwestern University
> http://www.numis.northwestern.edu
> Corrosion in 4D http://MURI4D.numis.northwestern.edu
> Co-Editor, Acta Cryst A
> "Research is to see what everybody else has seen, and to think what 
> nobody else has thought"
> Albert Szent-Gyorgi
>
> On Jul 10, 2015 03:05, "Imran Khan" <imrankhanswati80 at gmail.com 
> <mailto:imrankhanswati80 at gmail.com>> wrote:
>
>     Dear wien2k experts and users,
>     I am using wien2k version 14.2 on a queuing system (SGE), with
>     intel compiler 11.1, MPI libraries mpi/openmpi-1.6.3 and math
>     libraries fftw-3.3.4. With these options I install Wien2K without
>     any compile time error.
>     The purpose of my calculation is to find the stable site for
>     different substituents in NdFeB intermetallics.
>     I am running the case.struct given in the attachment, using 200 (6
>     6 4) k-points. My RKmax value is 7 and Gmax is 12, and I am using
>     LDA+U method.
>     I am using the following command  runsp_lapw -p -orb -i 80 -ec
>     0.0001 -cc 0.001
>     Every time I submit my job after few scf cycles the job is
>     terminated with the following error in the error tag file.
>
>     error: commlib error: got select error (Connection reset by peer)
>     error: executing task of job 2424636 failed: failed sending task
>     to execd at tachyon1478: can't find connection
>         .
>         .
>         .
>      LAPW2 END
>      LAPW2 END
>      LAPW2 END
>      LAPW2 END
>     real    0m53.638s
>     forrtl: No such file or directory
>     forrtl: severe (29): file not found, unit 21, file
>     /home01/x1030imr/khan/Wien2K/Neomagnet/Pr-doped/f-site/AFM/Pr-Af/Pr-Af.scf2up_31
>     Image              PC                Routine      Line        Source
>     sumpara            00000000004A671D  Unknown         Unknown  Unknown
>     sumpara            00000000004A5225  Unknown         Unknown  Unknown
>     sumpara            0000000000456259  Unknown         Unknown  Unknown
>     sumpara            0000000000416A5A  Unknown         Unknown  Unknown
>     sumpara            0000000000416250  Unknown         Unknown  Unknown
>     sumpara            0000000000421E3D  Unknown         Unknown  Unknown
>     sumpara            0000000000410771  scfsum_             126  scfsum.f
>     sumpara            000000000040EE82  MAIN__            219  sumpara.f
>     sumpara            00000000004033DC  Unknown         Unknown  Unknown
>     libc.so.6          00000035AA81D974  Unknown         Unknown  Unknown
>     sumpara            00000000004032E9  Unknown         Unknown  Unknown
>     cp: cannot stat `.in.tmp': No such file or directory
>
>     I have discussed this error with the engineers of that queuing
>     system (tachyon), and I have searched the mailing list as well but
>     could not find any solutions.
>     your guidance to solve this issue will be greatly appreciated.
>     Best regards
>     Imran.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20150710/491ac64b/attachment.html>


More information about the Wien mailing list