[Wien] MPI error

leila mollabashi le.mollabashi at gmail.com
Tue Apr 13 21:33:46 CEST 2021


Dear Prof. Peter Blaha and WIEN2k users,

Thank you for your assistances.

> At least now the error: "lapw0 not found" is gone. Do you understand why
??

Yes, I think that because now the path is clearly known.

>How many slots do you get by this srun command ?

Usually I went to node with 28 CPUs.

>Is this the node with the name  e0591  ???

Yes, it is.

>Of course the .machines file must be consistent (dynamically adapted)

with the actual nodename.

Yes, to do this I use my script.

>When I use “srun --pty -n 8 /bin/bash” that goes to the node with 8 free
cores, and run x lapw0 –p then this happens:

starting parallel lapw0 at Tue Apr 13 20:50:49 CEST 2021

-------- .machine0 : 4 processors

[1] 12852

[e0467:12859] mca_base_component_repository_open: unable to open
mca_btl_uct: libucp.so.0: cannot open shared object file: No such file or
directory (ignored)

[e0467][[56319,1],1][btl_openib_component.c:1699:init_one_device] error
obtaining device attributes for mlx4_0 errno says Protocol not supported

[e0467:12859] mca_base_component_repository_open: unable to open
mca_pml_ucx: libucp.so.0: cannot open shared object file: No such file or
directory (ignored)

LAPW0 END

[1]    Done                          mpirun -np 4 -machinefile .machine0
/home/users/mollabashi/v19.2/lapw0_mpi lapw0.def >> .time00

Sincerely yours,

Leila Mollabashi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20210414/ec75104f/attachment.htm>


More information about the Wien mailing list