[Wien] MPI error
leila mollabashi
le.mollabashi at gmail.com
Tue Apr 13 21:33:46 CEST 2021
Dear Prof. Peter Blaha and WIEN2k users,
Thank you for your assistances.
> At least now the error: "lapw0 not found" is gone. Do you understand why
??
Yes, I think that because now the path is clearly known.
>How many slots do you get by this srun command ?
Usually I went to node with 28 CPUs.
>Is this the node with the name e0591 ???
Yes, it is.
>Of course the .machines file must be consistent (dynamically adapted)
with the actual nodename.
Yes, to do this I use my script.
>When I use “srun --pty -n 8 /bin/bash” that goes to the node with 8 free
cores, and run x lapw0 –p then this happens:
starting parallel lapw0 at Tue Apr 13 20:50:49 CEST 2021
-------- .machine0 : 4 processors
[1] 12852
[e0467:12859] mca_base_component_repository_open: unable to open
mca_btl_uct: libucp.so.0: cannot open shared object file: No such file or
directory (ignored)
[e0467][[56319,1],1][btl_openib_component.c:1699:init_one_device] error
obtaining device attributes for mlx4_0 errno says Protocol not supported
[e0467:12859] mca_base_component_repository_open: unable to open
mca_pml_ucx: libucp.so.0: cannot open shared object file: No such file or
directory (ignored)
LAPW0 END
[1] Done mpirun -np 4 -machinefile .machine0
/home/users/mollabashi/v19.2/lapw0_mpi lapw0.def >> .time00
Sincerely yours,
Leila Mollabashi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20210414/ec75104f/attachment.htm>
More information about the Wien
mailing list