[Wien] Error while parallel run

Laurence Marks L-marks at northwestern.edu
Thu Jul 26 10:27:51 CEST 2012


I think you may have misunderstood how Wien2k works in parallel. It uses IP
addresses or names, and I suspect that with your cpu1 you are trying to
send to the 1st core but I may be wrong. Is there a physical machine called
cpu1? Can you, from a terminal, do "ssh cpu1"?

---------------------------
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what nobody
else has thought"
Albert Szent-Gyorgi
 On Jul 26, 2012 3:18 AM, "alpa dashora" <dashoralpa at gmail.com> wrote:

>  Dear Wien2k Users and Prof. Marks,
>
> Thankyou very much for your reply. I am giving more information.
> Wien2k Version: Wien2k_11.1 on a 8 processor server each has two nodes.
> mkl library: 10.0.1.014
> openmpi: 1.3
> fftw: 2.1.5
>
> My OPTION file is as follows:
>
> current:FOPT:-FR -O3 -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML
> -traceback -l/opt/openmpi/include
> current:FPOPT:-FR -mp1 -w -prec_div -pc80 -pad -ip -traceback
> current:LDFLAGS:-L/root/WIEN2k_11/SRC_lib -L/opt/intel/cmkl/
> 10.0.1.014/lib/em64t -lmkl_em64t -lmkl_blacs_openmpi_lp64 -lmkl_solver
> -lguide -lpthread -i-static
> current:DPARALLEL:'-DParallel'
> current:R_LIBS:-L/opt/intel/cmkl/10.0.1.014/lib/em64t-lmkl_scalapack_lp64 -lmkl_solver_lp64_sequential -Wl,--start-group
> -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_openmpi_lp64
> -Wl,--end-group -lpthread -lm -L/opt/openmpi/1.3/lib/ -lmpi_f90 -lmpi_f77
> -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -limf
> -L/opt/fftw-2.1.5/lib/lib/ -lfftw_mpi -lrfftw_mpi -lfftw -lrfftw
> current:RP_LIBS:-L/opt/intel/cmkl/10.0.1.014/lib/em64t-lmkl_scalapack_lp64 -lmkl_solver_lp64_sequential -Wl,--start-group
> -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_openmpi_lp64
> -Wl,--end-group -lpthread -lm -L/opt/openmpi/1.3/lib/ -lmpi_f90 -lmpi_f77
> -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -limf
> -L/opt/fftw-2.1.5/lib/lib/ -lfftw_mpi -lrfftw_mpi -lfftw -lrfftw
> current:MPIRUN:/opt/openmpi/1.3/bin/mpirun -v -n _NP_ _EXEC_
>  My parallel_option file is as follows:
>
> setenv USE_REMOTE 0
> setenv MPI_REMOTE 0
> setenv WIEN_GRANULARITY 1
> setenv WIEN_MPIRUN "/opt/openmpi/1.3/bin/mpirun -v -n _NP_ -machinefile
> _HOSTS_ _EXEC_"
>  On the compilation no error message was received and all the executable
> files are generated. I have edited parallel_option file, so now the error
> message is changed and it is as follows:
>
> [arya:01254] filem:rsh: copy(): Error: File type unknown
> ssh: cpu1: Name or service not known
>
> --------------------------------------------------------------------------
> A daemon (pid 9385) died unexpectedly with status 255 while attempting
> to launch so we are aborting.
>
> There may be more information reported by the environment (see above).
>
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
> ssh: cpu2: Name or service not known
>
> ssh: cpu3: Name or service not known
>
> ssh: cpu4: Name or service not known
>
> mpirun: clean termination accomplished
>
> LAPW1 - Error
> LAPW1 - Error
> LAPW1 - Error
> LAPW1 - Error
> LAPW1 - Error
> LAPW1 - Error
> LAPW1 - Error
>  I have used the following .machines file for 16 k-points:
>
> granularity:1
> 1:cpu1
> 1:cpu2
> 1:cpu3
> 1:cpu4
> 1:cpu5
> 1:cpu6
> 1:cpu7
> 1:cpu8
> 1:cpu9
> 1:cpu10
> 1:cpu11
> 1:cpu12
> 1:cpu13
> 1:cpu14
> 1:cpu15
> 1:cpu16
> extrafine:1
> lapw0: cpu1:1 cpu2:1 cpu3:1 cpu4:1
>  Please any one suggest me the solution of this problem.
>
> With kind regards,
>
>
> On Mon, Jul 23, 2012 at 4:50 PM, Laurence Marks <L-marks at northwestern.edu>wrote:
>
>> You probably have an incorrect MPIRUN environmental parameter. You have
>> not provided enough information, and need to do a bit more analysis
>> yourself.
>>
>> ---------------------------
>> Professor Laurence Marks
>> Department of Materials Science and Engineering
>> Northwestern University
>> www.numis.northwestern.edu 1-847-491-3996
>> "Research is to see what everybody else has seen, and to think what
>> nobody else has thought"
>> Albert Szent-Gyorgi
>>   On Jul 23, 2012 6:17 AM, "alpa dashora" <dashoralpa at gmail.com> wrote:
>>
>>> Dear Wien2k Users,
>>>
>>> I recently installed Wien2k with openmpi on 16 processor server.
>>> Installation was completed without any compilation error. While running the
>>> run_lapw -p command, I received the following error:
>>>
>>> ------------------------------------------------------------------------------------------------------------------------------
>>>
>>> mpirun was unable to launch the specified application as it could not
>>> find an executable:
>>>
>>> Executable:-4
>>> Node: arya
>>>
>>> while attempting to start process rank 0.
>>>
>>> -------------------------------------------------------------------------------------------------------------------------------
>>>
>>> Kindly suggest me the solution.
>>> mpirun is available in /opt/openmpi/1.3/bin
>>>
>>> Thank you in advance.
>>>
>>> Regards,
>>>
>>> --
>>> Dr. Alpa Dashora
>>>
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>
>>
>
>
> --
> Alpa Dashora
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20120726/5ce789dd/attachment.htm>


More information about the Wien mailing list