[Wien] problem in k-point parallel job
Laurence Marks
L-marks at northwestern.edu
Thu Feb 23 23:45:35 CET 2017
This is almost certainly no a WIEN2k issue. As your output states:
"cannot change to directory /mnt/oss/hod/wien_case/TCO_1: No such file or
directory"
This means that for whatever reason this directory is not available for you
in your batch job. Why...hard to say, but unlikely to be anything to do
with WIEN2k. Maybe something to do with what is mounted where on your
system, which I don't think we can help with.
On Thu, Feb 23, 2017 at 4:33 PM, Dr. K. C. Bhamu <kcbhamu85 at gmail.com>
wrote:
> Dear Wien2k Experts,
> I am trying to submit a job but the chance of job submission failure are
> more than 90%.
>
> It is SGE resource manager system and job file is take from FAQs as such.
>
> In job.out file, I am getting this message:
>
> qrsh_starter: cannot change to directory /mnt/oss/hod/wien_case/TCO_1: No
> such file or directory
> qrsh_starter: cannot change to directory /mnt/oss/hod/wien_case/TCO_1: No
> such file or directory
>
> In job.err:
>
> qrsh_starter: cannot change to directory /mnt/oss/hod/wien_case/TCO_1: No
> such file or directory
> qrsh_starter: cannot change to directory /mnt/oss/hod/wien_case/TCO_1: No
> such file or directory
> [mpiexec at compute-0] control_cb (./pm/pmiserv/pmiserv_cb.c:717): assert
> (!closed) failed
> [mpiexec at compute-1] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:77): callback returned error status
> [mpiexec at compute-2] HYD_pmci_wait_for_completion
> (./pm/pmiserv/pmiserv_pmci.c:435): error waiting for event
> [mpiexec at compute-3] main (./ui/mpich/mpiexec.c:901): process manager
> error waiting for completion
>
>
> I tried with varying number of cores (16/32/48/64) but the problem
> persists.
>
> System specifications:
>
> sge cluster (linuxifc) with 5 nodes with each node having 16 core and
> each core has 4GB RAM (~2GB/ processor), 40 Gbps Infiniband interconnect.
> I used "mpiifort" and "mpiicc" compiler with scalapck, blas, fftd3 and
> blacs library (without ELPA and LIBXC-3.0.0).
>
>
> Parallel options:
>
> setenv TASKSET "no"
> if ( ! $?USE_REMOTE ) setenv USE_REMOTE 0
> if ( ! $?MPI_REMOTE ) setenv MPI_REMOTE 0
> setenv WIEN_GRANULARITY 1
> setenv DELAY 0.1
> setenv SLEEPY 1
> setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_"
> setenv CORES_PER_NODE 16
> # if ( ! $?PINNING_COMMAND) setenv PINNING_COMMAND "--cpu_bin=map_cpu:"
> # if ( ! $?PINNING_LIST ) setenv PINNING_LIST "0,8,1,9,2,10,3,11,4,12,5,13,
> 6,14,7,15"
>
> Sincerely
> Bhamu
>
--
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody
else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu ; Corrosion in 4D: MURI4D.numis.northwestern.edu
Partner of the CFW 100% program for gender equity, www.cfw.org/100-percent
Co-Editor, Acta Cryst A
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20170223/ab1c9da9/attachment-0001.html>
More information about the Wien
mailing list