[Wien] Logging output issue for parallel jobs?
Peter Blaha
peter.blaha at tuwien.ac.at
Thu May 2 18:48:25 CEST 2024
No, not really.
lapw0 writes only to :parallel_lapw0
susi:/area51/WIEN2k_23> grep 'starting parallel' *_lapw
dstartpara_lapw:echo "starting parallel dstart at `date`"
dstartpara_lapw:echo "starting parallel dstart at `date`" >>$log
hfpara_lapw:echo "-> "starting parallel hf$cmplx at `date` >>$log
irreppara_lapw:echo "-> "starting parallel irrep at `date` >>$log
lapw0para_lapw:echo "starting parallel lapw0 at `date`"
lapw0para_lapw:echo "starting parallel lapw0 at `date`" >>$log
lapw1para_lapw:echo "starting parallel lapw1 at `date`"
lapw1para_lapw:echo "starting parallel lapw1 at `date`" >>$log
lapw1para_lapw:echo "-> starting parallel LAPW1 jobs at `date`"
lapw2para_lapw:echo "-> "starting parallel lapw2$cmplx at `date` >>$log
lapwdmpara_lapw:echo "-> "starting parallel lapwdm$cmplx at `date` >>$log
lapwsopara_lapw:echo "-> "starting parallel lapwso at `date` >>$log
nlvdwpara_lapw:echo "starting parallel nlvdw at `date`"
nlvdwpara_lapw:echo "starting parallel nlvdw at `date`" >>$log
opticpara_lapw:echo "-> "starting parallel optic at `date` >>$log
and
susi:/area51/WIEN2k_23> grep 'done at' *_lapw
dstartpara_lapw:echo "<- " done at `date`>>$log
hfpara_lapw:echo "<- "done at `date` >>$log
hfpara_lapw:echo "<- "done at `date` >>$log
irreppara_lapw:echo "<- "done at `date` >>$log
irreppara_lapw:echo "<- "done at `date` >>$log
lapw0para_lapw:echo "<- " done at `date`>>$log
lapw1para_lapw:echo "<- " done at `date`>>$log
lapw2para_lapw:echo "<- "done at `date` >>$log
lapw2para_lapw:echo "<- "done at `date` >>$log
lapwdmpara_lapw:echo "<- "done at `date` >>$log
lapwsopara_lapw:echo "<- "done at `date` >>$log
lapwsopara_lapw:echo "<- "done at `date` >>$log
nlvdwpara_lapw:echo "<- " done at `date`>>$log
opticpara_lapw:echo "<- "done at `date` >>$log
opticpara_lapw:echo "<- "done at `date`
opticpara_lapw:echo "<- "done at `date` >>$log
where log is set as:
susi:/area51/WIEN2k_23> grep 'set log' *_lapw
checkparam_lapw:set logfile = checkparam.log
dstartpara_lapw:set log = :parallel_dstart
hfpara_lapw:set log = :parallel
init_lapw:set logfile = :log
init_so_lapw:set logfile = :log
init_w2w_lapw:set logfile = :log
irreppara_lapw:set log = :parallel
kill_w2web_lapw:set logpid="$tmp_dir/w2web.pid.$$"
kill_w2web_lapw:set logpid1=`cat $logpid`
lapw0para_lapw:set log = :parallel_lapw0
lapw1para_lapw:set log = :parallel
.....
-------------------------------
I agree, the :parallel* files are a bit confusing. If you monitor the
scf cycle, I'd concentrate at case.dayfile, although the dayfile gets
overwritten by a new scf cycle.
Regards
Am 02.05.2024 um 17:47 schrieb Straus, Daniel B:
>
> Hi,
>
> I’m running ver 23.2 on a SLURM-managed cluster. I’ve been looking at
> the :parallel file for job step timing to optimize performance.
>
> I think there is a minor bug in the output, where it refers to lapw0
> as lapw1.
>
> Here is relevant text from a :parallel file:
>
> starting parallel lapw1 at Wed May 1 16:25:39 CDT 2024
>
> 2 <- done at Wed May 1 16:25:52 CDT 2024
>
> 3 -----------------------------------------------------------------
>
> 4 starting parallel lapw1 at Wed May 1 16:26:18 CDT 2024
>
> 5 cypress01-029 cypress01-029 cypress01-029 cypress01-029
> cypress01-029 cypress01-029 cypress01-029 cypress01-029 cyp
>
> ress01-029(3) 0.019u 0.056s 23:55.17 0.0% 0+0k 0+8io 0pf+0w
>
> 6 cypress01-031 cypress01-031 cypress01-031 cypress01-031
> cypress01-031 cypress01-031 cypress01-031 cypress01-031 cyp
>
> ress01-031(2) 0.031u 0.053s 40:08.71 0.0% 0+0k 0+16io 0pf+0w
>
> 7 cypress01-032 cypress01-032 cypress01-032 cypress01-032
> cypress01-032 cypress01-032 cypress01-032 cypress01-032 cyp
>
> ress01-032(2) 0.025u 0.066s 56:58.72 0.0% 0+0k 0+16io 0pf+0w
>
> 8 cypress01-036 cypress01-036 cypress01-036 cypress01-036
> cypress01-036 cypress01-036 cypress01-036 cypress01-036 cyp
>
> ress01-036(2) 0.032u 0.067s 1:13:47.00 0.0% 0+0k 0+16io 0pf+0w
>
> 9 Summary of lapw1para:
>
> 10 cypress01-029 k=3 user=0.019 wallclock=1435.17
>
> 11 cypress01-031 k=2 user=0.031 wallclock=2408.71
>
> 12 cypress01-032 k=2 user=0.025 wallclock=3418.72
>
> 13 cypress01-036 k=2 user=0.032 wallclock=73
>
> 14 <- done at Wed May 1 17:40:07 CDT 2024
>
> 15 -----------------------------------------------------------------
>
> The first “lapw1” should be lapw0.
>
> A separate file called :parallel_lapw0 is also generated. This refers
> to lapw0 correctly.
>
> starting parallel lapw0 at Wed May 1 16:25:52 CDT 2024
>
> 2 <- done at Wed May 1 16:26:18 CDT 2024
>
> 3 -----------------------------------------------------------------
>
> 4 starting parallel lapw0 at Wed May 1 18:55:36 CDT 2024
>
> 5 <- done at Wed May 1 18:56:00 CDT 2024
>
> 6 -----------------------------------------------------------------
>
> Perhaps this is as intended, though I got confused by it.
>
> Daniel Straus
>
> Assistant Professor
>
> Department of Chemistry
>
> Tulane University
>
> 5088 Percival Stern Hall
>
> 6400 Freret Street
>
> New Orleans, LA 70118
>
> (504) 862-3585
>
> http://straus.tulane.edu/ <http://straus.tulane.edu/>
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
--
-----------------------------------------------------------------------
Peter Blaha, Inst. f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-158801165300
Email:peter.blaha at tuwien.ac.at
WWW:http://www.imc.tuwien.ac.at WIEN2k:http://www.wien2k.at
-------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20240502/234d593d/attachment.htm>
More information about the Wien
mailing list