[Wien] Need extensive help for a job file for slurm job scheduler cluster

Dr. K. C. Bhamu kcbhamu85 at gmail.com
Fri Nov 13 10:23:28 CET 2020


Dear All

I need your extensive help.
I have tried to provide full details that can help you understand my
requirement. In case I have missed something, please let me know.

I am looking for a job file for our cluster. The available jobs files on
FAQs are not working. They give me
.machine0          .machines          .machines_current   files only
wherein .machines has # and the other two are empty.

The script that is working fine for Quantum Espresso for 44core partition
is below
#!/bin/sh
#SBATCH -J test #job name
#SBATCH -p 44core #partition name
#SBATCH -N 1 #node
#SBATCH -n 18 #core
#SBATCH -o %x.o%j
#SBATCH -e %x.e%j
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so #Do not change here!!
srun ~/soft/qe66/bin/pw.x  < case.in > case.out

I have compiled Wien2k_19.2 on the Centos queuing system which has the head
node of Centos kernel Linux 3.10.0-1127.19.1.el7.x86_64.

I used compilers_and_libraries_2020.2.254 , fftw-3.3.8 , libxc-4.34 for the
installation.

The details of the nodes that I can use are as follows (I can login into
these nodes with my user password):
NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT
AVAIL_FE REASON
elpidos        1    master        idle 4       4:1:1  15787        0      1
  (null) none
node01         1    72core   allocated 72     72:1:1 515683        0      1
  (null) none
node02         1    72core   allocated 72     72:1:1 257651        0      1
  (null) none
node03         1    72core   allocated 72     72:1:1 257651        0      1
  (null) none
node09         1    44core       mixed 44     44:1:1 128650        0      1
  (null) none
node10         1    44core       mixed 44     44:1:1 128649        0      1
  (null) none
node11         1   52core*   allocated 52     52:1:1 191932        0      1
  (null) none
node12         1   52core*   allocated 52     52:1:1 191932        0      1
  (null) none

The other nodes have a mixture of the kernel as below.

   OS=Linux 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020
   OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
   OS=Linux 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016
   OS=Linux 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019

Your extensive help will improve my research productivity.

Thank you very much.
Regards
Bhamu

*Full details of the nodes are here:*

NodeName=elpidos Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUTot=4 CPULoad=0.06
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.250 NodeHostName=elpidos Version=20.02.3
   OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
   RealMemory=15787 AllocMem=0 FreeMem=5597 Sockets=4 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=master
   BootTime=2020-10-13T14:25:13 SlurmdStartTime=2020-10-13T14:25:26
   CfgTRES=cpu=4,mem=15787M,billing=4
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node01 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=72 CPUTot=72 CPULoad=72.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.1 NodeHostName=node01 Version=20.02.3
   OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
   RealMemory=515683 AllocMem=0 FreeMem=363362 Sockets=72 Boards=1
   State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
   Partitions=72core
   BootTime=2020-10-13T20:44:04 SlurmdStartTime=2020-10-14T05:44:23
   CfgTRES=cpu=72,mem=515683M,billing=72
   AllocTRES=cpu=72
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node02 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=72 CPUTot=72 CPULoad=71.92
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.2 NodeHostName=node02 Version=20.02.3
   OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
   RealMemory=257651 AllocMem=0 FreeMem=142057 Sockets=72 Boards=1
   State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
   Partitions=72core
   BootTime=2020-10-13T20:44:04 SlurmdStartTime=2020-10-14T05:44:17
   CfgTRES=cpu=72,mem=257651M,billing=72
   AllocTRES=cpu=72
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node03 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=72 CPUTot=72 CPULoad=71.96
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.3 NodeHostName=node03 Version=20.02.3
   OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
   RealMemory=257651 AllocMem=0 FreeMem=168118 Sockets=72 Boards=1
   State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
   Partitions=72core
   BootTime=2020-10-13T20:44:33 SlurmdStartTime=2020-10-14T05:43:35
   CfgTRES=cpu=72,mem=257651M,billing=72
   AllocTRES=cpu=72
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node04 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUTot=20 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.4 NodeHostName=node04 Version=20.02.3
   OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
   RealMemory=128664 AllocMem=0 FreeMem=126677 Sockets=20 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=20core
   BootTime=2020-10-13T20:43:24 SlurmdStartTime=2020-10-14T05:42:43
   CfgTRES=cpu=20,mem=128664M,billing=20
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node05 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUTot=4 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.5 NodeHostName=node05 Version=20.02.3
   OS=Linux 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020
   RealMemory=64190 AllocMem=0 FreeMem=63350 Sockets=4 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=4core
   BootTime=2020-10-29T11:34:18 SlurmdStartTime=2020-10-29T11:34:30
   CfgTRES=cpu=4,mem=64190M,billing=4
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node06 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUTot=4 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.6 NodeHostName=node06 Version=20.02.3
   OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
   RealMemory=64190 AllocMem=0 FreeMem=63084 Sockets=4 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=4core
   BootTime=2020-10-19T11:07:32 SlurmdStartTime=2020-10-19T11:07:51
   CfgTRES=cpu=4,mem=64190M,billing=4
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node07 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUTot=64 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.7 NodeHostName=node07 Version=20.02.3
   OS=Linux 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016
   RealMemory=80241 AllocMem=0 FreeMem=75316 Sockets=64 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=64core
   BootTime=2020-10-13T20:52:40 SlurmdStartTime=2020-10-13T21:10:59
   CfgTRES=cpu=64,mem=80241M,billing=64
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node08 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUTot=64 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.8 NodeHostName=node08 Version=20.02.3
   OS=Linux 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016
   RealMemory=47987 AllocMem=0 FreeMem=42188 Sockets=64 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=64core
   BootTime=2020-10-13T20:51:08 SlurmdStartTime=2020-10-13T20:57:12
   CfgTRES=cpu=64,mem=47987M,billing=64
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node09 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=36 CPUTot=44 CPULoad=35.99
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.9 NodeHostName=node09 Version=20.02.3
   OS=Linux 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019
   RealMemory=128650 AllocMem=0 FreeMem=78059 Sockets=44 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=44core
   BootTime=2020-10-13T20:47:11 SlurmdStartTime=2020-10-13T20:47:29
   CfgTRES=cpu=44,mem=128650M,billing=44
   AllocTRES=cpu=36
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node10 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=18 CPUTot=44 CPULoad=18.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.10 NodeHostName=node10 Version=20.02.3
   OS=Linux 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019
   RealMemory=128649 AllocMem=0 FreeMem=82279 Sockets=44 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=44core
   BootTime=2020-10-13T20:47:36 SlurmdStartTime=2020-10-13T20:48:00
   CfgTRES=cpu=44,mem=128649M,billing=44
   AllocTRES=cpu=18
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node11 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=52 CPUTot=52 CPULoad=52.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.11 NodeHostName=node11 Version=20.02.3
   OS=Linux 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020
   RealMemory=191932 AllocMem=0 FreeMem=147904 Sockets=52 Boards=1
   State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
   Partitions=52core
   BootTime=2020-10-13T20:47:02 SlurmdStartTime=2020-10-13T20:47:13
   CfgTRES=cpu=52,mem=191932M,billing=52
   AllocTRES=cpu=52
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node12 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=52 CPUTot=52 CPULoad=52.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.12 NodeHostName=node12 Version=20.02.3
   OS=Linux 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020
   RealMemory=191932 AllocMem=0 FreeMem=162998 Sockets=52 Boards=1
   State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
   Partitions=52core
   BootTime=2020-10-13T20:47:31 SlurmdStartTime=2020-10-13T20:47:42
   CfgTRES=cpu=52,mem=191932M,billing=52
   AllocTRES=cpu=52
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=node13 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUTot=4 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=10.0.0.13 NodeHostName=node13 Version=20.02.3
   OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
   RealMemory=31836 AllocMem=0 FreeMem=31093 Sockets=4 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=4core
   BootTime=2020-10-13T20:48:12 SlurmdStartTime=2020-10-13T20:48:20
   CfgTRES=cpu=4,mem=31836M,billing=4
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20201113/8f2642e5/attachment-0001.htm>


More information about the Wien mailing list