[Wien] Need extensive help for a job file for slurm job scheduler cluster
Dr. K. C. Bhamu
kcbhamu85 at gmail.com
Fri Nov 13 10:23:28 CET 2020
Dear All
I need your extensive help.
I have tried to provide full details that can help you understand my
requirement. In case I have missed something, please let me know.
I am looking for a job file for our cluster. The available jobs files on
FAQs are not working. They give me
.machine0 .machines .machines_current files only
wherein .machines has # and the other two are empty.
The script that is working fine for Quantum Espresso for 44core partition
is below
#!/bin/sh
#SBATCH -J test #job name
#SBATCH -p 44core #partition name
#SBATCH -N 1 #node
#SBATCH -n 18 #core
#SBATCH -o %x.o%j
#SBATCH -e %x.e%j
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so #Do not change here!!
srun ~/soft/qe66/bin/pw.x < case.in > case.out
I have compiled Wien2k_19.2 on the Centos queuing system which has the head
node of Centos kernel Linux 3.10.0-1127.19.1.el7.x86_64.
I used compilers_and_libraries_2020.2.254 , fftw-3.3.8 , libxc-4.34 for the
installation.
The details of the nodes that I can use are as follows (I can login into
these nodes with my user password):
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT
AVAIL_FE REASON
elpidos 1 master idle 4 4:1:1 15787 0 1
(null) none
node01 1 72core allocated 72 72:1:1 515683 0 1
(null) none
node02 1 72core allocated 72 72:1:1 257651 0 1
(null) none
node03 1 72core allocated 72 72:1:1 257651 0 1
(null) none
node09 1 44core mixed 44 44:1:1 128650 0 1
(null) none
node10 1 44core mixed 44 44:1:1 128649 0 1
(null) none
node11 1 52core* allocated 52 52:1:1 191932 0 1
(null) none
node12 1 52core* allocated 52 52:1:1 191932 0 1
(null) none
The other nodes have a mixture of the kernel as below.
OS=Linux 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020
OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
OS=Linux 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016
OS=Linux 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019
Your extensive help will improve my research productivity.
Thank you very much.
Regards
Bhamu
*Full details of the nodes are here:*
NodeName=elpidos Arch=x86_64 CoresPerSocket=1
CPUAlloc=0 CPUTot=4 CPULoad=0.06
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.250 NodeHostName=elpidos Version=20.02.3
OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
RealMemory=15787 AllocMem=0 FreeMem=5597 Sockets=4 Boards=1
State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=master
BootTime=2020-10-13T14:25:13 SlurmdStartTime=2020-10-13T14:25:26
CfgTRES=cpu=4,mem=15787M,billing=4
AllocTRES=
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node01 Arch=x86_64 CoresPerSocket=1
CPUAlloc=72 CPUTot=72 CPULoad=72.00
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.1 NodeHostName=node01 Version=20.02.3
OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
RealMemory=515683 AllocMem=0 FreeMem=363362 Sockets=72 Boards=1
State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
Partitions=72core
BootTime=2020-10-13T20:44:04 SlurmdStartTime=2020-10-14T05:44:23
CfgTRES=cpu=72,mem=515683M,billing=72
AllocTRES=cpu=72
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node02 Arch=x86_64 CoresPerSocket=1
CPUAlloc=72 CPUTot=72 CPULoad=71.92
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.2 NodeHostName=node02 Version=20.02.3
OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
RealMemory=257651 AllocMem=0 FreeMem=142057 Sockets=72 Boards=1
State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
Partitions=72core
BootTime=2020-10-13T20:44:04 SlurmdStartTime=2020-10-14T05:44:17
CfgTRES=cpu=72,mem=257651M,billing=72
AllocTRES=cpu=72
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node03 Arch=x86_64 CoresPerSocket=1
CPUAlloc=72 CPUTot=72 CPULoad=71.96
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.3 NodeHostName=node03 Version=20.02.3
OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
RealMemory=257651 AllocMem=0 FreeMem=168118 Sockets=72 Boards=1
State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
Partitions=72core
BootTime=2020-10-13T20:44:33 SlurmdStartTime=2020-10-14T05:43:35
CfgTRES=cpu=72,mem=257651M,billing=72
AllocTRES=cpu=72
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node04 Arch=x86_64 CoresPerSocket=1
CPUAlloc=0 CPUTot=20 CPULoad=0.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.4 NodeHostName=node04 Version=20.02.3
OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
RealMemory=128664 AllocMem=0 FreeMem=126677 Sockets=20 Boards=1
State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=20core
BootTime=2020-10-13T20:43:24 SlurmdStartTime=2020-10-14T05:42:43
CfgTRES=cpu=20,mem=128664M,billing=20
AllocTRES=
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node05 Arch=x86_64 CoresPerSocket=1
CPUAlloc=0 CPUTot=4 CPULoad=0.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.5 NodeHostName=node05 Version=20.02.3
OS=Linux 3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020
RealMemory=64190 AllocMem=0 FreeMem=63350 Sockets=4 Boards=1
State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=4core
BootTime=2020-10-29T11:34:18 SlurmdStartTime=2020-10-29T11:34:30
CfgTRES=cpu=4,mem=64190M,billing=4
AllocTRES=
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node06 Arch=x86_64 CoresPerSocket=1
CPUAlloc=0 CPUTot=4 CPULoad=0.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.6 NodeHostName=node06 Version=20.02.3
OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
RealMemory=64190 AllocMem=0 FreeMem=63084 Sockets=4 Boards=1
State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=4core
BootTime=2020-10-19T11:07:32 SlurmdStartTime=2020-10-19T11:07:51
CfgTRES=cpu=4,mem=64190M,billing=4
AllocTRES=
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node07 Arch=x86_64 CoresPerSocket=1
CPUAlloc=0 CPUTot=64 CPULoad=0.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.7 NodeHostName=node07 Version=20.02.3
OS=Linux 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016
RealMemory=80241 AllocMem=0 FreeMem=75316 Sockets=64 Boards=1
State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=64core
BootTime=2020-10-13T20:52:40 SlurmdStartTime=2020-10-13T21:10:59
CfgTRES=cpu=64,mem=80241M,billing=64
AllocTRES=
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node08 Arch=x86_64 CoresPerSocket=1
CPUAlloc=0 CPUTot=64 CPULoad=0.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.8 NodeHostName=node08 Version=20.02.3
OS=Linux 3.10.0-514.el7.x86_64 #1 SMP Tue Nov 22 16:42:41 UTC 2016
RealMemory=47987 AllocMem=0 FreeMem=42188 Sockets=64 Boards=1
State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=64core
BootTime=2020-10-13T20:51:08 SlurmdStartTime=2020-10-13T20:57:12
CfgTRES=cpu=64,mem=47987M,billing=64
AllocTRES=
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node09 Arch=x86_64 CoresPerSocket=1
CPUAlloc=36 CPUTot=44 CPULoad=35.99
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.9 NodeHostName=node09 Version=20.02.3
OS=Linux 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019
RealMemory=128650 AllocMem=0 FreeMem=78059 Sockets=44 Boards=1
State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=44core
BootTime=2020-10-13T20:47:11 SlurmdStartTime=2020-10-13T20:47:29
CfgTRES=cpu=44,mem=128650M,billing=44
AllocTRES=cpu=36
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node10 Arch=x86_64 CoresPerSocket=1
CPUAlloc=18 CPUTot=44 CPULoad=18.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.10 NodeHostName=node10 Version=20.02.3
OS=Linux 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019
RealMemory=128649 AllocMem=0 FreeMem=82279 Sockets=44 Boards=1
State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=44core
BootTime=2020-10-13T20:47:36 SlurmdStartTime=2020-10-13T20:48:00
CfgTRES=cpu=44,mem=128649M,billing=44
AllocTRES=cpu=18
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node11 Arch=x86_64 CoresPerSocket=1
CPUAlloc=52 CPUTot=52 CPULoad=52.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.11 NodeHostName=node11 Version=20.02.3
OS=Linux 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020
RealMemory=191932 AllocMem=0 FreeMem=147904 Sockets=52 Boards=1
State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
Partitions=52core
BootTime=2020-10-13T20:47:02 SlurmdStartTime=2020-10-13T20:47:13
CfgTRES=cpu=52,mem=191932M,billing=52
AllocTRES=cpu=52
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node12 Arch=x86_64 CoresPerSocket=1
CPUAlloc=52 CPUTot=52 CPULoad=52.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.12 NodeHostName=node12 Version=20.02.3
OS=Linux 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020
RealMemory=191932 AllocMem=0 FreeMem=162998 Sockets=52 Boards=1
State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A
MCS_label=N/A
Partitions=52core
BootTime=2020-10-13T20:47:31 SlurmdStartTime=2020-10-13T20:47:42
CfgTRES=cpu=52,mem=191932M,billing=52
AllocTRES=cpu=52
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
NodeName=node13 Arch=x86_64 CoresPerSocket=1
CPUAlloc=0 CPUTot=4 CPULoad=0.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=10.0.0.13 NodeHostName=node13 Version=20.02.3
OS=Linux 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020
RealMemory=31836 AllocMem=0 FreeMem=31093 Sockets=4 Boards=1
State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=4core
BootTime=2020-10-13T20:48:12 SlurmdStartTime=2020-10-13T20:48:20
CfgTRES=cpu=4,mem=31836M,billing=4
AllocTRES=
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20201113/8f2642e5/attachment-0001.htm>
More information about the Wien
mailing list