[Wien] lapwso_mpi error

Md. Fhokrul Islam fislam at hotmail.com
Sun Nov 13 01:33:50 CET 2016


Hi Prof. Blaha,


   I wasn't aware of the bug but I will check the updates. I have repeated calculation

with 16 cores (square processor grid) as you suggested but I still got the same error.

As before, job crashes at lapwso. I don't see any missing file as you can see from the

list of vector files.


-rw-r--r--. 1 eishfh kalmar 12427583862 Nov 12 10:04 3Mn.vectordn_1

-rw-r--r--. 1 eishfh kalmar       77760 Nov 12 10:26 3Mn.vectorsodn_1

-rw-r--r--. 1 eishfh kalmar       77760 Nov 12 10:26 3Mn.vectorsoup_1

-rw-r--r--. 1 eishfh kalmar 12428559726 Nov 12 04:17 3Mn.vectorup_1


Here are the dayfile and output error files. These are the only error messages I got.


case.dayfile:


    cycle 1     (Sat Nov 12 01:21:39 CET 2016)  (100/99 to go)


>   lapw0 -p    (01:21:39) starting parallel lapw0 at Sat Nov 12 01:21:39 CET 2016

-------- .machine0 : 16 processors

14031.329u 15.362s 14:40.87 1594.6%     0+0k 90152+1974560io 175pf+0w

>   lapw1  -up -p   -c  (01:36:20) starting parallel lapw1 at Sat Nov 12 01:36:20 CET 2016

->  starting parallel LAPW1 jobs at Sat Nov 12 01:36:20 CET 2016

running LAPW1 in parallel mode (using .machines)

1 number_of_parallel_jobs

     au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188(1) 121331.481u 33186.223s 2:41:04.62 1598.7%       0+0k 0+29485672io 118pf+0w

   Summary of lapw1para:

   au188         k=0     user=0  wallclock=0

121367.583u 33215.702s 2:41:06.83 1599.1%       0+0k 288+29487024io 121pf+0w

>   lapw1  -dn -p   -c  (04:17:27) starting parallel lapw1 at Sat Nov 12 04:17:27 CET 2016

->  starting parallel LAPW1 jobs at Sat Nov 12 04:17:27 CET 2016

running LAPW1 in parallel mode (using .machines.help)

1 number_of_parallel_jobs

     au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188(1) 233187.228u 100041.449s 5:47:30.00 1598.2%      0+0k 5832+35169304io 116pf+0w

   Summary of lapw1para:

   au188         k=0     user=0  wallclock=0

233263.580u 100102.639s 5:47:31.69 1598.7%      0+0k 6296+35170640io 118pf+0w

>   lapwso -up  -p -c   (10:04:59) running LAPWSO in parallel mode

**  LAPWSO crashed!

1233.319u 23.612s 21:29.72 97.4%        0+0k 13064+7712io 17pf+0w

error: command   /lunarc/nobackup/users/eishfh/SRC/Wien2k14.2-iomkl/lapwsopara -up -c lapwso.def   failed


>   stop error

-----------------------

lapwso.error file:


**  Error in Parallel LAPWSO

**  Error in Parallel LAPWSO

-----------------------

output error file:


 LAPW0 END

 LAPW1 END

 LAPW1 END

forrtl: severe (39): error during read, unit 9, file /lunarc/nobackup/users/eishfh/WIEN2k/GaAs_ZB/David_project/3Mn001/ALL/test-so/3Mn/./3Mn.vectordn_1

Image              PC                Routine            Line        Source

lapwso_mpi         00000000004634E3  Unknown               Unknown  Unknown

lapwso_mpi         000000000047F3C4  Unknown               Unknown  Unknown

lapwso_mpi         000000000042BA1F  kptin_                     56  kptin.F

lapwso_mpi         0000000000431566  MAIN__                    523  lapwso.F

lapwso_mpi         000000000040B3EE  Unknown               Unknown  Unknown

libc.so.6          00002BA34EDECB15  Unknown               Unknown  Unknown

lapwso_mpi         000000000040B2E9  Unknown               Unknown  Unknown

-----------------------


Thanks,
Fhokrul



________________________________
From: Wien <wien-bounces at zeus.theochem.tuwien.ac.at> on behalf of Peter Blaha <pblaha at theochem.tuwien.ac.at>
Sent: Friday, November 11, 2016 7:12 PM
To: A Mailing list for WIEN2k users
Subject: Re: [Wien] lapwso_mpi error

> I have repeated the calculation as you suggested. I have used current
 > work directory as SCRATCH but I got the same error. I don't see
 > anything wrong with lapw1.

You have to send us detailed error messages.

It cannot be true that your SCRATCH is the working directory, when an
error points to  /local/slurmtmp.287632/3Mn.vectordn_1

What is your error now ?

Do you see this missing file:
/local/slurmtmp.287632/3Mn.vectordn_1

When doing    ll *vector*
in the "correct" directory, what length do these files have ?

PS: There was a bugreport for non-square processor grids (20=4*5) and
RLOs. Did you fix that ?
Eventually try 16 cores only.

Am 11.11.2016 um 16:01 schrieb Md. Fhokrul Islam:
>> /local/slurmtmp.287632/3Mn.vectordn_1

--
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WIEN 2k<http://www.wien2k.at/>
www.wien2k.at
The program package WIEN2k allows to perform electronic structure calculations of solids using density functional theory (DFT). It is based on the full-potential ...



WWW:   http://www.imc.tuwien.ac.at/staff/tc_group_e.php
Institute Technische Universität Wien : Fehler 404 - Seite nicht gefunden<http://www.imc.tuwien.ac.at/staff/tc_group_e.php>
www.imc.tuwien.ac.at
Technische Universität Wien, TU Wien



--------------------------------------------------------------------------
_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
Wien -- A Mailing list for WIEN2k users<http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien>
zeus.theochem.tuwien.ac.at
A Mailing list for WIEN2k users. Please post questions, suggestions or comments about WIEN2k ONLY in this list. Please follow the following "Nettiquette" (depending ...



SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Messages by Thread - The Mail Archive<http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html>
www.mail-archive.com
Messages by Thread [Wien] convert to grace file ?Amir lot? ? Re: [Wien] convert to grace file Peter Blaha; Re: [Wien] convert to grace file ?Amir lot? ?



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20161113/14ab7e70/attachment.html>


More information about the Wien mailing list